Jonathan Muhire

I build intelligent systems that ship.

ML systems, data infrastructure, and applied AI. 19 open-source repositories with real benchmarks and production code. GitHub has everything.

Selected Work Resume Email

CS & AI · GSoC 2025 · Prev co-founder @ Neotix

Selected Work

Research and systems I've shipped.

Agentic Long-Runner

ReAct agent with 4 memory modes. Chunked vector retrieval fully recovers needle-in-haystack performance on 5,000+ word documents.

ReAct Memory Eval

RenAIssance — GSoC 2025

Document AI pipeline for Renaissance manuscripts. Layout detection with LayoutLMv3, OCR, and post-correction on real historical pages.

Document AI LayoutLMv3 OCR

MinIO + LakeFS Infrastructure

3-node distributed cluster with erasure coding, LakeFS versioning, and Prometheus + Grafana monitoring for ML datasets.

MinIO LakeFS Docker

ArtExtract

CNN-RNN architecture for artwork classification. Trained on real art samples with confusion matrix validation and training curves.

Computer Vision CNN-RNN

CFN Biomedical Eval

Evaluation framework for biomedical NLP. Token-level and span-level benchmarks for clinical concept extraction.

NLP Biomedical Eval

ISSR — Crisis Detection

NLP system for crisis-signal classification using social text, sentiment analysis, and geospatial mapping.

NLP Sentiment Geospatial

Architecture

Systems I've built.

Real architectures from real projects. Each links to the source repo.

Memory-Augmented Agents

ReAct agent with 4 memory modes. Chunked retrieval rescues 0% to 100% on needle-in-haystack evals.

Distributed Data Infrastructure

3-node MinIO cluster with erasure coding, LakeFS versioning, and Prometheus monitoring.

Document AI Pipeline

GSoC 2025. 3-stage architecture: layout detection, OCR, and post-correction on Renaissance manuscripts.

All Projects

Full timeline.

2025 — ML Research & Systems

Agentic Long-Runner

ReAct agent with 4 memory modes. Needle-in-haystack evals: 0% to 100% retrieval recovery.

Repo

ReAct Memory

RenAIssance — GSoC 2025

Document AI for historical manuscripts. Layout detection, OCR, post-correction.

Repo

LayoutLMv3 PyTorch

MinIO + LakeFS Infrastructure

Distributed object storage with dataset versioning for reproducible ML experiments.

Repo

MinIO Docker

CFN Biomedical Eval

Token-level and span-level benchmarks for clinical NLP concept extraction.

Repo

NLP Eval

ISSR — Crisis Detection

NLP crisis-signal classification with sentiment analysis and geospatial features.

Repo

NLP Sentiment

ArtExtract

CNN-RNN architecture for artwork classification with validation metrics.

Repo

PyTorch CV

Custom PyMyCobot

Leader-follower control system for bimanual manipulation and data collection.

Repo

Python Hardware

Swarm Simulator

Multi-agent simulation with flocking, foraging, and formation algorithms.

Repo

Python Multi-Agent

G1 Humanoid RL

RL policy for dynamic locomotion in MuJoCo simulation.

Repo

RL MuJoCo

MediSync

Health records platform with scheduling and clinical workflows.

Repo

React Node.js

AgriFinance

Financial services for smallholder farmers. Credit scoring and mobile-first design.

Repo

Flutter Firebase

ConnectFarm

Agricultural marketplace connecting farmers to buyers.

Repo

Flutter Firebase

2024 — Full-Stack & Mobile

NutrAI

Full-stack nutrition app with meal planning and dietary analysis.

Repo

React Python

CampusBuddy

Flutter app for campus events, dining, and student resources.

Repo

Flutter Dart

POS Terminal

Java POS application with inventory tracking and sales reporting.

Repo

Java Desktop

2023 — Early Projects

Space Invader

C++ arcade game with custom rendering and gameplay mechanics.

Repo

C++ SFML

Pacman AI

Autonomous agents with search algorithms and gameplay heuristics.

Repo

Python Search

Writing

Notes and breakdowns.

November 17, 2025

The State of Robotics in 2025: Why the Hype Isn't Lying (Yet)

A deep dive into the four critical bottlenecks slowing the robotics revolution and why general-purpose robots remain inevitable

December 23, 2024

Making Sense of Multimodal Models with Partial Information Decomposition

A deep dive into how Partial Information Decomposition (PID) reveals how different modalities interact in AI systems, from redundancy to synergy

Contact

Open to research and systems roles.

Looking for teams building ML systems, data infrastructure, or applied AI.

Email LinkedIn Resume