Jonathan Muhire Jonathan Muhire
Jonathan Muhire

I build intelligent systems that ship.

ML systems, data infrastructure, and applied AI. 19 open-source repositories with real benchmarks and production code. GitHub has everything.

CS & AI · GSoC 2025 · Prev co-founder @ Neotix

Research and systems I've shipped.

Needle-in-haystack eval heatmap Agentic Long-Runner

ReAct agent with 4 memory modes. Chunked vector retrieval fully recovers needle-in-haystack performance on 5,000+ word documents.

ReAct Memory Eval
RenAIssance 3-stage pipeline RenAIssance — GSoC 2025

Document AI pipeline for Renaissance manuscripts. Layout detection with LayoutLMv3, OCR, and post-correction on real historical pages.

Document AI LayoutLMv3 OCR
MinIO + LakeFS architecture MinIO + LakeFS Infrastructure

3-node distributed cluster with erasure coding, LakeFS versioning, and Prometheus + Grafana monitoring for ML datasets.

MinIO LakeFS Docker
ArtExtract classification ArtExtract

CNN-RNN architecture for artwork classification. Trained on real art samples with confusion matrix validation and training curves.

Computer Vision CNN-RNN
Biomedical NLP eval CFN Biomedical Eval

Evaluation framework for biomedical NLP. Token-level and span-level benchmarks for clinical concept extraction.

NLP Biomedical Eval
ISSR — Crisis Detection

NLP system for crisis-signal classification using social text, sentiment analysis, and geospatial mapping.

NLP Sentiment Geospatial

Systems I've built.

Real architectures from real projects. Each links to the source repo.

Memory-Augmented Agents
ReAct agent with 4 memory modes. Chunked retrieval rescues 0% to 100% on needle-in-haystack evals.
Needle-in-haystack eval heatmap
Distributed Data Infrastructure
3-node MinIO cluster with erasure coding, LakeFS versioning, and Prometheus monitoring.
MinIO + LakeFS infrastructure
Document AI Pipeline
GSoC 2025. 3-stage architecture: layout detection, OCR, and post-correction on Renaissance manuscripts.
RenAIssance pipeline

Full timeline.

2025 — ML Research & Systems
Agentic Long-Runner

ReAct agent with 4 memory modes. Needle-in-haystack evals: 0% to 100% retrieval recovery.

ReAct Memory
RenAIssance — GSoC 2025

Document AI for historical manuscripts. Layout detection, OCR, post-correction.

LayoutLMv3 PyTorch
MinIO + LakeFS Infrastructure

Distributed object storage with dataset versioning for reproducible ML experiments.

MinIO Docker
CFN Biomedical Eval

Token-level and span-level benchmarks for clinical NLP concept extraction.

NLP Eval
ISSR — Crisis Detection

NLP crisis-signal classification with sentiment analysis and geospatial features.

NLP Sentiment
ArtExtract

CNN-RNN architecture for artwork classification with validation metrics.

PyTorch CV
Custom PyMyCobot

Leader-follower control system for bimanual manipulation and data collection.

Python Hardware
Swarm Simulator

Multi-agent simulation with flocking, foraging, and formation algorithms.

Python Multi-Agent
G1 Humanoid RL

RL policy for dynamic locomotion in MuJoCo simulation.

RL MuJoCo
MediSync

Health records platform with scheduling and clinical workflows.

React Node.js
AgriFinance

Financial services for smallholder farmers. Credit scoring and mobile-first design.

Flutter Firebase
ConnectFarm

Agricultural marketplace connecting farmers to buyers.

Flutter Firebase
2024 — Full-Stack & Mobile
NutrAI

Full-stack nutrition app with meal planning and dietary analysis.

React Python
CampusBuddy

Flutter app for campus events, dining, and student resources.

Flutter Dart
POS Terminal

Java POS application with inventory tracking and sales reporting.

Java Desktop
2023 — Early Projects
Space Invader

C++ arcade game with custom rendering and gameplay mechanics.

C++ SFML
Pacman AI

Autonomous agents with search algorithms and gameplay heuristics.

Python Search

Notes and breakdowns.

Open to research and systems roles.

Looking for teams building ML systems, data infrastructure, or applied AI.