AI-Powered Decision-Support Health Guidance for Non-Expert Patients
β οΈ MEDICAL DISCLAIMER: This tool is NOT a diagnostic system. It provides health guidance and decision-support only.
Always consult a qualified healthcare professional for medical advice.
The pipeline is built as a 10-node LangGraph directed acyclic graph:
[Image Upload] βββββββββββββββββββββββββββββββββββββββββββββββ
βΌ
[Symptom Text] βββΊ Node 1: Image Interpreter β
β β
βΌ β
Node 2: Symptom Interpreter ββββββββββββββ
β
βΌ
Node 3: Context Builder
β
βΌ
Node 4: RAG Retriever (FAISS)
β
βΌ
Node 5: Clinical Reasoner (MedGemma-4B)
β
βΌ
Node 6: Follow-up Generator
β
βΌ
Node 7: Response Integrator
β
βΌ
Node 8: Risk Classifier
β
βΌ
Node 9: Explanation Generator
β
βΌ
Node 10: Care Suggestion Generator
β
βΌ
π Final Report (JSON + Markdown)
Over 4 billion people worldwide lack adequate healthcare access. Patients cannot understand complex medical reports, panic over symptoms they don't recognize, and delay care due to confusion. This assistant solves that by:
- Accepting medical images β X-rays, lab reports, prescriptions, discharge summaries
- Engaging in intelligent multi-turn conversation to build a complete clinical picture
- Generating plain-language, actionable health guidance β no medical jargon
- Running 100% offline on consumer hardware β no cloud, no data leaks
| Feature | Description |
|---|---|
| πΌοΈ Medical Image Understanding | Analyzes X-rays, lab reports, prescriptions via MedGemma multimodal inference |
| π©Ί Symptom Parsing | Categorizes symptoms across 8 clinical domains with emergency red-flag detection |
| π¨ Emergency Escalation | Instantly surfaces emergency alerts for life-threatening symptoms |
| π Adaptive Follow-Up | Asks targeted clarifying questions on duration, severity, history, medications, triggers |
| π RAG-Enhanced Reasoning | Retrieves clinical guidelines from a curated FAISS medical knowledge base |
| π’π‘π΄ Risk Stratification | Classifies risk as Low / Medium / High with confidence scores |
| π‘οΈ Safety Layer | Filters diagnostic overreach, injects uncertainty, enforces disclaimers |
| π΄ 100% Offline | Fully local inference β patient data never leaves the device |
| Component | Technology | Version |
|---|---|---|
| Base Model | google/medgemma-4b-it |
4B parameters |
| Quantization | BitsAndBytes NF4 + double quant | >=0.43.0 |
| Agent Framework | LangGraph | >=0.1.0 |
| RAG / Retrieval | LangChain + FAISS + sentence-transformers | >=0.2.0 |
| Embeddings | all-MiniLM-L6-v2 |
β |
| UI | Gradio | >=4.31.0 |
| Hardware Target | Kaggle T4 GPU (16 GB VRAM) | β |
- Python 3.9+
- CUDA GPU (16 GB VRAM recommended) or CPU with 8+ GB RAM
- HuggingFace account with access to
google/medgemma-4b-it
git clone https://github.com/your-username/medgemma-health-assistant.git
cd medgemma-health-assistantpip install transformers>=4.40.0 accelerate>=0.27.0 bitsandbytes>=0.43.0 \
langchain>=0.2.0 langchain-community>=0.2.0 langgraph>=0.1.0 \
faiss-cpu sentence-transformers gradio>=4.31.0 \
Pillow torch torchvision huggingface_hub peft einops timmfrom huggingface_hub import login
login(token="YOUR_HF_TOKEN")On Kaggle, store your token as a secret named
HF_TOKEN. The notebook retrieves it automatically.
Open medgemma_health_assistant_final.ipynb and run all cells. The Gradio app launches in Section 18.
- (Optional) Upload a medical image β X-ray, lab report, prescription, or discharge summary
- Describe your symptoms in plain language
- Answer follow-up questions to help the assistant refine its understanding
- Receive your report β risk level, possible concerns, recommended actions, and a plain-language explanation
Example inputs to try:
"Chest pain this morning, feels like pressure, mild shortness of breath""Severe headache for 3 days, dizzy when standing, very thirsty""Stomach pain below belly button, painful urination, slight fever since yesterday""Hit my head 2 hours ago, now headache and feeling confused"
Round 1 β Full pipeline runs β Clarity < 65%? β Ask follow-up questions
Round 2 β Pipeline re-runs with answers β Clarity < 65%? β Ask more questions
...
Round N β Clarity β₯ 65% OR Emergency detected β Generate final report
Clarity threshold: 65% Β |Β Max follow-up rounds: 4
| Indicator | Level | Meaning |
|---|---|---|
| π’ | Low | Self-care likely sufficient; monitor symptoms |
| π‘ | Medium | Schedule a doctor visit soon |
| π΄ | High | Seek medical care promptly |
| π¨ | Emergency | Call emergency services NOW |
- No diagnosis β hedged language replaces all diagnostic phrasing automatically
- Emergency-first β red-flag symptoms trigger escalation before any other response
- Mandatory disclaimers β appended to every single response without exception
- Content safety β regex filtering removes hopeless or harmful language
- Offline by design β no telemetry, no logging, full patient privacy
Six built-in test cases validate the pipeline end-to-end:
| ID | Scenario | Expected Risk | Emergency |
|---|---|---|---|
| TC001 | Fever with body ache | Medium | β |
| TC002 | Chest pain + left arm radiation + sweating | High | β |
| TC003 | Dehydration signs | Medium | β |
| TC004 | Possible UTI | Medium | β |
| TC005 | Head injury with confusion | High | β |
| TC006 | Minor finger wound infection | Low | β |
Run the full suite with:
run_evaluation()| Environment | Status | Notes |
|---|---|---|
| Kaggle T4 GPU | β Primary target | 2β5s response time |
| 16 GB RAM Laptop | β Recommended | CPU inference ~10β30s |
| Apple M2 / M3 | β Metal acceleration | Good performance |
| 8 GB RAM Windows/Linux | 4-bit quantization enables this | |
| Android (Termux + llama.cpp) | GGUF conversion required | |
| iOS (CoreML) | Swift integration required |
Memory footprint:
MedGemma 4-bit model β ~2.1 GB VRAM
FAISS medical index β ~10 MB RAM
MiniLM embeddings β ~80 MB RAM
App overhead β ~500 MB RAM
βββββββββββββββββββββββββββββββββββββ
Total minimum β ~3 GB RAM
Target: 4+ billion underserved patients globally
Use cases: Rural clinics, home monitoring, elderly care, community health workers
Cost: $0 API cost β fully local inference
Privacy: 100% β no data leaves the device
Roadmap:
- Multilingual support (Hindi, Swahili, Spanish, Arabic)
- Voice input/output for low-literacy users
- Wearable sensor data integration
- Community health worker dashboard
- Fine-tuning on local disease prevalence data
- WhatsApp / SMS bot for feature phones
- Progressive Web App (PWA)
| Section | Description |
|---|---|
| 1 | Setup & package installation |
| 2 | Imports & GPU detection |
| 3 | HuggingFace authentication |
| 4 | MedGemma model loading (4-bit quantized) |
| 5 | Medical image understanding pipeline |
| 6 | Symptom intake module & emergency detection |
| 7 | RAG pipeline with FAISS medical knowledge base |
| 8 | Conversational follow-up engine |
| 9 | LangGraph state & MedGemma inference helper |
| 9b | LangGraph node definitions (Nodes 1β10) |
| 9c | LangGraph workflow compilation |
| 10 | Decision engine & session manager |
| 11 | Safety layer |
| 12 | Pipeline runner & report formatter |
| 13 | Gradio chat UI |
| 14 | Evaluation suite |
| 15 | Edge deployment notes |
| 16 | Competition writeup |
| 17 | Demo video script |
| 18 | App launch |
- Google MedGemma β Medical multimodal language model
- LangChain / LangGraph β Agentic pipeline framework
- FAISS β Efficient vector similarity search (Meta AI)
- Gradio β ML demo UI (Hugging Face)
- BitsAndBytes β 4-bit quantization
This project is licensed under the MIT License β see the LICENSE file for details.
Built for the Google MedGemma Impact Challenge on Kaggle.
For research and demonstration purposes only. Not a certified medical device.
Made with β€οΈ to make healthcare guidance accessible to everyone, everywhere.
β Star this repo if you found it helpful!

