Skip to content

utpalbarua/MedGemma-Conversational-Health-Assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ₯ MedGemma Conversational Health Assistant

AI-Powered Decision-Support Health Guidance for Non-Expert Patients

Kaggle Python MedGemma LangGraph Gradio License


⚠️ MEDICAL DISCLAIMER: This tool is NOT a diagnostic system. It provides health guidance and decision-support only.
Always consult a qualified healthcare professional for medical advice.


πŸ–₯️ Demo

App UI


πŸ—οΈ System Architecture

Architecture Diagram

The pipeline is built as a 10-node LangGraph directed acyclic graph:

[Image Upload] ──────────────────────────────────────────────┐
                                                              β–Ό
[Symptom Text] ──► Node 1: Image Interpreter                 β”‚
                        β”‚                                     β”‚
                        β–Ό                                     β”‚
                   Node 2: Symptom Interpreter  β—„β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
                        β–Ό
                   Node 3: Context Builder
                        β”‚
                        β–Ό
                   Node 4: RAG Retriever (FAISS)
                        β”‚
                        β–Ό
                   Node 5: Clinical Reasoner (MedGemma-4B)
                        β”‚
                        β–Ό
                   Node 6: Follow-up Generator
                        β”‚
                        β–Ό
                   Node 7: Response Integrator
                        β”‚
                        β–Ό
                   Node 8: Risk Classifier
                        β”‚
                        β–Ό
                   Node 9: Explanation Generator
                        β”‚
                        β–Ό
                   Node 10: Care Suggestion Generator
                        β”‚
                        β–Ό
                  πŸ“‹ Final Report (JSON + Markdown)

πŸ“Œ Overview

Over 4 billion people worldwide lack adequate healthcare access. Patients cannot understand complex medical reports, panic over symptoms they don't recognize, and delay care due to confusion. This assistant solves that by:

  • Accepting medical images β€” X-rays, lab reports, prescriptions, discharge summaries
  • Engaging in intelligent multi-turn conversation to build a complete clinical picture
  • Generating plain-language, actionable health guidance β€” no medical jargon
  • Running 100% offline on consumer hardware β€” no cloud, no data leaks

✨ Features

Feature Description
πŸ–ΌοΈ Medical Image Understanding Analyzes X-rays, lab reports, prescriptions via MedGemma multimodal inference
🩺 Symptom Parsing Categorizes symptoms across 8 clinical domains with emergency red-flag detection
🚨 Emergency Escalation Instantly surfaces emergency alerts for life-threatening symptoms
πŸ” Adaptive Follow-Up Asks targeted clarifying questions on duration, severity, history, medications, triggers
πŸ“š RAG-Enhanced Reasoning Retrieves clinical guidelines from a curated FAISS medical knowledge base
πŸŸ’πŸŸ‘πŸ”΄ Risk Stratification Classifies risk as Low / Medium / High with confidence scores
πŸ›‘οΈ Safety Layer Filters diagnostic overreach, injects uncertainty, enforces disclaimers
πŸ“΄ 100% Offline Fully local inference β€” patient data never leaves the device

🧱 Tech Stack

Component Technology Version
Base Model google/medgemma-4b-it 4B parameters
Quantization BitsAndBytes NF4 + double quant >=0.43.0
Agent Framework LangGraph >=0.1.0
RAG / Retrieval LangChain + FAISS + sentence-transformers >=0.2.0
Embeddings all-MiniLM-L6-v2 β€”
UI Gradio >=4.31.0
Hardware Target Kaggle T4 GPU (16 GB VRAM) β€”

πŸš€ Getting Started

Prerequisites

  • Python 3.9+
  • CUDA GPU (16 GB VRAM recommended) or CPU with 8+ GB RAM
  • HuggingFace account with access to google/medgemma-4b-it

1. Clone the Repository

git clone https://github.com/your-username/medgemma-health-assistant.git
cd medgemma-health-assistant

2. Install Dependencies

pip install transformers>=4.40.0 accelerate>=0.27.0 bitsandbytes>=0.43.0 \
            langchain>=0.2.0 langchain-community>=0.2.0 langgraph>=0.1.0 \
            faiss-cpu sentence-transformers gradio>=4.31.0 \
            Pillow torch torchvision huggingface_hub peft einops timm

3. Authenticate with HuggingFace

from huggingface_hub import login
login(token="YOUR_HF_TOKEN")

On Kaggle, store your token as a secret named HF_TOKEN. The notebook retrieves it automatically.

4. Run

Open medgemma_health_assistant_final.ipynb and run all cells. The Gradio app launches in Section 18.


πŸ“– Usage

  1. (Optional) Upload a medical image β€” X-ray, lab report, prescription, or discharge summary
  2. Describe your symptoms in plain language
  3. Answer follow-up questions to help the assistant refine its understanding
  4. Receive your report β€” risk level, possible concerns, recommended actions, and a plain-language explanation

Example inputs to try:

  • "Chest pain this morning, feels like pressure, mild shortness of breath"
  • "Severe headache for 3 days, dizzy when standing, very thirsty"
  • "Stomach pain below belly button, painful urination, slight fever since yesterday"
  • "Hit my head 2 hours ago, now headache and feeling confused"

🧠 How It Works

Multi-Turn Session Flow

Round 1 β†’ Full pipeline runs β†’ Clarity < 65%? β†’ Ask follow-up questions
Round 2 β†’ Pipeline re-runs with answers β†’ Clarity < 65%? β†’ Ask more questions
  ...
Round N β†’ Clarity β‰₯ 65% OR Emergency detected β†’ Generate final report

Clarity threshold: 65% Β |Β  Max follow-up rounds: 4

Risk Levels

Indicator Level Meaning
🟒 Low Self-care likely sufficient; monitor symptoms
🟑 Medium Schedule a doctor visit soon
πŸ”΄ High Seek medical care promptly
🚨 Emergency Call emergency services NOW

πŸ›‘οΈ Safety Design

  • No diagnosis β€” hedged language replaces all diagnostic phrasing automatically
  • Emergency-first β€” red-flag symptoms trigger escalation before any other response
  • Mandatory disclaimers β€” appended to every single response without exception
  • Content safety β€” regex filtering removes hopeless or harmful language
  • Offline by design β€” no telemetry, no logging, full patient privacy

πŸ“Š Evaluation

Six built-in test cases validate the pipeline end-to-end:

ID Scenario Expected Risk Emergency
TC001 Fever with body ache Medium ❌
TC002 Chest pain + left arm radiation + sweating High βœ…
TC003 Dehydration signs Medium ❌
TC004 Possible UTI Medium ❌
TC005 Head injury with confusion High ❌
TC006 Minor finger wound infection Low ❌

Run the full suite with:

run_evaluation()

πŸ“± Edge Deployment

Environment Status Notes
Kaggle T4 GPU βœ… Primary target 2–5s response time
16 GB RAM Laptop βœ… Recommended CPU inference ~10–30s
Apple M2 / M3 βœ… Metal acceleration Good performance
8 GB RAM Windows/Linux ⚠️ Possible 4-bit quantization enables this
Android (Termux + llama.cpp) ⚠️ Experimental GGUF conversion required
iOS (CoreML) ⚠️ Experimental Swift integration required

Memory footprint:

MedGemma 4-bit model  β†’  ~2.1 GB VRAM
FAISS medical index   β†’  ~10  MB RAM
MiniLM embeddings     β†’  ~80  MB RAM
App overhead          β†’  ~500 MB RAM
─────────────────────────────────────
Total minimum         β†’  ~3   GB RAM

🌍 Impact & Roadmap

Target: 4+ billion underserved patients globally
Use cases: Rural clinics, home monitoring, elderly care, community health workers
Cost: $0 API cost β€” fully local inference
Privacy: 100% β€” no data leaves the device

Roadmap:

  • Multilingual support (Hindi, Swahili, Spanish, Arabic)
  • Voice input/output for low-literacy users
  • Wearable sensor data integration
  • Community health worker dashboard
  • Fine-tuning on local disease prevalence data
  • WhatsApp / SMS bot for feature phones
  • Progressive Web App (PWA)

πŸ“ Notebook Structure

Section Description
1 Setup & package installation
2 Imports & GPU detection
3 HuggingFace authentication
4 MedGemma model loading (4-bit quantized)
5 Medical image understanding pipeline
6 Symptom intake module & emergency detection
7 RAG pipeline with FAISS medical knowledge base
8 Conversational follow-up engine
9 LangGraph state & MedGemma inference helper
9b LangGraph node definitions (Nodes 1–10)
9c LangGraph workflow compilation
10 Decision engine & session manager
11 Safety layer
12 Pipeline runner & report formatter
13 Gradio chat UI
14 Evaluation suite
15 Edge deployment notes
16 Competition writeup
17 Demo video script
18 App launch

🀝 Acknowledgements


πŸ“œ License

This project is licensed under the MIT License β€” see the LICENSE file for details.

Built for the Google MedGemma Impact Challenge on Kaggle.
For research and demonstration purposes only. Not a certified medical device.


Made with ❀️ to make healthcare guidance accessible to everyone, everywhere.

⭐ Star this repo if you found it helpful!

About

Built a multi-modal agentic RAG pipeline using MedGemma-4B-IT (4-bit NF4 quantized) orchestrated via a 10-node LangGraph graph. Implemented FAISS vector retrieval over a curated medical knowledge base, dynamic clarity scoring for adaptive follow-up generation, emergency red-flag detection, and a safety layer with diagnostic language filtering.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors