Medical RAG – Retrieval Augmented Generation for Medical Documents

A Retrieval-Augmented Generation (RAG) system designed to answer questions from medical documents using semantic search and LLM-based reasoning.

The system processes medical PDFs, builds a vector database, retrieves relevant context for user queries, and generates grounded responses using an LLM.

This project demonstrates how modern AI systems combine vector search + LLMs + APIs to build domain-specific knowledge assistants.

Project Demo

🎥 Demo Video:

A short demo of the Medical RAG system processing medical PDFs and answering user queries using retrieval-augmented generation.

Demo Video

Key Features

Medical document question answering
Retrieval-Augmented Generation pipeline
Semantic search using vector embeddings
PDF ingestion and processing
Image/diagram description using Gemini
FastAPI backend
Web-based UI for interaction
ChromaDB vector storage

System Architecture

The system follows a typical RAG pipeline:

Document Ingestion
- Medical PDFs are loaded and processed
- Text is extracted page by page
Image Understanding
- Pages containing diagrams/images are processed with Gemini
- Image descriptions are added to the document text
Embedding Generation
- Text chunks are converted into embeddings
Vector Database
- Embeddings are stored in ChromaDB
Query Processing
- User question is converted to embedding
- Relevant chunks are retrieved from vector database
LLM Response Generation
- Retrieved context is sent to the LLM
- The model generates a grounded answer

Example Workflow

User Question

What are the symptoms of hypertension?

Pipeline Steps

User Query
    ↓
Embedding Generation
    ↓
Vector Search (ChromaDB)
    ↓
Retrieve Relevant Medical Context
    ↓
LLM Answer Generation (Gemini)
    ↓
Final Response

Project Structure

medical-rag/
│
├── main.py
├── vector.py
├── retriever_service.py
├── qa_service.py
├── app.py
├── requirements.txt
│
├── static/
│   └── index.html        # Web interface
│
├── data/
│   ├── pdfs/             # Source medical documents
│   ├── processed/        # Processed text files
│   ├── mapping/
│   └── chroma_db/        # Vector database

Tech Stack

Python FastAPI ChromaDB (Vector Database) Gemini API (LLM + Vision) Sentence Embeddings HTML / JavaScript UI

Installation

Clone the repository

git clone https://github.com/MuntahaShams/medical-rag.git
cd medical-rag

Install dependencies

pip install -r requirements.txt

Environment Setup

Set your Gemini API key.

Mac / Linux

export GEMINI_API_KEY="your_api_key"

Windows

set GEMINI_API_KEY=your_api_key

Running the Application

Start the FastAPI server:

uvicorn app:app --reload --port 8000

Access the UI

Open your browser and go to:

http://127.0.0.1:8000/ui

You will see the Medical RAG interface where you can ask questions about the medical documents.

Scripts Overview

Script	Purpose
`main.py`	Processes PDFs and extracts text and image descriptions
`vector.py`	Generates embeddings and builds the vector database
`retriever_service.py`	Retrieves relevant chunks from ChromaDB
`qa_service.py`	Sends context + query to Gemini for answer generation
`app.py`	FastAPI application and API endpoints

Challenges

Some challenges in building domain-specific RAG systems include:

handling long medical documents
maintaining context relevance
processing images and diagrams
reducing hallucinations
ensuring accurate retrieval

This project addresses these using semantic retrieval and structured document processing.

Future Improvements

citation support in answers
better chunking strategies
evaluation metrics for RAG quality
UI improvements
support for multiple medical datasets
hybrid search (BM25 + vector)

Author

Muntaha Shams

AI Engineer – LLMs | NLP | Computer Vision | Document AI

GitHub https://github.com/MuntahaShams

Portfolio https://muntahashams.github.io/portfolio/projects

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Medical RAG – Retrieval Augmented Generation for Medical Documents

Project Demo

Demo Video

Key Features

System Architecture

Example Workflow

Project Structure

Tech Stack

Installation

Environment Setup

Running the Application

Access the UI

Scripts Overview

Challenges

Future Improvements

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data/mapping		data/mapping
static		static
README.md		README.md
app.py		app.py
main.py		main.py
medical_rag.mp4		medical_rag.mp4
qa_service.py		qa_service.py
readme.md		readme.md
requirements.txt		requirements.txt
retriever_service.py		retriever_service.py
vector.py		vector.py

Folders and files

Latest commit

History

Repository files navigation

Medical RAG – Retrieval Augmented Generation for Medical Documents

Project Demo

Demo Video

Key Features

System Architecture

Example Workflow

Project Structure

Tech Stack

Installation

Environment Setup

Running the Application

Access the UI

Scripts Overview

Challenges

Future Improvements

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages