MedSignal API

Real-Time Drug Safety Intelligence via Hybrid RAG

A production-style REST API that automates drug safety research for pharmacovigilance analysts. It combines live FDA adverse event data, real-time PubMed literature, and a pre-built biomedical vector knowledge base to deliver AI-synthesized drug safety assessments in seconds — replacing a process that traditionally takes 4–8 hours.

1. Problem Statement

Every drug that reaches the market must be continuously monitored for unexpected side effects — a regulated practice called pharmacovigilance. When a safety analyst suspects that a drug is causing a new adverse reaction, the investigation requires:

Searching the FDA's FAERS database (millions of raw adverse event reports) for statistical patterns
Cross-referencing those patterns with peer-reviewed medical literature on PubMed
Writing a structured safety assessment that distinguishes confirmed signals from noise

This process is performed entirely by hand. For a single drug query, a trained analyst typically spends 4 to 8 hours pulling numbers from one system, abstracts from another, and then synthesizing it all into a coherent report.

Three specific problems make this workflow painful:

Data fragmentation: FDA event data and PubMed literature exist in completely separate, unconnected systems with different query interfaces.
Static knowledge goes stale: Drug safety data changes weekly. A pre-built knowledge base without live data is unreliable for current signal detection.
Context without synthesis is useless: Knowing "47 heart failure reports were filed" means nothing without the clinical context to interpret whether that count is elevated relative to the drug's mechanism and patient population.

2. Our Solution

MedSignal is an API that automates the entire pharmacovigilance research workflow into a single HTTP request.

A user sends a natural language question — for example, "What cardiac adverse events have been reported for semaglutide in the last 6 months in patients over 65?" — and receives back:

Live FDA adverse event statistics (reaction counts, outcome severity, demographic breakdown)
Relevant PubMed abstracts from the last indexed period
Synthesized passages from a pre-built biomedical knowledge base
A structured AI-generated safety assessment with confidence score and formatted citations

The core insight is Hybrid RAG (Retrieval-Augmented Generation): instead of relying solely on a pre-built static index (which goes stale) or solely on live APIs (which are unstructured), we fan out across three sources simultaneously and merge the results before handing them to the language model. This ensures the LLM always has both current data and deep pre-indexed biomedical context.

3. How It Works — System Architecture

User / Analyst
     │
     │  POST /api/v1/query
     ▼
┌────────────────────────────────────────────────────────┐
│                   FastAPI Gateway                       │
│     (Pydantic validation → drug whitelist + date       │
│      format + chronology checks before any API call)   │
└──────────┬─────────────────────────────────────────────┘
           │
           │  asyncio.gather() — all three run in parallel
           ▼
┌──────────────────────────────────────────────────────────────────────────┐
│                        Parallel Retrieval Fan-Out                         │
│                                                                            │
│  ┌─────────────────────┐  ┌─────────────────────┐  ┌────────────────────┐│
│  │   openFDA API        │  │  PubMed E-utilities  │  │   FAISS Vector DB  ││
│  │  (Live)              │  │  (Live)              │  │  (Static Index)    ││
│  │                      │  │                      │  │                    ││
│  │  - Total report count│  │  - MeSH-term search  │  │  - ~1,500 embedded ││
│  │  - Top reactions     │  │  - Abstracts (up to  │  │    PubMed abstracts││
│  │  - Outcome severity  │  │    5 most relevant)  │  │    per drug        ││
│  │  - Sex demographics  │  │  - PMID + pub date   │  │  - Cosine sim. ≥   ││
│  │  - Optional filters: │  │  - XML parsed        │  │    0.45 threshold  ││
│  │    date_range,       │  │    server-side        │  │  - S-PubMedBert    ││
│  │    age_group         │  │                      │  │    embeddings      ││
│  └─────────────────────┘  └─────────────────────┘  └────────────────────┘│
└───────────────────────────────────┬──────────────────────────────────────┘
                                    │
                                    ▼
                        ┌─────────────────────┐
                        │   Context Merger     │
                        │                      │
                        │  - Deduplicates by   │
                        │    PMID across all   │
                        │    three sources     │
                        │  - Truncates long    │
                        │    abstracts to      │
                        │    preserve context  │
                        │    window budget     │
                        │  - Formats as        │
                        │    labeled sections  │
                        └──────────┬──────────┘
                                   │
                                   ▼
                    ┌──────────────────────────────┐
                    │   Groq LLM (Llama 3.3-70B)   │
                    │                               │
                    │  System prompt enforces:      │
                    │  - Evidence-only responses    │
                    │  - Structured output format   │
                    │  - Confidence score (0–1)     │
                    │  - No hallucinated citations  │
                    └──────────────┬───────────────┘
                                   │
                                   ▼
                    JSON Response with:
                    - synthesized_assessment
                    - adverse_events (counts + stats)
                    - literature_context (articles used)
                    - citations (formatted references)
                    - confidence_score
                    - metadata (latency, sources used)

4. What We Built — Actions Taken

Step 1: Offline Ingestion Pipeline

Before the API can serve requests, a one-time offline pipeline builds the static knowledge base:

fetch_pubmed.py: Queries PubMed E-utilities for up to 1,500 abstracts per target drug using relevance-sorted search. Fetches in batches of 50 (to respect URL length limits), with exponential backoff retries via tenacity and rate limiting (0.35–0.5s between requests to respect NCBI's API policy).
build_index.py: Loads the fetched abstracts, embeds each document using S-PubMedBert-MS-MARCO (a biomedical-domain sentence transformer), and indexes all embeddings into a FAISS IndexFlatIP (inner product = cosine similarity on normalized vectors). The index and a JSON metadata file are written to disk and loaded into memory at API startup.

Step 2: The API Server

Three retrieval services (openfda.py, pubmed.py, vector_store.py) are each isolated modules with their own async clients and error handling. The API uses asyncio.gather() to hit all three concurrently — no request waits on another.
context_merger.py deduplicates results by PMID across live PubMed and static FAISS results (so the LLM never sees the same study twice), truncates long abstracts to stay within LLM context budget, and formats everything as labeled sections for the prompt.
llm.py sends the merged context to Groq with a strict system prompt that forbids hallucination, requires structured output, and instructs the model to end every response with a parseable CONFIDENCE_SCORE: line.
Pydantic schemas validate every incoming request: drug names are whitelisted against a configured list; date ranges must be in YYYYMMDD+TO+YYYYMMDD format with chronological validity enforced before any downstream API call is made.
RequestLoggingMiddleware attaches a UUID request ID to every request and logs structured JSON with method, path, status code, and latency in milliseconds.

Step 3: Adversarial Testing

An adversarial test suite of 14 edge cases was designed to probe the system's failure modes before deployment, including paradoxical queries, hallucination bait, injection attempts, invalid inputs, and excessive query lengths.

5. Results & Validation

The 14-case adversarial test suite produced the following verified outcomes:

Test Category	Test Case	Result
Input Validation	Unsupported drug (`ibuprofen`)	`422` rejected in 0.01s
Input Validation	Made-up drug name	`422` rejected in 0.00s
Input Validation	Invalid date range (future start date)	`422` rejected in 0.01s
Hallucination Resistance	"Does metformin cause neon green hair and levitation?"	LLM confirmed zero FDA evidence; cited real GI reactions
Paradoxical Query	Asking if a weight-loss drug causes uncontrollable weight gain	LLM cited 360 "weight decreased" reports to correctly refute the premise
Off-topic Query	"Can metformin make me better at chess?"	LLM returned factual safety profile; explicitly stated no evidence for cognitive enhancement
SQL Injection	`'; DROP TABLE users; --` injected into query field	Request processed safely; injection treated as plain text with no code execution
Gibberish Input	Random characters as query	System returned a coherent safety profile; ignored unintelligible query
Brand Name	`ozempic` (brand name for semaglutide)	`422` — highlights a known limitation for future brand-name normalization
Complex Clinical Query	Comorbidity + B12 deficiency question	All 3 sources engaged; structured clinical response returned
Signal Report	Full pharmacovigilance report requested	7-section formal report generated with executive summary and risk characterization
Typical Response Time	Standard query (first 4 test cases before rate limit)	3.73s – 5.87s end-to-end including 3 parallel API calls + LLM synthesis

Key takeaways:

Input guardrails blocked 100% of invalid drug names and malformed date ranges before any API credits were spent
The LLM remained grounded in the provided context for all hallucination-bait questions
Parallel retrieval keeps end-to-end latency under 6 seconds for clean queries

6. Technical Stack — Why Each Tool Was Chosen

Technology	Role	Why This Choice
FastAPI	Web framework & API gateway	Native `async/await` support is essential here — the core architecture depends on firing three external API calls in parallel. FastAPI's `asyncio.gather()` integration and automatic OpenAPI docs made it the right fit over Flask or Django.
httpx	Async HTTP client	Unlike `requests` (which is synchronous), `httpx` supports async natively and integrates directly with FastAPI's event loop. This is what makes the parallel fan-out possible without threading overhead.
FAISS (Facebook AI Similarity Search)	Vector database for semantic retrieval	FAISS is the industry standard for high-performance similarity search. For a knowledge base of ~1,500–3,000 medical abstracts, `IndexFlatIP` with normalized vectors gives exact cosine similarity results with microsecond query latency — no approximate-search tradeoff needed at this scale.
S-PubMedBert-MS-MARCO	Sentence embedding model	This model was pre-trained on biomedical PubMed corpora and fine-tuned for semantic retrieval on MS MARCO. It significantly outperforms general-purpose embeddings (like `all-MiniLM`) on medical vocabulary because it understands terminology like "medullary thyroid carcinoma" or "GLP-1 receptor agonist" rather than treating them as rare tokens.
Groq API (Llama 3.3 70B)	LLM for synthesis	Groq's custom LPU (Language Processing Unit) hardware delivers Llama 3.3-70B inference at speeds that allow complete assessment generation in ~1–3 seconds. This is 10–20x faster than running the same model on standard GPU APIs, which is critical for keeping the total query latency under 6 seconds.
Pydantic v2	Request validation & schema enforcement	The whitelist validator on `drug_name` and the chronological validator on `date_range` act as a security and cost-control firewall — invalid requests are rejected at the schema layer before any external API credits are spent.
Tenacity	Retry logic for ingestion	The offline PubMed ingestion pipeline fetches thousands of records across hundreds of batch requests. `tenacity`'s exponential backoff decorator handles transient network errors without crashing the entire ingestion job.
Docker	Containerization	The FAISS index and sentence transformer model (~440MB) must be packaged with the application. Docker ensures the full runtime — Python version, system libraries, model weights, and index files — is reproducible across environments and deployable to any container platform.

7. Getting Started

Prerequisites

Python 3.10+
A free Groq API key
(Optional) API keys for openFDA and NCBI — the API works without them but at lower rate limits

Installation

# 1. Clone the repository
git clone https://github.com/nikhilreddy00/MedSignal-API.git
cd MedSignal-API

# 2. Create and activate a virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Create your .env file
cat > .env << EOF
GROQ_API_KEY=gsk_your_key_here
OPENFDA_API_KEY=         # optional
NCBI_API_KEY=            # optional
LOG_LEVEL=INFO
EOF

Build the Knowledge Base (one-time)

# Step 1: Fetch PubMed abstracts (~2–5 minutes depending on API key)
python -m app.ingestion.fetch_pubmed

# Step 2: Embed and index into FAISS (~3–10 minutes, downloads ~440MB model on first run)
python -m app.ingestion.build_index

Run the Server

uvicorn app.main:app --reload --port 8000

Navigate to http://localhost:8000/docs for the interactive Swagger UI.

8. API Endpoints

`POST /api/v1/query` — Drug Safety Intelligence

Submit a natural language pharmacovigilance question.

Request body:

{
  "drug_name": "semaglutide",
  "query": "What cardiac adverse events have been reported in patients over 65?",
  "date_range": "20240101+TO+20241231",
  "age_group": "65+"
}

Response includes: synthesized_assessment, adverse_events (counts + severity + demographics), literature_context (articles used), citations, confidence_score, and metadata (latency, sources used, models).

`POST /api/v1/signal-report` — Formal Safety Report

Generate a 7-section pharmacovigilance signal report (Executive Summary → Signal Description → Adverse Event Analysis → Literature Review → Risk Characterization → Recommendations → Data Sources).

Request body:

{
  "drug_name": "semaglutide",
  "report_type": "comprehensive"
}

Supported report_type values: comprehensive, cardiac, hepatic, renal.

`GET /api/v1/health` — Health Check

Returns connectivity status for all three external dependencies (openFDA, PubMed, Groq), vector store load status, document count, and active model names.

9. Future Scope

If scaled to a production pharmacovigilance environment:

Automated re-indexing: Nightly CI/CD pipeline to re-fetch the latest PubMed publications and rebuild the FAISS index, so the static knowledge base stays within days of current literature.
Brand-name normalization: Map trade names (e.g., "Ozempic", "Wegovy") to their generic INN equivalents before validation, rather than rejecting them.
RAGAS evaluation: Automated context relevance and answer faithfulness scoring on a held-out test set after every index rebuild (target: >0.92 faithfulness).
Expanded drug coverage: Extending TARGET_DRUGS beyond the current PoC scope of semaglutide and metformin to cover full therapeutic classes.
EHR integration: Add a POST endpoint that accepts de-identified patient records and generates patient-specific drug interaction alerts by cross-referencing the existing safety signal data.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
app		app
output		output
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
render.yaml		render.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MedSignal API

Table of Contents

1. Problem Statement

2. Our Solution

3. How It Works — System Architecture

4. What We Built — Actions Taken

Step 1: Offline Ingestion Pipeline

Step 2: The API Server

Step 3: Adversarial Testing

5. Results & Validation

6. Technical Stack — Why Each Tool Was Chosen

7. Getting Started

Prerequisites

Installation

Build the Knowledge Base (one-time)

Run the Server

8. API Endpoints

`POST /api/v1/query` — Drug Safety Intelligence

`POST /api/v1/signal-report` — Formal Safety Report

`GET /api/v1/health` — Health Check

9. Future Scope

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MedSignal API

Table of Contents

1. Problem Statement

2. Our Solution

3. How It Works — System Architecture

4. What We Built — Actions Taken

Step 1: Offline Ingestion Pipeline

Step 2: The API Server

Step 3: Adversarial Testing

5. Results & Validation

6. Technical Stack — Why Each Tool Was Chosen

7. Getting Started

Prerequisites

Installation

Build the Knowledge Base (one-time)

Run the Server

8. API Endpoints

POST /api/v1/query — Drug Safety Intelligence

POST /api/v1/signal-report — Formal Safety Report

GET /api/v1/health — Health Check

9. Future Scope

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`POST /api/v1/query` — Drug Safety Intelligence

`POST /api/v1/signal-report` — Formal Safety Report

`GET /api/v1/health` — Health Check

Packages