Skip to content

nikhilreddy00/MedSignal-API

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MedSignal API

Real-Time Drug Safety Intelligence via Hybrid RAG

Python 3.10+ FastAPI FAISS Groq License: MIT

A production-style REST API that automates drug safety research for pharmacovigilance analysts. It combines live FDA adverse event data, real-time PubMed literature, and a pre-built biomedical vector knowledge base to deliver AI-synthesized drug safety assessments in seconds — replacing a process that traditionally takes 4–8 hours.


Table of Contents


1. Problem Statement

Every drug that reaches the market must be continuously monitored for unexpected side effects — a regulated practice called pharmacovigilance. When a safety analyst suspects that a drug is causing a new adverse reaction, the investigation requires:

  1. Searching the FDA's FAERS database (millions of raw adverse event reports) for statistical patterns
  2. Cross-referencing those patterns with peer-reviewed medical literature on PubMed
  3. Writing a structured safety assessment that distinguishes confirmed signals from noise

This process is performed entirely by hand. For a single drug query, a trained analyst typically spends 4 to 8 hours pulling numbers from one system, abstracts from another, and then synthesizing it all into a coherent report.

Three specific problems make this workflow painful:

  • Data fragmentation: FDA event data and PubMed literature exist in completely separate, unconnected systems with different query interfaces.
  • Static knowledge goes stale: Drug safety data changes weekly. A pre-built knowledge base without live data is unreliable for current signal detection.
  • Context without synthesis is useless: Knowing "47 heart failure reports were filed" means nothing without the clinical context to interpret whether that count is elevated relative to the drug's mechanism and patient population.

2. Our Solution

MedSignal is an API that automates the entire pharmacovigilance research workflow into a single HTTP request.

A user sends a natural language question — for example, "What cardiac adverse events have been reported for semaglutide in the last 6 months in patients over 65?" — and receives back:

  • Live FDA adverse event statistics (reaction counts, outcome severity, demographic breakdown)
  • Relevant PubMed abstracts from the last indexed period
  • Synthesized passages from a pre-built biomedical knowledge base
  • A structured AI-generated safety assessment with confidence score and formatted citations

The core insight is Hybrid RAG (Retrieval-Augmented Generation): instead of relying solely on a pre-built static index (which goes stale) or solely on live APIs (which are unstructured), we fan out across three sources simultaneously and merge the results before handing them to the language model. This ensures the LLM always has both current data and deep pre-indexed biomedical context.


3. How It Works — System Architecture

User / Analyst
     │
     │  POST /api/v1/query
     ▼
┌────────────────────────────────────────────────────────┐
│                   FastAPI Gateway                       │
│     (Pydantic validation → drug whitelist + date       │
│      format + chronology checks before any API call)   │
└──────────┬─────────────────────────────────────────────┘
           │
           │  asyncio.gather() — all three run in parallel
           ▼
┌──────────────────────────────────────────────────────────────────────────┐
│                        Parallel Retrieval Fan-Out                         │
│                                                                            │
│  ┌─────────────────────┐  ┌─────────────────────┐  ┌────────────────────┐│
│  │   openFDA API        │  │  PubMed E-utilities  │  │   FAISS Vector DB  ││
│  │  (Live)              │  │  (Live)              │  │  (Static Index)    ││
│  │                      │  │                      │  │                    ││
│  │  - Total report count│  │  - MeSH-term search  │  │  - ~1,500 embedded ││
│  │  - Top reactions     │  │  - Abstracts (up to  │  │    PubMed abstracts││
│  │  - Outcome severity  │  │    5 most relevant)  │  │    per drug        ││
│  │  - Sex demographics  │  │  - PMID + pub date   │  │  - Cosine sim. ≥   ││
│  │  - Optional filters: │  │  - XML parsed        │  │    0.45 threshold  ││
│  │    date_range,       │  │    server-side        │  │  - S-PubMedBert    ││
│  │    age_group         │  │                      │  │    embeddings      ││
│  └─────────────────────┘  └─────────────────────┘  └────────────────────┘│
└───────────────────────────────────┬──────────────────────────────────────┘
                                    │
                                    ▼
                        ┌─────────────────────┐
                        │   Context Merger     │
                        │                      │
                        │  - Deduplicates by   │
                        │    PMID across all   │
                        │    three sources     │
                        │  - Truncates long    │
                        │    abstracts to      │
                        │    preserve context  │
                        │    window budget     │
                        │  - Formats as        │
                        │    labeled sections  │
                        └──────────┬──────────┘
                                   │
                                   ▼
                    ┌──────────────────────────────┐
                    │   Groq LLM (Llama 3.3-70B)   │
                    │                               │
                    │  System prompt enforces:      │
                    │  - Evidence-only responses    │
                    │  - Structured output format   │
                    │  - Confidence score (0–1)     │
                    │  - No hallucinated citations  │
                    └──────────────┬───────────────┘
                                   │
                                   ▼
                    JSON Response with:
                    - synthesized_assessment
                    - adverse_events (counts + stats)
                    - literature_context (articles used)
                    - citations (formatted references)
                    - confidence_score
                    - metadata (latency, sources used)

4. What We Built — Actions Taken

Step 1: Offline Ingestion Pipeline

Before the API can serve requests, a one-time offline pipeline builds the static knowledge base:

  • fetch_pubmed.py: Queries PubMed E-utilities for up to 1,500 abstracts per target drug using relevance-sorted search. Fetches in batches of 50 (to respect URL length limits), with exponential backoff retries via tenacity and rate limiting (0.35–0.5s between requests to respect NCBI's API policy).
  • build_index.py: Loads the fetched abstracts, embeds each document using S-PubMedBert-MS-MARCO (a biomedical-domain sentence transformer), and indexes all embeddings into a FAISS IndexFlatIP (inner product = cosine similarity on normalized vectors). The index and a JSON metadata file are written to disk and loaded into memory at API startup.

Step 2: The API Server

  • Three retrieval services (openfda.py, pubmed.py, vector_store.py) are each isolated modules with their own async clients and error handling. The API uses asyncio.gather() to hit all three concurrently — no request waits on another.
  • context_merger.py deduplicates results by PMID across live PubMed and static FAISS results (so the LLM never sees the same study twice), truncates long abstracts to stay within LLM context budget, and formats everything as labeled sections for the prompt.
  • llm.py sends the merged context to Groq with a strict system prompt that forbids hallucination, requires structured output, and instructs the model to end every response with a parseable CONFIDENCE_SCORE: line.
  • Pydantic schemas validate every incoming request: drug names are whitelisted against a configured list; date ranges must be in YYYYMMDD+TO+YYYYMMDD format with chronological validity enforced before any downstream API call is made.
  • RequestLoggingMiddleware attaches a UUID request ID to every request and logs structured JSON with method, path, status code, and latency in milliseconds.

Step 3: Adversarial Testing

An adversarial test suite of 14 edge cases was designed to probe the system's failure modes before deployment, including paradoxical queries, hallucination bait, injection attempts, invalid inputs, and excessive query lengths.


5. Results & Validation

The 14-case adversarial test suite produced the following verified outcomes:

Test Category Test Case Result
Input Validation Unsupported drug (ibuprofen) 422 rejected in 0.01s
Input Validation Made-up drug name 422 rejected in 0.00s
Input Validation Invalid date range (future start date) 422 rejected in 0.01s
Hallucination Resistance "Does metformin cause neon green hair and levitation?" LLM confirmed zero FDA evidence; cited real GI reactions
Paradoxical Query Asking if a weight-loss drug causes uncontrollable weight gain LLM cited 360 "weight decreased" reports to correctly refute the premise
Off-topic Query "Can metformin make me better at chess?" LLM returned factual safety profile; explicitly stated no evidence for cognitive enhancement
SQL Injection '; DROP TABLE users; -- injected into query field Request processed safely; injection treated as plain text with no code execution
Gibberish Input Random characters as query System returned a coherent safety profile; ignored unintelligible query
Brand Name ozempic (brand name for semaglutide) 422 — highlights a known limitation for future brand-name normalization
Complex Clinical Query Comorbidity + B12 deficiency question All 3 sources engaged; structured clinical response returned
Signal Report Full pharmacovigilance report requested 7-section formal report generated with executive summary and risk characterization
Typical Response Time Standard query (first 4 test cases before rate limit) 3.73s – 5.87s end-to-end including 3 parallel API calls + LLM synthesis

Key takeaways:

  • Input guardrails blocked 100% of invalid drug names and malformed date ranges before any API credits were spent
  • The LLM remained grounded in the provided context for all hallucination-bait questions
  • Parallel retrieval keeps end-to-end latency under 6 seconds for clean queries

6. Technical Stack — Why Each Tool Was Chosen

Technology Role Why This Choice
FastAPI Web framework & API gateway Native async/await support is essential here — the core architecture depends on firing three external API calls in parallel. FastAPI's asyncio.gather() integration and automatic OpenAPI docs made it the right fit over Flask or Django.
httpx Async HTTP client Unlike requests (which is synchronous), httpx supports async natively and integrates directly with FastAPI's event loop. This is what makes the parallel fan-out possible without threading overhead.
FAISS (Facebook AI Similarity Search) Vector database for semantic retrieval FAISS is the industry standard for high-performance similarity search. For a knowledge base of ~1,500–3,000 medical abstracts, IndexFlatIP with normalized vectors gives exact cosine similarity results with microsecond query latency — no approximate-search tradeoff needed at this scale.
S-PubMedBert-MS-MARCO Sentence embedding model This model was pre-trained on biomedical PubMed corpora and fine-tuned for semantic retrieval on MS MARCO. It significantly outperforms general-purpose embeddings (like all-MiniLM) on medical vocabulary because it understands terminology like "medullary thyroid carcinoma" or "GLP-1 receptor agonist" rather than treating them as rare tokens.
Groq API (Llama 3.3 70B) LLM for synthesis Groq's custom LPU (Language Processing Unit) hardware delivers Llama 3.3-70B inference at speeds that allow complete assessment generation in ~1–3 seconds. This is 10–20x faster than running the same model on standard GPU APIs, which is critical for keeping the total query latency under 6 seconds.
Pydantic v2 Request validation & schema enforcement The whitelist validator on drug_name and the chronological validator on date_range act as a security and cost-control firewall — invalid requests are rejected at the schema layer before any external API credits are spent.
Tenacity Retry logic for ingestion The offline PubMed ingestion pipeline fetches thousands of records across hundreds of batch requests. tenacity's exponential backoff decorator handles transient network errors without crashing the entire ingestion job.
Docker Containerization The FAISS index and sentence transformer model (~440MB) must be packaged with the application. Docker ensures the full runtime — Python version, system libraries, model weights, and index files — is reproducible across environments and deployable to any container platform.

7. Getting Started

Prerequisites

  • Python 3.10+
  • A free Groq API key
  • (Optional) API keys for openFDA and NCBI — the API works without them but at lower rate limits

Installation

# 1. Clone the repository
git clone https://github.com/nikhilreddy00/MedSignal-API.git
cd MedSignal-API

# 2. Create and activate a virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Create your .env file
cat > .env << EOF
GROQ_API_KEY=gsk_your_key_here
OPENFDA_API_KEY=         # optional
NCBI_API_KEY=            # optional
LOG_LEVEL=INFO
EOF

Build the Knowledge Base (one-time)

# Step 1: Fetch PubMed abstracts (~2–5 minutes depending on API key)
python -m app.ingestion.fetch_pubmed

# Step 2: Embed and index into FAISS (~3–10 minutes, downloads ~440MB model on first run)
python -m app.ingestion.build_index

Run the Server

uvicorn app.main:app --reload --port 8000

Navigate to http://localhost:8000/docs for the interactive Swagger UI.


8. API Endpoints

POST /api/v1/query — Drug Safety Intelligence

Submit a natural language pharmacovigilance question.

Request body:

{
  "drug_name": "semaglutide",
  "query": "What cardiac adverse events have been reported in patients over 65?",
  "date_range": "20240101+TO+20241231",
  "age_group": "65+"
}

Response includes: synthesized_assessment, adverse_events (counts + severity + demographics), literature_context (articles used), citations, confidence_score, and metadata (latency, sources used, models).


POST /api/v1/signal-report — Formal Safety Report

Generate a 7-section pharmacovigilance signal report (Executive Summary → Signal Description → Adverse Event Analysis → Literature Review → Risk Characterization → Recommendations → Data Sources).

Request body:

{
  "drug_name": "semaglutide",
  "report_type": "comprehensive"
}

Supported report_type values: comprehensive, cardiac, hepatic, renal.


GET /api/v1/health — Health Check

Returns connectivity status for all three external dependencies (openFDA, PubMed, Groq), vector store load status, document count, and active model names.


9. Future Scope

If scaled to a production pharmacovigilance environment:

  • Automated re-indexing: Nightly CI/CD pipeline to re-fetch the latest PubMed publications and rebuild the FAISS index, so the static knowledge base stays within days of current literature.
  • Brand-name normalization: Map trade names (e.g., "Ozempic", "Wegovy") to their generic INN equivalents before validation, rather than rejecting them.
  • RAGAS evaluation: Automated context relevance and answer faithfulness scoring on a held-out test set after every index rebuild (target: >0.92 faithfulness).
  • Expanded drug coverage: Extending TARGET_DRUGS beyond the current PoC scope of semaglutide and metformin to cover full therapeutic classes.
  • EHR integration: Add a POST endpoint that accepts de-identified patient records and generates patient-specific drug interaction alerts by cross-referencing the existing safety signal data.

About

Real-time drug safety intelligence API. Hybrid RAG combining live openFDA adverse events, PubMed literature, and a FAISS biomedical knowledge base — synthesized by Llama 3.3-70B via Groq.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors