A Python implementation of the Corrective RAG (CRAG) pipeline using LangGraph. The system evaluates the quality of retrieved documents before generating an answer, and falls back to web search when local knowledge is insufficient.
The pipeline is a LangGraph state machine with the following flow:
START → retrieve → eval_each_doc → [route] → refine → generate → END
↓
rewrite_query → web_search → refine
- Retrieve — Fetches the top-k most similar chunks from a FAISS vector store built from local PDF documents.
- Evaluate — Scores each retrieved chunk's relevance to the question using an LLM. Produces one of three verdicts:
CORRECT— at least one chunk scored above the upper threshold → proceeds directly to refinementINCORRECT— all chunks scored below the lower threshold → discards local docs, falls back to web searchAMBIGUOUS— mixed scores → rewrites the query and supplements with web results
- Rewrite Query (INCORRECT / AMBIGUOUS only) — Rewrites the user question into a concise web search query.
- Web Search (INCORRECT / AMBIGUOUS only) — Searches the web using Tavily and collects results as documents.
- Refine — Decomposes the combined context into individual sentences and uses the LLM to keep only those directly relevant to the question.
- Generate — Produces the final answer using the refined context.
| Component | Tool |
|---|---|
| LLM | Groq (llama-3.1-8b-instant) |
| Embeddings | HuggingFace sentence-transformers/all-MiniLM-L6-v2 |
| Vector Store | FAISS |
| Orchestration | LangGraph |
| Web Search | Tavily |
| Document Loader | LangChain PyPDFLoader |
CRAG/
├── run.py # CLI entry point
├── requirements.txt # Python dependencies
└── app/
├── __init__.py
├── state.py # LangGraph state schema (TypedDict)
├── model.py # LLM and embeddings initialisation
├── retriever.py # PDF loading and FAISS vector store
├── utils.py # Sentence decomposition utility
└── pipeline.py # LangGraph nodes, edges, and graph compilation
git clone https://github.com/Saurav-kumar077/CRAG.git
cd CRAGpip install -r requirements.txtPlace your PDF files in a ./data/ directory at the project root:
CRAG/
└── data/
├── document1.pdf
└── document2.pdf
Create a .env file in the project root:
GROQ_API_KEY=your_groq_api_key
TAVILY_API_KEY=your_tavily_api_key
# Optional overrides (defaults shown)
GROQ_MODEL=llama-3.1-8b-instant
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
UPPER_TH=0.7
LOWER_TH=0.3
TAVILY_MAX_RESULTS=5python run.pyYou will be prompted to enter a question. The pipeline will print the verdict, reason, web query (if used), and the final answer.
Example output:
Enter your question: What is the capital of France?
VERDICT : CORRECT
REASON : At least one chunk scored > 0.7.
WEB QUERY:
ANSWER :
The capital of France is Paris.
| Variable | Default | Description |
|---|---|---|
GROQ_MODEL |
llama-3.1-8b-instant |
Groq model to use for all LLM calls |
EMBEDDING_MODEL |
sentence-transformers/all-MiniLM-L6-v2 |
HuggingFace embedding model |
UPPER_TH |
0.7 |
Score threshold to classify retrieval as CORRECT |
LOWER_TH |
0.3 |
Score threshold to classify retrieval as INCORRECT |
TAVILY_MAX_RESULTS |
5 |
Number of web search results to retrieve |
langchain
langchain-groq
faiss-cpu
python-dotenv
tqdm