Skip to content

Saurav-kumar077/CRAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CRAG — Corrective Retrieval-Augmented Generation

A Python implementation of the Corrective RAG (CRAG) pipeline using LangGraph. The system evaluates the quality of retrieved documents before generating an answer, and falls back to web search when local knowledge is insufficient.


How It Works

The pipeline is a LangGraph state machine with the following flow:

START → retrieve → eval_each_doc → [route] → refine → generate → END
                                       ↓
                                rewrite_query → web_search → refine
  1. Retrieve — Fetches the top-k most similar chunks from a FAISS vector store built from local PDF documents.
  2. Evaluate — Scores each retrieved chunk's relevance to the question using an LLM. Produces one of three verdicts:
    • CORRECT — at least one chunk scored above the upper threshold → proceeds directly to refinement
    • INCORRECT — all chunks scored below the lower threshold → discards local docs, falls back to web search
    • AMBIGUOUS — mixed scores → rewrites the query and supplements with web results
  3. Rewrite Query (INCORRECT / AMBIGUOUS only) — Rewrites the user question into a concise web search query.
  4. Web Search (INCORRECT / AMBIGUOUS only) — Searches the web using Tavily and collects results as documents.
  5. Refine — Decomposes the combined context into individual sentences and uses the LLM to keep only those directly relevant to the question.
  6. Generate — Produces the final answer using the refined context.

Tech Stack

Component Tool
LLM Groq (llama-3.1-8b-instant)
Embeddings HuggingFace sentence-transformers/all-MiniLM-L6-v2
Vector Store FAISS
Orchestration LangGraph
Web Search Tavily
Document Loader LangChain PyPDFLoader

Project Structure

CRAG/
├── run.py              # CLI entry point
├── requirements.txt    # Python dependencies
└── app/
    ├── __init__.py
    ├── state.py        # LangGraph state schema (TypedDict)
    ├── model.py        # LLM and embeddings initialisation
    ├── retriever.py    # PDF loading and FAISS vector store
    ├── utils.py        # Sentence decomposition utility
    └── pipeline.py     # LangGraph nodes, edges, and graph compilation

Setup

1. Clone the repository

git clone https://github.com/Saurav-kumar077/CRAG.git
cd CRAG

2. Install dependencies

pip install -r requirements.txt

3. Add your documents

Place your PDF files in a ./data/ directory at the project root:

CRAG/
└── data/
    ├── document1.pdf
    └── document2.pdf

4. Configure environment variables

Create a .env file in the project root:

GROQ_API_KEY=your_groq_api_key
TAVILY_API_KEY=your_tavily_api_key

# Optional overrides (defaults shown)
GROQ_MODEL=llama-3.1-8b-instant
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
UPPER_TH=0.7
LOWER_TH=0.3
TAVILY_MAX_RESULTS=5

Usage

python run.py

You will be prompted to enter a question. The pipeline will print the verdict, reason, web query (if used), and the final answer.

Example output:

Enter your question: What is the capital of France?

VERDICT : CORRECT
REASON  : At least one chunk scored > 0.7.
WEB QUERY:

ANSWER  :
 The capital of France is Paris.

Configuration

Variable Default Description
GROQ_MODEL llama-3.1-8b-instant Groq model to use for all LLM calls
EMBEDDING_MODEL sentence-transformers/all-MiniLM-L6-v2 HuggingFace embedding model
UPPER_TH 0.7 Score threshold to classify retrieval as CORRECT
LOWER_TH 0.3 Score threshold to classify retrieval as INCORRECT
TAVILY_MAX_RESULTS 5 Number of web search results to retrieve

Dependencies

langchain
langchain-groq
faiss-cpu
python-dotenv
tqdm

About

Corrective RAG pipeline using LangGraph — evaluates retrieval quality and falls back to web search when local docs are insufficient

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages