EduRAG: Syllabus-Aware Examination & Grading System

The Problem

Academic assessment suffers from a disconnect between static syllabus documents and active evaluation. Manual processes create three core inefficiencies:

Drafting Overhead: Manually aligning exam questions with specific modules and Bloom’s Taxonomy levels is repetitive and error-prone.
The Handwriting Gap: Grading handwritten student work traditionally requires manual transcription or physical handling, slowing down the feedback loop.
Contextual Accuracy: Standard AI graders often rely on general knowledge rather than the specific "ground truth" of a university’s official curriculum.

The Solution

EduRAG digitizes the academic lifecycle by anchoring every action to the official syllabus. Using Retrieval-Augmented Generation (RAG), it ensures that question generation and answer evaluation are strictly mapped to the uploaded curriculum. It bridges the gap between physical handwriting and digital evaluation through multimodal AI.

How It Works

Syllabus Digitization: Uses Tesseract OCR to extract text from PDF syllabi, cleaning and partitioning data into logical "Modules" to preserve the curriculum structure.
Vectorized Knowledge: Converts structured text into high-dimensional vectors stored in ChromaDB, allowing for topic-specific similarity searches.
Automated Exam Generation: Retrieves syllabus context based on user-defined parameters (Subject, Module, Bloom’s Level) and generates a structured exam paper following university-standard "Choice" logic (e.g., Q1 OR Q2).
Multimodal Evaluation: Processes images of handwritten answers by:
- Transcribing handwriting into digital text via Vision-Language models.
- Retrieving the specific syllabus "Truth" for the given question.
- Comparing student output against the syllabus to award marks and provide constructive reasoning.

Technical Stack

Orchestration: LangChain (Chains, PromptTemplates, OutputParsers).
Models: Google Gemini 1.5 Pro & 2.5 Flash-Lite (Multimodal/Vision).
OCR Engine: Tesseract OCR (Initial PDF-to-Text).
Vector Database: ChromaDB (Storage and Retrieval).
Document Generation: FPDF2 (PDF Rendering).
Interface: Streamlit (Web Dashboard).

Local Setup

Install Tesseract OCR: Ensure the Tesseract engine is installed on your OS and added to your system's PATH.
Clone Repository: Download the project files to your local environment.
Install Dependencies: Run pip install -r requirements.txt.
Configure API Key: Create a .env file in the root directory and add GOOGLE_API_KEY=your_key_here.
Launch App: Execute streamlit run app.py to start the local server.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
assets		assets
data/raw		data/raw
embeddings		embeddings
generator		generator
grading		grading
models		models
prompts		prompts
tests		tests
text_handler		text_handler
utils		utils
.gitignore		.gitignore
QUESTION_PROMPT.json		QUESTION_PROMPT.json
Questions_File_Processing_L2.pdf		Questions_File_Processing_L2.pdf
README.md		README.md
SYLLABUS_PROMPT.json		SYLLABUS_PROMPT.json
WhatsApp Image 2025-12-09 at 5.16.15 PM.jpeg		WhatsApp Image 2025-12-09 at 5.16.15 PM.jpeg
main_app.py		main_app.py
requirements.txt		requirements.txt
syllabus.json		syllabus.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EduRAG: Syllabus-Aware Examination & Grading System

The Problem

The Solution

How It Works

Technical Stack

Local Setup

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

EduRAG: Syllabus-Aware Examination & Grading System

The Problem

The Solution

How It Works

Technical Stack

Local Setup

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages