Academic assessment suffers from a disconnect between static syllabus documents and active evaluation. Manual processes create three core inefficiencies:
- Drafting Overhead: Manually aligning exam questions with specific modules and Bloom’s Taxonomy levels is repetitive and error-prone.
- The Handwriting Gap: Grading handwritten student work traditionally requires manual transcription or physical handling, slowing down the feedback loop.
- Contextual Accuracy: Standard AI graders often rely on general knowledge rather than the specific "ground truth" of a university’s official curriculum.
EduRAG digitizes the academic lifecycle by anchoring every action to the official syllabus. Using Retrieval-Augmented Generation (RAG), it ensures that question generation and answer evaluation are strictly mapped to the uploaded curriculum. It bridges the gap between physical handwriting and digital evaluation through multimodal AI.
- Syllabus Digitization: Uses Tesseract OCR to extract text from PDF syllabi, cleaning and partitioning data into logical "Modules" to preserve the curriculum structure.
- Vectorized Knowledge: Converts structured text into high-dimensional vectors stored in ChromaDB, allowing for topic-specific similarity searches.
- Automated Exam Generation: Retrieves syllabus context based on user-defined parameters (Subject, Module, Bloom’s Level) and generates a structured exam paper following university-standard "Choice" logic (e.g., Q1 OR Q2).
- Multimodal Evaluation: Processes images of handwritten answers by:
- Transcribing handwriting into digital text via Vision-Language models.
- Retrieving the specific syllabus "Truth" for the given question.
- Comparing student output against the syllabus to award marks and provide constructive reasoning.
- Orchestration: LangChain (Chains, PromptTemplates, OutputParsers).
- Models: Google Gemini 1.5 Pro & 2.5 Flash-Lite (Multimodal/Vision).
- OCR Engine: Tesseract OCR (Initial PDF-to-Text).
- Vector Database: ChromaDB (Storage and Retrieval).
- Document Generation: FPDF2 (PDF Rendering).
- Interface: Streamlit (Web Dashboard).
- Install Tesseract OCR: Ensure the Tesseract engine is installed on your OS and added to your system's PATH.
- Clone Repository: Download the project files to your local environment.
- Install Dependencies: Run
pip install -r requirements.txt. - Configure API Key: Create a
.envfile in the root directory and addGOOGLE_API_KEY=your_key_here. - Launch App: Execute
streamlit run app.pyto start the local server.





