AI-Powered Study Buddy for MISM Students

An intelligent study assistant that helps MISM students at CMU learn from their course materials through AI-powered summaries, practice questions, and interactive Q&A using Retrieval-Augmented Generation (RAG).

Features

Core Functionality

Document Processing: Upload and process PDFs, PowerPoint slides, Word documents, and text files
RAG-Powered Q&A: Ask questions and get accurate answers grounded in your course materials
Smart Summaries: Generate concise summaries of topics or entire documents
Practice Questions: Auto-generate diverse question types:
- Multiple Choice Questions (MCQs)
- True/False
- Fill-in-the-Blanks
- Short Answer
- Match-the-Following
- Long Answer/Essay Questions
Evaluation Framework: Built-in RAGAS and DeepEval metrics for quality assessment
User Feedback: Collect and analyze user feedback for continuous improvement

Architecture

AI-study-buddy/
├── backend/              # FastAPI backend server
│   ├── main.py          # API endpoints
│   ├── config.py        # Configuration management
│   └── requirements.txt
├── frontend/            # Streamlit UI
│   ├── app.py          # Main application
│   └── requirements.txt
├── utils/              # Core utilities
│   ├── document_processor.py  # Document parsing and chunking
│   ├── vector_store.py        # Vector database and RAG pipeline
│   └── content_generator.py  # LLM-powered content generation
├── evaluation/         # Evaluation framework
│   └── evaluator.py   # RAGAS and DeepEval integration
├── data/              # Data storage
│   ├── raw/          # Uploaded documents
│   ├── processed/    # Processed documents
│   └── chromadb/     # Vector database
├── models/           # Model configurations
└── tests/           # Test suite for debugging

Quick Start

Prerequisites

Python 3.9 or higher
OpenAI API key (required)
Git

Setup

Get your OpenAI API key: Visit OpenAI API Keys

Installation

Set up environment variables

cp .env.example .env
# Edit .env and add your OpenAI API key

Install backend dependencies

cd backend
pip install -r requirements.txt

Install frontend dependencies

cd ../frontend
pip install -r requirements.txt

Running the Application

Start the backend server (Terminal 1)

cd backend
python main.py

The API will be available at http://localhost:8000

Start the frontend (Terminal 2)

cd frontend
streamlit run app.py

The UI will open in your browser at http://localhost:8501

Usage Guide

1. Upload Documents

Navigate to the "Upload Documents" page
Select a PDF, PPTX, DOCX, or TXT file
Click "Process Document"
The system will extract text, create chunks, and generate embeddings

2. Ask Questions

Go to the "Ask Questions" page
Type your question about the course materials
Get AI-generated answers with supporting context
Rate the answer to help improve the system

3. Generate Summaries

Visit the "Generate Summary" page
Choose topic-based or custom query summary
Receive a concise summary of the content
Download the summary for later reference

4. Practice Questions

Select "Practice Questions"
Choose question type (MCQ, True/False, etc.)
Specify number of questions
Review questions and answers for self-assessment

Configuration

Edit .env file to customize:

# OpenAI Settings
OPENAI_API_KEY=your_key_here
OPENAI_MODEL=gpt-4-turbo-preview
OPENAI_EMBEDDING_MODEL=text-embedding-3-small

# RAG Settings
CHUNK_SIZE=1000
CHUNK_OVERLAP=200
TOP_K_RETRIEVAL=5
TEMPERATURE=0.4

# Vector Database
VECTOR_DB_TYPE=chromadb
CHROMADB_PATH=./data/chromadb

RAG Pipeline

The system implements a sophisticated RAG pipeline:

Document Processing
- Extract text from various file formats
- Split into manageable chunks with overlap
- Preserve metadata and context
Embedding Generation
- Generate vector embeddings using OpenAI
- Store in ChromaDB for efficient retrieval
- Support batch processing for large documents
Retrieval
- Convert queries to embeddings
- Perform similarity search
- Retrieve top k most relevant chunks
Generation
- Provide context to LLM
- Generate accurate, grounded responses
- Include source references

Evaluation

The system includes comprehensive evaluation:

RAGAS Metrics

Faithfulness: How factually accurate are the answers?
Answer Relevancy: How relevant is the answer to the query?
Context Precision: How precise is the retrieved context?
Context Recall: How complete is the retrieved context?

DeepEval Metrics

Answer Relevancy: Semantic relevance of responses
Faithfulness: Consistency with source material
Coherence: Logical flow and clarity

User Feedback

Star ratings (1-5)
Qualitative comments
Usage analytics

Security & Privacy

Documents are stored locally
No data sharing with third parties
API keys stored securely in environment variables
User sessions isolated
Uploaded files can be deleted anytime

API Documentation

Once the backend is running, visit:

API Docs: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc

Key Endpoints

POST /upload - Upload and process a document
POST /query - Ask questions about materials
POST /summary - Generate summaries
POST /questions - Generate practice questions
GET /stats - Get knowledge base statistics

Tips for Best Results

Upload Quality Materials: Clear, well-formatted documents work best
Specific Questions: More specific queries yield better answers
Chunk Size: Adjust based on your document structure
Regular Updates: Keep adding new materials for better coverage
Provide Feedback: Help improve the system through ratings

Built with ❤️ for MISM students at Carnegie Mellon University

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.deepeval		.deepeval
backend		backend
courses		courses
evaluation		evaluation
frontend		frontend
models		models
tests		tests
utils		utils
.env.example		.env.example
.gitignore		.gitignore
Generative AI Use.pdf		Generative AI Use.pdf
LICENSE		LICENSE
README.md		README.md
eda.ipynb		eda.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI-Powered Study Buddy for MISM Students

Features

Core Functionality

Architecture

Quick Start

Prerequisites

Setup

Installation

Running the Application

Usage Guide

1. Upload Documents

2. Ask Questions

3. Generate Summaries

4. Practice Questions

Configuration

RAG Pipeline

Evaluation

RAGAS Metrics

DeepEval Metrics

User Feedback

Security & Privacy

API Documentation

Key Endpoints

Tips for Best Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI-Powered Study Buddy for MISM Students

Features

Core Functionality

Architecture

Quick Start

Prerequisites

Setup

Installation

Running the Application

Usage Guide

1. Upload Documents

2. Ask Questions

3. Generate Summaries

4. Practice Questions

Configuration

RAG Pipeline

Evaluation

RAGAS Metrics

DeepEval Metrics

User Feedback

Security & Privacy

API Documentation

Key Endpoints

Tips for Best Results

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages