An LLM-powered agent for generating comprehensive repository- and file-level documentation and question answering for codebases.
- 🔍 Repository-level and file-level code understanding and summarization
- 📝 Automatic documentation generation
- 💬 RAG-based Q&A chatbot
# Create conda environment
conda env create -f environment.yaml
conda activate repodocgen# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt- Copy the example environment file:
cp .env.example .env- Edit
.envand add your API keys:
- Get OpenAI API key from: https://platform.openai.com/api-keys
- Get Voyage API key from: https://www.voyageai.com/
# Generate documentation for a repository
python main.py /path/to/repo
# Generate and launch web interface
python main.py /path/to/repo --web
# Save vector index for later use
python main.py /path/to/repo --save-index --output ./my_outputfrom src.parser.code_parser import CodeParser
from src.summarizer.summarizer import CodeSummarizer
from src.rag.vector_store import VectorStore
from src.rag.hybrid_search import HybridSearch
from src.chatbot.qa_bot import QABot
# Parse repository
parser = CodeParser()
analyses = parser.parse_repository("/path/to/repo")
# Generate summaries
summarizer = CodeSummarizer()
summaries = summarizer.summarize_repository(analyses)
# Build RAG index
vector_store = VectorStore()
hybrid_search = HybridSearch(vector_store)
# ... (see examples/quickstart.py for complete example)
# Query repository
qa_bot = QABot(hybrid_search)
result = qa_bot.query("Where is function X defined?")
print(result['answer'])RepoDocGen/
├── src/
│ ├── parser/ # Tree-sitter code parsing
│ ├── summarizer/ # Code summarization module
│ ├── rag/ # RAG and vector database
│ ├── chatbot/ # Q&A chatbot
│ └── web/ # Web interface
├── tests/ # Unit tests
├── examples/ # Quickstart example
├── main.py # Main entry point
├── requirements.txt
├── environment.yaml
├── SETUP_GUIDE.md
└── README.md
- Aryaman Velampalli (fin1cky)
- Chengtao Dai (HeyyDario)
- Ellena Jiang (ellenaj0)
This project was done as a class project for COMS 6998-013 LLM Based Generative AI, Fall 2025 at Columbia University.