Augment your projects with a RAG agent
This project aims to create an assistant that is able to answer questions about your codebase. This tool is intended to be helpful for very large codebases that the user does not have a lot of knowledge of. It should be helpful to ask questions about the project and save time having to read through the whole codebase each time.
To install, there are two recommended methods:
-
Pip - simply run
pip install perpetua -
Homebrew - run:
brew tap samikh-git/tools brew install perpetua
This tool is based in the command line and is supposed to be used similarly to git. The project is initialized with the creation of a .rag directory that contains the following:
database.db: a sqlite database to keep track of tracked files and for graph state for the agentmilvus.db: a Milvus Lite vector store for the RAG operationsrepo-graph-lock.json: a JSON file that represents the codebase as a graph.threads.txt: a text file that keeps track of the thread id for the agent. Currently, only 1 thread is ever tracked per project. I will create a command that allows the user to create a new thread.staging: the staging directory
The principal commands of this project are:
perpetua config: configures the package. Sets up the environment the package uses.perpetua init: initializes a localrag project. Creates the .rag directoryperpetua add file/to/path: adds a file/directory to the staging areaperpetua rm file/to/path: removes a file from the staging areaperpetua commit: vectorizes staged fileperpetua ask: prompts the agentperpetua reset: clears the staging area. With --hard, it reinitializes the projectperpetua ls: shows all files currently being tracked by the projectperpetua diff: shows the difference between the staged files and currently tracked filesperpetua search "query": searches the vector store directly and returns the 2 closest matches. Good for a sanity check before asking the LLM.perpetua help: provides links to documentation.
The CLI functionality was implemented using Typer. This is a very straightforward API to use to create a CLI. I would definitely recommend it. The agent was orchestrated using langgraph with langchain. The vector store is Milvus Lite and search is provided through Tavily. The relational database is SQLite. The package is built using Poetry.
The requirements can be installed by running pip install -r requirements.txt.
You should have a .env file with the following information in the localrag directory in your home directory.
GOOGLE_API_KEY=
TAVILY_API_KEY=
# Optional: For evaluation and tracing
LANGSMITH_API_KEY=
LANGSMITH_TRACING=true
LANGSMITH_PROJECT=localragThis directory is structured in two subdirectories:
- perpetua - the poetry package -- this is what you should install with poetry to get the package in your environment using
poetry install. - test - a mockup weather app -- this is what I used to test the package. This is totally boilerplate and generated by cursor. It provides code that has some complexity to see if localrag works. I would recommend it for testing as some of the commands interact with files and may modify them if there is an error.
In perpetua, you will find the files for the project. Some highlights:
app.py:
This file contains the logic for the CLI.
agent/agent.py:
This file contains the logic for the agent.
agent/document_processing.py:
This file contains a class that is used to abstract for adding files to the vector store and relational database.
This is a very small app and is not very complicated. Please feel free to modify it rather extensively as it is not that hard to restore.
Please read the README.md for the package for more information about the commands and the agent's tooling.