An intelligent chatbot that allows users to upload text-based Ayurveda PDFs and ask questions based on the content using RAG (Retrieval-Augmented Generation) combining semantic search and LLM-based responses.
- Upload Ayurveda PDFs (text-based only)
- Ask natural-language questions based on uploaded content
- Chunk text smartly using LangChain
- Semantic search with MiniLM embeddings
- Fast retrieval using FAISS
- Powered by LLaMA 3 via Groq API
- Based on the RAG architecture (Retrieval-Augmented Generation)
- Easy-to-use interface via Streamlit
| Component | Technology |
|---|---|
| Frontend | Streamlit |
| Backend | FastAPI |
| Embeddings | HuggingFace MiniLM-L6-v2 |
| Vector Search | FAISS |
| Language Model | LLaMA 3 via Groq API |
| PDF Processing | PyMuPDF + LangChain |
| Prompting | LangChain + Custom PromptTemplate |
| Environment Var | python-dotenv |
- Upload: User uploads a text-based PDF
- Text Extraction: PDF is read using PyMuPDF
- Chunking: Text is broken into smaller pieces using RecursiveCharacterTextSplitter
- Embedding: Chunks are embedded using HuggingFace MiniLM
- Vector DB: Chunks are stored in a FAISS vector store
- Q&A:Question → Similar chunks retrieved Chunks + Question → sent to LLaMA 3 LLM generates a final context-based answer This is a Retrieval-Augmented Generation (RAG) system.
git clone https://github.com/yourusername/ayurveda-chatbot.git
cd ayurveda-chatbotpython -m venv ayurveda_env
ayurveda_env\Scripts\activate pip install -r requirements.txtCreate a .env file with your Groq key:
GROQ_API_KEY=your_groq_api_keypython app.pystreamlit run streamlit_app.py