Skip to content

SannidhyaDas/VerbaVista-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

22 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎬 VerbaVista-AI ( 🌐Live Demo )

YouTube Content Synthesizer using Gemini + LangChain + Streamlit VideoChat eg

VideoNotes eg

πŸš€ Transform any YouTube video into structured notes or an interactive chatbot powered by Google Gemini and LangChain.


πŸ“– Overview

VidSynth AI is a Generative AI application that automatically fetches a YouTube video transcript, translates it (if needed), and turns it into:

  • 🧠 Structured Notes – AI-generated, topic-wise summaries
  • πŸ’¬ Interactive Chatbot – Ask questions directly about the video content

It’s built using LangChain, Google Gemini 2.5 Flash, and Streamlit, enabling seamless text extraction, translation, chunking, and contextual question answering β€” all in one elegant app.


✨ Key Features

Feature Description
πŸŽ₯ YouTube Transcript Fetching Automatically extracts video transcripts in multiple languages using YouTubeTranscriptApi.
🌐 Multilingual Translation Uses Gemini LLM to translate transcripts into English while preserving tone and meaning.
🧩 Chunking & Embeddings Splits long transcripts and creates embeddings using GoogleGenerativeAIEmbeddings.
πŸ’Ύ Vector Store (RAG) Stores embeddings in a Chroma vector database for fast, context-based retrieval.
πŸ—‚οΈ AI Notes Generator Creates structured, human-readable notes using LLM prompting.
πŸ’¬ Chat with Video Chatbot mode lets users ask natural language questions about any video content.
🧠 Exponential Backoff Handling Handles Google API quota limits gracefully with retry logic.

🧱 Tech Stack

Category Tools / Libraries
πŸ’‘ LLM Gemini 2.5 Flash Lite via LangChain
🧩 Frameworks LangChain, Streamlit
πŸ”€ Embeddings GoogleGenerativeAIEmbeddings
πŸ—„οΈ Vector DB Chroma
🎞️ Transcript Extraction YouTubeTranscriptApi
βš™οΈ Other Utilities dotenv, regex, time, google.api_core.exceptions

βš™οΈ Installation & Setup

1️⃣ Clone the Repository

git clone https://github.com/SannidhyaDas/VerbaVista-AI.git
cd VerbaVista-AI

2️⃣ Install Dependencies

pip install -r requirements.txt

3️⃣ Add Environment Variables

Create a .env file in the root directory with your Google API Key:

GOOGLE_API_KEY=your_google_api_key_here

πŸ”‘ You can get your key from Google AI Studio

πŸš€ Run the App

streamlit run app.py

Then open your browser at the link Streamlit provides (usually http://localhost:8501).


🧩 How It Works β€” Behind the Scenes

pipeline

πŸ”Ή Step 1: Transcript Extraction - Extracts the video’s transcript (in any supported language) using the YouTubeTranscriptApi.

πŸ”Ή Step 2: Translation (Optional) - If the video is not in English, Gemini translates the transcript with cultural and linguistic precision.

πŸ”Ή Step 3: Processing Options

  • Notes Mode β†’ Extracts key topics and generates structured, concise notes.
  • Chat Mode β†’ Creates embeddings, stores them in Chroma DB, and launches a Retrieval-Augmented Generation (RAG) chatbot.

πŸ”Ή Step 4: RAG-based Question Answering - When chatting, user queries are matched against the video transcript via embeddings β†’ Gemini answers using only retrieved context.


🧰 Key Files

VerbaVista-AI/
β”‚
β”œβ”€β”€ assets/                        # Streamlit web interface
β”‚   β”œβ”€β”€ appInterface_1.png            # Chat with Video example 
β”‚   β”œβ”€β”€ appInterface_2.png            # Notes from the video example 
β”‚   └── VerbaVista-pipeline           # working pipeline
β”‚
β”œβ”€β”€ deployment/             # Streamlit deployment setup
β”‚   β”œβ”€β”€ requirements.txt            # Python dependencies
β”‚   β”œβ”€β”€ main.py             # Core logic and LLM pipelines  
β”‚   └── app.py              # Streamlit user interface
β”‚
β”œβ”€β”€ localhost/              # setup to run app locally
β”‚   β”œβ”€β”€ requirements.txt            # Python dependencies
β”‚   β”œβ”€β”€ main.py             # Core logic and LLM pipelines
β”‚   └── app.py              # Streamlit user interface
β”‚
└── README.md                   # Project documentation

🧠 Example Use Cases

πŸ“š Smart Study Companion - Transform complex academic or lecture videos into clear, structured notes for faster learning and revision.

🎧 Podcast & Interview Analyst - Extract key takeaways and actionable insights from long-form conversations β€” save hours of manual listening.

🌐 Multilingual Research Assistant - Break language barriers by translating, summarizing, and analyzing global video content in real time.

🏒 Enterprise Knowledge Hub - Turn webinars, product demos, and training sessions into searchable, chat-enabled knowledge bases for internal teams.

πŸ’Ό Scalable Business Value - Integrate with CRMs or content libraries to automate learning, onboarding, and support, turning video data into searchable, revenue-driving intelligence.


🧩 Future Improvements

πŸŽ™οΈ Voice Interaction - Add Speech-to-Text and Text-to-Speech modules for fully voice-based question answering.

🧠 Enhanced Prompt Tuning - Fine-tune Gemini prompts for domain-specific or educational content understanding.

πŸ’Ύ Vector Store Caching - Implement caching for faster reloads and reduced embedding costs.

🧩 Batch Video Summarization - Enable multi-video summarization to process and analyze playlists or course modules efficiently.

About

VerbaVista is an LLM-powered Streamlit app that transforms any YouTube video into structured English notes and an interactive chatbot. It automatically fetches, translates, chunks, and embeds transcripts using Gemini + LangChain, enabling contextual Q&A through a RAG pipeline with Chroma.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages