YouTube Chatbot

An intelligent, fully dynamic Retrieval-Augmented Generation (RAG) chatbot that answers questions about any YouTube video you provide.
It uses OpenAI Whisper for transcription, OpenAI embeddings (small) for semantic search, and Gemini for generating responses, all via API keys. No local models are required.

🚀 Key Features

Dynamic YouTube Video Learning: Enter any video URL and the bot automatically processes it.
Audio Conversion & Transcription: Video is converted to audio, then transcribed using OpenAI Whisper.
JSON Storage: Transcriptions are stored in JSON format for structured processing.
OpenAI Embeddings: Transcript chunks are converted into embeddings for semantic retrieval.
User Query Matching: Queries are converted into embeddings and matched using cosine similarity.
Gemini API Responses: Generates context-aware answers based on the relevant transcript; replies with "Information not available in the provided video" if no match exists.
Flask Web Interface: Interactive and user-friendly chat interface.

🛠️ Tech Stack

Python 3
Flask for web backend
HTML/CSS/JS for frontend templates
OpenAI Whisper API for transcription
OpenAI Embeddings (small) for vectorization
Gemini API for LLM responses
Cosine Similarity for Nearest neighbor search

📁 Project Structure

youtube-chatbot/
├── app.py 
├── backend/
│ ├── embeddings.py 
│ ├── mp3_to_json.py 
│ ├── process_incomings.py
│ └── yt_to_mp3.py 
├── templates/ 
│ └── index.html 
├── static/ 
│ ├── style.css 
│ └── script.js 
├── requirements.txt 
└── README.md

🧭 How It Works

User provides a YouTube URL.
Video is converted to audio using yt_to_mp3.py.
Audio is transcribed via OpenAI Whisper using mp3_to_json.py and stored as JSON.
Transcript chunks are converted into embeddings with embeddings.py using OpenAI Embeddings (small).
User query is also converted into embeddings.
Cosine similarity is computed between query and transcript embeddings via process_incomings.py.
The most relevant transcript chunk is sent to Gemini API for a response.
If no relevant information exists, the bot replies:

"Information not available in the provided video."
Response is displayed in the Flask web UI (index.html).

📌 Setup Instructions

Clone the repository:

git clone https://github.com/MuhammadUsman-Khan/youtube-chatbot.git
cd youtube-chatbot

Install dependencies:

pip install -r requirements.txt

Set environment variables for API keys:

export OPENAI_API_KEY="your_openai_key"
export GEMINI_API_KEY="your_gemini_key"

Run the Flask app:

python app.py

Open your browser at http://127.0.0.1:5000/ and start chatting with the bot.

📌 requirements.txt

Flask==3.1.2
imageio_ffmpeg==0.6.0
joblib==1.5.0
numpy==2.3.4
openai==2.7.1
openai_whisper==20250625
pandas==2.3.3
python-dotenv==1.2.1
Requests==2.32.5
scikit_learn==1.7.2

✅ Notes & Improvements

Fully dynamic; no local models required.
Transcript quality depends on YouTube captions and audio clarity.
Reduce Time Complexity and make it faster.
Frontend can be enhanced with chat history, typing indicators, and UI themes.

🤝 Contributing

Feel free to fork the repo, submit issues, or create pull requests. Contributions are welcome!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YouTube Chatbot

🚀 Key Features

🛠️ Tech Stack

📁 Project Structure

🧭 How It Works

📌 Setup Instructions

📌 requirements.txt

✅ Notes & Improvements

🤝 Contributing

🖊️ Author & Developer

Muhammad Usman Khan

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
__pycache__		__pycache__
backend		backend
static		static
templates		templates
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

YouTube Chatbot

🚀 Key Features

🛠️ Tech Stack

📁 Project Structure

🧭 How It Works

📌 Setup Instructions

📌 requirements.txt

✅ Notes & Improvements

🤝 Contributing

🖊️ Author & Developer

Muhammad Usman Khan

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages