A Python tool that downloads YouTube videos and transcribes them using AI. Built step by step as a learning project.
The full pipeline:
YouTube URL --> download --> extract audio --> transcribe --> LLM script --> Gemini TTS --> podcast .wav
Work through the steps in order. Each one builds on the previous.
| Step | What You Learn |
|---|---|
| Step 1: Setup and Your First Transcription | Install dependencies, get an API key, run your first transcription |
| Step 2: Downloading Videos from YouTube | Use yt-dlp to download any YouTube video automatically |
| Step 3: Organizing with Projects | Group related videos into named projects with URL lists |
| Step 4: Building a CLI | Build a proper command-line tool with Typer |
| Step 5: Writing Tests | Write automated tests with pytest and mocking |
| Step 6: Writing the Podcast Script with AI | Use OpenAI to rewrite a transcript as a podcast monologue |
| Step 7: Generating Your Podcast Audio | Synthesize your podcast episode using Gemini TTS via Replicate |
| What You Can Build Next | Ideas for extending the project further |
# Install dependencies
uv sync
# Create a project
uv run transcript create my-project
# Add YouTube URLs to projects/my-project/urls.txt
# Format: URL optional-name
# Run the pipeline
uv run transcript run my-projectOutput files land in:
projects/my-project/audio/— extracted.mp3filesprojects/my-project/transcripts/—.txtand.jsontranscriptsprojects/my-project/podcast/—script.txtandepisode.wav
- Python 3.10+
- UV — fast Python package manager
- ffmpeg — audio/video processing
- Replicate account and API token
Full setup instructions are in Step 1.