Open-source AI pipeline that turns any topic into a publish-ready YouTube/Instagram/TikTok Short. One command. Research, script, voiceover, AI-generated visuals, AI-generated music, animated captions, scene transitions. Out comes a vertical MP4, ready to upload.
Web UI with live pipeline visualization, REST API, Docker, and CLI. Bring your own API keys.
final.mp4
Topic in, MP4 out. This video was generated in a single command:
openreels "the apollo 13 disaster, from explosion to miraculous return" --provider google
Live pipeline visualization. Watch research, script, voiceover, visuals, music, and assembly stages stream in real time.
Screenshots
![]() |
![]() |
| Topic input with archetype gallery | Live pipeline with storyboard |
![]() |
![]() |
| Completed: video player, cost breakdown, quality review | Gallery with multiple generations |
Give it a topic. It handles everything:
| Stage | What happens |
|---|---|
| Research | Web search grounds the script in real facts, not hallucinations |
| Script | Writes a punchy short-form script with scene breakdowns, visual direction, and emotional arc |
| Voiceover | Generates TTS audio with word-level timestamps for karaoke-style captions |
| Visuals | AI images (Gemini, DALL-E), AI video clips (Google Veo 3.1, fal.ai Kling 2.6 Pro) with dedicated cinematography prompts and negative prompt guidance, plus vision-verified stock footage that rejects bad matches and retries automatically |
| Music | AI-generated background score via Google Lyria 3 Pro, scene-synced to match the video's emotional arc. Or pick from 25 bundled royalty-free tracks |
| Captions | Spring-animated 3-state captions with 7 distinct styles, per-archetype theming, and word-level karaoke highlighting |
| Assembly | Composites everything into a vertical MP4 via Remotion with crossfade, slide, wipe, and flip transitions |
| Critique | AI critic scores the output. If quality is below threshold, the pipeline re-runs |
Every stage streams live progress to the web UI. You watch the AI research, write, paint, compose, and render in real time.
Mix and match providers or go all-in on one ecosystem:
| Capability | Providers |
|---|---|
| LLM | Anthropic Claude, OpenAI GPT, Google Gemini, OpenRouter (300+ models), any OpenAI-compatible endpoint |
| Search | Native (provider built-in), Tavily, or parametric knowledge |
| TTS | ElevenLabs, Inworld, OpenAI TTS, Gemini TTS, Kokoro (free, local) |
| Images | Gemini Imagen, OpenAI DALL-E |
| Video | Google Veo 3.1, fal.ai Kling 2.6 Pro (with cross-provider fallback, negative prompts, structured cinematography prompts) |
| Music | Google Lyria 3 Pro (AI-generated, $0.08/track), Bundled library (free) |
| Stock | Pexels, Pixabay (both searched, vision-verified, with AI fallback) |
One key, everything Google: --provider google sets LLM, images, TTS, video, and music to Google APIs with a single GOOGLE_API_KEY.
Zero-cost voiceover: --provider local uses Kokoro for free local TTS. No API key needed.
git clone https://github.com/tsensei/OpenReels.git
cd OpenReels
cp .env.example .env # fill in your API keys
docker compose up # starts Redis + API + Worker
# Open http://localhost:3000Type a topic, pick your providers, and watch the pipeline run. Research, script, voiceover, visuals, music, and assembly stages stream live to the browser. Download the final video when it's done.
cp .env.example .env # fill in your API keys
docker run --env-file .env --shm-size=2gb -v ./output:/output ghcr.io/tsensei/openreels "5 stoic lessons that changed my life"Or run through Docker Compose:
docker compose run worker npx tsx src/index.ts --yes "5 stoic lessons that changed my life"Prerequisites: Node.js 22+, pnpm, ffprobe
git clone https://github.com/tsensei/OpenReels.git
cd OpenReels
pnpm install
cp .env.example .env # fill in your API keys# Full pipeline with AI music
pnpm start "the fall of the Roman Empire" --provider google
# Free local TTS, no API spend on voiceover
pnpm start "5 stoic lessons" --provider local
# Dry run (outputs DirectorScore JSON, no asset generation)
pnpm start "your topic" --dry-run
# Specific archetype and provider combo
pnpm start "your topic" --archetype anime_illustration --provider openai
# Creative direction file (guide the AI with a brief)
pnpm start "deep sea exploration" --direction examples/direction-brief.md
# Replay from a previous score (skip research + director, re-render)
pnpm start "your topic" --score output/2026-04-10-111939-.../score.jsonMinimum to run (pick one LLM + one TTS):
ANTHROPIC_API_KEYorOPENAI_API_KEYorGOOGLE_API_KEY— Anthropic / OpenAI / Google AI StudioELEVENLABS_API_KEYorINWORLD_TTS_API_KEY— ElevenLabs / Inworld. Or use--tts-provider kokoro(free, no key),--tts-provider openai-tts, or--tts-provider gemini-ttsGOOGLE_API_KEY— also needed for Gemini image generation, AI video (Veo), AI music (Lyria), and Gemini TTS
Optional: PEXELS_API_KEY (Pexels), PIXABAY_API_KEY (Pixabay) for stock footage, FAL_API_KEY (fal.ai) for Kling video generation
| Flag | Description | Default |
|---|---|---|
--provider <name> |
LLM provider (anthropic, openai, gemini, openrouter, openai-compatible, google, local) |
anthropic |
--llm-model <model> |
Model ID override (e.g. anthropic/claude-sonnet-4 for OpenRouter) |
provider default |
--llm-base-url <url> |
Base URL for openai-compatible (e.g. http://localhost:11434/v1) |
— |
--search-provider <name> |
Search provider (native, tavily, none) |
auto-detect |
--image-provider <name> |
Image provider (gemini, openai) |
gemini |
--tts-provider <name> |
TTS provider (elevenlabs, inworld, kokoro, gemini-tts, openai-tts) |
elevenlabs |
--music-provider <name> |
Music provider (bundled, lyria) |
bundled |
--video-provider <name> |
Video provider (gemini, fal) |
auto-detect |
--archetype <name> |
Override visual archetype | LLM chooses |
--platform <name> |
Target platform (youtube, tiktok, instagram) |
youtube |
--dry-run |
Output DirectorScore JSON without generating assets | off |
--preview |
Open Remotion Studio after rendering | off |
-o, --output <dir> |
Output directory | ./output |
--no-music |
Disable background music | music on |
--no-video |
Disable AI video generation | video on |
--no-stock-verify |
Disable VLM stock footage verification | verify on |
--stock-confidence <n> |
Min confidence for stock verification (0-1) | 0.6 |
--stock-max-attempts <n> |
Max stock API calls per scene | 4 |
--video-model <model> |
Video model override | provider default |
--kokoro-voice <voice> |
Kokoro voice preset | af_heart |
--direction <file> |
Creative brief file (markdown) to guide the AI | — |
--score <path> |
Replay from a saved score.json, skipping research + director |
— |
-y, --yes |
Auto-confirm cost estimation (Docker/CI) | off |
Before spending any money, the pipeline shows a detailed cost breakdown and asks for confirmation:
Estimated cost: $0.686
LLM: $0.0029 (7 calls)
TTS: $0.0171 (853 chars)
Images: $0.3030 (3 AI images @ $0.101/ea)
Video: $0.3000 (1 AI videos)
Music: $0.0802 (Lyria AI generation)
Stock: free
After rendering, actual cost is computed from real token usage. Use --dry-run to preview the DirectorScore without spending anything.
14 visual styles that control colors, captions, motion, lighting, and AI image prompting. Same topic, four different archetypes:
| Archetype | Best for |
|---|---|
editorial_caricature |
News commentary, satire, social issues |
warm_narrative |
Storytelling, history, human interest |
studio_realism |
Professional photography, editorial, luxury |
infographic |
Data, facts, explainers, rapid-fire content |
anime_illustration |
Dynamic, action-oriented, pop culture |
pastoral_watercolor |
Nature, contemplative, hand-painted aesthetic |
comic_book |
Action, adventure, energetic content |
gothic_fantasy |
Dark themes, mythology, epic content |
vintage_snapshot |
Nostalgic, intimate, personal stories |
surreal_dreamscape |
Sci-fi, fantasy, mind-bending topics |
warm_editorial |
Lifestyle, people stories, general purpose |
cinematic_documentary |
Factual, historical, science |
moody_cinematic |
Mystery, tension, crime, dark history |
bold_illustration |
Educational, how-to, listicles |
OpenReels is a full rewrite of ReelMistri, a CLI pipeline originally built for Bangla-language YouTube Shorts automation. ReelMistri proved the concept: one command, fully produced video, language-aware scripts, culturally coherent visuals, proper Bengali text rendering.
The rewrite moves from Python to TypeScript for native Remotion integration. Cleaner video rendering, better developer experience, no Python-to-TypeScript bridge.
v0.17.0 shipped. See CHANGELOG.md for full version history and TODOS.md for known issues and roadmap.
If OpenReels is useful to you, consider giving it a star. It helps others discover the project.
This project is licensed under the MIT License.







