Skip to content

tsensei/OpenReels

Repository files navigation

OpenReels

GitHub stars License: MIT Version Tests TypeScript Docker

Open-source AI pipeline that turns any topic into a publish-ready YouTube/Instagram/TikTok Short. One command. Research, script, voiceover, AI-generated visuals, AI-generated music, animated captions, scene transitions. Out comes a vertical MP4, ready to upload.

Web UI with live pipeline visualization, REST API, Docker, and CLI. Bring your own API keys.

Demo

final.mp4

Topic in, MP4 out. This video was generated in a single command:

openreels "the apollo 13 disaster, from explosion to miraculous return" --provider google

Web UI

OpenReels Web UI demo

Live pipeline visualization. Watch research, script, voiceover, visuals, music, and assembly stages stream in real time.

Screenshots
Home page Pipeline running
Topic input with archetype gallery Live pipeline with storyboard
Pipeline completed Gallery view
Completed: video player, cost breakdown, quality review Gallery with multiple generations

How it works

Give it a topic. It handles everything:

Stage What happens
Research Web search grounds the script in real facts, not hallucinations
Script Writes a punchy short-form script with scene breakdowns, visual direction, and emotional arc
Voiceover Generates TTS audio with word-level timestamps for karaoke-style captions
Visuals AI images (Gemini, DALL-E), AI video clips (Google Veo 3.1, fal.ai Kling 2.6 Pro) with dedicated cinematography prompts and negative prompt guidance, plus vision-verified stock footage that rejects bad matches and retries automatically
Music AI-generated background score via Google Lyria 3 Pro, scene-synced to match the video's emotional arc. Or pick from 25 bundled royalty-free tracks
Captions Spring-animated 3-state captions with 7 distinct styles, per-archetype theming, and word-level karaoke highlighting
Assembly Composites everything into a vertical MP4 via Remotion with crossfade, slide, wipe, and flip transitions
Critique AI critic scores the output. If quality is below threshold, the pipeline re-runs

Every stage streams live progress to the web UI. You watch the AI research, write, paint, compose, and render in real time.

Provider flexibility

Mix and match providers or go all-in on one ecosystem:

Capability Providers
LLM Anthropic Claude, OpenAI GPT, Google Gemini, OpenRouter (300+ models), any OpenAI-compatible endpoint
Search Native (provider built-in), Tavily, or parametric knowledge
TTS ElevenLabs, Inworld, OpenAI TTS, Gemini TTS, Kokoro (free, local)
Images Gemini Imagen, OpenAI DALL-E
Video Google Veo 3.1, fal.ai Kling 2.6 Pro (with cross-provider fallback, negative prompts, structured cinematography prompts)
Music Google Lyria 3 Pro (AI-generated, $0.08/track), Bundled library (free)
Stock Pexels, Pixabay (both searched, vision-verified, with AI fallback)

One key, everything Google: --provider google sets LLM, images, TTS, video, and music to Google APIs with a single GOOGLE_API_KEY.

Zero-cost voiceover: --provider local uses Kokoro for free local TTS. No API key needed.

Quickstart

Web UI (recommended)

git clone https://github.com/tsensei/OpenReels.git
cd OpenReels
cp .env.example .env   # fill in your API keys
docker compose up      # starts Redis + API + Worker
# Open http://localhost:3000

Type a topic, pick your providers, and watch the pipeline run. Research, script, voiceover, visuals, music, and assembly stages stream live to the browser. Download the final video when it's done.

Docker CLI (single run)

cp .env.example .env   # fill in your API keys

docker run --env-file .env --shm-size=2gb -v ./output:/output ghcr.io/tsensei/openreels "5 stoic lessons that changed my life"

Or run through Docker Compose:

docker compose run worker npx tsx src/index.ts --yes "5 stoic lessons that changed my life"

Local development

Prerequisites: Node.js 22+, pnpm, ffprobe

git clone https://github.com/tsensei/OpenReels.git
cd OpenReels
pnpm install
cp .env.example .env   # fill in your API keys
# Full pipeline with AI music
pnpm start "the fall of the Roman Empire" --provider google

# Free local TTS, no API spend on voiceover
pnpm start "5 stoic lessons" --provider local

# Dry run (outputs DirectorScore JSON, no asset generation)
pnpm start "your topic" --dry-run

# Specific archetype and provider combo
pnpm start "your topic" --archetype anime_illustration --provider openai

# Creative direction file (guide the AI with a brief)
pnpm start "deep sea exploration" --direction examples/direction-brief.md

# Replay from a previous score (skip research + director, re-render)
pnpm start "your topic" --score output/2026-04-10-111939-.../score.json

API keys

Minimum to run (pick one LLM + one TTS):

  • ANTHROPIC_API_KEY or OPENAI_API_KEY or GOOGLE_API_KEYAnthropic / OpenAI / Google AI Studio
  • ELEVENLABS_API_KEY or INWORLD_TTS_API_KEYElevenLabs / Inworld. Or use --tts-provider kokoro (free, no key), --tts-provider openai-tts, or --tts-provider gemini-tts
  • GOOGLE_API_KEY — also needed for Gemini image generation, AI video (Veo), AI music (Lyria), and Gemini TTS

Optional: PEXELS_API_KEY (Pexels), PIXABAY_API_KEY (Pixabay) for stock footage, FAL_API_KEY (fal.ai) for Kling video generation

CLI flags

Flag Description Default
--provider <name> LLM provider (anthropic, openai, gemini, openrouter, openai-compatible, google, local) anthropic
--llm-model <model> Model ID override (e.g. anthropic/claude-sonnet-4 for OpenRouter) provider default
--llm-base-url <url> Base URL for openai-compatible (e.g. http://localhost:11434/v1)
--search-provider <name> Search provider (native, tavily, none) auto-detect
--image-provider <name> Image provider (gemini, openai) gemini
--tts-provider <name> TTS provider (elevenlabs, inworld, kokoro, gemini-tts, openai-tts) elevenlabs
--music-provider <name> Music provider (bundled, lyria) bundled
--video-provider <name> Video provider (gemini, fal) auto-detect
--archetype <name> Override visual archetype LLM chooses
--platform <name> Target platform (youtube, tiktok, instagram) youtube
--dry-run Output DirectorScore JSON without generating assets off
--preview Open Remotion Studio after rendering off
-o, --output <dir> Output directory ./output
--no-music Disable background music music on
--no-video Disable AI video generation video on
--no-stock-verify Disable VLM stock footage verification verify on
--stock-confidence <n> Min confidence for stock verification (0-1) 0.6
--stock-max-attempts <n> Max stock API calls per scene 4
--video-model <model> Video model override provider default
--kokoro-voice <voice> Kokoro voice preset af_heart
--direction <file> Creative brief file (markdown) to guide the AI
--score <path> Replay from a saved score.json, skipping research + director
-y, --yes Auto-confirm cost estimation (Docker/CI) off

Cost transparency

Before spending any money, the pipeline shows a detailed cost breakdown and asks for confirmation:

Estimated cost: $0.686
  LLM:    $0.0029 (7 calls)
  TTS:    $0.0171 (853 chars)
  Images: $0.3030 (3 AI images @ $0.101/ea)
  Video:  $0.3000 (1 AI videos)
  Music:  $0.0802 (Lyria AI generation)
  Stock:  free

After rendering, actual cost is computed from real token usage. Use --dry-run to preview the DirectorScore without spending anything.

Archetypes

14 visual styles that control colors, captions, motion, lighting, and AI image prompting. Same topic, four different archetypes:

Cinematic Documentary Anime Illustration Surreal Dreamscape Vintage Snapshot
Cinematic Documentary Anime Illustration Surreal Dreamscape Vintage Snapshot
Archetype Best for
editorial_caricature News commentary, satire, social issues
warm_narrative Storytelling, history, human interest
studio_realism Professional photography, editorial, luxury
infographic Data, facts, explainers, rapid-fire content
anime_illustration Dynamic, action-oriented, pop culture
pastoral_watercolor Nature, contemplative, hand-painted aesthetic
comic_book Action, adventure, energetic content
gothic_fantasy Dark themes, mythology, epic content
vintage_snapshot Nostalgic, intimate, personal stories
surreal_dreamscape Sci-fi, fantasy, mind-bending topics
warm_editorial Lifestyle, people stories, general purpose
cinematic_documentary Factual, historical, science
moody_cinematic Mystery, tension, crime, dark history
bold_illustration Educational, how-to, listicles

Background

OpenReels is a full rewrite of ReelMistri, a CLI pipeline originally built for Bangla-language YouTube Shorts automation. ReelMistri proved the concept: one command, fully produced video, language-aware scripts, culturally coherent visuals, proper Bengali text rendering.

The rewrite moves from Python to TypeScript for native Remotion integration. Cleaner video rendering, better developer experience, no Python-to-TypeScript bridge.

Status

v0.17.0 shipped. See CHANGELOG.md for full version history and TODOS.md for known issues and roadmap.

Star History

If OpenReels is useful to you, consider giving it a star. It helps others discover the project.

Star History Chart

License

This project is licensed under the MIT License.

About

Open-source AI pipeline that turns any topic into a publish-ready YouTube/Instagram/TikTok Short — research, script, voiceover, visuals, music, captions, and assembly in one command.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages