Skip to content

lang315/ArcReel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

228 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

中文 English Tiếng Việt


ArcReel Logo
ArcReel

Open-Source AI Video Generation Workspace — Novel to Short Video, Powered by AI Agents

Quick Start License Stars Docker Tests

Python React FastAPI Claude Agent SDK Gemini 火山方舟 Grok OpenAI

ArcReel Workspace


Core Capabilities

🤖 AI Agent Workflow

Built on the Claude Agent SDK, orchestrating Skill + focused Subagent multi-agent collaboration to automatically complete the full pipeline from script creation to video synthesis

🎨 Multi-Provider Image Generation

Gemini, Volcano Ark (ByteDance), Grok, OpenAI and custom providers. Character design images ensure character consistency; clue tracking maintains prop/scene coherence across shots

🎬 Multi-Provider Video Generation

Veo 3.1, Seedance, Grok, Sora 2 and custom providers, switchable globally or per project

⚡ Async Task Queue

RPM rate limiting + independent Image/Video concurrency channels, lease-based scheduling, supports checkpoint resume

🖥️ Visual Workspace

Web UI for project management, asset preview, version rollback, real-time SSE task tracking, with built-in AI assistant

Workflow

graph TD
    A["📖 Upload Novel"] --> B["📝 AI Agent Generates Storyboard Script"]
    B --> C["👤 Generate Character Design Images"]
    B --> D["🔑 Generate Clue Design Images"]
    C --> E["🖼️ Generate Storyboard Images"]
    D --> E
    E --> F["🎬 Generate Video Clips"]
    F --> G["🎞️ FFmpeg Final Video Synthesis"]
    F --> H["📦 Export Jianying Draft"]
Loading

Quick Start

Default Deployment (SQLite)

git clone https://github.com/ArcReel/ArcReel.git
cd ArcReel/deploy
cp .env.example .env
docker compose up -d
# Visit http://localhost:1241

Production Deployment (PostgreSQL)

cd ArcReel/deploy/production
cp .env.example .env    # Set POSTGRES_PASSWORD
docker compose up -d

After first startup, log in with the default account (username admin, password set via AUTH_PASSWORD in .env; if not set, it is auto-generated on first launch and written back to .env), then go to the Settings page (/settings) to complete configuration:

  1. ArcReel Agent — Configure Anthropic API Key (powers the AI assistant), supports custom Base URL and model
  2. AI Image/Video Generation — Configure at least one provider's API Key (Gemini / Volcano Ark / Grok / OpenAI), or add a custom provider

📖 For detailed steps, see the Full Getting Started Guide

Features

  • Complete Production Pipeline — Novel → Script → Character Design → Storyboard Images → Video Clips → Final Video, one-click orchestration
  • Multi-Agent Architecture — Orchestrator Skill detects project state and automatically dispatches focused Subagents; each Subagent completes one task then returns a summary
  • Multi-Provider Support — Image/video/text generation supports four built-in providers: Gemini, Volcano Ark, Grok, OpenAI, switchable globally or per project
  • Custom Providers — Connect any OpenAI-compatible / Google-compatible API (e.g., Ollama, vLLM, third-party proxies), auto-discovers available models and assigns media types, with feature parity to built-in providers
  • Two Content Modes — Narration mode splits segments by reading rhythm; drama/animation mode organizes by scene/dialogue structure
  • Progressive Episode Planning — Human-AI collaboration for splitting long novels: peek probe → Agent suggests breakpoints → user confirms → physical split, produce on demand
  • Style Reference Images — Upload style images; AI automatically analyzes and applies them uniformly to all image generation, ensuring visual consistency across the project
  • Character Consistency — AI first generates character design images; all subsequent storyboards and videos reference that design
  • Clue Tracking — Key props and scene elements marked as "clues" maintain visual coherence across shots
  • Version History — Each regeneration automatically saves a historical version, supporting one-click rollback
  • Multi-Provider Cost Tracking — All image/video/text generation included in cost calculation, billed per provider strategy, with separate statistics by currency
  • Cost Estimation — Estimate project/episode/shot costs before generation, with three-level drill-down showing estimated vs. actual cost comparison
  • Jianying Draft Export — Export Jianying draft ZIPs by episode, supporting Jianying 5.x / 6+ (Operation Guide)
  • Project Import/Export — Package entire project as archive for easy backup and migration

Provider Support

ArcReel supports multiple built-in providers and custom providers through unified ImageBackend / VideoBackend / TextBackend protocols, switchable globally or per project:

Image Providers

Provider Available Models Capabilities Billing
Gemini (Google) Nano Banana 2, Nano Banana Pro Text-to-image, image-to-image (multi-reference) Resolution lookup table (USD)
Volcano Ark (ByteDance) Seedream 5.0, Seedream 5.0 Lite, Seedream 4.5, Seedream 4.0 Text-to-image, image-to-image Per image (CNY)
Grok (xAI) Grok Imagine Image, Grok Imagine Image Pro Text-to-image, image-to-image Per image (USD)
OpenAI GPT Image 1.5, GPT Image 1 Mini Text-to-image, image-to-image (multi-reference) Per image (USD)

Video Providers

Provider Available Models Capabilities Billing
Gemini (Google) Veo 3.1, Veo 3.1 Fast, Veo 3.1 Lite Text-to-video, image-to-video, video extension, negative prompts Resolution × duration lookup table (USD)
Volcano Ark (ByteDance) Seedance 2.0, Seedance 2.0 Fast, Seedance 1.5 Pro Text-to-video, image-to-video, video extension, audio generation, seed control, offline inference Per token usage (CNY)
Grok (xAI) Grok Imagine Video Text-to-video, image-to-video Per second (USD)
OpenAI Sora 2, Sora 2 Pro Text-to-video, image-to-video Per second (USD)

Text Providers

Provider Available Models Capabilities Billing
Gemini (Google) Gemini 3.1 Flash, Gemini 3.1 Flash Lite, Gemini 3 Pro Text generation, structured output, visual understanding Per token usage (USD)
Volcano Ark (ByteDance) Doubao Seed series Text generation, structured output, visual understanding Per token usage (CNY)
Grok (xAI) Grok 4.20, Grok 4.1 Fast series Text generation, structured output, visual understanding Per token usage (USD)
OpenAI GPT-5.4, GPT-5.4 Mini, GPT-5.4 Nano Text generation, structured output, visual understanding Per token usage (USD)

Custom Providers

In addition to built-in providers, you can connect any OpenAI-compatible or Google-compatible API:

  • Add a custom provider in the settings page with Base URL and API Key
  • Automatically calls /v1/models to discover available models, inferring media type (image/video/text) from model names
  • Feature parity with built-in providers: global/project-level switching, cost tracking, version management

Provider selection priority: project-level settings > global default. When switching providers, common settings (resolution, aspect ratio, audio, etc.) carry over directly; provider-specific parameters are preserved.

Community

Scan the QR code to join the Feishu (Lark) community group for help and latest updates:

Feishu Community QR Code

AI Assistant Architecture

ArcReel's AI assistant is built on the Claude Agent SDK, using an Orchestrator Skill + Focused Subagent multi-agent architecture:

flowchart TD
    User["User Conversation"] --> Main["Main Agent"]
    Main --> MW["manga-workflow<br/>Orchestrator Skill"]
    MW -->|"State Detection"| PJ["Read project.json<br/>+ File System"]
    MW -->|"dispatch"| SA1["analyze-characters-clues<br/>Global Character/Clue Extraction"]
    MW -->|"dispatch"| SA2["split-narration-segments<br/>Narration Mode Segment Splitting"]
    MW -->|"dispatch"| SA3["normalize-drama-script<br/>Drama Animation Normalization"]
    MW -->|"dispatch"| SA4["create-episode-script<br/>JSON Script Generation"]
    MW -->|"dispatch"| SA5["Asset Generation Subagent<br/>Characters/Clues/Storyboards/Video"]
    SA1 -->|"Summary"| Main
    SA4 -->|"Summary"| Main
    Main -->|"Show Results<br/>Await Confirmation"| User
Loading

Core Design Principles:

  • Orchestrator Skill (manga-workflow) — Has state detection capability, automatically determines the current project phase (character design / episode planning / preprocessing / script generation / asset generation), dispatches the corresponding Subagent, supports entry from any phase and interruption/resume
  • Focused Subagent — Each Subagent completes only one task then returns; large context such as the novel source text stays inside the Subagent, while the main Agent only receives a refined summary, protecting context space
  • Skill vs. Subagent Boundary — Skills handle deterministic script execution (API calls, file generation); Subagents handle tasks requiring reasoning and analysis (character extraction, script normalization)
  • Inter-Phase Confirmation — After each Subagent returns, the main Agent presents a results summary to the user and waits for confirmation before proceeding to the next phase

OpenClaw Integration

ArcReel supports calls from external AI Agent platforms such as OpenClaw, enabling natural language-driven video creation:

  1. Generate an API Key (with arc- prefix) in ArcReel's settings page
  2. Load ArcReel's Skill definition in OpenClaw (visit http://your-domain/skill.md for automatic retrieval)
  3. Create projects, generate scripts, and produce videos through OpenClaw conversation

Technical implementation: API Key authentication (Bearer Token) + synchronous Agent conversation endpoint (POST /api/v1/agent/chat), internally connecting to the SSE streaming assistant and collecting complete responses.

Technical Architecture

flowchart TB
    subgraph UI["Web UI — React 19"]
        U1["Project Management"] ~~~ U2["Asset Preview"] ~~~ U3["AI Assistant"] ~~~ U4["Task Monitor"]
    end

    subgraph Server["FastAPI Server"]
        S1["REST API<br/>Route Dispatch"] ~~~ S2["Agent Runtime<br/>Claude Agent SDK"]
        S3["SSE Stream<br/>Real-time Status Push"] ~~~ S4["Auth<br/>JWT + API Key"]
    end

    subgraph Core["Core Library"]
        C1["VideoBackend Abstraction<br/>Gemini · Volcano Ark · Grok · OpenAI · Custom"] ~~~ C2["ImageBackend Abstraction<br/>Gemini · Volcano Ark · Grok · OpenAI · Custom"]
        C5["TextBackend Abstraction<br/>Gemini · Volcano Ark · Grok · OpenAI · Custom"] ~~~ C3["GenerationQueue<br/>RPM Limiting · Image/Video Channels"]
        C4["ProjectManager<br/>File System + Version Management"]
    end

    subgraph Data["Data Layer"]
        D1["SQLAlchemy 2.0 Async ORM"] ~~~ D2["SQLite / PostgreSQL"]
        D3["Alembic Migrations"] ~~~ D4["UsageTracker<br/>Multi-Provider Cost Tracking"]
    end

    UI --> Server --> Core --> Data
Loading

Tech Stack

Layer Technology
Frontend React 19, TypeScript, Tailwind CSS 4, wouter, zustand, Framer Motion, Vite
Backend FastAPI, Python 3.12+, uvicorn, Pydantic 2
AI Agents Claude Agent SDK (Skill + Subagent multi-agent architecture)
Image Generation Gemini (google-genai), Volcano Ark (volcengine-python-sdk[ark]), Grok (xai-sdk), OpenAI (openai)
Video Generation Gemini Veo 3.1 (google-genai), Volcano Ark Seedance 2.0/1.5 (volcengine-python-sdk[ark]), Grok (xai-sdk), OpenAI Sora 2 (openai)
Text Generation Gemini (google-genai), Volcano Ark (volcengine-python-sdk[ark]), Grok (xai-sdk), OpenAI (openai), Instructor (structured output fallback)
Media Processing FFmpeg, Pillow
ORM & Database SQLAlchemy 2.0 (async), Alembic, aiosqlite, asyncpg — SQLite (default) / PostgreSQL (production)
Authentication JWT (pyjwt), API Key (SHA-256 hash), Argon2 password hashing (pwdlib)
Deployment Docker, Docker Compose (deploy/ default, deploy/production/ with PostgreSQL)

Documentation

Contributing

Contributions, bug reports, and feature suggestions are welcome! Please see the Contributing Guide for local development setup, testing, and code standards.

License

AGPL-3.0


If you find this project useful, please give it a ⭐ Star!

About

AI Agent 驱动的开源视频生成工作台 — 小说→角色/场景/道具设计→剧本→分镜图→视频,跨镜头角色与场景一致 | Open-source AI video workspace powered by AI Agents, Nano Banana 2 & Veo 3.1 / Grok / Seedance / OpenAI

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors