Long-form Video Subtitle Generator

A command-line tool that automatically generates subtitles for long-form videos by splitting them at natural silence points and using OpenAI's Whisper model for transcription and translation.

Features

Detects natural silence in videos to create optimal splitting points, ensuring that subtitles remain accurate for long-form content
Uses OpenAI's Whisper model for high-quality transcription and translation
Supports multiple languages
Generates properly formatted SRT subtitle files
GPU acceleration with fallback to CPU

Installation

Prerequisites

Python 3.8 or higher
A CUDA-compatible GPU (optional, but recommended for faster processing)
FFmpeg (required for video processing)

# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg

# on Arch Linux
sudo pacman -S ffmpeg

# on MacOS using Homebrew (https://brew.sh/)
brew install ffmpeg

# on Windows using Chocolatey (https://chocolatey.org/)
choco install ffmpeg

# on Windows using Scoop (https://scoop.sh/)
scoop install ffmpeg

Setup

Clone this repository:

git clone https://github.com/pschua/generate-longform-video-subtitles.git
cd generate-longform-video-subtitles

Create a virtual environment (recommended):

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install the required packages:
```
pip install -r requirements.txt
```

Usage

Basic usage:

python subtitle_generator.py video_file.mp4

Advanced options:

python subtitle_generator.py video_file.mp4 --chunk-duration 100 --output custom_name.srt --language ja

Options

video_file: Path to the video file (required)
--chunk-duration: Target duration in seconds for each chunk (default: 300)
--output, -o: Custom output filename for the subtitle file
--device: Device to use for transcription (default: cuda if available, otherwise cpu)
--language: Language code for transcription (default: ja for Japanese)
- language codes

Examples

Generate English subtitles using CPU:

python subtitle_generator.py my_video.mp4 --language en --device cpu

Generate subtitles from Japanese video with 10-minute chunks:

python subtitle_generator.py my_video.mp4 --chunk-duration 600 --language ja

License

MIT License - See LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
jap-preview-1.gif		jap-preview-1.gif
jap-preview-2.gif		jap-preview-2.gif
kor-preview.gif		kor-preview.gif
requirements.txt		requirements.txt
subtitle_generator.py		subtitle_generator.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Long-form Video Subtitle Generator

Features

Installation

Prerequisites

Setup

Usage

Options

Examples

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Long-form Video Subtitle Generator

Features

Installation

Prerequisites

Setup

Usage

Options

Examples

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages