Skip to content

robdefeo/voxscribe

Repository files navigation

voxscribe

Offline audio transcription using local Whisper models.

Installation

macOS / Linux — shell installer:

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/robdefeo/voxscribe/releases/latest/download/voxscribe-installer.sh | sh

macOS — Homebrew:

brew install robdefeo/tap/voxscribe

cargo-binstall (downloads pre-built binary):

cargo binstall voxscribe

Build from source:

mise install
just build

Usage

voxscribe <INPUT> [OPTIONS]

The first run downloads the selected model from HuggingFace and caches it locally. All subsequent runs work fully offline.

Models

Model Size Notes
large-v3-turbo ~809 MB Recommended. Best speed/accuracy balance for most audio.
large-v3 ~1.5 GB Maximum accuracy. ~2x slower than turbo.
medium ~769 MB Good accuracy, faster than large.
small ~466 MB Fast, less accurate.
base ~141 MB Very fast, lower accuracy.
tiny ~75 MB Fastest, lowest accuracy.

Examples

# Transcribe to stdout
voxscribe audio.m4a

# Transcribe to file using the recommended model
voxscribe audio.m4a --model large-v3-turbo --output transcript.txt

# SRT subtitles with timestamps
voxscribe audio.mp4 --format srt --timestamps --output subtitles.srt

# Force language and apply a correction dictionary
voxscribe audio.m4a --language en --dict corrections.json

# Use a locally downloaded model file
voxscribe audio.m4a --model-path ~/models/ggml-large-v3-turbo.bin

Options

Run voxscribe --help for the full list of options.

About

Local audio/video transcription CLI powered by Whisper

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors