WhisperX Transcription GUI

A GUI for transcribing audio and video files using WhisperX, with CUDA acceleration for NVIDIA RTX 4070.

Updated to support the latest version of whisperx as of 4/1/2026.

Transcripts are saved as both .srt and .txt files in a transcripts/ folder next to the script.

Requirements

Windows 11 (AMD64)
NVIDIA RTX 4070 (or compatible GPU)
Git Bash
Python 3.12.0
CUDA 12.8.0
cuDNN 9.x (I tested with 9.20 and 9.15)

Setup

1. Install CUDA Toolkit 12.8.1

Download and run the installer:

wget https://developer.download.nvidia.com/compute/cuda/12.8.1/network_installers/cuda_12.8.1_windows_network.exe

After installing, set the CUDA_PATH environment variable if it wasn't set automatically:

Value: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8

2. Install cuDNN 9.20

Download and run the installer:

wget https://developer.download.nvidia.com/compute/cudnn/9.20.0/local_installers/cudnn_9.20.0_windows_x86_64.exe

Add the cuDNN bin folder to your system PATH:

C:\Program Files\NVIDIA\CUDNN\v9.20\bin\12.9

3. Install Python 3.12.0

Download from python.org or use the Python Launcher:

py install 3.12.0

4. Run the install script

Open Git Bash in the project directory and run:

source install_4070_complete.bash

This script will:

Create and activate a Python 3.12.0 virtual environment at .venv
Install PyTorch and torchaudio 2.8.0cu128
Verify CUDA 12.8 and cuDNN 9.x are detected correctly — if not, it will download the installers for you and exit
Install WhisperX with all its dependencies.

Note: If CUDA/cuDNN are not installed when you run the script, it will download the installers to the project directory and exit. Install them, then re-run the script.

Running the app

With the virtual environment activated:

source .venv/Scripts/activate
python whisperx-gui.py

Usage

Click Add Files to add audio or video files (.mp4, .mkv, .mov, .wmv, .avi, .flv, .mp3, .wav, .aac, .flac, .ogg)
Select a Model — large-v2 is the most accurate, tiny is the fastest
Select the Language of the audio
Select a Compute Type — float16 is recommended for CUDA
Click Transcribe All

Output files are saved to transcripts/<filename>_<timestamp>/ and the folder opens automatically when transcription completes.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
icon		icon
.gitignore		.gitignore
README.md		README.md
install_4070_complete.bash		install_4070_complete.bash
install_ffmpeg.bash		install_ffmpeg.bash
installation_steps.md		installation_steps.md
run.ps1		run.ps1
whisperx-gui.py		whisperx-gui.py
win11_amd64_requirements.txt		win11_amd64_requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WhisperX Transcription GUI

Requirements

Setup

1. Install CUDA Toolkit 12.8.1

2. Install cuDNN 9.20

3. Install Python 3.12.0

4. Run the install script

Running the app

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

WhisperX Transcription GUI

Requirements

Setup

1. Install CUDA Toolkit 12.8.1

2. Install cuDNN 9.20

3. Install Python 3.12.0

4. Run the install script

Running the app

Usage

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages