Audio Transcription PoC 🎤 → 📝

A simple Proof of Concept (PoC) application that records audio from the microphone in a web front-end, sends it to a Fastapi API, and transcribes it using OpenAI's Whisper model.

📌 Features

Record audio directly from the browser
Send recorded audio to a Fastapi backend
Use OpenAI Whisper for speech-to-text transcription
Display the transcribed text in the UI

📦 Tech Stack

Frontend: JavaScript (Vanilla JS) + HTML + CSS
Backend: Python + Fastapi
Transcription Model: OpenAI Whisper
Audio Processing: ffmpeg

🚀 Setup & Installation

1️⃣ Install Backend Dependencies

Make sure you have Python 3.8+ installed.

1. Clone the repository

git clone https://github.com/abonvalle/PoC-OpenAI-Whisper.git
cd PoC-OpenAI-Whisper

2. Create a virtual environment (optional but recommended)

python -m venv .poc_openai_whisper_env
source .poc_openai_whisper_env/bin/activate # On Windows: .poc_openai_whisper_env\Scripts\activate

3. Install Python dependencies

pip install -r requirements.txt

4. Install ffmpeg (Required for audio processing)

Linux (Ubuntu/Debian):

sudo apt install ffmpeg

macOS:

brew install ffmpeg

Windows (Chocolatey):

choco install ffmpeg

2️⃣ Run the Fastapi Backend

1. Run Fastapi server

fastapi dev app.py --port 5000

The API should now be running at: http://localhost:5000

3️⃣ Run the Frontend

Use the live server extension https://marketplace.visualstudio.com/items?itemName=ritwickdey.LiveServer (recommended) or open index.html in a browser.

🛠️ API Endpoints

🎙️ Transcribe Audio

Endpoint:

POST /transcribe

Request:

Content-Type: multipart/form-data
Body:
- file: The recorded audio file (WebM/WAV)

Response (JSON):

{
  "message": "Success",
  "transcription": "Hello, this is a test transcription."
}

📜 Project Structure

/PoC-OpenAI-Whisper
│── index.html # Main UI
│── script.js # Handles recording & API calls
│── app.py # Fastapi application
└── requirements.txt # Python dependencies

TODO : update readme with fastapi

fastapi dev app.py --port 5000;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Transcription PoC 🎤 → 📝

📌 Features

📦 Tech Stack

🚀 Setup & Installation

1️⃣ Install Backend Dependencies

1. Clone the repository

2. Create a virtual environment (optional but recommended)

3. Install Python dependencies

4. Install ffmpeg (Required for audio processing)

2️⃣ Run the Fastapi Backend

1. Run Fastapi server

3️⃣ Run the Frontend

🛠️ API Endpoints

🎙️ Transcribe Audio

Endpoint:

Request:

Response (JSON):

📜 Project Structure

TODO : update readme with fastapi

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
sample		sample
LICENSE		LICENSE
README.md		README.md
app.py		app.py
index.html		index.html
requirements.txt		requirements.txt
script.js		script.js

Folders and files

Latest commit

History

Repository files navigation

Audio Transcription PoC 🎤 → 📝

📌 Features

📦 Tech Stack

🚀 Setup & Installation

1️⃣ Install Backend Dependencies

1. Clone the repository

2. Create a virtual environment (optional but recommended)

3. Install Python dependencies

4. Install ffmpeg (Required for audio processing)

2️⃣ Run the Fastapi Backend

1. Run Fastapi server

3️⃣ Run the Frontend

🛠️ API Endpoints

🎙️ Transcribe Audio

Endpoint:

Request:

Response (JSON):

📜 Project Structure

TODO : update readme with fastapi

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages