Skip to content

suyashsahu00/WORKSHOP-DEMO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

15 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸŽ™οΈ LiveKit Voice Agent β€” Workshop Demo

Python LiveKit License: MIT uv

A production-ready, multi-agent voice assistant built from scratch using LiveKit Agents SDK. Features consent collection, manager escalation with a different Cartesia voice, semantic turn detection, and multi-model fallback.

πŸ”— Based on the workshop: Building Production-Ready Voice Agents with LiveKit


✨ Features

  • 🎀 Real-time voice conversation via WebRTC (LiveKit)
  • 🧠 Multi-model LLM fallback β€” OpenAI GPT-4.1 Mini β†’ Google Gemini 2.5 Flash
  • πŸ—£οΈ Multi-model STT fallback β€” AssemblyAI Universal Streaming β†’ Deepgram Nova-3
  • πŸ”Š Multi-model TTS fallback β€” Cartesia Sonic-3 β†’ Inworld TTS-1
  • πŸ”‡ Background noise cancellation via LiveKit BVC
  • πŸ›‘ Semantic Turn Detection β€” no awkward mid-sentence interruptions (MultilingualModel)
  • ⚑ Preemptive generation for ultra-low latency responses
  • βœ… Consent Collection Task β€” legally compliant recording consent before call starts
  • πŸ‘¨β€πŸ’Ό Manager Escalation β€” seamless handoff to ManagerAgent with a different Cartesia voice
  • πŸ—‚οΈ Full conversation history preserved across all agent handoffs
  • 🐳 Docker support for containerized deployment
  • ☁️ LiveKit Cloud deployment ready via lk CLI

πŸ“Έ Demo Screenshots

πŸ–₯️ Local Console Testing Demo β€” click to expand

Local Console Testing

☁️ LiveKit Cloud Deployment Demo β€” click to expand

Deployment Demo


πŸ—οΈ Tech Stack

Category Provider Model / Details
LLM (Primary) OpenAI gpt-4.1-mini
LLM (Fallback) Google gemini-2.5-flash
STT (Primary) AssemblyAI universal-streaming:en
STT (Fallback) Deepgram nova-3
TTS β€” Assistant Cartesia sonic-3 Β· voice 9626c31c-bec5-4cca-baa8-f8ba9e84c8bc
TTS β€” Manager Cartesia sonic-3 Β· voice 6f84f4b8-58a2-430c-8c79-688dad597532
TTS (Fallback) Inworld inworld-tts-1
VAD Silero β€”
Turn Detection LiveKit MultilingualModel (semantic)
Noise Cancellation LiveKit BVC
Infrastructure LiveKit Cloud WebRTC

πŸ€– Agent Architecture

User joins room
      β”‚
      β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  CollectConsent Task  β”‚  ◄─ Asks for recording permission (Yes / No)
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
    Yes ─── No
         β”‚     └─► Proceed without recording
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚    Assistant Agent    β”‚  ◄─ Friendly CSR Β· Cartesia Voice 1
β”‚                       β”‚     Handles general queries
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
  "I want a manager"
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚    Manager Agent      β”‚  ◄─ Empathetic Manager Β· Cartesia Voice 2
β”‚                       β”‚     Full chat history preserved βœ…
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ Project Structure

WORKSHOP-DEMO/
└── livekit-voice-agent/
    β”œβ”€β”€ agent.py            # Main voice agent β€” all agent classes & entrypoint
    β”œβ”€β”€ .env                # API keys (not committed to git)
    β”œβ”€β”€ .env.example        # Environment variable template
    β”œβ”€β”€ pyproject.toml      # uv project config & dependencies
    β”œβ”€β”€ uv.lock             # Locked dependency versions
    β”œβ”€β”€ Dockerfile          # Docker container config
    β”œβ”€β”€ .dockerignore       # Docker ignore rules
    β”œβ”€β”€ livekit.toml        # LiveKit Cloud deployment config
    └── README.md           # This file

βš™οΈ Installation & Setup

Prerequisites

Step 1 β€” Clone the repo

git clone https://github.com/suyashsahu00/WORKSHOP-DEMO.git
cd WORKSHOP-DEMO/livekit-voice-agent

Step 2 β€” Install dependencies

uv sync

Step 3 β€” Setup environment variables

cp .env.example .env

Open .env and fill in your LiveKit credentials:

LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your_api_key_here
LIVEKIT_API_SECRET=your_api_secret_here

πŸ”‘ Get your API keys here: LiveKit Cloud API Keys

Note: All model inference (OpenAI, AssemblyAI, Cartesia, Deepgram, Inworld) runs via LiveKit Cloud Inference β€” no separate API keys needed!


πŸš€ Running the Agent

Option 1 β€” Console Mode (recommended for testing)

uv run agent.py console

Then open LiveKit Agents Playground, connect to your room, and start talking!

Option 2 β€” Dev Mode

uv run agent.py dev

🐳 Docker Deployment

Build the Docker image

docker build -t livekit-voice-agent .

Run the container

docker run --env-file .env livekit-voice-agent

☁️ LiveKit Cloud Deployment

Step 1 β€” Authenticate with LiveKit CLI

lk cloud auth

Step 2 β€” Deploy your agent

lk agent deploy

Step 3 β€” Verify on dashboard

LiveKit Cloud Dashboard β†’ Agents β†’ status should be Running βœ…


πŸ“ˆ Development History

Commit Message
8b6f5ba feat: replace tech support persona with Dr. Sydney health assistant
2f37fce feat: initialize project with uv dependency management and configuration files
7c8876e feat: change Sydney persona from health assistant to weather girl
febb85d feat: connecting voice agent to external services with MCP
ef7aaf7 feat: implement production-ready LiveKit voice agent with semantic turn detection, fallback models, and manager escalation
latest feat: add multi-agent voice system with consent workflow, manager handoff, and Cartesia TTS override

πŸ› οΈ Troubleshooting

  • ImportError: Ensure you have run uv sync to install all plugins listed in pyproject.toml.
  • API Key Errors: Double-check your .env file. Ensure GROQ_API_KEY, DEEPGRAM_API_KEY, and MURF_API_KEY are correctly set alongside your LiveKit credentials.
  • Voice Not Working: Some voices like "Tanushree" are premium on Murf AI. Ensure your API key has access to the requested voice and model (FALCON/GEN2).
  • Network Issues: If running locally, ensure you have a stable internet connection as the STT, LLM, and TTS all communicate with cloud providers.

πŸ“š Resources

Resource Link
πŸ”— Workshop Tutorial Building Production-Ready Voice Agents with LiveKit
πŸ”‘ LiveKit API Keys cloud.livekit.io β†’ API Keys
πŸ“– LiveKit Agents Docs docs.livekit.io/agents
πŸ§ͺ Agents Playground agents-playground.livekit.io
πŸ’¬ LiveKit Community LiveKit Slack
☁️ LiveKit Cloud cloud.livekit.io

πŸͺͺ License

MIT License Β© 2026 Suyash Sahu


πŸ“ Updates from Table_revision Branch

This branch introduces several high-performance updates to the tech stack:

  • 🧠 High-Performance LLM β€” Groq Llama-3.3-70b-versatile
  • πŸ—£οΈ Ultra-fast STT β€” Deepgram Nova-2
  • πŸ”Š Premium TTS β€” Murf AI (Tanushree Voice Β· FALCON Model)

Updated Tech Stack Table (Table_revision)

Category Provider Model / Details
LLM Groq llama-3.3-70b-versatile
STT Deepgram nova-2
TTS Murf AI Tanushree (FALCON model)

Required API Keys (for Table_revision)

To use the features from this branch, get your keys from the following consoles and add them to your .env file:

GROQ_API_KEY=your_groq_api_key_here
DEEPGRAM_API_KEY=your_deepgram_api_key_here
MURF_API_KEY=your_murf_api_key_here

About

Production-ready LiveKit voice agent with multi-model fallback, semantic turn detection, and manager handoff using custom Cartesia TTS.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors