Add livekit-plugins-avaz for dashboard WebSocket TTS#6136
Open
tamerrkanak wants to merge 17 commits into
Open
Conversation
Provides env-based api_key, base_url, and model_id integration with unit tests, plugin integration tests, and a voice agent example. Co-authored-by: Cursor <cursoragent@cursor.com>
Devin review on livekit#6136: AvazSynthesizeStream was undefined; stream() and synthesize() now delegate to the existing SynthesizeStream class. Co-authored-by: Cursor <cursoragent@cursor.com>
Addresses Devin review on livekit#6136 for optional-dependencies and workspace source ordering. Co-authored-by: Cursor <cursoragent@cursor.com>
Move _mark_started() from first audio receipt to the WebSocket text send, matching Cartesia/ElevenLabs/Deepgram plugin conventions. Co-authored-by: Cursor <cursoragent@cursor.com>
Guard drain/flush paths against non-dict JSON payloads; clarify intentional text batching and chunk_notation normalization for Devin review on livekit#6136. Co-authored-by: Cursor <cursoragent@cursor.com>
Replace asyncio.timeout (3.11+) with wait_for to match requires-python >=3.10 and other LiveKit plugins. Co-authored-by: Cursor <cursoragent@cursor.com>
Pass ws explicitly into _drain_audio so streaming synthesis works after _connect_and_run_turn refactor; add a mocked unit test that would have caught the NameError. Co-authored-by: Cursor <cursoragent@cursor.com>
Retry warmup in _ensure_warmed when background prewarm failed, and count only zero-padding bytes in pcm_accum so audio_total_ms is not inflated. Co-authored-by: Cursor <cursoragent@cursor.com>
Let the base SynthesizeStream._main_task call end_input() once after _run returns instead of ending the segment explicitly inside _run. Co-authored-by: Cursor <cursoragent@cursor.com>
Cover replace-vs-append behavior for Avaz dashboard chunk boundaries in tests/test_avaz_tts.py per Devin review on livekit#6136. Co-authored-by: Cursor <cursoragent@cursor.com>
Log server payloads at DEBUG with truncated audio fields, and remove the unused model_id branch from _resolve_stream_model now that dashboard UUIDs flow through agent_model_id only. Co-authored-by: Cursor <cursoragent@cursor.com>
Signal PEP 561 inline typing support for mypy and other type checkers. Co-authored-by: Cursor <cursoragent@cursor.com>
Ensure base TTS teardown runs after cancelling the prewarm task. Co-authored-by: Cursor <cursoragent@cursor.com>
Re-raise APIConnectionError and convert other connect-time exceptions so base SynthesizeStream retry logic can handle transient network/handshake errors. Co-authored-by: Cursor <cursoragent@cursor.com>
Keep avaz optional extra after asyncai and align plugin pins with main (>=1.6.1). Co-authored-by: Cursor <cursoragent@cursor.com>
Align plugin version and livekit-agents dependency with main after the >=1.6.1 optional-extra merge. Co-authored-by: Cursor <cursoragent@cursor.com>
Comment on lines
+693
to
+698
| await asyncio.sleep(0.05) | ||
| await _drain_audio(ws, timeout=self._opts.recv_idle_timeout_s) | ||
|
|
||
| # Match test_ws_avaz3.py: wait for trailing audio before flush. | ||
| await asyncio.sleep(self._opts.post_text_drain_s) | ||
| await _drain_audio(ws, timeout=self._opts.recv_idle_timeout_s) |
Contributor
There was a problem hiding this comment.
🚩 Fixed sleep delays in _run_turn add minimum ~200ms latency per synthesis turn
Lines 693 and 697 add fixed asyncio.sleep calls (50ms + 150ms by default via post_text_drain_s) between sending text and sending flush. Combined with drain timeout windows (recv_idle_timeout_s=0.5s), the minimum turn time is significantly padded. These sleeps exist to match the Avaz server's expected protocol timing, but they add latency that may be unnecessary with faster server builds. The timeouts are configurable via constructor parameters, so users can tune them.
Was this helpful? React with 👍 or 👎 to provide feedback.
Use post_text_drain_s as a WebSocket recv idle window instead of asyncio.sleep plus a second full drain, removing the ~200ms minimum padding per turn while keeping trailing-chunk capture configurable. Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
livekit-plugins-avaz, a LiveKit Agents TTS plugin for Avaz dashboard WebSocket streaming (/tts/stream-input).AVAZ_API_KEY,AVAZ_BASE_URL, andAVAZ_AGENT_MODEL_ID(UUID) with separate upstreamstream_modelfor WebSocket init.pytest --unit), env-gated plugin integration tests (pytest --plugin avaz), andexamples/voice_agents/avaz_agent.py.Supersedes #6122 (
livekit-plugins-voxcpm), which targeted self-hosted vLLM-Omni rather than the Avaz dashboard API.Test plan
uv run pytest tests/test_avaz_tts.py --unituv run pytest tests/test_plugin_avaz_tts.py --plugin avaz(requiresAVAZ_API_KEY,AVAZ_BASE_URL,AVAZ_AGENT_MODEL_ID)examples/voice_agents/avaz_agent.pyagainst a dashboard deploymentMade with Cursor