Skip to content

Add livekit-plugins-avaz for dashboard WebSocket TTS#6136

Open
tamerrkanak wants to merge 17 commits into
livekit:mainfrom
Mank-Technology:livekit-plugins-avaz
Open

Add livekit-plugins-avaz for dashboard WebSocket TTS#6136
tamerrkanak wants to merge 17 commits into
livekit:mainfrom
Mank-Technology:livekit-plugins-avaz

Conversation

@tamerrkanak

Copy link
Copy Markdown

Summary

  • Adds livekit-plugins-avaz, a LiveKit Agents TTS plugin for Avaz dashboard WebSocket streaming (/tts/stream-input).
  • Supports env-based AVAZ_API_KEY, AVAZ_BASE_URL, and AVAZ_AGENT_MODEL_ID (UUID) with separate upstream stream_model for WebSocket init.
  • Includes unit tests (pytest --unit), env-gated plugin integration tests (pytest --plugin avaz), and examples/voice_agents/avaz_agent.py.

Supersedes #6122 (livekit-plugins-voxcpm), which targeted self-hosted vLLM-Omni rather than the Avaz dashboard API.

Test plan

  • uv run pytest tests/test_avaz_tts.py --unit
  • uv run pytest tests/test_plugin_avaz_tts.py --plugin avaz (requires AVAZ_API_KEY, AVAZ_BASE_URL, AVAZ_AGENT_MODEL_ID)
  • Run examples/voice_agents/avaz_agent.py against a dashboard deployment

Made with Cursor

Provides env-based api_key, base_url, and model_id integration with unit tests, plugin integration tests, and a voice agent example.

Co-authored-by: Cursor <cursoragent@cursor.com>
@tamerrkanak tamerrkanak requested a review from a team as a code owner June 17, 2026 11:53
devin-ai-integration[bot]

This comment was marked as resolved.

Devin review on livekit#6136: AvazSynthesizeStream was undefined; stream() and synthesize() now delegate to the existing SynthesizeStream class.

Co-authored-by: Cursor <cursoragent@cursor.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Addresses Devin review on livekit#6136 for optional-dependencies and workspace source ordering.

Co-authored-by: Cursor <cursoragent@cursor.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Move _mark_started() from first audio receipt to the WebSocket text send, matching Cartesia/ElevenLabs/Deepgram plugin conventions.

Co-authored-by: Cursor <cursoragent@cursor.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Guard drain/flush paths against non-dict JSON payloads; clarify intentional text batching and chunk_notation normalization for Devin review on livekit#6136.

Co-authored-by: Cursor <cursoragent@cursor.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Replace asyncio.timeout (3.11+) with wait_for to match requires-python >=3.10 and other LiveKit plugins.

Co-authored-by: Cursor <cursoragent@cursor.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Pass ws explicitly into _drain_audio so streaming synthesis works after
_connect_and_run_turn refactor; add a mocked unit test that would have
caught the NameError.

Co-authored-by: Cursor <cursoragent@cursor.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Retry warmup in _ensure_warmed when background prewarm failed, and count
only zero-padding bytes in pcm_accum so audio_total_ms is not inflated.

Co-authored-by: Cursor <cursoragent@cursor.com>
devin-ai-integration[bot]

This comment was marked as resolved.

tamerrkanak and others added 2 commits June 17, 2026 14:26
Let the base SynthesizeStream._main_task call end_input() once after _run
returns instead of ending the segment explicitly inside _run.

Co-authored-by: Cursor <cursoragent@cursor.com>
Cover replace-vs-append behavior for Avaz dashboard chunk boundaries in
tests/test_avaz_tts.py per Devin review on livekit#6136.

Co-authored-by: Cursor <cursoragent@cursor.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Log server payloads at DEBUG with truncated audio fields, and remove the
unused model_id branch from _resolve_stream_model now that dashboard UUIDs
flow through agent_model_id only.

Co-authored-by: Cursor <cursoragent@cursor.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Signal PEP 561 inline typing support for mypy and other type checkers.

Co-authored-by: Cursor <cursoragent@cursor.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Ensure base TTS teardown runs after cancelling the prewarm task.

Co-authored-by: Cursor <cursoragent@cursor.com>
devin-ai-integration[bot]

This comment was marked as resolved.

tamerrkanak and others added 2 commits June 17, 2026 16:01
Re-raise APIConnectionError and convert other connect-time exceptions so
base SynthesizeStream retry logic can handle transient network/handshake errors.

Co-authored-by: Cursor <cursoragent@cursor.com>
Keep avaz optional extra after asyncai and align plugin pins with main (>=1.6.1).

Co-authored-by: Cursor <cursoragent@cursor.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Align plugin version and livekit-agents dependency with main after the
>=1.6.1 optional-extra merge.

Co-authored-by: Cursor <cursoragent@cursor.com>

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

Open in Devin Review

Comment on lines +693 to +698
await asyncio.sleep(0.05)
await _drain_audio(ws, timeout=self._opts.recv_idle_timeout_s)

# Match test_ws_avaz3.py: wait for trailing audio before flush.
await asyncio.sleep(self._opts.post_text_drain_s)
await _drain_audio(ws, timeout=self._opts.recv_idle_timeout_s)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Fixed sleep delays in _run_turn add minimum ~200ms latency per synthesis turn

Lines 693 and 697 add fixed asyncio.sleep calls (50ms + 150ms by default via post_text_drain_s) between sending text and sending flush. Combined with drain timeout windows (recv_idle_timeout_s=0.5s), the minimum turn time is significantly padded. These sleeps exist to match the Avaz server's expected protocol timing, but they add latency that may be unnecessary with faster server builds. The timeouts are configurable via constructor parameters, so users can tune them.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Use post_text_drain_s as a WebSocket recv idle window instead of
asyncio.sleep plus a second full drain, removing the ~200ms minimum
padding per turn while keeping trailing-chunk capture configurable.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant