feat: voice dashboard, barge-in detection, kernel integration by chazmaniandinkle · Pull Request #1 · cogos-dev/mod3

chazmaniandinkle · 2026-04-13T21:36:35Z

Summary

Voice dashboard at /dashboard — browser-based voice/text chat via WebSocket, Silero VAD v5, three inference providers (local Gemma 4, Ollama, CogOS kernel)
Barge-in detection — filesystem-based SuperWhisper recording detection, speak() returns "held" during user speech, interrupt context tracking
Kernel integration — agent loop pulls CogOS context per turn, exchanges logged to bus for observation by higher-level agents
Reliability fixes — non-blocking speak(), Kokoro pre-warm, WebSocket cleanup, graceful shutdown, standardized /health and /capabilities

New files

agent_loop.py — Gemma 4 tool-calling agent loop with kernel context injection
channels.py — BrowserChannel WebSocket adapter for dashboard
providers.py — MlxProvider, OllamaProvider, CogOSProvider + auto-detect
dashboard/ — HTML/JS/VAD assets for browser voice chat
integrations/bargein-producer.py — SuperWhisper barge-in signal producer

Test plan

python server.py --dashboard --port 7860 then open http://localhost:7860/dashboard
Text chat works (type message, get AI response + TTS)
Voice chat works (allow mic, speak, see transcript, hear response)
Barge-in: run bargein-producer.py, start SuperWhisper during TTS -> playback interrupts
curl localhost:7860/health returns enriched status
curl localhost:7860/capabilities returns voice lists

Dashboard: - Browser-based voice/text chat at /dashboard via WebSocket (/ws/chat) - Silero VAD v5 in-browser, PCM audio streaming, auto-reconnect - Three inference providers: MlxProvider (local Gemma), OllamaProvider, CogOSProvider - Agent loop with tool-calling (speak/send_text) — Gemma 4 E4B works natively Barge-in: - SuperWhisper recording detection via filesystem watching (bargein-producer.py) - speak() returns "held" when user is recording — no zombie queued jobs - Interrupt context written back to signal file (spoken_pct, delivered_text) - 150ms poll, pure stdlib Python, zero external dependencies Kernel integration: - Agent loop pulls CogOS kernel context per turn (identity, state, barge-in history) - Exchanges logged to CogOS bus for observation by higher-level agents - CogOSProvider routes through kernel /v1/chat/completions Reliability: - Non-blocking speak() in agent loop (fire-and-forget via bus.act) - Kokoro pre-warm on server startup (eliminates 60s cold start) - WebSocket cleanup: cancel pending TTS jobs on disconnect - Graceful /shutdown endpoint with drain + SIGINT - Standardized /health with uptime, engine status, queue state - /capabilities endpoint with dynamic voice lists New files: agent_loop.py, channels.py, providers.py, dashboard/, integrations/bargein-producer.py Modified: http_api.py, server.py, output_queue.py, requirements.txt Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove unused imports (typing.Any, WebSocketDisconnect) and unused variable assignments (full, loop). Update /health smoke test assertions to match the new structured response format (engines dict, modalities dict, status can be 'degraded' when no engines loaded). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chazmaniandinkle and others added 3 commits April 13, 2026 17:36

style: apply ruff formatting to pass CI format check

15e3e0a

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chazmaniandinkle merged commit 0277629 into main Apr 13, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: voice dashboard, barge-in detection, kernel integration#1

feat: voice dashboard, barge-in detection, kernel integration#1
chazmaniandinkle merged 3 commits intomainfrom
feat/dashboard-barge-in

chazmaniandinkle commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

chazmaniandinkle commented Apr 13, 2026

Summary

New files

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant