feat: voice dashboard, barge-in detection, kernel integration#1
Merged
chazmaniandinkle merged 3 commits intomainfrom Apr 13, 2026
Merged
feat: voice dashboard, barge-in detection, kernel integration#1chazmaniandinkle merged 3 commits intomainfrom
chazmaniandinkle merged 3 commits intomainfrom
Conversation
Dashboard:
- Browser-based voice/text chat at /dashboard via WebSocket (/ws/chat)
- Silero VAD v5 in-browser, PCM audio streaming, auto-reconnect
- Three inference providers: MlxProvider (local Gemma), OllamaProvider, CogOSProvider
- Agent loop with tool-calling (speak/send_text) — Gemma 4 E4B works natively
Barge-in:
- SuperWhisper recording detection via filesystem watching (bargein-producer.py)
- speak() returns "held" when user is recording — no zombie queued jobs
- Interrupt context written back to signal file (spoken_pct, delivered_text)
- 150ms poll, pure stdlib Python, zero external dependencies
Kernel integration:
- Agent loop pulls CogOS kernel context per turn (identity, state, barge-in history)
- Exchanges logged to CogOS bus for observation by higher-level agents
- CogOSProvider routes through kernel /v1/chat/completions
Reliability:
- Non-blocking speak() in agent loop (fire-and-forget via bus.act)
- Kokoro pre-warm on server startup (eliminates 60s cold start)
- WebSocket cleanup: cancel pending TTS jobs on disconnect
- Graceful /shutdown endpoint with drain + SIGINT
- Standardized /health with uptime, engine status, queue state
- /capabilities endpoint with dynamic voice lists
New files: agent_loop.py, channels.py, providers.py, dashboard/,
integrations/bargein-producer.py
Modified: http_api.py, server.py, output_queue.py, requirements.txt
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove unused imports (typing.Any, WebSocketDisconnect) and unused variable assignments (full, loop). Update /health smoke test assertions to match the new structured response format (engines dict, modalities dict, status can be 'degraded' when no engines loaded). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
/dashboard— browser-based voice/text chat via WebSocket, Silero VAD v5, three inference providers (local Gemma 4, Ollama, CogOS kernel)New files
agent_loop.py— Gemma 4 tool-calling agent loop with kernel context injectionchannels.py— BrowserChannel WebSocket adapter for dashboardproviders.py— MlxProvider, OllamaProvider, CogOSProvider + auto-detectdashboard/— HTML/JS/VAD assets for browser voice chatintegrations/bargein-producer.py— SuperWhisper barge-in signal producerTest plan
python server.py --dashboard --port 7860then openhttp://localhost:7860/dashboardcurl localhost:7860/healthreturns enriched statuscurl localhost:7860/capabilitiesreturns voice listsGenerated with Claude Code