Skip to content

Commit e64c7f9

Browse files
docs: add bus architecture guide + integration tests
ARCHITECTURE.md (216 lines): - Core types (CognitiveEvent, CognitiveIntent, EncodedOutput) - Bus operations (perceive, act, hud) with flow diagrams - Current modalities (voice, text) - How to add a new modality (4-step guide) - Integration points (MCP + HTTP API) tests/test_bus_wiring.py (6 tests, 5 pass, 1 skip): - Bus singleton exists and has VoiceModule - health() and hud() return expected structure - diagnostics() MCP tool includes bus state - http_api import check (skipped: missing pysbd dep) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent c2e5610 commit e64c7f9

2 files changed

Lines changed: 382 additions & 0 deletions

File tree

ARCHITECTURE.md

Lines changed: 216 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,216 @@
1+
# Mod3 Architecture: The Modality Bus
2+
3+
The modality bus is the sensorimotor boundary between cognitive agents and physical signals. Agents think in cognitive events ("someone spoke", "say this"); the bus translates between those events and raw bytes (audio, text, future: vision, spatial).
4+
5+
```
6+
ModalityBus
7+
┌──────────────────────────────────────────────┐
8+
│ │
9+
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
10+
│ │ Voice │ │ Text │ │ Vision* │ ... │
11+
│ │ Module │ │ Module │ │ Module │ │
12+
│ └────┬─────┘ └────┬────┘ └────┬────┘ │
13+
│ │ │ │ │
14+
│ ┌────┴─────────────┴────────────┴────┐ │
15+
│ │ Event Log + Listeners │ │
16+
│ └────┬─────────────┬────────────┬────┘ │
17+
│ │ │ │ │
18+
│ ┌────┴────┐ ┌─────┴─────┐ ┌──┴───┐ │
19+
│ │ Channel │ │ Channel │ │ ... │ │
20+
│ │ discord │ │ http-api │ │ │ │
21+
│ └─────────┘ └───────────┘ └──────┘ │
22+
└──────────────────────────────────────────────┘
23+
24+
* Vision/Spatial are defined in ModalityType but not yet implemented.
25+
```
26+
27+
## Core Types (modality.py)
28+
29+
### Cognitive Primitives
30+
31+
The agent never touches raw bytes. It sees these:
32+
33+
```python
34+
@dataclass
35+
class CognitiveEvent: # Input percept
36+
modality: ModalityType # VOICE, TEXT, VISION, SPATIAL
37+
content: str # The meaning (transcribed text, caption, etc.)
38+
source_channel: str # Which channel it arrived on
39+
confidence: float # Decoder certainty (0.0 - 1.0)
40+
timestamp: float
41+
metadata: dict[str, Any]
42+
43+
@dataclass
44+
class CognitiveIntent: # Output intent (not yet encoded)
45+
modality: ModalityType | None # None = let the bus decide
46+
content: str # What to communicate
47+
target_channel: str # Specific channel, or "" for bus routing
48+
priority: int # Higher = more urgent
49+
metadata: dict[str, Any] # voice, speed, emotion, etc.
50+
51+
@dataclass
52+
class EncodedOutput: # Raw signal ready for delivery
53+
modality: ModalityType
54+
data: bytes # WAV, PNG, JSON, etc.
55+
format: str # "wav", "png", "text", etc.
56+
duration_sec: float
57+
metadata: dict[str, Any]
58+
```
59+
60+
### Abstract Base Classes
61+
62+
Every modality module implements three components:
63+
64+
```python
65+
class Gate(ABC):
66+
def check(self, raw: bytes, **kwargs) -> GateResult: ...
67+
68+
class Decoder(ABC):
69+
def decode(self, raw: bytes, **kwargs) -> CognitiveEvent: ...
70+
71+
class Encoder(ABC):
72+
def encode(self, intent: CognitiveIntent) -> EncodedOutput: ...
73+
74+
class ModalityModule(ABC):
75+
modality_type -> ModalityType # Which modality this handles
76+
gate -> Gate | None # Input filter (None = pass all)
77+
decoder -> Decoder | None # raw -> CognitiveEvent
78+
encoder -> Encoder | None # CognitiveIntent -> EncodedOutput
79+
state -> ModuleState # Live HUD state
80+
health() -> dict # Diagnostics
81+
```
82+
83+
`Gate` is optional. Text has no gate (all text passes). Voice uses VAD to reject silence.
84+
85+
## The Bus (bus.py)
86+
87+
`ModalityBus` manages module registration, signal routing, and state tracking.
88+
89+
### perceive() -- Input Path
90+
91+
```
92+
raw bytes ──→ Gate.check() ──→ Decoder.decode() ──→ CognitiveEvent
93+
│ │
94+
(rejected?) (empty content?)
95+
↓ ↓
96+
None None (filtered)
97+
```
98+
99+
```python
100+
bus.perceive(raw: bytes, modality: str | ModalityType, channel: str = "", **kwargs)
101+
-> CognitiveEvent | None
102+
```
103+
104+
1. Resolve the modality module from the registry
105+
2. If the module has a gate, run `gate.check(raw)`. Emit a `modality.gate` bus event. Return `None` if rejected.
106+
3. Run `decoder.decode(raw)`. If content is empty (e.g., hallucination filtered), emit `modality.filtered` and return `None`.
107+
4. Stamp `source_channel`, emit `modality.input`, return the event.
108+
109+
### act() -- Output Path
110+
111+
```
112+
CognitiveIntent ──→ resolve modality ──→ Encoder.encode() ──→ EncodedOutput
113+
114+
channel.deliver()
115+
```
116+
117+
```python
118+
bus.act(intent: CognitiveIntent, channel: str = "", blocking: bool = False)
119+
-> QueuedJob | EncodedOutput
120+
```
121+
122+
1. Resolve output modality: explicit on intent, or inferred from channel capabilities (prefers voice over text), or defaults to text.
123+
2. Encode via the module's encoder. Emits `modality.encode_start` and `modality.output` bus events.
124+
3. If the target channel has a `deliver` callback, call it with the encoded output.
125+
4. If `blocking=True`, returns `EncodedOutput` directly. Otherwise queues via `OutputQueueManager` and returns a `QueuedJob`.
126+
127+
### hud() -- Agent Awareness
128+
129+
```python
130+
bus.hud() -> dict
131+
```
132+
133+
Returns a live snapshot of all modules and channels: current status, active jobs, queue depths, recent events. Designed to be injected into the agent's context window so it knows what the body is doing.
134+
135+
### Channels
136+
137+
Channels declare which modalities they support. The bus auto-routes output based on channel capabilities.
138+
139+
```python
140+
bus.register_channel("discord-voice", [ModalityType.VOICE, ModalityType.TEXT],
141+
deliver=send_to_discord)
142+
```
143+
144+
### Bus Events
145+
146+
Every boundary crossing is recorded as a `BusEvent` (type, modality, channel, timestamp, data). Listeners can subscribe via `bus.on_event(callback)` for ledger integration. The bus keeps the last 500 events in memory.
147+
148+
## Current Modalities
149+
150+
### Voice (modules/voice.py)
151+
152+
| Component | Class | Implementation |
153+
|-----------|-------|----------------|
154+
| Gate | `VoiceGate` | Silero VAD via `vad.detect_speech()`. Threshold-configurable (default 0.5). Rejects audio with no detected speech. |
155+
| Decoder | `WhisperDecoder` | `mlx_whisper` STT on Apple Silicon. Lazy-loads `mlx-community/whisper-turbo`. Applies `vad.is_hallucination()` filter to reject phantom transcripts. |
156+
| Decoder (legacy) | `PlaceholderDecoder` | Accepts pre-transcribed text. Used by the MCP server for the `speak` tool path where text is already known. |
157+
| Encoder | `VoiceEncoder` | Wraps `engine.synthesize()` (Kokoro, Voxtral, Chatterbox, Spark). Default voice: `bm_lewis` at 1.25x speed. Returns WAV bytes. |
158+
159+
### Text (modules/text.py)
160+
161+
| Component | Class | Implementation |
162+
|-----------|-------|----------------|
163+
| Gate | None | All text passes through. |
164+
| Decoder | `TextDecoder` | Identity transform: `bytes.decode("utf-8")` -> `CognitiveEvent`. |
165+
| Encoder | `TextEncoder` | Identity transform: `intent.content.encode("utf-8")` -> `EncodedOutput`. |
166+
167+
Text exists so it is a first-class modality on the bus, not a special case.
168+
169+
## Integration Points
170+
171+
### MCP Server (server.py)
172+
173+
The MCP server creates the bus singleton at module level:
174+
175+
```python
176+
_bus = _create_bus() # ModalityBus with VoiceModule(decoder=PlaceholderDecoder())
177+
```
178+
179+
MCP tools (`speak`, `diagnostics`, `vad_check`) use `_bus` for voice state tracking, health reports, and VAD. The `speak` tool resolves voices through the bus's voice module, sets encoder state, and uses the engine directly for synthesis (the adaptive player handles local playback).
180+
181+
The `diagnostics` tool returns `_bus.health()` and `_bus.hud()`.
182+
183+
### HTTP API (http_api.py)
184+
185+
The HTTP API imports the bus singleton from the MCP server:
186+
187+
```python
188+
from server import _bus as _shared_bus # Shared instance when co-hosted
189+
_bus = _shared_bus # Falls back to fresh ModalityBus if import fails
190+
```
191+
192+
It ensures both Text and Voice modules are registered, then exposes the bus directly:
193+
194+
| Endpoint | Bus Method |
195+
|----------|------------|
196+
| `GET /v1/bus/hud` | `_bus.hud()` |
197+
| `GET /v1/bus/health` | `_bus.health()` |
198+
| `POST /v1/bus/perceive` | `_bus.perceive(raw, modality, channel)` |
199+
| `POST /v1/bus/act` | `_bus.act(intent, channel, blocking=True)` |
200+
| `GET /health` | includes `_bus.health()` and `_bus.hud()` |
201+
202+
When running with `--all`, both MCP and HTTP share the same bus instance and model cache.
203+
204+
## Adding a New Modality
205+
206+
1. **Create `modules/your_modality.py`** -- implement `Gate`, `Decoder`, `Encoder` (all optional), and a `ModalityModule` subclass that wires them together. See `modules/text.py` for the minimal case or `modules/voice.py` for the full pattern.
207+
208+
2. **Add the modality type** to `ModalityType` in `modality.py` if needed. `VISION` and `SPATIAL` are already defined.
209+
210+
3. **Register with the bus** where it is created (`server.py` and/or `http_api.py`):
211+
```python
212+
bus.register(VisionModule())
213+
bus.register_channel("webcam-feed", [ModalityType.VISION])
214+
```
215+
216+
4. **No routing changes needed.** The bus auto-routes `act()` based on channel capabilities. The HTTP API's `/v1/bus/perceive` and `/v1/bus/act` already accept any registered modality via the `modality` parameter.

tests/test_bus_wiring.py

Lines changed: 166 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
"""Integration tests for Mod3 bus wiring.
2+
3+
Verify that the ModalityBus singleton in server.py is correctly
4+
instantiated and wired through to http_api.py, with a VoiceModule
5+
registered and all key APIs returning expected structures.
6+
7+
Run: python3 -m pytest tests/test_bus_wiring.py -v
8+
"""
9+
10+
import json
11+
import os
12+
import sys
13+
14+
import pytest
15+
16+
# Ensure the project root is on the path so imports resolve
17+
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
18+
19+
20+
# ---------------------------------------------------------------------------
21+
# Helpers
22+
# ---------------------------------------------------------------------------
23+
24+
25+
def _skip_if_import_fails(module_name: str):
26+
"""Return a pytest skip decorator if the given module cannot be imported."""
27+
try:
28+
__import__(module_name)
29+
except (ImportError, ModuleNotFoundError) as e:
30+
pytest.skip(f"{module_name} unavailable: {e}")
31+
32+
33+
# ---------------------------------------------------------------------------
34+
# Tests
35+
# ---------------------------------------------------------------------------
36+
37+
38+
def test_bus_singleton_exists():
39+
"""server._bus exists and is a ModalityBus instance."""
40+
from bus import ModalityBus
41+
from server import _bus
42+
43+
assert _bus is not None, "_bus should not be None"
44+
assert isinstance(_bus, ModalityBus), f"_bus should be a ModalityBus, got {type(_bus).__name__}"
45+
46+
47+
def test_bus_has_voice_module():
48+
"""The server bus has a VoiceModule registered under ModalityType.VOICE."""
49+
from modality import ModalityType
50+
from modules.voice import VoiceModule
51+
from server import _bus
52+
53+
modules = getattr(_bus, "_modules", {})
54+
assert ModalityType.VOICE in modules, "Bus should have a VOICE module registered"
55+
voice_module = modules[ModalityType.VOICE]
56+
assert isinstance(voice_module, VoiceModule), (
57+
f"VOICE module should be VoiceModule, got {type(voice_module).__name__}"
58+
)
59+
# The server uses PlaceholderDecoder (no heavy model deps)
60+
assert voice_module.gate is not None, "VoiceModule should have a gate"
61+
assert voice_module.decoder is not None, "VoiceModule should have a decoder"
62+
assert voice_module.encoder is not None, "VoiceModule should have an encoder"
63+
64+
65+
def test_bus_health_returns_dict():
66+
"""_bus.health() returns a dict with modules, channels, queues, event_count."""
67+
from server import _bus
68+
69+
health = _bus.health()
70+
assert isinstance(health, dict), f"health() should return a dict, got {type(health).__name__}"
71+
72+
expected_keys = {"modules", "channels", "queues", "event_count"}
73+
assert expected_keys.issubset(health.keys()), (
74+
f"health() missing keys: {expected_keys - health.keys()}"
75+
)
76+
77+
# modules should contain at least 'voice'
78+
assert "voice" in health["modules"], "health() modules should include 'voice'"
79+
80+
voice_health = health["modules"]["voice"]
81+
assert "has_gate" in voice_health, "voice health should report has_gate"
82+
assert "has_decoder" in voice_health, "voice health should report has_decoder"
83+
assert "has_encoder" in voice_health, "voice health should report has_encoder"
84+
assert voice_health["has_gate"] is True
85+
assert voice_health["has_decoder"] is True
86+
assert voice_health["has_encoder"] is True
87+
88+
89+
def test_bus_hud_returns_dict():
90+
"""_bus.hud() returns a dict with modules, channels, queues, recent_events."""
91+
from server import _bus
92+
93+
hud = _bus.hud()
94+
assert isinstance(hud, dict), f"hud() should return a dict, got {type(hud).__name__}"
95+
96+
expected_keys = {"modules", "channels", "queues", "recent_events"}
97+
assert expected_keys.issubset(hud.keys()), (
98+
f"hud() missing keys: {expected_keys - hud.keys()}"
99+
)
100+
101+
# modules should contain 'voice' with status info
102+
assert "voice" in hud["modules"], "hud() modules should include 'voice'"
103+
voice_hud = hud["modules"]["voice"]
104+
assert "status" in voice_hud, "voice HUD entry should have 'status'"
105+
assert voice_hud["status"] == "idle", f"voice status should be 'idle', got {voice_hud['status']}"
106+
107+
# timestamp should be present and numeric
108+
assert "timestamp" in hud, "hud() should include a timestamp"
109+
assert isinstance(hud["timestamp"], (int, float)), "timestamp should be numeric"
110+
111+
# recent_events should be a list
112+
assert isinstance(hud["recent_events"], list), "recent_events should be a list"
113+
114+
# channels and queues should be dicts
115+
assert isinstance(hud["channels"], dict), "channels should be a dict"
116+
assert isinstance(hud["queues"], dict), "queues should be a dict"
117+
118+
119+
def test_diagnostics_includes_bus():
120+
"""The diagnostics() MCP tool response includes a 'bus' key with health and hud."""
121+
from server import diagnostics
122+
123+
raw = diagnostics()
124+
data = json.loads(raw)
125+
126+
assert "bus" in data, "diagnostics() should include a 'bus' key"
127+
bus_data = data["bus"]
128+
129+
assert "health" in bus_data, "bus section should include 'health'"
130+
assert "hud" in bus_data, "bus section should include 'hud'"
131+
132+
# Verify nested structure is populated
133+
assert "modules" in bus_data["health"], "bus.health should have 'modules'"
134+
assert "modules" in bus_data["hud"], "bus.hud should have 'modules'"
135+
assert "voice" in bus_data["health"]["modules"], "bus health modules should include 'voice'"
136+
assert "voice" in bus_data["hud"]["modules"], "bus hud modules should include 'voice'"
137+
138+
139+
def test_http_api_imports_bus():
140+
"""http_api.py can import the bus from server without circular import errors."""
141+
# This import itself is the test: http_api does
142+
# from server import _bus as _shared_bus
143+
# If there's a circular import, this will raise ImportError.
144+
# http_api also imports engine which may not be available, so we
145+
# handle that gracefully.
146+
try:
147+
from http_api import _bus as http_bus
148+
except ImportError as e:
149+
# If engine or another heavy dep is missing, that's OK for this test
150+
# as long as it's not a circular import error.
151+
if "circular" in str(e).lower():
152+
pytest.fail(f"Circular import detected: {e}")
153+
pytest.skip(f"http_api import failed (likely missing dep): {e}")
154+
except Exception as e:
155+
# Some deps (engine, sounddevice, etc.) may fail on import.
156+
# The key assertion is that it doesn't fail due to circular imports
157+
# between server.py and http_api.py.
158+
if "circular" in str(e).lower():
159+
pytest.fail(f"Circular import detected: {e}")
160+
pytest.skip(f"http_api import failed (non-circular): {e}")
161+
162+
from bus import ModalityBus
163+
164+
assert isinstance(http_bus, ModalityBus), (
165+
f"http_api._bus should be a ModalityBus, got {type(http_bus).__name__}"
166+
)

0 commit comments

Comments
 (0)