You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Last Updated: 2025-12-27
Original Session: 2025-12-22 (Conversation ID: c305f5d6-89a1-4d5b-a311-e081142f51ae)
Phases Completed: 0-31.6
Executive Summary
The Distributed Task Observatory is a production-grade distributed task processing system demonstrating modern microservice architecture, event-driven design, observability, and polyglot development on Kubernetes.
Final System State
15+ Kubernetes pods running and healthy
5 microservices (Node.js/TypeScript, Python, Go x2, Rust)
Refactored monolithic 2710-line main.rs into 7 clean modules:
File
Lines
Purpose
main.rs
~1130
Entry point, rendering, event loop
lib.rs
50
Module re-exports
types.rs
~405
Data structures, App state
error.rs
~460
Error types, remediation helpers
doctor.rs
~300
Prerequisite checking, CLI handlers
cluster.rs
~470
Cluster ops, setup script, browser
install.rs
~140
Clipboard, install execution
New Feature: Guided Prerequisites Setup
Automatic detection of Docker, PowerShell, kubectl, kind
Interactive selection of missing prerequisites
Clipboard copy via arboard crate (cross-platform)
Status feedback in TUI
Dependencies Added
Crate
Purpose
arboard v3
Cross-platform clipboard access
Test Results
Component
Tests
TUI (Rust)
49 pass (3 new install tests)
All modules
Full coverage
Phase 16: TypeScript Migration & Doctor Enhancement (2025-12-25)
Doctor Command Enhancement
The doctor command now displays OS-specific installation commands for missing prerequisites:
odd-dashboard doctor
====================
[OK] Platform: windows-x86_64 (supported)
[OK] Docker: Docker version 24.0.7
[FAIL] PowerShell Core: not found
[OK] kubectl: Client Version: v1.28.3
[FAIL] kind: not found
Installation Commands (windows):
----------------------------------------
PowerShell Core: winget install Microsoft.PowerShell
kind: winget install Kubernetes.kind
Some prerequisites are missing.
Run the commands above, then retry: odd-dashboard doctor
TypeScript Migration
Converted JavaScript test files to TypeScript with strict typing:
File
Changes
src/services/gateway/__tests__/index.test.ts
Added interfaces for EventEnvelope, EventProducer, OpenApiSpec
src/services/gateway/__tests__/web-smoke.test.ts
Added interfaces for Registry, RegistryEntry, JobPayload, ErrorResponse
tests/web-smoke.test.ts
Root-level tests migrated to TypeScript
New Configuration Files
File
Purpose
tsconfig.json
Root TypeScript config with strict mode
vitest.config.ts
Root vitest config for tests directory
src/services/gateway/vitest.config.ts
Gateway vitest config for tests directory
commitlint.config.cjs
Renamed from .js for ESM compatibility
Package Updates
Added vitest ^1.6.0 to root devDependencies
Added typescript ^5.3.3 to root devDependencies
Set "type": "module" in root package.json
Added test and typecheck scripts to root
Bazel Updates
Updated src/services/gateway/BUILD.bazel to reference TypeScript test files.
Infrastructure Fix
RedisInsight port correction: Updated infra/k8s/redisinsight.yaml to use internal port 5540 (RedisInsight v2) while maintaining external access on port 8001
CHANGELOG Automation
Added @semantic-release/changelog plugin to .releaserc.json for automatic CHANGELOG.md generation on releases.
Test Results
Component
Tests
TUI (Rust)
72 pass
Gateway (TypeScript)
17 pass
Root (TypeScript)
10 pass
Commits
c85b856 refactor: migrate JavaScript tests to TypeScript with strict typing
aa55c4f feat(tui): show OS-specific install commands in doctor output
ab9a126 fix(infra): correct RedisInsight port and enable changelog generation
Phase 17: Testing Optimizations & CI Hardening (2025-12-25)
Invariants Documentation
Created docs/INVARIANTS.md with formal system guarantees
Contract, cross-platform, coverage, integration, and automation invariants
Coverage Enforcement
coverage-config.json - Externalized thresholds (single source of truth)
scripts/check-coverage.py - Unified validator with self-test capability
CI Optimizations
Added dorny/paths-filter@v3 for reliable change detection
Analyzed Go services and identified that ~70-80% of code in main.go files handles external connections (RabbitMQ, Redis, MongoDB, PostgreSQL) and infinite processing loops - code that cannot be meaningfully unit-tested without refactoring for dependency injection or integration tests.
Decision
Opted for Option B: Maximize unit test coverage for business logic within current architecture, retain realistic thresholds for infrastructure-heavy services, and document the tradeoff.
New Tests Added
metrics-engine/metrics_engine_test.go (11 new tests)
Add DOCKERHUB_USERNAME and DOCKERHUB_TOKEN secrets to GitHub
Push changes to main to trigger image builds
Update INVARIANTS.md to mark I3-I5 as ✅ CI after verification
Phase 20: Web Terminal Modernization (2025-12-25)
BREAKING CHANGE
Replaced the glassmorphic Web Dashboard with an xterm.js-based terminal that mirrors the TUI via WebSocket PTY streaming. This provides 100% visual fidelity with the native TUI.
Architecture: PTY Multiplexer
web-pty-ws (Rust): PTY broker running odd-dashboard in pseudo-terminal
web-ui-http (nginx): Static files + /ws proxy to PTY server
Split K8s deployments: HTTP can roll independently without killing PTY sessions
web-pty-server (Rust)
Module
Purpose
main.rs
WebSocket server, cleanup task, metrics endpoint
session.rs
Session lifecycle, single-use reconnect tokens
protocol.rs
Client/server message types, input classification
auth.rs
Bearer token validation (never logged)
pty.rs
PTY spawning with xterm-256color, UTF-8, truecolor
config.rs
Environment-driven configuration
Frontend (xterm.js)
File
Purpose
terminal.js
WebSocket client, auto-reconnect, resize handling
styles.css
Terminal theme matching TUI colors
nginx.conf
/ws proxy, /api proxy for fallback stats
Requirements Implemented
Req
Feature
R1
Split K8s deployments (PTY survival during HTTP rollouts)
R2
Session model with single-use reconnect tokens
R3
Environment-driven resource limits (logged at startup)
infra/k8s/web-pty-ws.yaml (PTY server deployment + secret)
src/interfaces/web/terminal.js (xterm.js client)
src/interfaces/web/nginx.conf (proxy config)
Files Removed
infra/k8s/web-ui.yaml (replaced by split deployments)
Visual Regression Tests (CI Integration)
Playwright tests: 6 tests for terminal rendering fidelity
CI job: visual-regression triggers on web_terminal path changes
Docker Compose: Services spun up automatically for tests
Artifacts: playwright-report and snapshots on failure
V7 invariant: Visual regression tests enforced in CI
Coverage
web-pty-server: 81.12% (35 unit tests)
Exceeds V6 threshold of 80%
Phase 31.5: Visual Test Strategy & Server-Side Failure Injection (2025-12-26)
Objective
Implement a tiered visual test strategy and server-side failure injection to enable deterministic fallback UI testing without relying on Playwright's unreliable WebSocket mocking.
Background
PTY_PER_IP_CAP=30 in docker-compose.integration.yml is an operational tuning parameter (not an invariant)
Re-enabled Fallback Dashboard tests with server injection
Test Results
Component
Tests
web-pty-server (Rust)
47 pass
Playwright (Tier 1 + 3)
7 pass
Key Design Decisions
Server-side injection over Playwright WebSocket mocking (unreliable)
Query param override allows per-request failure without server restart
Frontend passthrough makes tests self-contained and race-free
Nightly-only visual tests until TUI rendering stability proven
Phase 31.6: CI Docker Build Context Fix (2025-12-27)
Objective
Fix a latent bug where CI build-images job used service-local contexts while Dockerfiles require repo root contexts. Add prevention script to catch future drift.
Background
Phase 19 introduced CI build-images with service-local contexts
Dockerfiles were later updated to repo-relative COPY paths for start-all.ps1 compatibility
Bug hadn't surfaced because CI hadn't rebuilt images since the Dockerfile changes
Discovery
During INVARIANTS.md review, context mismatch was identified:
# CI Used (WRONG) vs Dockerfile Expects (CORRECT)
context: src/services/gateway context: . (repo root)
COPY src/services/gateway/... # Fails with service-local context!
Fix Applied
1. CI Workflow (ci.yml)
# Before (BROKEN)
- service: gatewaycontext: src/services/gateway# After (FIXED)
- service: gatewaycontext: .
2. Prevention Script Created
New scripts/validate-dockerfile-context.py:
Conservative approach with explicit service lists
Documents assumptions in script header
Added to CI as "Dockerfile context parity check"
3. INVARIANTS.md Updated
B1/B2 now CI-enforced via validate-dockerfile-context.py
Added "Service Context Categories" table
Documented new hazard: "CI/Local Context Drift"
Service Context Categories
Category
Services
Context
Reason
Repo Root Required
gateway, processor, web-pty-server
.
Use COPY src/... and COPY contracts/
Service-Local OK
metrics-engine, read-model, web-ui
Service dir
Self-contained, no shared assets
Files Created
File
Purpose
scripts/validate-dockerfile-context.py
CI prevention script for context/path drift
Files Modified
File
Changes
.github/workflows/ci.yml
Fixed contexts, removed cp contracts workaround, added validation step, added web-pty-server and web-ui to matrix
docs/INVARIANTS.md
Updated B1-B4 with accurate documentation and CI enforcement
README.md
Updated Docker Hub images table with web-pty-server and web-ui