diff --git a/.agents/sow/current/SOW-0005-20260526-opencode-adapter.md b/.agents/sow/current/SOW-0005-20260526-opencode-adapter.md deleted file mode 100644 index 14523c9..0000000 --- a/.agents/sow/current/SOW-0005-20260526-opencode-adapter.md +++ /dev/null @@ -1,177 +0,0 @@ -# SOW-0005 - opencode adapter (read-only SQLite + cumulative-token deltas + schema-drift tolerance) - -## Status - -Status: in-progress - -Sub-state: active in `current/`. Approved under the operator's blanket Phase-2 backlog sign-off ("deliver them all, any order"). Prerequisites met: SOW-0001 Phase 1 in `done/`; SOW-0004 (codex) merged, which left the catalog idempotent under op re-emission (reused here). Pre-Implementation Gate filled 2026-05-30 (below). - -## Requirements - -### Purpose - -Deliver the opencode adapter end-to-end against the single live SQLite database at `~/.local/share/opencode/opencode.db` (3.9 GB on the operator's workstation). The adapter opens the database **strictly read-only**, polls per-table watermarks with PK-indexed `MAX(id)` queries, synthesizes turns and ops from the opencode `session → message → part` tree, computes per-LLM-op token deltas from opencode's cumulative `step-finish` totals, tolerates schema drift across ~30 historic migrations via `PRAGMA table_info` + dynamic SELECT, registers multi-provider sessions with `provider_alias`, and exposes an auto-discovery probe. Outcome: the operator sees every opencode session — including sub-agents linked via `session.parent_id` — without opencode ever observing a write from ai-viewer. - -### User Request - -From the operator's 2026-05-26 milestone list (recorded in conversation while planning post-Phase-1 work): "Add claude-code, codex, and opencode adapters next, one SOW each, so each can be reviewed and scoped independently." This SOW is the opencode slice of that instruction and inherits its full scope (parser + Scan + Tail + cursor + tests + fixtures + auto-discovery + spec sync). - -### Assistant Understanding - -Facts: - -- Opencode stores everything in one SQLite database with WAL companions at `~/.local/share/opencode/opencode.db` (3.9 GB main + 5.5 MB WAL + 32 KB SHM on the operator's workstation; `adapter-opencode.md` §"Source Format"). -- The defining read-safety constraint, recorded as a hard invariant in AGENTS.md and `adapter-opencode.md` §"Read Strategy": open with `mode=ro&_pragma=query_only(true)&_pragma=journal_mode(WAL)&_pragma=busy_timeout(5000)`. NEVER call `PRAGMA wal_checkpoint`, `PRAGMA optimize`, `VACUUM`, `BEGIN EXCLUSIVE`, `ATTACH ... rwc`, or any other write-path. This is the highest-risk adapter on read-safety because opencode's writer is live and concurrent. -- Schema is `session → message → part` with no native `turn` or `op` concept (`adapter-opencode.md` §"Source Format", §"Mapping to Canonical Events"). The adapter synthesizes: Turn = assistant message; LLM-Op = `step-start` → `step-finish` pair; Tool-Op = `tool` part nested under current step; Reasoning-Op = `reasoning` part nested under current step. -- **`step-finish` token counts are CUMULATIVE within a message, not per-step** (`adapter-opencode.md` §"Tool calls and Models — concrete field map", §"Canonical Model Gaps" #3). Observed monotonic sequence (input tokens 17438, 23075, 31713, 35407, …) confirms this. The adapter MUST compute deltas between successive `step-finish` values within the same message before emitting per-op tokens. Mixing cumulative for delta would triple-count tokens silently — this is the top-line defect to prevent. -- Sub-agent linkage is dual and 100% consistent on observed data: `session.parent_id` (authoritative, 1285 child sessions) + `part.data.state.metadata.sessionId` on `tool` parts where `tool='task'` (1274 of 1274 cross-checks match) (`adapter-opencode.md` §"Sub-Agent Linkage"). Adapter prefers `parent_id`. -- IDs are time-prefixed Sonyflake; `id > ''` is monotonic and PK-indexed. Cursor uses `MAX(id)` per table as the primary watermark, with `MAX(time_updated)` as a fallback for detecting in-place mutations behind an fsnotify-gated trigger (`adapter-opencode.md` §"Performance"). -- Schema evolves between opencode versions (~30 migrations in `packages/opencode/migration/` per `anomalyco/opencode @ 2b3ddf9`). The adapter queries `PRAGMA table_info(session|message|part|session_message)` at startup, builds dynamic SELECT lists naming only known columns (never `SELECT *`), tolerates missing columns with empty/zero values + one INF log per (table, column) on first occurrence (`adapter-opencode.md` §"Edge Cases" #1). -- Opencode is multi-provider: observed `providerID` values include `llm-netdata-cloud`, `zai-coding-plan`, `minimax-coding-plan`, `deepseek`, `kimi-for-coding`, `openrouter`, `alibaba-coding-plan`. These are user-defined aliases, not canonical vendors. Canonical model adds `sessions.provider_alias` + `catalog_providers` table per SOW-0002; the adapter populates the alias verbatim and emits a best-effort canonical mapping where known (`adapter-opencode.md` §"Multi-provider awareness"). -- Poll cadence: 2 s idle / 500 ms active / 250 ms after `opencode.db-wal` fsnotify mtime change for the next 5 s (`adapter-opencode.md` §"Watch Strategy"). Each delta page is its own short transaction (limit 1000 rows, target <50 ms) to avoid pinning the WAL. -- Phase 1 Foundation (SOW-0001) delivers `internal/canonical/`, `internal/ingest/`, `internal/store/`, `internal/adapters/registry.go`, the `canonical.Adapter` interface, pricing catalog, fixture sanitization tooling, and CI gates that this SOW reuses unchanged. - -Inferences: - -- Initial backfill of 6,778 sessions + 127,345 messages + 585,894 parts (~3.9 GB) is expected in 60-90 s wall-clock per `adapter-opencode.md` §"Performance" (SSD read ~100 MB/s + JSON decode CPU bound at ~50 MB/s in Go with `encoding/json`). Page reads at 1000 rows/transaction with `SourceProgress` every 1000 rows so restart resumes. -- Read-only enforcement is layered defense-in-depth: `mode=ro` (OS-level, the file is opened `O_RDONLY` so SQLite cannot upgrade), `_pragma=query_only(true)` (SQL-layer rejection of writes), and a test that asserts an attempted write panics or errors at the adapter's connection-helper boundary. -- The cumulative-token regression test should pin the delta math against a synthetic fixture with deliberately monotonic step-finish values and a committed `.golden.json` so a future change cannot silently revert to raw-value emission. - -Unknowns: - -- Whether the opencode binary on this workstation is currently running concurrently with the test runs (the adapter must be safe either way). Resolved by the read-only-enforcement test which uses a copy-on-write fixture; production runs against the live DB are validated only in the manual-walkthrough acceptance. -- Whether any `session_message.type` beyond `agent-switched` / `model-switched` appears on this workstation. Spec records "treat unknown types as forward-compatibility data and skip with structured WARN"; resolved by a `SELECT DISTINCT type` query during Pre-Implementation Gate authoring. -- Whether the dynamic `PRAGMA table_info`-driven SELECT can be tested against an older opencode schema. Acceptance #8 requires this; the fixture is a small synthetic SQLite file with a subset of columns mimicking an older migration state. - -### Acceptance Criteria - -1. `internal/adapters/opencode/` package compiles, lints clean, and is registered in `internal/adapters/registry.go`. **Verification**: `go build ./...` exits 0; `golangci-lint run` exits 0; `internal/adapters/registry_test.go` asserts the adapter is enumerable by name `"opencode"`. -2. **Read-only enforcement asserted in tests.** The adapter's connection helper opens with `mode=ro&_pragma=query_only(true)`; an explicit unit test invokes the helper and then attempts `INSERT`, `UPDATE`, `DELETE`, `PRAGMA wal_checkpoint`, `VACUUM`, and `ATTACH ... 'rwc'` against the returned `*sql.DB` and asserts every one returns an error. **Verification**: `internal/adapters/opencode/readonly_test.go` runs all six attempted-write probes and asserts errors; CI's gates include this test. -3. **Cumulative-token-delta regression test.** A synthetic SQLite fixture contains one assistant message with three `step-finish` parts whose `tokens.input` are `100, 250, 410` (cumulative). The adapter must emit per-LLM-op `tokens_in` of `100, 150, 160` (deltas). **Verification**: `internal/adapters/opencode/tokens_delta_test.go` asserts the exact delta sequence; the golden file pins the values so a regression to raw-value emission fails the gate. -4. Sub-agent linkage is correct: every `session` row with `parent_id` set emits a `SessionStartedEvent` with `Kind='sub_agent'` and `ParentNativeID=parent_id`; `tool` parts where `tool='task'` and `state.metadata.sessionId` is set emit both a tool Op AND a session Op (`Kind='session'`, `ChildSessionNativeID=state.metadata.sessionId`) per `adapter-opencode.md` §"Mapping to Canonical Events" rule for `tool` where `tool='task'`. **Verification**: golden test on a sanitized real-data fixture with one parent + one task-spawned child asserts both edges exist in the emitted event stream. -5. **Schema-drift tolerance proven against an older schema fixture.** A second synthetic SQLite fixture mimics a pre-`20260510033149_session_usage` schema (no `cost`/`tokens_*` columns on `session`). The adapter, reading `PRAGMA table_info` at startup, builds a dynamic SELECT that omits the missing columns, emits empty/zero values in the canonical event, and logs exactly one INF per (table, column) on first occurrence. **Verification**: `internal/adapters/opencode/schema_drift_test.go` opens the older-schema fixture, asserts the SELECT does not reference the missing columns (by inspecting the prepared statement or via a query-log probe), and asserts the INF log fires once per missing column then is suppressed. -6. Watermark cursor (per-table `MAX(id)` primary + `MAX(time_updated)` fallback gated by `opencode.db-wal` fsnotify) is durable across restart with zero duplicates and zero gaps; the `time_updated` query runs only after WAL mtime change or every 60 s safety net. **Verification**: integration test that ingests half a fixture, persists cursor, restarts, ingests rest, asserts identical end state to a one-shot ingest; a second test asserts the `MAX(time_updated)` query is NOT issued during steady-state idle polls (probed via a query-counting test driver). -7. Multi-provider sessions register correctly: every distinct opencode `providerID` observed becomes a row in `catalog_providers` with `alias=providerID` and `canonical=`; `sessions.provider_alias` is populated from `data.providerID`. **Verification**: golden test on a fixture with two sessions using different aliased providers asserts both `catalog_providers` rows and both `sessions.provider_alias` values. -8. Auto-discovery probe detects `~/.local/share/opencode/opencode.db` (and `$OPENCODE_DB` when set) at startup, opens read-only, queries `__drizzle_migrations` to record the schema hash, and exposes `(session_count, message_count, part_count, latest_migration_name)` in `/api/health`. **Verification**: unit test on the probe with a fixture DB; manual run on the operator's workstation registers the real source and `/api/sources` reports the live counts. - -## Analysis - -Sources checked: - -- `.agents/sow/specs/adapter-opencode.md` (full spec, all sections) — primary contract. -- `.agents/sow/specs/canonical-events.md` — target event types, including `Kind='sub_agent'`, `OpKind='reasoning'`, `provider_alias` field on SessionStartedEvent, indefinite-`running` SessionStatus (opencode never finalizes; only archives). -- `.agents/sow/specs/data-model.md` — SQLite schema, especially `sessions.provider_alias`, `catalog_providers`, cross-format compatibility matrix. -- `.agents/sow/done/SOW-0002-20260526-cross-format-data-model-analysis.md` — analysis context confirming opencode's cumulative-token quirk and read-only invariant. -- `.agents/sow/current/SOW-0001-phase-1-foundation.md` — infrastructure the adapter plugs into. -- Real evidence on the operator's workstation: `~/.local/share/opencode/opencode.db` (3.9 GB, 6778 sessions, 127345 messages, 585894 parts, 3985 session_messages, 20 migrations applied through `20260511000411_data_migration_state` as of 2026-05-26). -- Upstream source at `anomalyco/opencode @ 2b3ddf9f34546b9bcea25ec8e0ff57e2811c4537` — `packages/opencode/src/storage/db.ts`, `packages/opencode/src/session/session.sql.ts`, `packages/opencode/src/session/message-v2.ts`, `packages/core/src/session-message.ts`, `packages/opencode/migration/` per `adapter-opencode.md` §"References". - -Current state: - -- SOW-0001 (in-progress) delivers canonical event types, SQLite store, ingest pipeline, adapter registry, pricing catalog, fixture sanitization tooling, CI gates, and the ai-agent v3/v2 adapters end-to-end. This SOW assumes that infrastructure is in place; if SOW-0001 is not yet completed, this SOW remains in `pending/`. -- No `internal/adapters/opencode/` package exists yet (the bootstrap only documented the format). -- The canonical model already absorbed opencode's gaps in SOW-0002 (cache tokens, reasoning tokens, provider alias, catalog_providers table); this SOW does NOT propose new canonical changes — only adapter implementation. - -Risks: - -- **R1 — Read-only DB safety (CRITICAL).** Opencode's writer is live and concurrent; any accidental write from ai-viewer corrupts the operator's primary AI coding tool. Mitigation: layered defense — OS-level `mode=ro`, SQL-layer `query_only(true)`, explicit test asserting six write paths all error (acceptance #2). The DSN string is encoded as a constant in the adapter and validated by a unit test that pattern-matches the connection string to guarantee the read-only PRAGMAs are present. No code path in the adapter ever takes a `*sql.Tx` that begins anything other than `BEGIN DEFERRED`. -- **R2 — Cumulative-token miscount (CRITICAL).** The most likely silent defect; misses would triple-count tokens and corrupt every cost calculation. Mitigation: acceptance #3 regression test with a frozen golden file; the test fixture is named explicitly so future code review surfaces the contract; per-LLM-op delta computation is implemented in a single named function (`computeStepDeltas`) with its own table-driven unit tests covering reset-on-message-boundary, missing intermediate step-finish (cancelled step), and out-of-order observation. -- **R3 — Schema drift between opencode versions.** ~30 historic migrations; older rows lack newer columns. Mitigation: dynamic `PRAGMA table_info`-driven SELECT (acceptance #5), tolerance for missing columns with structured INF on first occurrence, and a `schema_hash` field in the cursor that detects new migrations and triggers a re-probe without resetting the cursor (only when a depended-on column disappears does the adapter perform a full re-ingest). -- **R4 — Multi-GB DB query latency.** Full-table `MAX(time_updated)` scans on the `part` table (585k rows, 2.3 GB) take 400-800 ms cold. Mitigation: PK-indexed `MAX(id)` is the primary watermark; the expensive `MAX(time_updated)` query runs only when fsnotify on `opencode.db-wal` signals activity OR every 60 s as a safety net (acceptance #6). The 1000-row page limit + sub-1 s transactions keep the WAL from growing unboundedly. -- **R5 — Sensitive content in fixtures.** Every real opencode message/part carries operator data — session titles, directories, prompts, tool outputs, patch file paths. Mitigation: every committed fixture under `testdata/opencode/` is a synthetic SQLite file constructed by a fixture-builder utility, NOT a copy of the operator's DB. The fixture-builder writes only sanitized data shaped like the real schema; the live DB is consulted only for shape verification (via `PRAGMA table_info` printed during Pre-Implementation Gate authoring) and never copied into `testdata/`. - -## Pre-Implementation Gate - -Filled 2026-05-30. A readiness-briefing subagent re-probed the live `opencode.db` read-only (`immutable=1`); load-bearing claims (the read-only DSN, acceptance #1-8) were re-verified against ground truth before this gate. - -### Problem / model - -Additive feature: a new `opencode` adapter that projects OpenCode's **SQLite** session store onto the canonical event model. Unlike the four JSONL/file adapters (byte-offset cursor + fsnotify-on-append), opencode keeps everything in one SQLite DB (`~/.local/share/opencode/opencode.db`, WAL, Drizzle-managed, ~4.36 GB live, 20 migrations). So the read model is SQL delta-queries + a watermark cursor + DB polling, not line streaming. The adapter is a read-only projection from `session`/`message`/`part`/`session_message` rows → canonical events, reusing the registry/payloads/golden patterns from `codex`/`claude_code` but replacing parser+scanner+stream+tailer with a query layer + poll loop. - -### Evidence reviewed - -- `.agents/sow/specs/adapter-opencode.md` (567 lines, evidence-driven) — primary contract. -- Live DB re-probe (read-only, 2026-05-30): `session` 7,775 (6,335 root / 1,440 child / 2 archived), `message` 144,551 (assistant `data` keys role/time/error/parentID/modelID/providerID/mode/path/cost/tokens), `part` 667,335 (tool 230,606 / step-start 133,186 / step-finish 132,595 / text 83,367 / reasoning 75,361 / patch 11,686 / compaction 495 / file 22 / retry 17), `session_message` 5,975 (**only** agent-switched + model-switched). `event`/`event_sequence` = 0 (ignore). Latest migration `20260510033149_session_usage`. -- `internal/store/store.go:303-318` — `buildDSN` forces `foreign_keys(on)`+`busy_timeout(5000)`, readers add `query_only(true)`; this is ai-viewer's OWN-DB reader (not for the external opencode.db). `internal/canonical/events.go` — `TurnFinalizedEvent` carries `TokensCacheRead/Write`+`CostUSD` (per-turn cache accounting works) but NO `Extras` (turn-extras unreachable — SOW-0021); `OpFinalizedEvent` carries cache tokens + ProviderAlias; `KindSubAgent`/`OpReasoning`/`OpSession`/`OpCompaction` exist (no canonical change needed). -- `internal/adapters/{codex,claude_code}/` — structural template; `registry.go` self-registration; codex `discovery.go`/golden harness. -- SOW-0005 Acceptance #1-8 + Risks R1-R5. - -### Affected contracts & surfaces - -- **NEW** package `internal/adapters/opencode/` (SQLite-backed; see structural map). -- **ADDITIVE** `cmd/ai-viewer-ingest/sources.go`: a 5th auto-discovery probe (`$OPENCODE_DB` else `~/.local/share/opencode/opencode.db`) + a `__drizzle_migrations` schema-hash + count helper (acceptance #8); blank-import for `init()` registration. -- **ADDITIVE** `testdata/opencode//` — synthetic SQLite fixtures built by a fixture-builder. -- **NO** change to `internal/canonical/` (all target fields exist), `internal/ingest/` (catalog already idempotent post-SOW-0004), `internal/store/` schema, or sibling adapters. - -### Spec deltas (LANDED before tests/code, committed with this gate) - -1. adapter-opencode.md task→session rule (was "TBD; emit both"): ratified to **emit both** (tool Op + session Op; session op is the topology parent). -2. adapter-opencode.md per-turn token rule (was "to be verified"): firmed to **delta from the previous assistant message's cumulative totals**, with an explicit implementer-verify-on-live-DB note (the step-finish cumulative pattern is verified; the message-level pattern is the analogous one level up, not yet independently confirmed). - -### Patterns to reuse vs differ (briefing §B) - -- **Reuse**: `init()→adapters.Register("opencode", Factory)`; `Adapter` struct + compile-time `var _ canonical.Adapter`; `Name()/Format()/ParseCursor()`; Scan-then-Tail single-thread lifecycle; fail-soft `onError`; codex `discovery.go` → the auto-discovery probe; the golden_test harness shape (seed a `.db` instead of files). -- **Differ**: parser+scanner+stream+tailer → a **`store.go` query layer** (prepared delta SQL + `database/sql` rows) + a **poll loop** (2 s idle / 500 ms active / 250 ms post-WAL-fsnotify; coarse fsnotify on `opencode.db-wal` as a wakeup hint only). `payloads.go` emits `opencode-sqlite://…?part_id=&field=…` URIs (spec 420-426), not `file://`. `mapper.go` keeps turn/op-synthesis but walks message+part trees. - -### Cursor model (decision) - -Per-table two-watermark JSON: `{version, schema_hash, tables:{session,message,part,session_message:{max_id, max_time_updated}}}`. Primary watermark = `MAX(id)` (the 30-char Sonyflake PK is time-prefixed + monotonic + PK-indexed → `WHERE id > :last` is cheap). `MAX(time_updated)` (13-digit ms, **unindexed** — a part-table full scan ~400-800 ms) catches in-place mutations and is gated to run only after an `opencode.db-wal` mtime change or a 60 s safety net. Delta page: `… WHERE time_updated>:u OR (time_updated=:u AND id>:id) ORDER BY time_updated,id LIMIT 1000`, page until empty. Scan→Tail resumes from persisted watermarks; re-reads are absorbed by the ingester's idempotent upserts + the now-idempotent catalog. - -### Canonical mapping (briefing §D) - -Session=`session` row; Turn=assistant `message` (seq by `(time_created,id)`); LLM-Op=`step-start`→`step-finish`; Tool-Op=`tool` part (namespace derived, e.g. `github_get_file_contents`→`github`/`get_file_contents`); Reasoning-Op=`reasoning` part; text/patch are not ops (text→presenter read; patch→op extras); compaction→INF LogEntry; retry→WRN LogEntry. Terminal status: assistant `data.time.completed` NULL → `running`; `data.error` → `failed` (ErrorClass=`data.error.name`); `time_archived` → `completed`; else stays `running` (no per-session terminal, like claude-code/codex). **Cumulative-token delta (AC#3, verified):** step-finish `tokens.*` are cumulative within a message → emit per-op deltas via one `computeStepDeltas`. Sub-agent (AC#4): `parent_id` child → `Kind=sub_agent`+ParentNativeID; `tool='task'` with `state.metadata.sessionId` → tool Op + session Op. Multi-provider (AC#7): `ProviderAlias=data.providerID` verbatim; `Provider`=best-effort canonical (default=alias). Turn-extras (cwd etc.) deferred to SOW-0021 (no canonical turn Extras); per-turn cache tokens DO work via `TurnFinalizedEvent`. - -### Risk & blast radius - -Purely additive (new package + registry blank-import + additive `sources.go` probe); no canonical/ingest/store change (target fields exist; catalog idempotent post-SOW-0004). **R1 (CRITICAL) read-safety:** the opencode writer is live + concurrent on a 4.36 GB DB — layered defense: own helper opens `mode=ro` (OS `O_RDONLY`) + `query_only(true)` + `busy_timeout`, never calls any write-path pragma, each delta page in its own short `BEGIN DEFERRED` (<1 s) to avoid pinning the WAL / blocking the writer's checkpoint; acceptance #2's six write-probes pin it. **R2 (CRITICAL)** cumulative-token miscount → `computeStepDeltas` + AC#3 golden. **R4** part-table `MAX(time_updated)` full scan → gated by `MAX(id)` primary + WAL-mtime. **R5** fixtures are synthetic SQLite (never copy the operator DB). - -### Sensitive-data plan - -Every committed fixture under `testdata/opencode/` is a synthetic SQLite file built by a fixture-builder writing only sanitized, schema-shaped data (synthetic titles/dirs/prompts; `git@github.com:example/example.git`; no operator PII). The live DB is consulted ONLY for shape verification (`PRAGMA table_info`), never copied. `scripts/scan-secrets.sh` is the net. - -### Implementation plan (chunked; each = spec → failing tests → subagent impl → gates → integrate) - -- **Chunk A** — read-only connection helper (own DSN constant + the 6 write-probe test, AC#2) + the watermark `cursor.go` + typed row/`data`-JSON structs + `store.go` schema introspection (`PRAGMA table_info` → dynamic SELECT, AC#5). -- **Chunk B** — `mapper.go` row→event synthesis: session/turn/op trees, terminal status, `computeStepDeltas` (AC#3), reasoning/tool/patch/compaction/retry, sub_agent + task→session linkage (AC#4), provider alias (AC#7). -- **Chunk C** — `store.go` delta queries + the poll-loop tailer (WAL-mtime fsnotify hint + idle/active cadence; `MAX(time_updated)` gating, AC#6). -- **Chunk D** — `payloads.go` (`opencode-sqlite://` URIs) + `adapter.go` (Scan/Tail/ParseCursor + `init()`) + the `sources.go` auto-discovery probe + `__drizzle_migrations` schema-hash/counts (AC#8) + registry_test. -- **Chunk E** — fixture-builder + synthetic-DB golden scenarios (happy, sub-agent+task-child, multi-provider, old-schema-drift, cumulative-token) + restart/resume + idle-no-MAX(time_updated) integration tests + fuzz on the `data`-JSON decode. - -### Validation plan (acceptance → tests) - -#1 registry_test asserts `"opencode"`. #2 `readonly_test.go` (6 write-probes error). #3 `tokens_delta_test.go` (100/250/410 → 100/150/160). #4 golden parent+task-child (both edges). #5 `schema_drift_test.go` (pre-`20260510033149` fixture, dynamic SELECT omits missing cols, one INF/col). #6 restart/resume integration + query-counter (no idle `MAX(time_updated)`). #7 multi-provider golden (two catalog_providers + provider_alias). #8 `cmd/ai-viewer-ingest/sources_test.go` (probe registers a fixture DB, reports counts + latest migration). Plus a `data`-JSON fuzz target. - -### Artifact impact plan - -Producer: the adapter's Scan (watermark backfill) + Tail (poll loop). Refresh: WAL-mtime fsnotify hint / poll cadence → delta query. Repair: cursor corruption → re-read from zero watermark (idempotent upserts absorb). Served by the existing presenter/REST + the now-idempotent catalog; `/api/sources` + `/api/health` report the opencode source + (session/message/part counts, latest migration) (AC#8). No DB migration (ai-viewer schema unchanged). - -### Open decisions — DECIDED by CTO (recorded) - -1. **Connection helper:** the adapter uses its OWN read-only helper (DSN `mode=ro&_pragma=query_only(true)&_pragma=busy_timeout(5000)`, `MaxOpenConns(2)`), NOT `store.OpenReader` (that targets ai-viewer's own DB + forces `foreign_keys(on)`/pool 8). The helper's DSN is a tested constant; acceptance #2's six write-probes are its contract. `foreign_keys` is immaterial for a read-only connection. **Decided.** -2. **Poll cadence:** 2 s idle / 500 ms active / 250 ms floor after a WAL-mtime fsnotify event (ratify spec). **Decided.** -3. **Cursor granularity:** per-table `MAX(id)` (primary, PK-indexed) + `MAX(time_updated)` (gated by WAL-mtime / 60 s). **Decided.** -4. **Turn-extras:** opencode per-turn extras (cwd, etc.) are DEFERRED to SOW-0021 (no canonical turn `Extras` carrier); do NOT half-build a write path; per-turn cache tokens use the existing `TurnFinalizedEvent` fields. State the limitation. **Decided.** -5. **task→session op:** emit BOTH the tool Op and the session Op (session = topology parent). **Decided** (spec ratified above). -6. **Provider alias:** `ProviderAlias = data.providerID` verbatim; `Provider` = best-effort canonical (default = alias unchanged). **Decided.** - -Open (implementer-verify, not blocking): the message-level per-turn cumulative-token pattern (spec row firmed but flagged for live-DB confirmation before pinning the golden); whether `immutable=1` is ever used in production (NO — production uses `mode=ro` to respect the live WAL; `immutable=1` only for static test fixtures). - -## Implementation - -(Empty placeholder. Filled as chunks complete.) - -## Validation - -(Empty placeholder. Filled at SOW close.) - -## Reviews - -(Empty placeholder. Filled as external reviewers run.) - -## Outcome - -Pending. - -## Lessons / Follow-Ups - -Pending. diff --git a/.agents/sow/done/SOW-0005-20260526-opencode-adapter.md b/.agents/sow/done/SOW-0005-20260526-opencode-adapter.md new file mode 100644 index 0000000..7ce22ec --- /dev/null +++ b/.agents/sow/done/SOW-0005-20260526-opencode-adapter.md @@ -0,0 +1,341 @@ +# SOW-0005 - opencode adapter (read-only SQLite + cumulative-token deltas + schema-drift tolerance) + +## Status + +Status: completed + +Sub-state: completed and merged (the 5th/final source adapter). Delivered under the operator's blanket Phase-2 backlog sign-off ("deliver them all, any order"). 5 chunk commits + 7 review-fix commits; 8 external-review rounds converged (codex + glm + minimax merge-ready); PR opened + self-merged per the branch-protection workflow. Moved to `done/`. Prerequisites had been met: SOW-0001 Phase 1 in `done/`; SOW-0004 (codex) merged, which left the catalog idempotent under op re-emission (reused here). Pre-Implementation Gate filled 2026-05-30 (below). Deferred follow-ups filed: SOW-0023/0024/0025. + +## Requirements + +### Purpose + +Deliver the opencode adapter end-to-end against the single live SQLite database at `~/.local/share/opencode/opencode.db` (3.9 GB on the operator's workstation). The adapter opens the database **strictly read-only**, polls per-table watermarks with PK-indexed `MAX(id)` queries, synthesizes turns and ops from the opencode `session → message → part` tree, computes per-LLM-op token deltas from opencode's cumulative `step-finish` totals, tolerates schema drift across ~30 historic migrations via `PRAGMA table_info` + dynamic SELECT, registers multi-provider sessions with `provider_alias`, and exposes an auto-discovery probe. Outcome: the operator sees every opencode session — including sub-agents linked via `session.parent_id` — without opencode ever observing a write from ai-viewer. + +### User Request + +From the operator's 2026-05-26 milestone list (recorded in conversation while planning post-Phase-1 work): "Add claude-code, codex, and opencode adapters next, one SOW each, so each can be reviewed and scoped independently." This SOW is the opencode slice of that instruction and inherits its full scope (parser + Scan + Tail + cursor + tests + fixtures + auto-discovery + spec sync). + +### Assistant Understanding + +Facts: + +- Opencode stores everything in one SQLite database with WAL companions at `~/.local/share/opencode/opencode.db` (3.9 GB main + 5.5 MB WAL + 32 KB SHM on the operator's workstation; `adapter-opencode.md` §"Source Format"). +- The defining read-safety constraint, recorded as a hard invariant in AGENTS.md and `adapter-opencode.md` §"Read Strategy": open with `mode=ro&_pragma=query_only(true)&_pragma=journal_mode(WAL)&_pragma=busy_timeout(5000)`. NEVER call `PRAGMA wal_checkpoint`, `PRAGMA optimize`, `VACUUM`, `BEGIN EXCLUSIVE`, `ATTACH ... rwc`, or any other write-path. This is the highest-risk adapter on read-safety because opencode's writer is live and concurrent. +- Schema is `session → message → part` with no native `turn` or `op` concept (`adapter-opencode.md` §"Source Format", §"Mapping to Canonical Events"). The adapter synthesizes: Turn = assistant message; LLM-Op = `step-start` → `step-finish` pair; Tool-Op = `tool` part nested under current step; Reasoning-Op = `reasoning` part nested under current step. +- **`step-finish` token counts are CUMULATIVE within a message, not per-step** (`adapter-opencode.md` §"Tool calls and Models — concrete field map", §"Canonical Model Gaps" #3). Observed monotonic sequence (input tokens 17438, 23075, 31713, 35407, …) confirms this. The adapter MUST compute deltas between successive `step-finish` values within the same message before emitting per-op tokens. Mixing cumulative for delta would triple-count tokens silently — this is the top-line defect to prevent. +- Sub-agent linkage is dual and 100% consistent on observed data: `session.parent_id` (authoritative, 1285 child sessions) + `part.data.state.metadata.sessionId` on `tool` parts where `tool='task'` (1274 of 1274 cross-checks match) (`adapter-opencode.md` §"Sub-Agent Linkage"). Adapter prefers `parent_id`. +- IDs are time-prefixed Sonyflake; `id > ''` is monotonic and PK-indexed. Cursor uses `MAX(id)` per table as the primary watermark, with `MAX(time_updated)` as a fallback for detecting in-place mutations behind an fsnotify-gated trigger (`adapter-opencode.md` §"Performance"). +- Schema evolves between opencode versions (~30 migrations in `packages/opencode/migration/` per `anomalyco/opencode @ 2b3ddf9`). The adapter queries `PRAGMA table_info(session|message|part|session_message)` at startup, builds dynamic SELECT lists naming only known columns (never `SELECT *`), tolerates missing columns with empty/zero values + one INF log per (table, column) on first occurrence (`adapter-opencode.md` §"Edge Cases" #1). +- Opencode is multi-provider: observed `providerID` values include `llm-netdata-cloud`, `zai-coding-plan`, `minimax-coding-plan`, `deepseek`, `kimi-for-coding`, `openrouter`, `alibaba-coding-plan`. These are user-defined aliases, not canonical vendors. Canonical model adds `sessions.provider_alias` + `catalog_providers` table per SOW-0002; the adapter populates the alias verbatim and emits a best-effort canonical mapping where known (`adapter-opencode.md` §"Multi-provider awareness"). +- Poll cadence: 2 s idle / 500 ms active / 250 ms after `opencode.db-wal` fsnotify mtime change for the next 5 s (`adapter-opencode.md` §"Watch Strategy"). Each delta page is its own short transaction (limit 1000 rows, target <50 ms) to avoid pinning the WAL. +- Phase 1 Foundation (SOW-0001) delivers `internal/canonical/`, `internal/ingest/`, `internal/store/`, `internal/adapters/registry.go`, the `canonical.Adapter` interface, pricing catalog, fixture sanitization tooling, and CI gates that this SOW reuses unchanged. + +Inferences: + +- Initial backfill of 6,778 sessions + 127,345 messages + 585,894 parts (~3.9 GB) is expected in 60-90 s wall-clock per `adapter-opencode.md` §"Performance" (SSD read ~100 MB/s + JSON decode CPU bound at ~50 MB/s in Go with `encoding/json`). Page reads at 1000 rows/transaction with `SourceProgress` every 1000 rows so restart resumes. +- Read-only enforcement is layered defense-in-depth: `mode=ro` (OS-level, the file is opened `O_RDONLY` so SQLite cannot upgrade), `_pragma=query_only(true)` (SQL-layer rejection of writes), and a test that asserts an attempted write panics or errors at the adapter's connection-helper boundary. +- The cumulative-token regression test should pin the delta math against a synthetic fixture with deliberately monotonic step-finish values and a committed `.golden.json` so a future change cannot silently revert to raw-value emission. + +Unknowns: + +- Whether the opencode binary on this workstation is currently running concurrently with the test runs (the adapter must be safe either way). Resolved by the read-only-enforcement test which uses a copy-on-write fixture; production runs against the live DB are validated only in the manual-walkthrough acceptance. +- Whether any `session_message.type` beyond `agent-switched` / `model-switched` appears on this workstation. Spec records "treat unknown types as forward-compatibility data and skip with structured WARN"; resolved by a `SELECT DISTINCT type` query during Pre-Implementation Gate authoring. +- Whether the dynamic `PRAGMA table_info`-driven SELECT can be tested against an older opencode schema. Acceptance #8 requires this; the fixture is a small synthetic SQLite file with a subset of columns mimicking an older migration state. + +### Acceptance Criteria + +1. `internal/adapters/opencode/` package compiles, lints clean, and is registered in `internal/adapters/registry.go`. **Verification**: `go build ./...` exits 0; `golangci-lint run` exits 0; `internal/adapters/registry_test.go` asserts the adapter is enumerable by name `"opencode"`. +2. **Read-only enforcement asserted in tests.** The adapter's connection helper opens with `mode=ro&_pragma=query_only(true)` (layered OS + SQL guard). An explicit unit test invokes the helper and probes `INSERT`, `UPDATE`, `DELETE`, `PRAGMA wal_checkpoint`, `VACUUM`, and `ATTACH ... 'rwc'`, asserting each probe **cannot mutate `opencode.db`** plus a byte-untouched read-back. **Verified ground truth (modernc.org/sqlite, Chunk A):** INSERT/UPDATE/DELETE/VACUUM return an error (`attempt to write a readonly database`); `PRAGMA wal_checkpoint(TRUNCATE)` is a no-op under `mode=ro` (returns the `busy=1, log=-1, checkpointed=-1` "nothing checkpointed" sentinel — asserted); `ATTACH ... 'rwc'` attaches a SEPARATE side file (never `opencode.db`) and `query_only(true)` blocks the write INTO it (the side `CREATE TABLE` errors — asserted). Asserting "all six error" would pin a false mechanism; the test asserts each probe's precise no-mutation property instead. **Verification**: `internal/adapters/opencode/conn_test.go` runs all six probes + the read-back; CI's gates include it. +3. **Cumulative-token-delta regression test.** A synthetic SQLite fixture contains one assistant message with three `step-finish` parts whose `tokens.input` are `100, 250, 410` (cumulative). The adapter must emit per-LLM-op `tokens_in` of `100, 150, 160` (deltas). **Verification**: `internal/adapters/opencode/tokens_delta_test.go` asserts the exact delta sequence; the golden file pins the values so a regression to raw-value emission fails the gate. **DONE (Chunk E):** `testdata/opencode/e_cumulative_tokens/{fixture.sql,expected.jsonl}` pins cumulative 100/250/410/400 → per-op deltas 100/150/160/0 (the 4th decreases → clamps to 0); `golden_resume_test.go:TestGoldenInvariant_ECumulativeTokens` asserts the exact sequence independent of the golden bytes (so a `-update-golden` cannot launder a regression). Hand-verified against `expected.jsonl` lines 4,6,8,10. +4. Sub-agent linkage is correct: every `session` row with `parent_id` set emits a `SessionStartedEvent` with `Kind='sub_agent'` and `ParentNativeID=parent_id`; `tool` parts where `tool='task'` and `state.metadata.sessionId` is set emit both a tool Op AND a session Op (`Kind='session'`, `ChildSessionNativeID=state.metadata.sessionId`) per `adapter-opencode.md` §"Mapping to Canonical Events" rule for `tool` where `tool='task'`. **Verification**: golden test on a sanitized real-data fixture with one parent + one task-spawned child asserts both edges exist in the emitted event stream. **DONE (Chunk E):** `testdata/opencode/b_subagent_task/{fixture.sql,expected.jsonl}` (synthetic, not real-data) pins BOTH edges; `golden_invariants_test.go:TestGoldenInvariant_BSubagentTask` asserts the session Op (`ChildSessionNativeID=ses_child01`) AND the tool Op (`name=task`) in the same turn, plus the child `Kind=sub_agent`/`ParentNativeID=ses_parent01`. Hand-verified against `expected.jsonl` lines 4,5,10. +5. **Schema-drift tolerance proven against an older schema fixture.** A second synthetic SQLite fixture mimics a pre-`20260510033149_session_usage` schema (no `cost`/`tokens_*` columns on `session`). The adapter, reading `PRAGMA table_info` at startup, builds a dynamic SELECT that omits the missing columns, emits empty/zero values in the canonical event, and logs exactly one INF per (table, column) on first occurrence. **Verification**: `internal/adapters/opencode/schema_drift_test.go` opens the older-schema fixture, asserts the SELECT does not reference the missing columns (by inspecting the prepared statement or via a query-log probe), and asserts the INF log fires once per missing column then is suppressed. **DONE (Chunk E + INF wiring):** (a) the dynamic-SELECT omission + empty/zero values + no-rejection are proven end-to-end: `testdata/opencode/d_schema_drift/{fixture.sql,expected.jsonl}` (pre-`20260510033149` session: no `agent`/`model`/`cost`/`tokens_*`/`time_archived`) + `golden_invariants_test.go:TestGoldenInvariant_DSchemaDrift` (SessionStarted `Model=""`/`AgentName=""`, Extras without `providerID`/`variant`; op/turn token+provider survive from `message.data`) + the chunk-A `schema_test.go:TestIntrospectAll_OldSchema` (SELECT omits missing cols). (b) the "one INFO per missing optional column" requirement is now wired in production: `tailer.go:logMissingColumns` iterates each table's `tableSchema.Missing` right after `introspectAll` succeeds in BOTH `scanLoop` and `tailLoop`, emitting one `logger.Info("opencode: optional column absent on this database schema; omitted from projection (old opencode version)", "table", table, "column", col)` per (table, column) in deterministic order; the logger is threaded from `Adapter.logger` via `adapter.go` `Scan`/`Tail`. `Scan` and `Tail` each emit the set once (per-phase, accepted). **Verification**: `golden_invariants_test.go:TestGoldenInvariant_DSchemaDrift_MissingColumnsLoggedINF` `Scan`s the fixture through the public adapter with a record-capturing `slog.Handler` (`golden_loghandler_test.go:captureHandler`) and asserts the set of logged (table, column) pairs equals the set introspection reports Missing — exactly one INFO record per missing column, nothing extra. (The INF set is a log, not a canonical event, so it is correctly absent from `expected.jsonl`.) +6. Watermark cursor (per-table `MAX(id)` primary + `MAX(time_updated)` fallback gated by `opencode.db-wal` fsnotify) is durable across restart with zero duplicates and zero gaps; the `time_updated` query runs only after WAL mtime change or every 60 s safety net. **Verification**: integration test that ingests half a fixture, persists cursor, restarts, ingests rest, asserts identical end state to a one-shot ingest; a second test asserts the `MAX(time_updated)` query is NOT issued during steady-state idle polls (probed via a query-counting test driver). **DONE:** chunk-C `tailer_resume_test.go:TestScanLoop_ResumeZeroDupesZeroGaps` (two-stage seed, union==cold-baseline) + `tailer_counting_test.go` (no idle `MAX(time_updated)` via the counting driver). **Chunk E adds the scenario-level complement** over static fixtures: `golden_resume_test.go:{TestGoldenInvariant_ResumeIdempotentReScan, TestGoldenInvariant_ResumeFromZeroIsDeterministic, TestGoldenInvariant_ResumeMultiSessionFinalCursor}` — re-scan from the final cursor emits 0 content events, two cold scans are identical, and a 2-session fixture re-emits neither session on re-scan. Together: resume/re-scan never drops or duplicates a content event. +7. Multi-provider sessions register correctly: every distinct opencode `providerID` observed becomes a row in `catalog_providers` with `alias=providerID` and `canonical=`; `sessions.provider_alias` is populated from `data.providerID`. **Verification**: golden test on a fixture with two sessions using different aliased providers asserts both `catalog_providers` rows and both `sessions.provider_alias` values. **DONE (Chunk E):** `testdata/opencode/c_multi_provider/{fixture.sql,expected.jsonl}` (two turns, providerID anthropic + openai) + `golden_invariants_test.go:TestGoldenInvariant_CMultiProvider` assert each LLM op carries its `ProviderAlias` verbatim + canonical `Provider`, and both providers surface (the adapter-side guarantee that seeds two `catalog_providers` downstream — the catalog write itself is the ingester's job, exercised by the ingester tests). Hand-verified against `expected.jsonl` lines 3,8. NOTE: the canonical event carries `ProviderAlias` per op (the chosen mechanism); there is no per-session `provider_alias` column write in the adapter — the alias is also in SessionStarted `Extras.providerID`. **Review P2.3:** `data-model.md` defines `sessions.provider_alias`/`provider`, but the canonical `SessionStartedEvent` carries neither field and the ingest writer's `sessions` upsert does not map them (confirmed in `internal/canonical/events.go` + `internal/ingest/writer.go`). Populating the session-level provider columns needs a canonical-event + writer change (a shared-surface edit outside this adapter's additive scope), deferred to **follow-up SOW-0023**. The op-scoped provider/alias — complete + correct, including multi-provider sessions — is the authoritative source today; a single session-level alias is inherently lossy for multi-provider sessions anyway. +8. Auto-discovery probe detects `~/.local/share/opencode/opencode.db` (and `$OPENCODE_DB` when set) at startup, opens read-only, queries `__drizzle_migrations` to record the schema hash, and exposes `(session_count, message_count, part_count, latest_migration_name)` in `/api/health`. **Verification**: unit test on the probe with a fixture DB; manual run on the operator's workstation registers the real source and `/api/sources` reports the live counts. **DONE (Chunk D + review fix P1.4):** `opencodeDBPath` resolves `$OPENCODE_DB` (verbatim) → `$XDG_DATA_HOME/opencode/opencode.db` → `~/.local/share/opencode/opencode.db` (`cmd/ai-viewer-ingest/discovery.go`; `TestOpencodeDBPath_Resolution`). The probe opens read-only, reads `__drizzle_migrations` (schema hash + latest migration via `migrations.go:ProbeStatus`), and LOGS `(session/message/part counts, latest_migration)` at startup while registering the source (visible in `/api/sources` as the standard source row). **AMENDED (review P2.2):** surfacing the per-source row counts as bespoke `/api/health` fields needs cross-cutting presenter+ingester+schema work that does not generalize to the file-based adapters (which have no cheap row count); that generalized per-source-metadata surface is deferred to **follow-up SOW-0024**. The load-bearing AC#8 — detect default + `$OPENCODE_DB`, read-only open, schema hash, source registration, startup-logged counts — is DONE. + +## Analysis + +Sources checked: + +- `.agents/sow/specs/adapter-opencode.md` (full spec, all sections) — primary contract. +- `.agents/sow/specs/canonical-events.md` — target event types, including `Kind='sub_agent'`, `OpKind='reasoning'`, `provider_alias` field on SessionStartedEvent, indefinite-`running` SessionStatus (opencode never finalizes; only archives). +- `.agents/sow/specs/data-model.md` — SQLite schema, especially `sessions.provider_alias`, `catalog_providers`, cross-format compatibility matrix. +- `.agents/sow/done/SOW-0002-20260526-cross-format-data-model-analysis.md` — analysis context confirming opencode's cumulative-token quirk and read-only invariant. +- `.agents/sow/current/SOW-0001-phase-1-foundation.md` — infrastructure the adapter plugs into. +- Real evidence on the operator's workstation: `~/.local/share/opencode/opencode.db` (3.9 GB, 6778 sessions, 127345 messages, 585894 parts, 3985 session_messages, 20 migrations applied through `20260511000411_data_migration_state` as of 2026-05-26). +- Upstream source at `anomalyco/opencode @ 2b3ddf9f34546b9bcea25ec8e0ff57e2811c4537` — `packages/opencode/src/storage/db.ts`, `packages/opencode/src/session/session.sql.ts`, `packages/opencode/src/session/message-v2.ts`, `packages/core/src/session-message.ts`, `packages/opencode/migration/` per `adapter-opencode.md` §"References". + +Current state: + +- SOW-0001 (in-progress) delivers canonical event types, SQLite store, ingest pipeline, adapter registry, pricing catalog, fixture sanitization tooling, CI gates, and the ai-agent v3/v2 adapters end-to-end. This SOW assumes that infrastructure is in place; if SOW-0001 is not yet completed, this SOW remains in `pending/`. +- No `internal/adapters/opencode/` package exists yet (the bootstrap only documented the format). +- The canonical model already absorbed opencode's gaps in SOW-0002 (cache tokens, reasoning tokens, provider alias, catalog_providers table); this SOW does NOT propose new canonical changes — only adapter implementation. + +Risks: + +- **R1 — Read-only DB safety (CRITICAL).** Opencode's writer is live and concurrent; any accidental write from ai-viewer corrupts the operator's primary AI coding tool. Mitigation: layered defense — OS-level `mode=ro`, SQL-layer `query_only(true)`, explicit test asserting six write paths all error (acceptance #2). The DSN string is encoded as a constant in the adapter and validated by a unit test that pattern-matches the connection string to guarantee the read-only PRAGMAs are present. No code path in the adapter ever takes a `*sql.Tx` that begins anything other than `BEGIN DEFERRED`. +- **R2 — Cumulative-token miscount (CRITICAL).** The most likely silent defect; misses would triple-count tokens and corrupt every cost calculation. Mitigation: acceptance #3 regression test with a frozen golden file; the test fixture is named explicitly so future code review surfaces the contract; per-LLM-op delta computation is implemented in a single named function (`computeStepDeltas`) with its own table-driven unit tests covering reset-on-message-boundary, missing intermediate step-finish (cancelled step), and out-of-order observation. +- **R3 — Schema drift between opencode versions.** ~30 historic migrations; older rows lack newer columns. Mitigation: dynamic `PRAGMA table_info`-driven SELECT (acceptance #5), tolerance for missing columns with structured INF on first occurrence, and a `schema_hash` field in the cursor that detects new migrations and triggers a re-probe without resetting the cursor (only when a depended-on column disappears does the adapter perform a full re-ingest). +- **R4 — Multi-GB DB query latency.** Full-table `MAX(time_updated)` scans on the `part` table (585k rows, 2.3 GB) take 400-800 ms cold. Mitigation: PK-indexed `MAX(id)` is the primary watermark; the expensive `MAX(time_updated)` query runs only when fsnotify on `opencode.db-wal` signals activity OR every 60 s as a safety net (acceptance #6). The 1000-row page limit + sub-1 s transactions keep the WAL from growing unboundedly. +- **R5 — Sensitive content in fixtures.** Every real opencode message/part carries operator data — session titles, directories, prompts, tool outputs, patch file paths. Mitigation: every committed fixture under `testdata/opencode/` is a synthetic SQLite file constructed by a fixture-builder utility, NOT a copy of the operator's DB. The fixture-builder writes only sanitized data shaped like the real schema; the live DB is consulted only for shape verification (via `PRAGMA table_info` printed during Pre-Implementation Gate authoring) and never copied into `testdata/`. + +## Pre-Implementation Gate + +Filled 2026-05-30. A readiness-briefing subagent re-probed the live `opencode.db` read-only (`immutable=1`); load-bearing claims (the read-only DSN, acceptance #1-8) were re-verified against ground truth before this gate. + +### Problem / model + +Additive feature: a new `opencode` adapter that projects OpenCode's **SQLite** session store onto the canonical event model. Unlike the four JSONL/file adapters (byte-offset cursor + fsnotify-on-append), opencode keeps everything in one SQLite DB (`~/.local/share/opencode/opencode.db`, WAL, Drizzle-managed, ~4.36 GB live, 20 migrations). So the read model is SQL delta-queries + a watermark cursor + DB polling, not line streaming. The adapter is a read-only projection from `session`/`message`/`part`/`session_message` rows → canonical events, reusing the registry/payloads/golden patterns from `codex`/`claude_code` but replacing parser+scanner+stream+tailer with a query layer + poll loop. + +### Evidence reviewed + +- `.agents/sow/specs/adapter-opencode.md` (567 lines, evidence-driven) — primary contract. +- Live DB re-probe (read-only, 2026-05-30): `session` 7,775 (6,335 root / 1,440 child / 2 archived), `message` 144,551 (assistant `data` keys role/time/error/parentID/modelID/providerID/mode/path/cost/tokens), `part` 667,335 (tool 230,606 / step-start 133,186 / step-finish 132,595 / text 83,367 / reasoning 75,361 / patch 11,686 / compaction 495 / file 22 / retry 17), `session_message` 5,975 (**only** agent-switched + model-switched). `event`/`event_sequence` = 0 (ignore). Latest migration `20260510033149_session_usage`. +- `internal/store/store.go:303-318` — `buildDSN` forces `foreign_keys(on)`+`busy_timeout(5000)`, readers add `query_only(true)`; this is ai-viewer's OWN-DB reader (not for the external opencode.db). `internal/canonical/events.go` — `TurnFinalizedEvent` carries `TokensCacheRead/Write`+`CostUSD` (per-turn cache accounting works) but NO `Extras` (turn-extras unreachable — SOW-0021); `OpFinalizedEvent` carries cache tokens + ProviderAlias; `KindSubAgent`/`OpReasoning`/`OpSession`/`OpCompaction` exist (no canonical change needed). +- `internal/adapters/{codex,claude_code}/` — structural template; `registry.go` self-registration; codex `discovery.go`/golden harness. +- SOW-0005 Acceptance #1-8 + Risks R1-R5. + +### Affected contracts & surfaces + +- **NEW** package `internal/adapters/opencode/` (SQLite-backed; see structural map). +- **ADDITIVE** `cmd/ai-viewer-ingest/sources.go`: a 5th auto-discovery probe (`$OPENCODE_DB` else `~/.local/share/opencode/opencode.db`) + a `__drizzle_migrations` schema-hash + count helper (acceptance #8); blank-import for `init()` registration. +- **ADDITIVE** `testdata/opencode//` — synthetic SQLite fixtures built by a fixture-builder. +- **NO** change to `internal/canonical/` (all target fields exist), `internal/ingest/` (catalog already idempotent post-SOW-0004), `internal/store/` schema, or sibling adapters. + +### Spec deltas (LANDED before tests/code, committed with this gate) + +1. adapter-opencode.md task→session rule (was "TBD; emit both"): ratified to **emit both** (tool Op + session Op; session op is the topology parent). +2. adapter-opencode.md per-turn token rule (was "to be verified"): firmed to **delta from the previous assistant message's cumulative totals**, with an explicit implementer-verify-on-live-DB note (the step-finish cumulative pattern is verified; the message-level pattern is the analogous one level up, not yet independently confirmed). + +### Patterns to reuse vs differ (briefing §B) + +- **Reuse**: `init()→adapters.Register("opencode", Factory)`; `Adapter` struct + compile-time `var _ canonical.Adapter`; `Name()/Format()/ParseCursor()`; Scan-then-Tail single-thread lifecycle; fail-soft `onError`; codex `discovery.go` → the auto-discovery probe; the golden_test harness shape (seed a `.db` instead of files). +- **Differ**: parser+scanner+stream+tailer → a **`store.go` query layer** (prepared delta SQL + `database/sql` rows) + a **poll loop** (2 s idle / 500 ms active / 250 ms post-WAL-fsnotify; coarse fsnotify on `opencode.db-wal` as a wakeup hint only). `payloads.go` emits `opencode-sqlite://…?part_id=&field=…` URIs (spec 420-426), not `file://`. `mapper.go` keeps turn/op-synthesis but walks message+part trees. + +### Cursor model (decision) + +Per-table two-watermark JSON: `{version, schema_hash, tables:{session,message,part,session_message:{max_id, max_time_updated}}}`. Primary watermark = `MAX(id)` (the 30-char Sonyflake PK is time-prefixed + monotonic + PK-indexed → `WHERE id > :last` is cheap). `MAX(time_updated)` (13-digit ms, **unindexed** — a part-table full scan ~400-800 ms) catches in-place mutations and is gated to run only after an `opencode.db-wal` mtime change or a 60 s safety net. Delta page: `… WHERE time_updated>:u OR (time_updated=:u AND id>:id) ORDER BY time_updated,id LIMIT 1000`, page until empty. Scan→Tail resumes from persisted watermarks; re-reads are absorbed by the ingester's idempotent upserts + the now-idempotent catalog. + +### Canonical mapping (briefing §D) + +Session=`session` row; Turn=assistant `message` (seq by `(time_created,id)`); LLM-Op=`step-start`→`step-finish`; Tool-Op=`tool` part (namespace derived, e.g. `github_get_file_contents`→`github`/`get_file_contents`); Reasoning-Op=`reasoning` part; text/patch are not ops (text→presenter read; patch→op extras); compaction→INF LogEntry; retry→WRN LogEntry. Terminal status: assistant `data.time.completed` NULL → `running`; `data.error` → `failed` (ErrorClass=`data.error.name`); `time_archived` → `completed`; else stays `running` (no per-session terminal, like claude-code/codex). **Cumulative-token delta (AC#3, verified):** step-finish `tokens.*` are cumulative within a message → emit per-op deltas via one `computeStepDeltas`. Sub-agent (AC#4): `parent_id` child → `Kind=sub_agent`+ParentNativeID; `tool='task'` with `state.metadata.sessionId` → tool Op + session Op. Multi-provider (AC#7): `ProviderAlias=data.providerID` verbatim; `Provider`=best-effort canonical (default=alias). Turn-extras (cwd etc.) deferred to SOW-0021 (no canonical turn Extras); per-turn cache tokens DO work via `TurnFinalizedEvent`. + +### Risk & blast radius + +Purely additive (new package + registry blank-import + additive `sources.go` probe); no canonical/ingest/store change (target fields exist; catalog idempotent post-SOW-0004). **R1 (CRITICAL) read-safety:** the opencode writer is live + concurrent on a 4.36 GB DB — layered defense: own helper opens `mode=ro` (OS `O_RDONLY`) + `query_only(true)` + `busy_timeout`, never calls any write-path pragma, each delta page in its own short `BEGIN DEFERRED` (<1 s) to avoid pinning the WAL / blocking the writer's checkpoint; acceptance #2's six write-probes pin it. **R2 (CRITICAL)** cumulative-token miscount → `computeStepDeltas` + AC#3 golden. **R4** part-table `MAX(time_updated)` full scan → gated by `MAX(id)` primary + WAL-mtime. **R5** fixtures are synthetic SQLite (never copy the operator DB). + +### Sensitive-data plan + +Every committed fixture under `testdata/opencode/` is a synthetic SQLite file built by a fixture-builder writing only sanitized, schema-shaped data (synthetic titles/dirs/prompts; `git@github.com:example/example.git`; no operator PII). The live DB is consulted ONLY for shape verification (`PRAGMA table_info`), never copied. `scripts/scan-secrets.sh` is the net. + +### Implementation plan (chunked; each = spec → failing tests → subagent impl → gates → integrate) + +- **Chunk A** — read-only connection helper (own DSN constant + the 6 write-probe test, AC#2) + the watermark `cursor.go` + typed row/`data`-JSON structs + `store.go` schema introspection (`PRAGMA table_info` → dynamic SELECT, AC#5). +- **Chunk B** — `mapper.go` row→event synthesis: session/turn/op trees, terminal status, `computeStepDeltas` (AC#3), reasoning/tool/patch/compaction/retry, sub_agent + task→session linkage (AC#4), provider alias (AC#7). +- **Chunk C** — `store.go` delta queries + the poll-loop tailer (WAL-mtime fsnotify hint + idle/active cadence; `MAX(time_updated)` gating, AC#6). +- **Chunk D** — `payloads.go` (`opencode-sqlite://` URIs) + `adapter.go` (Scan/Tail/ParseCursor + `init()`) + the `sources.go` auto-discovery probe + `__drizzle_migrations` schema-hash/counts (AC#8) + registry_test. +- **Chunk E** — fixture-builder + synthetic-DB golden scenarios (happy, sub-agent+task-child, multi-provider, old-schema-drift, cumulative-token) + restart/resume + idle-no-MAX(time_updated) integration tests + fuzz on the `data`-JSON decode. + +### Validation plan (acceptance → tests) + +#1 registry_test asserts `"opencode"`. #2 `readonly_test.go` (6 write-probes error). #3 `tokens_delta_test.go` (100/250/410 → 100/150/160). #4 golden parent+task-child (both edges). #5 `schema_drift_test.go` (pre-`20260510033149` fixture, dynamic SELECT omits missing cols, one INF/col). #6 restart/resume integration + query-counter (no idle `MAX(time_updated)`). #7 multi-provider golden (two catalog_providers + provider_alias). #8 `cmd/ai-viewer-ingest/sources_test.go` (probe registers a fixture DB, reports counts + latest migration). Plus a `data`-JSON fuzz target. + +### Artifact impact plan + +Producer: the adapter's Scan (watermark backfill) + Tail (poll loop). Refresh: WAL-mtime fsnotify hint / poll cadence → delta query. Repair: cursor corruption → re-read from zero watermark (idempotent upserts absorb). Served by the existing presenter/REST + the now-idempotent catalog; `/api/sources` + `/api/health` report the opencode source + (session/message/part counts, latest migration) (AC#8). No DB migration (ai-viewer schema unchanged). + +### Open decisions — DECIDED by CTO (recorded) + +1. **Connection helper:** the adapter uses its OWN read-only helper (DSN `mode=ro&_pragma=query_only(true)&_pragma=busy_timeout(5000)`, `MaxOpenConns(2)`), NOT `store.OpenReader` (that targets ai-viewer's own DB + forces `foreign_keys(on)`/pool 8). The helper's DSN is a tested constant; acceptance #2's six write-probes are its contract. `foreign_keys` is immaterial for a read-only connection. **Decided.** +2. **Poll cadence:** 2 s idle / 500 ms active / 250 ms floor after a WAL-mtime fsnotify event (ratify spec). **Decided.** +3. **Cursor granularity:** per-table `MAX(id)` (primary, PK-indexed) + `MAX(time_updated)` (gated by WAL-mtime / 60 s). **Decided.** +4. **Turn-extras:** opencode per-turn extras (cwd, etc.) are DEFERRED to SOW-0021 (no canonical turn `Extras` carrier); do NOT half-build a write path; per-turn cache tokens use the existing `TurnFinalizedEvent` fields. State the limitation. **Decided.** +5. **task→session op:** emit BOTH the tool Op and the session Op (session = topology parent). **Decided** (spec ratified above). +6. **Provider alias:** `ProviderAlias = data.providerID` verbatim; `Provider` = best-effort canonical (default = alias unchanged). **Decided.** + +Open (implementer-verify, not blocking): the message-level per-turn cumulative-token pattern (spec row firmed but flagged for live-DB confirmation before pinning the golden); whether `immutable=1` is ever used in production (NO — production uses `mode=ro` to respect the live WAL; `immutable=1` only for static test fixtures). + +## Implementation + +### Chunk C — delta-query layer + poll-loop tailer (2026-05-30) + +Delivered the SQL delta-query layer and the poll-loop tailer (the backfill scan loop + the realtime poll loop). Purely additive inside `internal/adapters/opencode/`; no sibling adapter, `canonical`, `ingest`, or `store` package touched. Read-only invariant held: the only production DB open is the chunk-A `openReadOnly` helper, and every transaction is `BeginTx{ReadOnly:true}` (BEGIN DEFERRED); no write-path pragma anywhere. + +Files: + +- `store_query.go` (NEW, 241 lines) — paged delta query per table (`scanTableDelta` → `scanOnePage`, each page its own short read tx; pages until a short page), the cheap PK-indexed `maxID` probe, the expensive gated `maxTimeUpdated` probe, the affected-session set (`affectedSet`, first-seen dedup), and `resolvePartSession` (denormalized `session_id` → message-map → indexed `message_id` lookup fallback). +- `store_load.go` (NEW, 400 lines) — full-session-tree load (`loadSession`, `loadSessionTree` → ordered `[]messageWithParts`), the per-table dynamic-column scanners (present-columns only, never `SELECT *`), and the present-column point/ordered SELECT builders. `errSessionGone` for an affected id whose row vanished. +- `store.go` (MODIFIED) — added `buildSelectByID` companion to `tableSchema` (the old-schema `time_updated`-absent fallback: `WHERE id > ? ORDER BY id LIMIT 1000`). +- `tailer.go` (NEW, 357 lines) — `scanLoop` (backfill), `tailLoop` (realtime follow with the idle/active/WAL-floor cadence state machine), `pollOnce`, `detectChange` (cheap `MAX(id)` every poll; gated `MAX(time_updated)`), the pure `shouldProbeTimeUpdated` gate (AC#6), `watchWAL` (best-effort fsnotify hint with non-fatal missing-WAL fallback), `emitProgress`/`emitEvents` (ctx-aware, codex shape). +- `tailer_changes.go` (NEW, 309 lines) — the shared `processChanges` pipeline (delta → affected → reload → map → emit → advance), `collectDeltas` (with the every-~1000-rows `SourceProgress` checkpoint), `reloadAndEmit`, `loadAndMapSession`, the `pollState` cadence machine, and `coerceScanCursor` + `schemaFingerprint` (records a present-column schema-shape hash into the cursor; the `__drizzle_migrations`-name hash is deferred to chunk D). +- Tests (NEW): `store_query_test.go`, `store_load_test.go`, `store_testhelpers_test.go` (synthetic-DB builders + a registered query-counting `driver.Driver` wrapper), `tailer_test.go`, `tailer_gate_test.go` (AC#6 pure gate + cadence), `tailer_counting_test.go` (literal no-idle-`MAX(time_updated)` via the counter), `tailer_resume_test.go` (zero-dupes/zero-gaps), `tailer_branch_test.go`, `tailer_wal_test.go`, `tailer_pollcycle_test.go`. + +Key decisions locked (honoring the recorded SOW/spec): + +- **Delta page SQL** = `buildSelect` (present columns) with `WHERE time_updated > :u OR (time_updated = :u AND id > :id) ORDER BY time_updated, id LIMIT 1000`; old-schema fallback (no `time_updated`) = `buildSelectByID` (`WHERE id > :id ORDER BY id`), watermark advancing on `MaxID` only. Chosen from the introspected schema (`tableSchema.has("time_updated")`), never crashes. +- **Affected-session derivation**: session row → own id; message → `session_id`; part → denormalized `session_id` (fallback to `message_id`→`session_id` lookup on a hypothetical old schema lacking it); session_message → `session_id`. De-duplicated; full-tree reload per affected session (the mapper's per-turn cumulative-token delta requires the whole ordered message list). +- **Cadence state machine**: idle 2 s / active 500 ms / 250 ms floor for 5 s after a WAL fsnotify event; next interval = min(active|idle, floor-while-open). +- **`MAX(time_updated)` gate (AC#6)**: pure `shouldProbeTimeUpdated(now, lastWALEvent, lastProbe, safetyNet)` = `lastWALEvent.After(lastProbe) || now.Sub(lastProbe) >= 60s`. Proven false on idle polls by both the pure-truth-table test and the query-counting-driver test (zero `MAX(time_updated)` across 5 idle polls; `MAX(id)` runs every poll). +- **WAL-watch-missing fallback**: a missing `-wal` file / Add failure / watcher error → one `onError` + a closed hint channel → pure timer polling; the 60 s safety net still catches in-place mutations. A watcher error never kills the loop. + +Gates (run 2026-05-30): `go build ./...` exit 0; `go vet` exit 0; `golangci-lint` 0 issues; `gosec -severity medium -confidence medium ./...` exit 0 (two justified `// #nosec G202` on `MAX(id)`/`MAX(time_updated)` where the only interpolated token is a fixed `trackedTables` name via `quoteIdent`); `go test -race -cover` pass at **91.6%** (package was 96.1% pre-chunk; the delta is the new code's defensive error branches, all new code ≥ target). All new `.go` files ≤ 400 lines. + +### Chunk D — payloads + adapter wiring + auto-discovery + real schema-hash (2026-05-30) + +Wired the chunk-A/B/C pure pieces into a registered `canonical.Adapter`, formalized the payload-URI grammar in `payloads.go`, replaced chunk C's present-column placeholder with the REAL `__drizzle_migrations` schema hash, and added the opencode auto-discovery probe (AC#8). Purely additive inside `internal/adapters/opencode/` plus the documented `cmd/ai-viewer-ingest/sources.go` integration point; no sibling adapter, `canonical`, `ingest`, `store`, or `presenter` touched. Read-only invariant held: every new production DB open goes through the chunk-A `openReadOnly` helper (`adapter.go:snapshotCursor`, `migrations.go:ProbeStatus`); no write-path pragma, no `rwc`, no `mkdir`, no `ATTACH`. + +Files (NEW): + +- `payloads.go` (47 lines) — `buildPayloadURI(partID, field)`, the SINGLE source of truth for the `opencode-sqlite://?part_id=&field=` grammar, URL-encoding both values via `net/url`. No resolver/parser (no consumer yet — that would be dead code; the `/api/payloads` resolver is a separate Phase-2 SOW). +- `migrations.go` (190 lines) — `readMigrations` (ordered `name` list by `id ASC` + latest; missing-table → `errNoMigrationsTable` soft sentinel), `schemaHash` (length-prefixed sha256 of the ordered names — injection-safe framing, replacing the chunk-C present-column fingerprint), `readSchemaHash`/`recordSchemaHash` (the poll-loop hook; mismatch → WARN + re-read + watermarks preserved), and `ProbeStatus` (read-only session/message/part `COUNT(*)` + latest migration for AC#8, degrading gracefully on a foreign DB). +- `adapter.go` (218 lines) — the registered `Adapter` (mirrors codex): `New`/`Name`/`Format`, `Scan` (records `scanCursor` even on cancel), `Tail` (resumes from `scanCursor` or cold-`snapshotCursor` HEAD), `ParseCursor`, `coerceCursor`, `snapshotCursor` (HEAD watermarks via `maxID`/`maxTimeUpdated` + real schema hash), `Factory`, `init()→adapters.Register(Format, Factory)`, `var _ canonical.Adapter`. +- Tests (NEW): `payloads_test.go`, `migrations_test.go`, `adapter_test.go` (construction + cursor), `adapter_lifecycle_test.go` (Scan/Tail/snapshot lifecycle), `cmd/ai-viewer-ingest/discovery_test.go` (codex probe tests, split from `sources_test.go`). + +Files (MODIFIED): + +- `mapper_turn.go` — `defaultPayloadURI` now delegates to `payloads.go:buildPayloadURI` (byte-identical; the chunk-B mapper goldens are unchanged, confirmed by the full-repo `go test`). +- `tailer.go` / `tailer_changes.go` — `coerceScanCursor` reduced to pure cursor-shaping (Tables/Version); the REAL migration-name hash is recorded by the new `recordSchemaHash`, called by `scanLoop`/`tailLoop` after `introspectAll`. The chunk-C `schemaFingerprint` placeholder is fully removed. +- `cmd/ai-viewer-ingest/sources.go` — named import of the opencode adapter (registers via `init()` AND exposes `ProbeStatus`), an `opencode` probe entry (`opencodeDBPath(home)`, a regular-file `os.Stat`), and a `case "opencode"` rich-attrs branch logging `sessions`/`messages`/`parts`/`latest_migration` (best-effort: a `ProbeStatus` error logs `probe_error` and still registers the source). The discovery counters + path helpers were extracted to a new `discovery.go` to bring `sources.go` back under the 400-line budget (it was already 464 at HEAD; the split also reduces that pre-existing overage). + +Key decisions locked (honoring the recorded SOW/spec): + +- **Scan→Tail cursor hand-off**: `Scan` records the final watermark on the instance even on ctx-cancel; `Tail` resumes from it. A cold `Tail` (no preceding `Scan`) snapshots current HEAD per table (`maxID`+`maxTimeUpdated`) + records the schema hash, so it follows from NOW (the SQLite analogue of codex stat'ing EOF). Re-emission is absorbed by idempotent upserts. +- **Real schema hash**: `sha256` of the `__drizzle_migrations.name` list ordered by `id ASC` (application order), length-prefixed so the digest is unambiguous regardless of name content. On a tail-time mismatch the loop logs a structured WARN, re-reads, and CONTINUES without resetting watermarks (column drift is per-column via the dynamic SELECT) — spec adapter-opencode.md §"Cursor". A missing `__drizzle_migrations` (foreign/old DB) leaves the hash empty and degrades gracefully. +- **Payload-URI grammar home**: `payloads.go` is the single source of truth; the mapper default delegates to it; behavior is byte-identical so chunk-B goldens are unchanged. +- **Probe reporting + graceful degradation**: `ProbeStatus` opens read-only, `COUNT(*)`s the three tables, reads the latest migration; a missing table → count 0 + soft error (not a hard failure), a hard open failure → returned so discovery logs it but the source still registers. + +Carried-forward chunk-C notes resolved: + +1. **`buildSelectByID` reachability**: KEEP (not dead code). It is reached by `scanTableDelta` (store_query.go:96-97) when `!s.has("time_updated")`. The migration history shows `time_updated` is part of the base `Timestamps` mixin on all four tracked tables across the entire observed schema (adapter-opencode.md lists `time_updated INTEGER NOT NULL` for session/message/part/session_message), so on every observed schema `time_updated` IS universal — but the fallback remains a genuine, tested backward-compat safeguard (tailer tests + `TestPureHelpers` cover it). It should NOT be flagged for removal. +2. **`schemaFingerprint` placeholder**: fully removed. `coerceScanCursor` no longer computes any hash; the real `__drizzle_migrations`-name digest is recorded by `recordSchemaHash` at scan/tail/snapshot start. + +Gates (run 2026-05-30): `go build ./...` exit 0; `go vet ./internal/adapters/opencode/... ./cmd/ai-viewer-ingest/...` exit 0; `golangci-lint` **0 issues**; `gosec -severity medium -confidence medium ./...` exit 0 (added one justified `// #nosec G202` on the `__drizzle_migrations` name read — fixed package constant via `quoteIdent`, never user input); `go test -race -cover` pass — **opencode 91.8%** (up from chunk-C's 91.6%, no regression), cmd unchanged at 47.8%; full-repo `go test -race ./...` all pass (mapper goldens intact); `scan-secrets.sh` PASS. Every new/modified `.go` file ≤ 400 lines. + +### Chunk E — golden harness + synthetic-DB fixtures + golden scenarios + resume + data-JSON fuzz (2026-05-30) + +The final chunk: pinned the adapter→events boundary with a committed golden suite, per-scenario invariant assertions, a scenario-level resume golden, and a `data`-JSON fuzz target. Purely additive — the entire change is `internal/adapters/opencode/*_test.go` + `testdata/opencode/**`; NO production package touched (no `internal/ingest`, `internal/canonical`, `internal/store`, `internal/presenter`, no sibling adapter, no `cmd/`). The full-pipeline ingester path is adapter-agnostic (already covered by the aiagent e2e); the opencode-specific risk is the adapter→events boundary, which this chunk pins. + +Files (NEW, all ≤400 lines): + +- `golden_test.go` (257 lines) — the `-update-golden` flag (same name as codex), the auto-discovering `testdata/opencode/*/` loop, `buildFixtureDB` (builds a throwaway SQLite DB from `fixture.sql` via a SEPARATE read-write conn, then reopens read-only via `New`/`openReadOnly`), the `SourceProgress` filter, the `{kind,payload}` JSONL `encodeEvents` with `opencode:`→`opencode:` substitution, and a small statement splitter (the modernc driver does not run multi-statement Exec). +- `golden_invariants_test.go` (335 lines) — makes the goldens NOT self-justifying: re-scans each fixture and asserts the load-bearing invariant keyed on canonical-event FIELDS (AC#4 dual-edge, AC#5 degrade, AC#7 multi-provider + two-level tokens, baseline tree shape) plus `TestGoldenInvariant_DSchemaDrift_MissingColumnsLoggedINF` asserting the AC#5 missing-optional-column INFO logs (one per Missing column, via the `golden_loghandler_test.go:captureHandler` record-capturing `slog.Handler`). [INF wiring update: the placeholder `…_MissingINFNotImplemented` doc test was replaced by the affirmative assertion once the production emission landed in `tailer.go:logMissingColumns`.] +- `golden_resume_test.go` (185 lines) — AC#3 cumulative-token invariant (deltas 100/150/160/0) + the scenario-level resume golden (AC#6): idempotent re-scan (final cursor → 0 new content), from-zero determinism (two cold scans identical), and multi-session final-cursor (parent+child both fully consumed → 0 re-emit). Reuses chunk-C's `eventFingerprint`/`contentFingerprints`/`multisetDiff`. +- `data_fuzz_test.go` (131 lines) — `FuzzDecodeMessageData` + `FuzzDecodePartData` over the `types.go` `data`-JSON decoders (the untrusted-bytes boundary). No-panic contract; seeds cover both message roles, all 12 part `$.type` variants (incl. the `tool='task'` metadata.sessionId edge), unknown types, and malformed/truncated/empty/deeply-nested bodies. +- `testdata/opencode/{a_happy,b_subagent_task,c_multi_provider,d_schema_drift,e_cumulative_tokens}/{fixture.sql,expected.jsonl}` — 5 scenarios, each a human-reviewable `fixture.sql` (NO binary `.db` committed) + a hand-verified `expected.jsonl`. All synthetic ids/content (R5); `scan-secrets.sh` PASS. + +Hand-verification (the critical step — every golden was read line-by-line against the spec + fixture intent BEFORE being trusted, NOT just regenerated): + +- **a_happy**: confirmed the full part-order op tree (LLM/reasoning/text-payload/tool/step-finish) + PayloadRef URIs + ms→µs (Ts=1000000 for a 1000 ms session) + NO SessionFinalized. +- **b_subagent_task (AC#4)**: confirmed BOTH edges in `expected.jsonl` — line 4 `op_started` `Kind=session`/`ChildSessionNativeID=ses_child01` (topology parent, emitted first) AND line 5 `op_started` `Kind=tool`/`Name=task`, same turn — plus the child `session_started` (line 10) `Kind=sub_agent`/`ParentNativeID=ses_parent01`/`RootNativeID=ses_parent01`. +- **c_multi_provider (AC#7)**: confirmed line 3 LLM op `Provider/ProviderAlias=anthropic` and line 8 `Provider/ProviderAlias=openai` (both surface); plus the two-level tokens — turn-2 op (line 10) 300/80 (per-message cumulative) vs turn-2 turn (line 11) 200/50 (session-level delta). +- **d_schema_drift (AC#5)**: confirmed line 1 `Model=""`/`AgentName=""` and Extras WITHOUT `providerID`/`variant` (dropped columns), while line 3 op keeps `Provider=anthropic`/`Model=claude-x` and line 6 turn keeps 60/15 (from `message.data`). Introspection ACCEPTED the old schema (no rejection). +- **e_cumulative_tokens (AC#3)**: confirmed op token_in deltas 100,150,160,0 across the four step-finish ops by reading lines 4,6,8,10 of `expected.jsonl` (cumulative 100/250/410/400; the 4th DECREASES→clamps to 0); turn rollup 400/80 (line 11). + +Determinism: every non-SourceProgress golden `Ts` derives from a fixture row timestamp ×1000 — NO wall-clock leak (golden passes `-count=3`; `-update-golden` is byte-idempotent). The only wall-clock (`emitProgress` `time.Now()`) is on `SourceProgressEvent`, which the harness filters out. + +**AC#5 INF gap — CLOSED (post-chunk-E wiring, 2026-05-30):** the chunk-E finding (the spec/AC#5 "one INF log per missing optional column" was computed in `store.go` `tableSchema.Missing` but consumed by no production code, and `Adapter.logger` never logged it) is now fixed. `tailer.go` gained `logMissingColumns(logger, schema)`, called right after `introspectAll` succeeds in BOTH `scanLoop` and `tailLoop`; it emits one `logger.Info(...)` per (table, column) with a stable message + `table`/`column` keys, in deterministic order (`trackedTables` order, columns sorted). The logger is threaded as a new `*slog.Logger` param into `scanLoop`/`tailLoop` (right before `onError`) and passed from `Adapter.logger` in `adapter.go` `Scan`/`Tail` (both loops nil-guard defensively for direct test callers). `Scan` and `Tail` each emit the set once (per-phase duplication on the rare old-schema path, accepted; noted in a code comment). The placeholder `TestGoldenInvariant_DSchemaDrift_MissingINFNotImplemented` was REPLACED by `TestGoldenInvariant_DSchemaDrift_MissingColumnsLoggedINF`, which `Scan`s the fixture through the public adapter with a record-capturing `slog.Handler` and asserts the logged (table, column) set equals the Missing set. All chunk-E tests preserved unchanged in what they assert (the `scanLoop`/`tailLoop` test call sites got a mechanical silent-logger arg via the new `store_testhelpers_test.go:silentLogger()`). Gates green; opencode coverage 92.4% (unchanged). Scope: `internal/adapters/opencode/` only + this SOW + `adapter-opencode.md`. + +**Fuzz finding:** none. Both targets ran 30s clean — `FuzzDecodeMessageData` 3.10M execs, `FuzzDecodePartData` 3.06M execs, zero crashes. No `types.go` decoder change needed. + +Gates (run 2026-05-30): `go build ./...` exit 0; `go vet ./internal/adapters/opencode/` exit 0; `golangci-lint run ./internal/adapters/opencode/...` **0 issues**; `gosec -quiet -severity medium -confidence medium ./...` exit 0; `go test -race -count=1 -cover ./internal/adapters/opencode/` pass at **92.4%** (up from chunk-D's 91.8% — goldens pushed coverage UP, no regression); both fuzz targets no-crash in 30s; `scan-secrets.sh` PASS (602 tracked files). Every new `.go` file ≤ 400 lines (257/312/185/131). NO binary fixture committed. + +## Validation + +(Empty placeholder. Filled at SOW close.) + +## Reviews + +### Round 1 — 2026-05-30 (codex + glm-5.1 + minimax-m2.7, parallel, full opencode surface) + +glm and minimax: no P1/P2 (only doc/style P3 + design-tradeoff observations). codex (decisive) found real defects; each adjudicated against the spec + code before acting (not on reviewer convergence): + +- **P1.1 (data loss) — FIXED.** `collectDeltas` emitted a `SourceProgress` checkpoint mid-paging, advancing the persisted watermark BEFORE `reloadAndEmit` emitted the affected sessions' content; a crash/cancel between them skips those sessions on restart (worst on cold backfill). Fix: `tailer_batch.go` `batchProcessor` — each bounded batch pages → `reloadAndEmit` → THEN promotes `committed` + `emitProgress`; on cancel/error returns the last content-committed cursor. Pinned by `TestProcessChanges_CheckpointAfterEmit_NoLoss`. +- **P1.2 (read-safety hardening) — FIXED.** `buildReadOnlyDSN` only stripped name-colliding `_pragma`, so `_pragma=wal_checkpoint(...)`/`_txlock=exclusive` from a crafted DSN survived (mode=ro still blocked real mutation, but the contract says unreachable). Fix: allowlist — discard ALL caller params, rebuild with `mode=ro` + `_txlock=deferred` + the read-only pragma set. Pinned by `TestBuildReadOnlyDSN_MaliciousDSNNeutralised`. +- **P1.3 (live turns finalized as completed) — FIXED.** `turnStatus`/`finalizeTurn` finalized every assistant message; spec adapter-opencode.md:472 finalizes only when `data.time.completed` OR a `step-finish` part exists. Fix: `turnIsTerminal(data, hasStepFinish)` gates `TurnFinalizedEvent` (mapper_parts.go:143); running turns emit TurnStarted only. Pinned by `TestMapper_RunningTurnNotFinalized` + `TestMapper_TurnFinalizedWhenTerminal`. +- **P1.4 ($OPENCODE_DB AC miss) — FIXED.** `opencodeDBPath` now resolves `$OPENCODE_DB` → `$XDG_DATA_HOME/opencode/opencode.db` → default. Pinned by `TestOpencodeDBPath_Resolution`. +- **P2.4 (nested-subagent RootNativeID) — FIXED.** `store_root.go:resolveRootID` walks the `parent_id` chain to the true tree root (depth-cap + cycle guard). Pinned by `testdata/opencode/g_nested_subagent` + `TestGoldenInvariant_GNestedSubagent` (grandchild Parent=`ses_gchild`, Root=`ses_groot`). +- **P2.5 (orphan step-start) — FIXED.** A new `step-start` over an open LLM op now force-closes the prior with `Status="cancelled"` (spec Edge #5). Pinned by `TestMapper_TwoStepStartsForceCloseFirst`. +- **P2.6 (silent parse failures) — FIXED.** Malformed model JSON / task metadata / corrupt numeric cells now route to `onError`/WARN with context instead of silent zero (hard rule #6). Pinned by `TestMapper_Malformed*Warns` + `TestLoadSession_CorruptNumericWarns`. +- **P2.7 (unknown session_message.type) — FIXED.** The `session_message` delta now decodes `type` and WARNs on an unrecognized value (spec Edge #1). Pinned by `TestSessionMessage_UnknownTypeWarns`. +- **P2.8/P2.1/P3.2 (N+1 / long txn / cross-snapshot) — ADDRESSED.** `loadSessionTree` loads the session + messages + parts under one bounded read transaction. +- **P3.1 (dead `buildSelectByID`) — FIXED.** Removed; `time_updated` is required by introspection (universal across opencode's schema), so the fallback was unreachable. +- **P3 minors — FIXED.** recursive `itoa`→`strconv.Itoa`; `probe.Close()` on the error path. + +**Deferred (shared-surface, out of this adapter's additive scope) → follow-up SOWs:** +- **P2.3** → `SOW-0023` — `sessions.provider`/`provider_alias` need a `SessionStartedEvent` field + writer mapping (`internal/canonical` + `internal/ingest`). +- **P2.2** → `SOW-0024` — per-source row counts in `/api/health` (generalized source metadata; presenter+ingester+schema). + +All fixes verified at golden + gate level (golangci 0, gosec 0, race, coverage 92.3%, fuzz clean). + +### Round 2 — 2026-05-30 (codex decisive; full opencode surface) + +Committed in `9630dc0`. codex confirmed round-1 fixes held and found a further set; each adjudicated against code: +- **P1-A (idle-scan re-arm) — FIXED.** A single `MaxID` conflated the monotonic insert-detect id with the `(time_updated, id)` paging position; an in-place UPDATE of an OLD row regressed `MaxID` so the cheap `MAX(id)` check fired every idle poll → the expensive `(time_updated)` scan ran forever. Fix: split the watermark into `MaxIDSeen` (monotonic, never regresses) + `MaxTimeUpdatedMs/ID` (paging position); cursor bumped to v2. Pinned by `cursor_regression_test.go` + the counting-driver `TestP1A_OldRowUpdateDoesNotReArmIdleScan`. +- **P1-B (sticky failed) — FIXED.** Session-failed was a sticky OR across turns; now it is the LAST assistant turn's state (cleared when a later turn recovers). +- **P1-C / P2-A (tool/turn error class) — FIXED.** Tool error → canonical `failed` + `ErrorClass`/`ErrorMessage`; error PRESENCE (not a non-empty name) is the terminal predicate; empty name → `defaultErrorClass`. +- **P2-B (N+1 parts) — FIXED.** One `WHERE session_id = ?` part query per session, partitioned in memory. +- **P2-D (blanked ops.name) — FIXED.** The patch-enrichment re-emit now carries the full op identity (the writer updates `ops.name` unconditionally). +- **P2-E (compaction) — FIXED.** A non-NULL `time_compacting` session is skipped that cycle. +- **P2-F (overflow) — FIXED (PARTIAL — see round-3 P2-1).** `subClampWarn` clamps the token-delta subtraction + WARNs; `msToMicros` saturated but SILENTLY, and the `ctx_used` add was still unguarded. +- **P3-B/P3-C — FIXED.** No `journal_mode` in the read-only DSN; one `SourceProgress` checkpoint layer. + +### Round 3 — 2026-05-31 (codex decisive; full opencode surface) + +codex confirmed round-1/2 hold and the read-safety/checkpoint/goldens are good, then found 2 live-concurrency P1 + 4 P2 + 2 P3. CTO adjudicated each; all implemented inside the adapter + the ingest CLI source/discovery files (no shared-surface edits): + +- **P1-1 (same-ms in-place update lost forever) — FIXED.** An already-seen LOW-id row updated in place at exactly the cursor's boundary ms moves neither `MAX(id)` nor `MAX(time_updated)`, and the forward delta's strict `(=T AND id > highID)` excludes it → skipped permanently when it is the session's only change. Fix: a dedicated bounded boundary-bucket re-scan (`tailer_boundary.go`) that, on a WAL-driven probe where no detector advanced (`probed && walDriven && !changed`), selects the FULL `time_updated = T` bucket per table, derives the owning sessions, and re-emits them idempotently WITHOUT advancing the cursor. Gating on `walDriven && !changed` keeps the cheap idle path untouched (AC#6) and stops a cold-Tail HEAD snapshot from replaying its boundary session (its writes are inserts → `changed`). Pinned by `TestP1_1_BoundaryUpdateReEmitted` (+ skip/empty/error/cross-table cases); the round-2 idle-scan tests stay green. +- **P1-2 (compaction TOCTOU / cross-snapshot) — FIXED.** `loadAndMapSession` now reads the session row, checks `time_compacting`, resolves the parent-chain root, AND loads messages+parts within ONE `BeginTx{ReadOnly:true}` snapshot (was: row+check in no-tx queries, tree in a SEPARATE tx). `loadSessionTree`/`loadSession`/`resolveRootID` take a `roQuerier` so the same code runs against the shared tx. Pinned by `TestP1_2_CompactingSkippedAtomically` + `TestP1_2_TreeLoadRunsInCallerTx`. +- **P2-1 (overflow fix completion) — FIXED.** `msToMicros` clamp now surfaces via `onWarn` at the mapper's method emission sites (`msToMicrosWarn`, used by `sessionStarted`/`sessionFinalized`); `ctx_used = tokens.input + tokens.cache.read` uses a new warning-capable saturating `addClampWarn`. Pinned by `TestP2_1_MsToMicrosWarnsOnClamp` + `TestP2_1_CtxUsedAddSaturatesAndWarns`. +- **P2-2 (malformed JSON invisible to /api/health) — FIXED.** The malformed-message and malformed-part branches now route through `mwarn` → the adapter's `onError` → `SourceErrorEvent` → `sources.parse_errors` (health), IN ADDITION to the session `LogEntry`. Pinned by `TestP2_2_MalformedDataRoutesToOnError` + `TestP2_2_MalformedPartRoutesToOnError`. +- **P2-3 (unbounded full-tree load) — FIXED.** Defensive `maxSessionMessagesWarn`/`maxSessionPartsWarn` (100k) bounds: a session over either emits ONE WARN via `onError` and is still processed in full (the whole ordered tree is required for token-delta synthesis — documented as an intentional design constraint). Pinned by `TestP2_3_OversizedSessionWarns`. +- **P2-4 (file:/:memory: DSN vs CLI os.Stat) — DOCUMENTED.** The CLI source location is a filesystem path; the `file:`/`:memory:` DSN forms are adapter programmatic/test use only. Doc comment at `buildReadOnlyDSN` + notes in `deployment.md`/`adapter-opencode.md`. No CLI code change (correct: documenting the contract is the fix). +- **P3-1 (dead old-schema part fallback) — REMOVED.** Found: `requiredColumns["part"]` includes `session_id` (store.go), so `introspectAll` makes a part table lacking it FATAL upstream → the round-2 `loadPartsByMessageIDs`/`selectPartsByMessageIDs` fallback was unreachable. Removed it + its introspection-bypassing test (`TestP2B_OldSchemaPartFallbackOneQuery`); the live single-query path keeps `TestP2B_PartsLoadedInOneQuery`. (The delta-path `resolvePartSession` is a separate function, out of scope, kept.) +- **P3-2 (directory named opencode.db registers) — FIXED.** The opencode auto-discovery probe now requires `info.Mode().IsRegular()` (other adapters' directory probes unchanged). Pinned by `TestAutoDiscover_OpencodeDirectoryNotRegistered`. + +Gates: `go build ./...` 0; `go vet` 0; golangci-lint "0 issues"; gosec medium+ 0; `go test -race -cover` opencode 92.4%; both `FuzzDecode*` 20s clean; whole-module `go test -race` all pass; all changed files ≤400 lines; scan-secrets PASS; 7 committed goldens byte-identical (`-update-golden` wrote zero diffs). + +### Round 4 — 2026-05-31 (codex decisive; full opencode surface; commit `pending`) + +codex confirmed round-1/2/3 hold + read-safety/checkpoint/goldens good, found 1 P1 + 5 P2 + 2 P3 (glm rated production-ready but MISSED the P1; minimax clean — codex remains the decisive reviewer). CTO adjudicated each; all implemented inside the adapter + the ingest CLI source/discovery files: + +- **P1 (boundary re-scan missed on the safety-net path) — FIXED.** The round-3 boundary re-scan fired only on `walDriven && !changed`; the 60s safety-net probe did not run it, so a same-ms in-place update was stranded when the WAL hint was missed (dropped event / watcher-setup failure / timer-only polling) — violating the spec's safety-net guarantee. Fix: a `priorProbe` flag on `pollState` (set in `markProbe`); the boundary re-scan now runs on `probed && (walDriven || priorProbe)`, so the safety-net probe also triggers it, while a cold-Tail's FIRST probe (`priorProbe==false`, not WAL-driven) never replays its snapshot boundary. Idle AC#6 property + idempotency + no-cursor-advance preserved. Also covers `time_compacting` clearing at the boundary ms. Pinned by the rewritten `TestP1_1_BoundaryUpdateReEmitted` (cold-first NO / safety-net YES / WAL YES) + `TestP1_1_CompactingClearsAtBoundaryReSurfacesOnSafetyNet`. +- **P2-1 (delta scanners silently coerced corrupt cells) — FIXED.** All four delta scanners now `.withWarn`; corrupt OPTIONAL cells WARN+degrade-to-0, but the REQUIRED cursor columns (`id`, `time_updated`) ERROR via `requiredWatermark`/`i64Required`/`strRequired` so a poisoned watermark can never be persisted. Pinned by `TestP2_1_CorruptRequiredCellErrorsNoCursorAdvance` + `...OptionalCellWarns...`. +- **P2-2 (silent `msToMicros` at most emitters) — FIXED.** Every emitted-event timestamp now routes through `msToMicrosWarn`; only the pure free-helper definition keeps the silent form. Pinned by `TestP2_2_*TimestampClampsAndWarns`. +- **P2-3 (non-canonical `PayloadKind: "user_attachment"`) — FIXED.** The canonical `PayloadKind` set has no attachment value; the `file` part now emits an INF `LogEntry` with `{filename,url,mime}` extras (no-loss, canonical-clean) instead of a non-canonical PayloadRef. A first-class canonical attachment kind is deferred to **follow-up SOW-0025**. Pinned by `TestMapSession_FilePartLogEntry` + a `canonicalPayloadKinds` guard; goldens carry only canonical kinds. +- **P3-1 (ProbeStatus `context.Background()`) — FIXED.** Now a bounded `context.WithTimeout(…, opencodeProbeTimeout=10s)`. Pinned by `TestOpencodeProbeRespectsCancelledContext`. +- **P3-2 (bare path split on `?`) — FIXED.** Query-string splitting is scoped to `file:`/`:memory:` URI forms; a bare filesystem path (POSIX allows `?`) is opaque. Pinned by `TestBuildReadOnlyDSN_BarePathWithQuestionMarkIsOpaque` + the literal-open test. +- **P2-4 (subtask/agent/snapshot) — SPEC-AMENDED (CTO).** Spec said `subtask`→session-op, but live counts are zero; documented as intentionally-ignored v1 no-ops with the zero-observed evidence + a deferred-SOW note (not implemented against zero data). +- **P2-5 (per-batch full-tree re-emit) — SPEC-DOCUMENTED (CTO).** An intentional crash-safety/idempotency tradeoff bounded by session size; cross-batch coalescing deferred. + +Gates: build/vet 0; golangci "0 issues"; gosec 0; `go test -race -cover` opencode 92.5%; both `FuzzDecode*` 20s clean; whole-module `go test -race` all pass; production files ≤400 lines; scan-secrets PASS; goldens byte-identical (no file-part scenario, all canonical PayloadKinds). + +### Round 5 — 2026-05-31 (codex decisive; commit `3569b08`) + +codex found **0 P1** + 2 P2 + 2 P3 (glm/minimax merge-ready). Fixed: no warn/error/content emission while a source-DB read tx is open (warnSink buffers, flushes post-tx — WAL-pin avoidance); required OWNERSHIP-id columns (message/part/session_message `session_id`, `part.message_id`) error on corrupt in the delta path (no silent `affectedSet.add("")` cursor gap); failed-session `ErrorMessage` from `data.error.data.message` (+ `i_failed_assistant` golden); stale cumulative-token comment fixed. Coverage 92.4%. + +### Round 6 — 2026-05-31 (codex decisive; commit `9dd8aaf`) + +codex found 1 P1 + 2 P2 + 2 P3. Fixed: same-ms boundary re-scan run pre-advance regardless of `changed` (deeper co-occurring-forward-change case) + `boundaryReal` cold guard; `tool_response` PayloadRef only when `state.output` non-empty (bogus failed-tool ref removed, h_failed_tool golden corrected); retry log includes `error.name`; removed dead `resolvePartSession` fallback; `j_file_attachment` golden; spec-amended (opencode does not emit `SessionUpdatedEvent` — idempotent `SessionStarted` re-emission is the update path). Coverage 92.8%. + +### Round 7 — 2026-05-31 (codex decisive; commit `f0c2b8b`) + +codex found the same-ms gap a 4th time (cheap-`MAX(id)` path bypassed the `probed` gate) + 4 more. **Closed the same-ms CLASS, not the case:** the boundary re-scan now fires on `boundaryReal && (changed || probeGateOpen)` — every detection path (cheap insert, gated probe, WAL, 60s net) — guarded by a deterministic-seed same-ms STRESS test (random insert/in-place-update interleavings; FAILS against the old trigger, passes `-count=5 -race`). Also: `reloadAndEmit` propagates non-skip errors (cursor not promoted on transient failures); `boundaryReal` guard on the `changed==false` path; `watchWAL` goroutine awaited in `closeWatch` (no send-on-closed-channel race); full-tree scanners validate required ownership ids. Coverage 92.7%. + +### Round 8 — 2026-05-31 (codex + glm + minimax; CONVERGED) + +**All three reviewers converged on merge.** codex: *"no actionable P1 or P2 in the supported Scan→Tail lifecycle. I would merge this adapter."* glm + minimax: merge-ready. codex's only remaining note is operational scale (full-tree re-emit + boundary-bucket scans) — a deliberate, spec-documented, test-guarded tradeoff, not a correctness issue. Final whole-repo gates on `f0c2b8b`: gofmt clean, `go build` 0, `golangci-lint run ./...` 0 issues, gosec 0, whole-module `go test -race` all pass, scan-secrets + scan-ai-attribution PASS. + +**codex P1 trend across the 8 rounds: 4 → 3 → 2 → 1 → 0 → 1 → 2 → 0.** The recurring thread was the same-ms incremental-cursor boundary (rounds 3/4/6/7), finally closed in round 7 by unifying the re-scan trigger across all detection paths + a property/stress test rather than per-case patching. codex was the decisive reviewer every round (glm/minimax never surfaced a substantive P1/P2 and twice rated the adapter merge-ready while a real codex P1 was open). Deferred follow-ups filed: SOW-0023 (session provider columns), SOW-0024 (per-source /api/health counts), SOW-0025 (canonical attachment PayloadKind). + +## Outcome + +Delivered. The `opencode` adapter — the 5th and final source adapter — projects OpenCode's live, concurrently-written, multi-GB SQLite store onto the canonical event model, strictly read-only, and is registered + auto-discovered by the ingester. Shipped in 5 chunk commits (`c4e4170` A → `ee5c77d` E) + 7 review-fix commits (`9630dc0` … `f0c2b8b`); 8 external-review rounds converged (codex + glm + minimax all merge-ready). All 8 acceptance criteria met with automated test evidence (see the AC section). Gates green at merge: golangci 0, gosec 0, whole-module `go test -race` pass, opencode coverage 92.7%, `FuzzDecode*` clean, same-ms stress `-count=5 -race` clean, files ≤400 lines, secret + AI-attribution scans PASS. Read-safety (the defining R1 risk): every production open routes through `openReadOnly` (`mode=ro` + `query_only(true)` + `_txlock=deferred` + short `BEGIN DEFERRED` per page; warnings buffered + flushed post-tx); no reachable write-path pragma/`VACUUM`/`ATTACH`/write `Exec`; six write-probes + the allowlist-DSN test pin it. PR opened + self-merged per the branch-protection/merge workflow (SOW sign-off was the only gate). + +## Lessons / Follow-Ups + +- **The same-ms incremental-cursor boundary is the hard part of tailing a live mutable SQLite DB.** A `(time_updated, id)` watermark cannot detect an in-place update of an old low-id row whose `time_updated` lands at the current boundary millisecond (neither `MAX(id)` nor `MAX(time_updated)` advances). codex surfaced a distinct case of this across 4 rounds (3/4/6/7); each per-case patch revealed the next. The fix that finally closed it was structural — a single boundary re-scan trigger covering ALL detection paths (`boundaryReal && (changed || probeGateOpen)`) plus a deterministic-seed **property/stress test** over random insert + in-place-update interleavings. Lesson: when a reviewer finds the same bug class N times, stop patching cases and (a) unify the mechanism, (b) write a property test that exercises the interleavings, not a per-case assertion. +- **codex is the decisive reviewer; glm/minimax are corroborators, not gates.** Across all 8 rounds glm and minimax never surfaced a substantive P1/P2, and twice declared the adapter "merge-ready / production-ready" while a real codex P1 (cursor data-loss; same-ms gap) was still open. Adjudicate on ground truth + codex; never merge on glm/minimax convergence alone. (Also disproved 2 false-positive findings on ground truth — glm's writer-COALESCE P2 and a glm same-ms false alarm.) +- **Subagent IDE diagnostics during file-splits/signature-changes are stale.** Every review-fix round produced a wave of ✘ DuplicateDecl / WrongArgCount / UndeclaredName captured mid-edit (file extraction, the `warmStart`/`bool` signature ripple). Ground truth was `go build` + `go vet` (which compiles tests) every time — all were stale. Verify, don't trust the diagnostic snapshot OR the subagent's gate claims. +- **Canonical-surface gaps belong in follow-up SOWs, not adapter hacks.** Three real gaps (session provider columns, per-source `/api/health` counts, a canonical attachment PayloadKind) needed `internal/canonical`/`ingest`/`presenter` changes outside the adapter's additive scope — filed as SOW-0023/0024/0025 rather than smuggled in. The adapter kept the canonical contract clean (no non-canonical op status or PayloadKind reached the DB). +- **Follow-ups to pick up:** SOW-0021 (turn-extras carrier), SOW-0022 (codex dup rollout-id), SOW-0023 (session provider carrier), SOW-0024 (per-source health counts), SOW-0025 (canonical attachment PayloadKind). Pre-existing test-file budget overages (`mapper_test.go`, `mapper_branch_test.go`, `cmd/ai-viewer-ingest/main_test.go` >400 lines) are a low-priority cleanup candidate. diff --git a/.agents/sow/pending/SOW-0023-20260530-session-provider-carrier.md b/.agents/sow/pending/SOW-0023-20260530-session-provider-carrier.md new file mode 100644 index 0000000..15ae3d3 --- /dev/null +++ b/.agents/sow/pending/SOW-0023-20260530-session-provider-carrier.md @@ -0,0 +1,78 @@ +# SOW-0023 - session provider carrier (populate sessions.provider / provider_alias) + +## Status + +Status: open + +Sub-state: proposed follow-up, awaiting operator prioritization. Discovered during SOW-0005 (opencode adapter) round-1 external review (codex P2.3). Not blocking SOW-0005 — the op-scoped provider/alias is complete and authoritative; this is a denormalized session-level convenience. + +## Requirements + +### Purpose + +Make `sessions.provider` and `sessions.provider_alias` reachable. `data-model.md` §sessions defines both columns (`provider TEXT`, `provider_alias TEXT -- user-defined provider alias (opencode); NULL otherwise`), but **no canonical session event carries provider fields**: `SessionStartedEvent` (`internal/canonical/events.go`) carries `Model` only — no `Provider`/`ProviderAlias` — and the ingest writer's `sessions` upsert (`internal/ingest/writer.go`) never writes the `provider`/`provider_alias` columns (they are written only for `ops`). So the session-level provider columns the data-model promises are structurally unreachable from any adapter. This SOW adds a session provider carrier to the canonical event model + ingest writer so adapters can populate the session-level provider columns, and wires the opencode adapter (which already knows the per-message provider) to set them. + +### User Request + +Implied by `data-model.md` §sessions (the columns exist) and the opencode adapter's multi-provider awareness. SOW-0005 round-1 review surfaced that the home is unwired. + +### Assistant Understanding + +Facts: + +- `internal/canonical/events.go`: `SessionStartedEvent` = {EventBase, SessionNativeID, ParentNativeID, RootNativeID, Kind, AgentName, Model, Cwd, CallPath, Extras, ...} — no `Provider`/`ProviderAlias`. `OpStartedEvent` carries `Provider`/`ProviderAlias`. +- `internal/ingest/writer.go`: the `sessions` UPSERT maps `kind, agent_name, model, cwd, call_path, status, ...`; it does NOT reference `provider`/`provider_alias` (those are in the `ops` UPSERT only). +- `data-model.md` §sessions: `provider`, `provider_alias` columns are defined ("primary/last-known"; alias is opencode-specific). +- The opencode mapper has the per-message `providerID` (alias) and a best-effort canonical mapping; it currently surfaces them per-op + in SessionStarted `Extras.providerID` only. + +Inferences: + +- Cleanest carrier: add `Provider` + `ProviderAlias` (+ optionally `Model` already present) to `SessionStartedEvent` (and/or a `SessionUpdatedEvent` for last-known refresh), mirroring how ops carry them; the writer marshals them into the `sessions` columns with the same `COALESCE(NULLIF(excluded.x,''), sessions.x)` idempotency discipline used elsewhere. +- A single session-level alias is inherently lossy for **multi-provider** sessions (opencode sessions can span providers). Decide in the gate: "primary = last-known" vs "first" vs leave NULL when >1 distinct provider. The op-scoped data remains the authoritative complete record; this column is a UI convenience. +- Shared infrastructure: claude-code/codex/opencode could all populate it. Deliberately out of SOW-0005's adapter-additive blast radius. + +Unknowns: + +- Whether the session row should carry "primary" provider (last-known) or be NULL for genuinely multi-provider sessions. Resolve in the gate against the presenter's intended use. +- Re-emit/idempotency: SessionStarted may re-emit (tailer full-tree re-feed); the writer must not corrupt the column. Confirm against the idempotent-write model. + +### Acceptance Criteria + +1. `SessionStartedEvent` (and/or `SessionUpdatedEvent`) carries `Provider` + `ProviderAlias`; `internal/canonical` tests cover it. **Verification**: `go test` for canonical. +2. The ingest writer marshals session `Provider`/`ProviderAlias` into `sessions.provider`/`sessions.provider_alias`, idempotently under re-emit. **Verification**: an ingester test asserts a `SessionStartedEvent{Provider:...,ProviderAlias:...}` lands in the columns and a re-emit does not corrupt them. +3. The opencode adapter populates the session provider columns from its per-message provider (per the gate's primary-provider decision). **Verification**: an opencode golden/invariant test asserts the session event's provider fields; the `c_multi_provider` fixture exercises the multi-provider decision. +4. Specs reconciled: SOW-0005 AC#7 note resolved; `data-model.md` + `canonical-events.md` describe the session provider carrier. **Verification**: spec-drift sweep clean. + +## Analysis + +Sources checked: `internal/canonical/events.go`, `internal/ingest/writer.go`, `.agents/sow/specs/{data-model.md,canonical-events.md,adapter-opencode.md}`, `internal/adapters/opencode/mapper*.go`. Discovered 2026-05-30 during SOW-0005 round-1 review. + +Risks: + +- **R1 — Shared-surface change.** Touches `internal/canonical` + `internal/ingest` (every adapter). Mitigation: additive field (no existing adapter sets it → no behavior change until opt-in); full gate + external review. +- **R2 — Multi-provider lossiness.** A single session alias cannot represent a multi-provider session. Mitigation: the gate's primary-vs-NULL decision; the op-scoped data stays authoritative. +- **R3 — Idempotency.** Re-emitted SessionStarted must not corrupt the column. Mitigation: COALESCE/NULLIF write + a re-emit test. + +## Pre-Implementation Gate + +(To be filled by the assistant picking this SOW up. Required before moving to `current/`.) + +## Implementation + +(Empty placeholder.) + +## Validation + +(Empty placeholder.) + +## Reviews + +(Empty placeholder.) + +## Outcome + +Pending. + +## Lessons / Follow-Ups + +Pending. Related: [[SOW-0021-20260530-turn-extras-carrier]] (the same class of canonical-carrier gap, for turns). diff --git a/.agents/sow/pending/SOW-0024-20260530-per-source-counts-health.md b/.agents/sow/pending/SOW-0024-20260530-per-source-counts-health.md new file mode 100644 index 0000000..cba56a5 --- /dev/null +++ b/.agents/sow/pending/SOW-0024-20260530-per-source-counts-health.md @@ -0,0 +1,76 @@ +# SOW-0024 - per-source row counts in /api/health + +## Status + +Status: open + +Sub-state: proposed follow-up, awaiting operator prioritization. Discovered during SOW-0005 (opencode adapter) round-1 external review (codex P2.2). Not blocking SOW-0005 — the opencode probe LOGS its counts at startup and registers the source (visible in `/api/sources`); this SOW generalizes the richer per-source metadata into `/api/health`. + +## Requirements + +### Purpose + +Surface per-source content metadata (e.g. session/message/part counts, latest schema/migration marker) in `/api/health` (and/or `/api/sources`) as a GENERAL, all-adapter feature. Today the opencode auto-discovery probe computes `(session_count, message_count, part_count, latest_migration)` and writes them to the startup log only; SOW-0005 AC#8 originally asked for them in `/api/health`, but a bespoke opencode-only health field does not generalize and would special-case the presenter. This SOW designs a generic source-metadata surface so every adapter can contribute health-relevant counts without per-adapter presenter branches. + +### User Request + +Implied by SOW-0005 AC#8 ("exposes (session_count, message_count, part_count, latest_migration_name) in /api/health") — amended during SOW-0005 to log-only + this follow-up, because the full surface is cross-cutting and should be general. + +### Assistant Understanding + +Facts: + +- `internal/presenter/health.go` builds `/api/health` from a `sources` query + a parse-error rollup; it has no per-source content-count field. +- `cmd/ai-viewer-ingest/discovery.go` + `internal/adapters/opencode/migrations.go:ProbeStatus` compute opencode counts at startup and LOG them; they are not persisted into ai-viewer's DB for the presenter to read. +- `source_progress` / `sources` tables (`data-model.md`) hold per-source state (cursor, last_seq, last_ts, parse_errors). There is no general per-source "content summary" column. +- File-based adapters (aiagent/claude-code/codex) have no cheap O(1) row count analogous to opencode's `COUNT(*)`; a generalized surface must tolerate "count unknown/not-applicable". + +Inferences: + +- A general design: a small per-source metadata blob (JSON) the adapter/probe can populate (e.g. `sources.meta_json` or `source_progress.extras`), surfaced verbatim under each source in `/api/health` (or `/api/sources`). Adapters that have cheap counts populate them; others omit. Avoids per-adapter presenter branches. +- Alternatively, a periodic ingester-side rollup of canonical row counts per source (sessions/turns/ops the ingester already wrote) — which is adapter-agnostic and always available — may be more useful than source-native counts. Decide in the gate (source-native probe counts vs ingested-canonical counts). + +Unknowns: + +- Which counts are actually useful for health triage (source-native vs ingested-canonical), and whether they belong in `/api/health` (triage) or `/api/sources` (inventory). Resolve in the gate with the presenter spec. +- Staleness: probe counts are point-in-time at startup; ingested counts are live. The gate picks the model + documents freshness. + +### Acceptance Criteria + +1. A general per-source metadata surface exists (schema + writer + presenter) that any adapter can populate without a presenter code branch. **Verification**: presenter test asserts a source's metadata round-trips into `/api/health` (or `/api/sources`). +2. The opencode source surfaces its `(session/message/part counts, latest_migration)`; file-based adapters omit gracefully (no error, no zero-as-real). **Verification**: an integration test with an opencode fixture + a file-based fixture asserts the opencode metadata appears and the file-based one is absent/omitted. +3. Specs reconciled: SOW-0005 AC#8 amendment resolved; `data-model.md` + `observability.md`/`rest-api.md` describe the surface. **Verification**: spec-drift sweep clean. + +## Analysis + +Sources checked: `internal/presenter/health.go`, `cmd/ai-viewer-ingest/discovery.go`, `internal/adapters/opencode/migrations.go`, `.agents/sow/specs/{data-model.md,observability.md,rest-api.md}`. Discovered 2026-05-30 during SOW-0005 round-1 review. + +Risks: + +- **R1 — Cross-cutting surface.** Touches schema + ingester + presenter. Mitigation: additive (new optional metadata; no existing field changes); full gate + external review. +- **R2 — Generalization.** Must not special-case opencode in the presenter. Mitigation: the metadata blob/rollup is adapter-agnostic; adapters opt in. +- **R3 — Freshness semantics.** Probe-time vs live counts. Mitigation: the gate decides + documents which, and `/api/health` labels it. + +## Pre-Implementation Gate + +(To be filled by the assistant picking this SOW up. Required before moving to `current/`.) + +## Implementation + +(Empty placeholder.) + +## Validation + +(Empty placeholder.) + +## Reviews + +(Empty placeholder.) + +## Outcome + +Pending. + +## Lessons / Follow-Ups + +Pending. diff --git a/.agents/sow/pending/SOW-0025-20260531-canonical-attachment-payloadkind.md b/.agents/sow/pending/SOW-0025-20260531-canonical-attachment-payloadkind.md new file mode 100644 index 0000000..ca75ce4 --- /dev/null +++ b/.agents/sow/pending/SOW-0025-20260531-canonical-attachment-payloadkind.md @@ -0,0 +1,75 @@ +# SOW-0025 - canonical attachment PayloadKind (file/user attachments) + +## Status + +Status: open + +Sub-state: proposed follow-up, awaiting operator prioritization. Discovered during SOW-0005 (opencode adapter) round-4 external review (codex P2-3). Not blocking SOW-0005 — opencode ships an interim no-loss representation (an INF LogEntry carrying the attachment metadata). + +## Requirements + +### Purpose + +Give user/file attachments a first-class canonical representation. The canonical `PayloadRefEvent.PayloadKind` set is `llm_request | llm_response | llm_sdk_request | llm_sdk_response | llm_reasoning | tool_request | tool_response | log` (`internal/canonical/events.go`). None represents a USER file attachment (an image/file a user attached to a turn). opencode's `file` part carries exactly that (filename/url/mime). SOW-0005 round-4 surfaced that the adapter was emitting a non-canonical `PayloadKind: "user_attachment"` — a contract violation — and fixed it by emitting an INF `LogEntryEvent` with the attachment metadata in extras instead (no data loss, canonical-clean, but not a first-class payload servable via the future `/api/payloads`). This SOW adds a canonical attachment payload kind so file attachments across adapters (opencode today; codex/claude-code likely have analogues) are a first-class, servable payload. + +### User Request + +Implied by the data model (payloads are first-class) + opencode's file parts. SOW-0005 round-4 review flagged the canonical-contract gap. + +### Assistant Understanding + +Facts: + +- `internal/canonical/events.go`: `PayloadRefEvent.PayloadKind` documents 8 kinds; none is a user/file attachment. +- `internal/adapters/opencode/mapper_emitters.go`: a `file` part now emits an INF `LogEntryEvent` ("file attachment", extras `{filename,url,mime}`) — the SOW-0005 round-4 interim (was a non-canonical `user_attachment` PayloadRef). +- The `/api/payloads` serving route is itself Phase 2 (unregistered today), so the interim LogEntry loses no currently-served capability. + +Inferences: + +- Cleanest carrier: add a canonical `user_attachment` (or `attachment`) value to the `PayloadRefEvent.PayloadKind` set, document it in `canonical-events.md` + `data-model.md`, ensure the ingest writer accepts it (payload_refs.kind is TEXT — likely no enum constraint, confirm), and the presenter/UI renders it. Then opencode's `file` part emits a PayloadRef of that kind (LocationURI = the file url / `opencode-sqlite://` ref) instead of (or in addition to) the LogEntry. +- This is a shared-surface change (canonical + ingest + presenter + every adapter that has attachments) — deliberately out of SOW-0005's adapter-additive scope. + +Unknowns: + +- Whether `/api/payloads` (Phase 2) should land first so the attachment payload is actually servable, or whether the kind can be added ahead of the serving route. Sequence in the gate. +- Whether codex/claude-code/ai-agent have attachment analogues to map at the same time (cross-adapter consistency). + +### Acceptance Criteria + +1. The canonical `PayloadRefEvent.PayloadKind` set includes an attachment kind; `internal/canonical` + `data-model.md` + `canonical-events.md` document it. **Verification**: `go test` for canonical; spec-drift sweep clean. +2. The ingest writer persists the new kind into `payload_refs.kind`; a presenter/UI surface renders it (or it is explicitly deferred with the kind reserved). **Verification**: ingester + presenter tests. +3. The opencode `file` part emits a PayloadRef of the attachment kind (replacing or complementing the interim LogEntry); golden updated. **Verification**: an opencode golden with a file part asserts the canonical attachment PayloadRef. +4. Specs reconciled: adapter-opencode.md file-part mapping updated; SOW-0005 round-4 interim note resolved. **Verification**: spec-drift sweep clean. + +## Analysis + +Sources checked: `internal/canonical/events.go`, `internal/ingest/writer.go` (payload_refs upsert), `internal/adapters/opencode/mapper_emitters.go`, `.agents/sow/specs/{canonical-events.md,data-model.md,adapter-opencode.md}`. Discovered 2026-05-31 during SOW-0005 round-4 review. + +Risks: + +- **R1 — Shared-surface change.** Touches canonical + ingest + presenter + adapters. Mitigation: additive enum value (no existing kind changes); full gate + external review. +- **R2 — Serving route ordering.** `/api/payloads` is Phase 2; the kind may land before it is servable. Mitigation: the gate sequences this; the kind is reserved + rendered even if served later. + +## Pre-Implementation Gate + +(To be filled by the assistant picking this SOW up. Required before moving to `current/`.) + +## Implementation + +(Empty placeholder.) + +## Validation + +(Empty placeholder.) + +## Reviews + +(Empty placeholder.) + +## Outcome + +Pending. + +## Lessons / Follow-Ups + +Pending. Related canonical-carrier-gap follow-ups: SOW-0021 (turn extras), SOW-0023 (session provider), SOW-0024 (per-source health counts). diff --git a/.agents/sow/specs/adapter-opencode.md b/.agents/sow/specs/adapter-opencode.md index 0c5160b..35dce11 100644 --- a/.agents/sow/specs/adapter-opencode.md +++ b/.agents/sow/specs/adapter-opencode.md @@ -248,29 +248,40 @@ Observed on the operator's DB: **0 rows** in both tables. This is an opencode-in | `name` TEXT NULL | migration directory name | | `applied_at` TEXT NULL | | -Observed: 20 migrations applied (range `20260127222353_familiar_lady_ursula` … `20260511000411_data_migration_state`). The adapter queries this table at startup to determine which optional columns are present (see Edge Cases). +Observed: 20 migrations applied (range `20260127222353_familiar_lady_ursula` … `20260511000411_data_migration_state`). opencode applies migrations from a journal of `{sql, timestamp, name}` entries ordered by the migration directory name, which embeds a `YYYYMMDDHHMMSS` timestamp prefix (anomalyco/opencode `packages/opencode/src/storage/db.ts`); Drizzle's standard `__drizzle_migrations` row carries an auto-increment `id` that increases in application order. The adapter reads the `name` column ordered by `id ASC` (application order) at scan/tail start. + +That ordered name list serves two purposes (chunk D): + +- **Schema hash.** `schema_hash` in the cursor (see Cursor) is `sha256(strings.Join(names, "\n"))` over the ordered names — a stable digest that changes only when opencode applies a new migration. It supersedes chunk C's interim present-column-shape fingerprint (which hashed the readable column shape as a placeholder before this table was read). The watermark semantics are unchanged: a hash mismatch logs a structured WARN, re-reads, and continues WITHOUT resetting watermarks (column drift is handled per-column by the dynamic SELECT; a depended-on column vanishing is the only re-ingest trigger). +- **Latest migration / counts (AC#8).** `latest_migration` is the name with the highest `id` (last applied). The auto-discovery probe (`ProbeStatus`) reads it alongside `COUNT(*)` of `session`/`message`/`part` so `/api/health` and the discovery log surface what the source will yield. A missing `__drizzle_migrations` table (a very old or foreign SQLite file) is non-fatal: the probe returns empty names + a soft sentinel, the schema hash is left empty, and no migration is reported — the adapter degrades rather than crashing. + +`ProbeStatus` opens the database read-only via the same `openReadOnly` helper. The three `COUNT(*)` queries are full counts; on a multi-GB database that costs a few hundred ms ONCE at startup, which is acceptable for a one-time discovery probe (the steady-state tailer never runs them). A table that does not exist makes its count 0 and is noted as a soft error rather than failing the probe, so a foreign SQLite file the probe stumbles on degrades gracefully. ## Read Strategy The defining constraint: opencode's writer holds the database open and may commit transactions at any time. ai-viewer is a strict read-only consumer. The adapter MUST: -1. Open with `mode=ro&_journal_mode=WAL&_busy_timeout=5000&_txlock=deferred`. +1. Open with `mode=ro&_txlock=deferred` plus the fixed read-only PRAGMA set below. 2. Use `modernc.org/sqlite` (CGO-free, per AGENTS.md tech stack). -3. Open a **fresh connection per poll cycle** or keep a pool with `SetMaxOpenConns(1)` for the read path. Opening read-only against a WAL-mode database is non-blocking for the writer — multiple readers and a single writer can proceed concurrently (SQLite WAL guarantee). Concrete DSN: +3. Open a **fresh connection per poll cycle** or keep a pool with `SetMaxOpenConns(1)` for the read path. Opening read-only against a WAL-mode database is non-blocking for the writer — multiple readers and a single writer can proceed concurrently (SQLite WAL guarantee). Concrete DSN (the one `buildReadOnlyDSN` rebuilds, `conn.go`): ``` -file:%2Fhome%2Foperator%2F.local%2Fshare%2Fopencode%2Fopencode.db?mode=ro&_pragma=busy_timeout(5000)&_pragma=journal_mode(WAL)&_pragma=query_only(true)&_pragma=foreign_keys(off)&_txlock=deferred +file:%2Fhome%2Foperator%2F.local%2Fshare%2Fopencode%2Fopencode.db?mode=ro&_txlock=deferred&_pragma=query_only(true)&_pragma=busy_timeout(5000) ``` Key choices: - `mode=ro` — refuse any write at the OS level. The OS opens the file `O_RDONLY`; SQLite cannot upgrade the connection. - `_pragma=query_only(true)` — defense-in-depth; rejects any UPDATE/INSERT/DELETE at the SQL layer. -- `_pragma=journal_mode(WAL)` — does NOT change the file (the writer already set it); confirms our connection enters WAL reader mode so we read a consistent snapshot from the WAL checkpoint and never block the writer. - `_pragma=busy_timeout(5000)` — wait up to 5 s when an exclusive lock is held (which should be rare in WAL mode but happens during `PRAGMA wal_checkpoint(TRUNCATE)`). -- `_pragma=foreign_keys(off)` — readers don't need FK enforcement; cheaper. - `_txlock=deferred` — defer BEGIN until first statement; for reads this just means "snapshot taken on first SELECT". +**No `journal_mode` in the DSN (SOW-0005 round-2 P3-B).** `conn.go`'s `readOnlyPragmas` allowlist deliberately OMITS `journal_mode(WAL)`: the journal mode is a WRITER concern recorded in the database header by whoever created/opened it read-write (opencode), and a read-only connection inherits the database's existing mode — it cannot (and must not try to) change it. Setting `journal_mode` on a `mode=ro` connection is a no-op at best and an attempted write at worst; either way it earns nothing, so the reader simply does not send it. The reader still gets WAL-reader snapshot semantics automatically because the database file is already in WAL mode. Likewise `foreign_keys(off)` is not sent — FK enforcement only matters for writes, and the reader issues none. + +**DSN is an ALLOWLIST, not a denylist (SOW-0005 P1.2).** `buildReadOnlyDSN` parses the caller-supplied query string only to VALIDATE it, then DISCARDS it and rebuilds the query from scratch with exactly: `mode=ro`, `_txlock=deferred`, and the fixed `readOnlyPragmas` set (`query_only(true)`, `busy_timeout(5000)`). Therefore NO caller-supplied `_pragma` survives — neither one that name-collides with the read-only set NOR a non-colliding write-path pragma (`wal_checkpoint(TRUNCATE)`, `optimize`, `foreign_keys(on)`, …) — and a caller `_txlock=exclusive` is replaced with `deferred`. A maliciously-constructed path string therefore cannot reach a write-path pragma or an exclusive (write-lock) BEGIN. The earlier denylist that stripped only colliding `_pragma` names is replaced by this allowlist. + +**`buildReadOnlyDSN` input is a filesystem path on the CLI; `file:`/`:memory:` are programmatic-only (SOW-0005 round-3 P2-4).** `buildReadOnlyDSN` accepts three shapes: a bare filesystem path (normalised to an absolute `file:` URI), an already-built `file:` URI, and the in-memory `:memory:` form. The CLI's opencode source location (auto-discovery + `--source opencode:`) is ALWAYS a filesystem path — `cmd/ai-viewer-ingest`'s `startSource` calls `os.Stat(location)` before constructing the adapter, which fails for a `file:`/`:memory:` DSN string. The `file:`/`:memory:` shapes exist for the adapter's own programmatic and test callers (throwaway shared-cache DBs); they are NOT supported `--source` locations and are not a CLI feature. This is a documented contract, not a code restriction: the adapter still accepts those shapes when called directly. + **Never** call `PRAGMA wal_checkpoint`, `PRAGMA optimize`, `VACUUM`, `BEGIN EXCLUSIVE`, or `ATTACH … AS … 'rwc'`. The connection MUST remain a pure reader. Connection pool settings: @@ -281,6 +292,57 @@ Connection pool settings: Read transactions: every query batch wraps in `BEGIN DEFERRED ... COMMIT` so the adapter sees a consistent snapshot across multi-statement cursors. Long transactions on a WAL DB pin the WAL and stop checkpointing, so each cycle keeps its transaction shorter than 1 s of wall time. If a backfill must read more than a few thousand rows, it does so in **paged transactions** (close the transaction every N rows, then start a new one), accepting that the snapshot advances between pages. +### Delta query, affected-session derivation, and tree load (Chunk C) + +The delta-query layer is the bridge between the watermark cursor and the pure mapper. It runs three steps per change cycle, each in its own short read transaction: + +1. **Paged delta query per tracked table.** Each `session`/`message`/`part`/`session_message` table is paged from its `TableWatermark` with the composite-key SELECT (`buildSelect`, naming only live columns — never `SELECT *`): + + ```sql + SELECT FROM + WHERE time_updated > :u OR (time_updated = :u AND id > :id) + ORDER BY time_updated, id LIMIT 1000 + ``` + + Each page runs inside a `BEGIN DEFERRED` read transaction opened via `database/sql`'s `BeginTx{ReadOnly:true}` and committed promptly, keeping the WAL unpinned. Paging continues until a short page (`< 1000` rows) returns. The new max `(time_updated, id)` seen across all pages becomes the table's advanced watermark. + + **No id-only fallback (SOW-0005 P3.1).** `time_updated` is a REQUIRED column for every tracked table (`requiredColumns`); `introspectAll` fails fast when it is absent, so a table that reaches a delta query ALWAYS has `time_updated`. The composite-key SELECT above is therefore the ONLY delta query. The earlier pre-`Timestamps`-mixin id-only fallback (`buildSelectByID`) was unreachable dead code — `introspectAll`'s required-column gate makes a `time_updated`-less table fatal upstream, never a delta-query input — and it was removed along with its introspection-bypassing isolation test. + + **Corrupt delta-row cells (SOW-0005 round-4 P2-1, extended round-5 P2-2).** The per-table delta scanners carry the same `onWarn` hook the non-delta `loadSession` path uses, so a corrupt OPTIONAL numeric cell (a non-NULL value unparseable as its column's numeric type — e.g. `cost`, `tokens_*`, `time_created`) surfaces a structured WARN with table/column context and degrades to 0 rather than being silently coerced. Two classes of column are treated more strictly and ERROR the row instead (`i64Required`/`strRequired`, which aborts the delta page so the cursor is NEVER advanced past the corrupt row; the error is non-fatal to the loop — it is surfaced via `onError` and the cursor stays at the last good position): + - the two REQUIRED **cursor-watermark** columns (`id`, `time_updated`, via `requiredWatermark`) — a `time_updated` coerced to 0 could regress the watermark (round-4 P2-1); + - the REQUIRED **owning-id** columns (`message.session_id`, `part.message_id`, `part.session_id`, `session_message.session_id`, via `requiredOwner`) — these derive the AFFECTED session the tailer reloads (Affected-session derivation, below). An empty/corrupt value would be silently swallowed by `affectedSet.add("")` while the row handler SUCCEEDED, so the cursor would advance PAST a change that emitted no content — a permanent, health-invisible loss. For `part`, the owning session is `part.session_id` (denormalized, required); if it is empty the page errors rather than falling through `resolvePartSession` to an empty affected id (round-5 P2-2). + `session_message.type` is NOT an owning id — it keeps its existing unknown-type WARN behaviour (a missing/unrecognized type never becomes a fatal error). All these columns are in `requiredColumns`, so the column itself is always PRESENT (`introspectAll` makes its absence fatal); the only failure mode reaching the scanner is a corrupt/empty cell value. The boundary re-scan shares the same scanners and therefore the same guard. + +2. **Affected-session derivation.** From the changed rows, the layer computes the SET of session ids whose full tree must be reloaded and re-mapped: + - a changed `session` row contributes its own `id`; + - a changed `message` row contributes its `session_id`; + - a changed `part` row contributes its `session_id` (the `part` table denormalizes `session_id`); on a hypothetical old schema where `part` lacks `session_id`, the owning session is resolved via an indexed `SELECT session_id FROM message WHERE id = :message_id` lookup (with the changed-message delta consulted first to avoid the query). This delta-path resolver (`resolvePartSession`) is distinct from the *tree-load* path below; + - a changed `session_message` row contributes its `session_id`. + + The set is de-duplicated: a session touched by several tables in one cycle is reloaded exactly once. + + **No tree-load `message_id IN (...)` fallback (SOW-0005 round-3 P3-1).** `part.session_id` is a REQUIRED column (`requiredColumns["part"]`), so `introspectAll` makes a `part` table lacking it FATAL upstream — it can never be a tree-load input. The round-2 P2-B `loadPartsByMessageIDs` / `selectPartsByMessageIDs` fallback (a `WHERE message_id IN (?,…)` part query for a `part` table without `session_id`) was therefore UNREACHABLE in production and has been removed, along with its introspection-bypassing isolation test. The tree load always reads parts via the single indexed `WHERE session_id = ?` query. (The delta-path `resolvePartSession` above is a separate concern and keeps its message-PK lookup as documented.) + + **Full-tree scanners validate the required ownership/id columns too (SOW-0005 round-7 P2-3).** The DELTA scanners abort the page on a corrupt/empty required ownership column (round-5 P2-2, above). The FULL-TREE load path (`scanPartRows` in `store_load.go`, which partitions a session's parts into a `map[message_id][]partRow` for the mapper) must apply the SAME discipline: a `part` row whose REQUIRED `message_id` (the partition key) or `session_id` is empty is **NOT** silently attached to the `out[""]` bucket — where it would be dropped when the mapper looks up parts by a message's real id — but is **skipped with a surfaced WARN** (table/column context, no raw value). The WARN is buffered in the same post-tx `warnSink` discipline (§"No warning/error/content EMISSION while a source-DB read tx is open"): it is collected during the read tx and flushed through `onWarn` only after the snapshot is released. A corrupt historical `part.message_id`/`session_id` therefore surfaces in the logs and (via the adapter's `onError` → `SourceError`) in `/api/health`, rather than vanishing into `out[""]`. The message loader's own required columns (`id`/`session_id`) are read the same way; an empty owning id is surfaced, not silently zeroed. + +3. **Full-session-tree load + map.** For each affected session id the layer loads the whole tree — the `session` row, all its `message` rows ordered by `(time_created, id)`, and each message's `part` rows ordered by `(id)` — under **ONE** bounded read-only transaction (`loadAndMapSession`), assembles `[]messageWithParts`, and calls the pure mapper (`mapSession`) on it. The session row read, the `time_compacting` check (Edge Cases #8), the parent-chain root resolution (`resolveRootID`), AND the message+part load all share **one consistent snapshot** (SOW-0005 round-3 P1-2): opening a second transaction for the tree after checking `time_compacting` in a first one is a TOCTOU — opencode could begin compaction between the two reads and the adapter would emit a partial/mutating tree despite the Edge #8 skip rule, and the session metadata would come from a different snapshot than its tree. A single `BeginTx{ReadOnly:true}` closes both gaps. Full-tree reload is mandatory, not partial: the mapper computes per-turn cumulative-token deltas across the ordered message list, so a partial reload would miscompute deltas. Re-emitting an unchanged session is harmless — the ingester's idempotent upserts + the (post-SOW-0004) idempotent catalog absorb it. + + **Full-tree load is an intentional design constraint, bounded by a defensive safety WARN (SOW-0005 round-3 P2-3).** The whole ordered message list must be in memory at once because per-turn token deltas are synthesized by subtracting successive cumulative snapshots (Edge Cases / AC#3) — there is no correct streaming decomposition. This is bounded in practice by real opencode session size (the largest observed session is well under the cap). As a defensive signal against a pathological or corrupt session, the loader counts the loaded messages and parts and emits ONE structured WARN via `onError` when a session exceeds `maxSessionMessagesWarn` / `maxSessionPartsWarn` (each set generously, 100 000). The session is still processed in full — the WARN surfaces the anomaly (so `/api/health` and the logs show it) without silently truncating, which would corrupt the token-delta chain. + + **True tree root (SOW-0005 P2.4).** Before mapping, the layer resolves the session's TRUE tree root by walking its `parent_id` chain to the topmost ancestor (`resolveRootID`: an indexed `SELECT parent_id FROM session WHERE id=?` walked up, read-only, depth-capped at 32 with a seen-set cycle guard) and injects it into the mapper (`WithRootNativeID`). A nested sub-agent's `RootNativeID` is therefore the whole tree's root, not its direct parent; `ParentNativeID` still points at the direct parent. If the chain cannot be fully resolved (a missing ancestor row, a cycle, or the depth cap), it falls back to the furthest resolvable ancestor (the direct parent on a one-step failure) and surfaces one WARN via `onError`. + + **`reloadAndEmit` error policy — only known non-retryable cases are skip-and-continue (SOW-0005 round-7 P1-2).** When reloading an affected session, `reloadAndEmit` treats exactly two outcomes as non-fatal skip-and-continue: (1) `errSessionGone` — the `session` row could not be loaded (deleted between the delta page and the load, or a part/message orphaned from its session); the session is legitimately gone, so it is skipped with one structured `onError` and the cursor may advance past it. (2) the `time_compacting` pause (`skipped == true`) — the session is mid-compaction and re-surfaces in a later delta when the column clears (Edge Cases #8). ANY OTHER load/map error (a transient tree-load/read error, a commit failure, a corrupt-tree decode error) **propagates up** so `commitBatch` does **NOT** promote the cursor: the same rows are retried on the next cycle. The earlier code swallowed every non-context error (logged + continued) and then let `commitBatch` advance the cursor anyway — a transient read error therefore skipped an affected session's content while persisting a cursor beyond the rows that required that reload, a permanent, health-invisible content loss. Propagating the error keeps the checkpoint-after-emit invariant intact: a cursor is promoted only after every affected session's content was successfully emitted, never after a session was skipped on a retryable error. + +**Checkpoint-after-emit invariant (SOW-0005 P1.1, data-loss fix).** A `SourceProgress` checkpoint carrying cursor `W` is emitted ONLY after every session affected by rows ≤ `W` in this run has been reloaded, mapped, and emitted. The pipeline (`processChanges` → `batchProcessor`) runs in BOUNDED BATCHES: each batch pages ≤ `progressEveryRows` delta rows forward ACROSS the tracked tables (one shared row budget, so a session touched by several tables is reloaded once per batch — cross-table dedupe), `reloadAndEmit`s that batch's affected sessions, and ONLY THEN advances the persisted cursor + checkpoints. On ctx-cancel/error mid-batch the LAST fully-committed cursor (the previous batch's) is returned, never the in-progress batch's scanned watermark — so a restart from the persisted cursor can never resume PAST rows whose canonical events were never emitted. The earlier scheme that emitted a `SourceProgress` every `progressEveryRows` rows DURING paging (before the affected sessions were emitted) advanced the watermark ahead of content and is replaced. + +**No warning/error/content EMISSION while a source-DB read tx is open (SOW-0005 round-5 P2-1).** Every warning/error the adapter raises ultimately reaches `onError` → a `SourceErrorEvent` send on the adapter's out channel (`adapter.go` `OnError`); a content event is a send on the same channel. A blocking send (a slow/backpressured ingester) must NEVER happen while a source-DB read transaction is open, or it would pin the WAL snapshot on the live multi-GB opencode database and delay opencode's own checkpoint. Therefore: + - The delta scanners (`scanOnePage`), the boundary re-scan (`scanBoundaryBucket`), and the full-session-tree load (`loadAndMapSession`) BUFFER every WARN/ERROR raised inside a tx into an in-memory `warnSink` (a non-blocking slice append), instead of calling the live `onError` under the open snapshot. + - The tx is committed/rolled back FIRST (explicitly, not via a deferred rollback that would still be open during a flush), and ONLY THEN are the buffered warnings flushed through the real `onError`. + - Content events are likewise emitted only after the tx closes: `loadAndMapSession` runs the pure mapper AFTER `tx.Commit()` and returns the events for the caller (`reloadAndEmit`) to emit — so neither a warning nor a content event reaches the channel with the snapshot held. + - A FATAL row error (a corrupt REQUIRED watermark/owning-id cell — round-4 P2-1 / round-5 P2-2) is RETURNED (the page is aborted so the cursor does not advance), and its surfacing to `onError` also happens after the tx is closed. + +**Per-batch full-tree re-emit is an accepted v1 tradeoff (SOW-0005 round-4 P2-5).** Affected-session dedupe is PER BATCH, not across batches: a session whose rows span multiple batches (a large session being backfilled, or a session that changed across several poll cycles) has its WHOLE tree reloaded and re-emitted once per batch it appears in. This is intentional — it is the simplest scheme that preserves the checkpoint-after-emit crash-safety invariant (a batch's cursor is only committed after that batch's sessions are emitted, so dedupe state cannot outlive a committed checkpoint without risking a resume that skips an un-emitted session). The re-emission is absorbed idempotently by the ingester's upserts + the idempotent catalog, and the cost is bounded by real session size (well under the defensive cap). Cross-batch coalescing (a session-level dedupe cache spanning the whole run) is a deferred optimization, not a v1 requirement — it would add state that complicates the crash-safety reasoning for a saving that only matters for the rare multi-batch session. + ## Watch Strategy SQLite does not notify external consumers of writes. The adapter has two complementary signals: @@ -309,6 +371,63 @@ Rationale for the floor at 250 ms (not lower): opencode writes can be very chatt Rationale for using `time_updated` over `id` (rowid) for the watermark: opencode rewrites in-place to update `tokens`, `cost`, `status` etc. on existing rows (via Drizzle `.$onUpdate`). An `id`-based cursor would miss these mutations. `time_updated` catches both inserts and updates. +### Poll-loop state machine and the `MAX(time_updated)` gate (Chunk C) + +The realtime tailer is a timer-driven poll loop with an fsnotify wakeup hint. It is implemented as two free functions Chunk D's `adapter.go` calls (mirroring codex's free-function tailer rather than methods on the `Adapter` struct): + +- `scanLoop(ctx, dbPath, sourceID, since, out, logger, onError) (Cursor, error)` — the historical backfill: introspect once, emit one INFO per missing optional column (see Edge Cases #1), record the schema hash into the cursor, page every tracked table from `since`, derive affected sessions, reload + map each, emit, checkpoint `SourceProgress` every ~1000 rows processed and once at the end, return the advanced cursor. +- `tailLoop(ctx, dbPath, sourceID, cur, out, logger, onError) error` — the realtime follow until `ctx` is cancelled (returns `nil` on cancel); also emits the missing-optional-column INFO set once at its introspection. + +The `logger` parameter is `*slog.Logger`, threaded from `Adapter.logger` (non-nil after `New`, which defaults to `slog.Default()`); both loops guard a nil logger defensively (`slog.Default()`) so a direct test caller passing nil does not panic. + +**Adapter Scan→Tail cursor hand-off (Chunk D).** `adapter.go` mirrors codex: `Scan` records the final advanced cursor on the `Adapter` instance (`scanCursor`) even on `ctx` cancellation, and a following `Tail` on the SAME instance resumes from it instead of snapshotting current HEAD — closing the data-loss window where rows committed between `Scan` finishing and `Tail` starting would otherwise be skipped. Any re-emission of an already-seen session tree is absorbed by the ingester's idempotent upserts. A **cold `Tail`** (no preceding `Scan`, e.g. a resumed daemon whose `Scan` ran in a previous process) builds a HEAD-snapshot cursor: open read-only, introspect, and set each tracked table's watermark to its current `MAX(id)` + `MAX(time_updated)` (via `maxID`/`maxTimeUpdated`). This is the SQLite analogue of codex stat'ing current file sizes — `Tail` then follows from NOW rather than replaying full history. The HEAD snapshot also records the real `__drizzle_migrations` schema hash into the cursor. A missing/unreadable database during the snapshot surfaces one structured error and `Tail` returns cleanly (the daemon keeps serving other sources). + +**Cadence intervals** (decided, SOW-0005 Open Decision #2): + +- **Idle** poll interval: 2 s (the previous cycle produced no change). +- **Active** poll interval: 500 ms (the previous cycle produced a change). +- **WAL-event floor**: 250 ms for a 5 s window after an `opencode.db-wal` fsnotify Write/Chmod event. + +The next interval is the minimum of the active/idle interval and the WAL-event floor when the floor window is open. + +**Cheap primary change check.** Every poll first runs the PK-indexed `MAX(id)` per table (a b-tree lookup on the time-prefixed Sonyflake PK, ~µs). When any table's `MAX(id)` exceeds the cursor's `MaxID`, the cycle runs the delta+reload+emit path. On a current schema this catches every INSERT. + +**The gated `MAX(time_updated)` probe.** In-place row mutations (token totals, status, archive) do NOT change `MAX(id)`, so they are caught by the unindexed `MAX(time_updated)` probe — which on the 585k-row `part` table is a 400–800 ms full scan and therefore MUST NOT run on every idle poll. It runs only when the gate is open. The gate predicate is a pure function: + +``` +shouldProbeTimeUpdated(now, lastWALEvent, lastProbe, safetyNet) = + lastWALEvent.After(lastProbe) || now.Sub(lastProbe) >= safetyNet +``` + +i.e. the probe is issued only when (a) a WAL-mtime fsnotify event has fired since the last probe, OR (b) the 60 s safety-net interval has elapsed since the last probe. `safetyNet` is **60 s** (decided, matching Performance §"suspect activity gate" and AC#6). During steady-state idle with no WAL events the predicate is false on every poll, so the expensive scan never runs — the property AC#6 pins. + +**Boundary-millisecond re-scan (SOW-0005 round-3 P1-1).** An already-seen LOW-id row updated **in place at exactly the cursor's boundary millisecond `T`** is otherwise lost forever: it moves neither `MAX(id)` (no insert → the cheap path stays silent) nor `MAX(time_updated)` (the bucket value is unchanged → the gated `MAX(time_updated) > T` check stays silent), and the forward delta query's strict tie-break `(time_updated = T AND id > highID)` excludes that low-id row. The gap exists only when that boundary-ms in-place update is the session's ONLY change (any other change flags the session and the affected-session reload re-emits its whole tree, picking the row up). + +The fix is a dedicated, bounded **boundary-bucket re-scan** (`emitBoundarySessions`/`boundaryAffectedSessions`): for each tracked table whose cursor `MaxTimeUpdatedMs == T > 0`, it runs `SELECT FROM WHERE time_updated = :T ORDER BY time_updated, id` (the FULL boundary bucket, regardless of id — an equality on the boundary ms), collects the owning session ids (reusing the same per-table `deltaRowHandler` derivation, including the part→session resolver), and feeds them into the SAME `reloadAndEmit` path. The cursor is **NOT advanced** by this scan (the boundary rows are already at the watermark; re-emitting their session trees is idempotent). It is bounded to the single boundary millisecond — it never walks earlier buckets and never pages — so its cost is the tiny set of rows sharing `T`. It is deliberately NOT folded into the forward delta query (which both Scan and Tail share): lowering that query's bound to `>=` would risk a non-advancing backfill loop, whereas the separate one-shot bucket scan is self-contained. + +**Unified pre-advance trigger (SOW-0005 round-7 P1-1).** The boundary re-scan runs FIRST in `pollOnce` — against the cursor's *current* (pre-advance) `MaxTimeUpdatedMs == T` — and ONLY THEN does `processChanges` page the forward delta (which advances the cursor). It fires whenever, **for a warm/real boundary (`boundaryReal == true`)**, EITHER: + +- **`changed == true` on ANY detect path** — the cheap `MAX(id) > MaxIDSeen` insert path OR the gated `MAX(time_updated) > T` probe. The re-scan does **NOT** key off `detectChange`'s `probed` output (round-7 P1-1 closes that gap); a true INSERT detected by the cheap path (`changed == true, probed == false`) still arms the boundary re-scan, OR +- **the probe gate is open this cycle** — `shouldProbeTimeUpdated(now, lastWALEvent, lastProbe, safetyNet)` is true (a WAL event since the last probe, or the 60 s net elapsed). This is the idle in-place-update path: no INSERT (`changed == false` via the cheap path) but the gate opened, so the bounded bucket re-scan re-checks `T`. + +Formally: `runBoundary := boundaryReal && (changed || probeGateOpen)`. + +**Why the cheap-path co-occurrence needed closing (the round-7 P1-1 class).** Before round-7 the trigger keyed off `probed` (`gateOpen := probed && …`). But `detectChange` returns EARLY on the cheap `MAX(id) > MaxIDSeen` path with `changed == true, probed == false`, BEFORE the gated `MAX(time_updated)` probe runs. So when a true INSERT (`MAX(id)` advances) **co-occurred in the same poll** with an in-place UPDATE of a LOW-id row re-stamped to the boundary ms `T`, the cheap path short-circuited (`probed == false`) → the boundary re-scan was skipped → `processChanges` advanced the watermark past `T` for the INSERT → the low-id in-place update (excluded from the forward delta by the strict tie-break `id > highID`) fell permanently below the new watermark, never seen (a zero-gaps violation). The round-7 trigger arms on `changed == true` **regardless of which path detected it**, so the co-occurring INSERT case re-emits the same-ms session's tree against the pre-advance `T` before the forward delta moves the cursor. (Round-3 introduced the boundary re-scan, round-4 added the safety-net path, round-6 ran it before the forward delta on the gated path; round-7 closes the cheap-`MAX(id)` co-occurrence — the 4th same-ms case.) + +**`boundaryReal` is the single cold-`Tail` guard, applied CONSISTENTLY to every trigger (SOW-0005 round-7 P2-1).** `boundaryReal` gates the WHOLE unified trigger above — both the `changed` path and the gate-open path — so it is impossible for any boundary-re-scan path to replay a never-emitted cold snapshot. It means the cursor's boundary `T` is a position whose bucket was ALREADY emitted, so re-scanning it is idempotent rather than a cold replay. It starts **true for a WARM `Tail`** (resumed from a Scan cursor — Scan already emitted the boundary) and **false for a COLD `Tail`** (HEAD snapshot, follow-from-now: the boundary bucket was never emitted); `pollOnce` flips it true once the cursor first advances (the new boundary is the just-emitted forward position). The earlier `priorProbe` flag — which guarded only the `changed == false` path and was a SEPARATE, partial cold guard — is **removed** (round-7 P2-1): a cold Tail whose first poll happened to be WAL-driven, or whose first safety-net probe ran with `priorProbe` already set, could otherwise replay the HEAD-snapshot boundary bucket on the `changed == false` path. `boundaryReal == false` until the first genuine cursor advance now suppresses the re-scan on ALL paths, closing that hole and collapsing two guards into one. + +This precision is what reconciles the fix with the existing contracts: +- A steady-state IDLE DB **WITHIN the 60 s net** with no WAL event has `changed == false` AND `probeGateOpen == false`, so the boundary re-scan never runs on an idle poll. AC#6's zero-expensive-query idle property is untouched (the cheap `MAX(id)` idle path is unchanged — strict `>` against the monotonic `MaxIDSeen` — and the gated `MAX(time_updated)` probe is still issued only when `shouldProbeTimeUpdated` is true); the round-2 counting-driver / `TestP1A_OldRowUpdateDoesNotReArmIdleScan` tests stay green. The re-scan fires only when something changed OR the gate OPENS — a WAL event or the 60 s net tick — which is exactly the safety net's purpose (a once-per-60 s bounded bucket re-emit on an otherwise-idle DB is the accepted cost of guaranteeing a missed-WAL in-place update is eventually surfaced). +- A **cold `Tail` HEAD snapshot** never replays its boundary session: `boundaryReal == false` until the cursor first advances suppresses the re-scan on EVERY path (round-7 P2-1) — the cheap-`MAX(id)` change path, the gated `MAX(time_updated)` probe path, the WAL-driven gate-open path, and the safety-net gate-open path. So even a post-snapshot forward change whose new row id sorts *below* the snapshot `MAX(id)` (which trips the `time_updated` probe rather than the cheap `MAX(id)` path) does NOT replay the snapshot boundary; and a cold Tail's first WAL-driven or safety-net probe (the round-7 P2-1 hole) is suppressed too. Once the cursor advances past the snapshot, `boundaryReal` flips true and the boundary is a genuinely-emitted position thereafter. A WARM `Tail` (resumed from a Scan cursor) starts `boundaryReal == true` because Scan already emitted the boundary. +- A genuine in-place boundary update (opencode re-stamps a row's `time_updated` to the same ms → WAL mtime changes, but no id/max-time advance) is caught on the gate-open path: immediately on the WAL-driven probe, or — if the WAL hint is **missed** (a dropped fsnotify event, a watcher whose `Add` failed, or timer-only polling) — on the next 60 s safety-net probe. It is also caught when it **co-occurs with a true INSERT**: the cheap `MAX(id)` path makes `changed == true`, which arms the re-scan against the pre-advance `T` before the forward delta advances the cursor (round-7 P1-1). +- A session whose `time_compacting` **clears at the boundary ms** (compaction finished, re-stamped to the same `T`) is the same invisible-in-place-update shape: it re-surfaces via the same boundary re-scan (the cleared session row is in the boundary bucket) and emits its tree (it is no longer skipped by the Edge #8 pause). + +A table with no boundary watermark yet (cold start, `MaxTimeUpdatedMs == 0`) or without `time_updated` is skipped by the re-scan, so an empty/old-schema table never spuriously fires. + +**WAL fsnotify hint.** The loop sets up an fsnotify watch on the `opencode.db-wal` companion path (`-wal`) as a wakeup hint only. A Write/Chmod event records `lastWALEvent = now` (opening the 250 ms floor window and the probe gate). The hint is best-effort: if the WAL file does not exist, the watch `Add` fails, or the watcher errors, that is **non-fatal** — the loop logs once via `onError` and falls back to pure timer polling (the 60 s safety net still guarantees in-place mutations are eventually seen). A watcher error never terminates the loop. + +**Manual reload** (`/api/sources//reload`) is out of scope for Chunk C (the route does not exist yet); when added it will force one immediate cycle with the probe gate open. + ## Cursor Cursor shape, stored as opaque JSON in `sources.cursor`: @@ -345,7 +464,7 @@ The 1000-row page LIMIT keeps each read transaction short. The adapter pages unt `schema_hash` invalidates the cursor when opencode applies a new migration that affects shape we read; on mismatch the adapter logs a structured WARN, re-reads `__drizzle_migrations`, and continues without resetting the cursor (column drift is handled per-column; see Edge Cases). A full re-ingest is only triggered when a column we depend on disappears or its type changes incompatibly. -`SourceProgress` events are emitted every 1000 rows or every 5 s during steady state, whichever comes first. +`SourceProgress` events are emitted per BATCH, AFTER that batch's affected sessions are emitted (the checkpoint-after-emit invariant — see Read Strategy §"Full-session-tree load + map"). A batch's row budget is `progressEveryRows` (1000) across the tracked tables, so the persisted cursor advances at most one batch ahead of fully-emitted content, never past un-emitted content. The earlier "every 1000 rows or every 5 s" mid-paging cadence is superseded. ## Mapping to Canonical Events @@ -379,22 +498,26 @@ When a new `session` row appears (delta on `session` table): - `Ts = session.time_created * 1000` (convert ms→µs) - `SourceSeq = deterministic per-event identifier` (stable across rescans; observability counter, not a dedup gate — see Idempotency) -When `session.time_updated` changes for a row already known: +When `session.time_updated` changes for a row already known (SOW-0005 round-6 P2-2): -- Emit `SessionUpdatedEvent` with the changed fields (agent/model/cost/tokens). +- The adapter **re-emits the session's whole tree** — a fresh `SessionStartedEvent` carrying the *current* `agent`/`model`/`cwd`/`Extras` (read live off the re-read `session` row), followed by the re-mapped turns/ops. It does **NOT** emit a `SessionUpdatedEvent`. The ingest writer absorbs the re-emission idempotently: `applySessionStarted` is an `INSERT … ON CONFLICT(source_id, native_id) DO UPDATE` that `COALESCE(NULLIF(excluded.col,''), sessions.col)`s `agent_name`/`model`/`cwd`/`call_path` onto the existing row (`internal/ingest/writer.go` — `applySessionStarted`), so a re-emit applies the changed metadata without inventing a new event type. +- **Why opencode needs no `SessionUpdatedEvent` (sibling-adapter contrast).** The sibling adapters emit `SessionUpdatedEvent` only to backfill a *single session-level field first learned mid-stream* when no full-row re-read is available: `codex`/`claude-code` emit one to set `sessions.model` the first time a turn reveals it (`!modelSeen` — `internal/adapters/codex/ops.go`, `internal/adapters/claude_code/ops.go`), `aiagent_v3` to attach final-report metadata, and `claude-code` additionally for the late sub-agent `.meta.json` repair (a partial `UPDATE` that makes **no** catalog call — `repairChangedMetas` in `internal/adapters/claude_code/tailer.go`). opencode has **no such mid-stream gap**: every delta re-reads the *whole* `session` row, so the current `model`/`agent`/`cwd` are already on the re-emitted `SessionStartedEvent` — there is nothing left for a separate partial-update event to carry. (opencode's `session.id` is also stable from row insert, so it has no late canonical-identity rewrite either.) +- **`last_activity_ts`** is advanced by `MAX()` over the re-emitted session/turn/op timestamps — the `applySessionStarted` upsert `MAX(last_activity_ts, excluded.Ts)` plus the aggregates rollup `MAX(last_activity_ts, MAX(op.end_ts))` (`internal/ingest/aggregates.go`). A metadata-only `time_updated` bump that adds no newer turn/op carries no later activity timestamp, so it does not by itself move `last_activity_ts`; the figure tracks real session activity, not a bare `session`-row touch. When `session.time_archived` becomes non-NULL: - Emit `SessionFinalizedEvent` with `Status = "completed"`, `EndTs = time_archived * 1000`. -Opencode does not have a `status='failed'` row column. Failed sessions are inferred from the last assistant message carrying `data.error` (any `AssistantError` variant). When the adapter sees an assistant message with non-NULL `data.error`, the session is finalized with `Status = "failed"`, `ErrorClass = data.error.name`. +Opencode does not have a `status='failed'` row column. Failed sessions are inferred from the last assistant message carrying `data.error` (any `AssistantError` variant). When the adapter sees an assistant message with non-NULL `data.error`, the session is finalized with `Status = "failed"`, `ErrorClass = data.error.name` (or a default class when the name is empty — SOW-0005 round-2 P2-A), and `ErrorMessage = data.error.data.message`. + +**`ErrorMessage` from the tagged error `data` (SOW-0005 round-5 P3-1).** opencode's `AssistantError` is a tagged union created via `NamedError.create(name, dataSchema)` (`anomalyco/opencode @ 2b3ddf9 :: packages/core/src/util/error.ts:13-64`), so it serializes as `{"name": , "data": }`. Every shipping variant EXCEPT `MessageOutputLengthError` carries a `message` string inside `data`: `MessageAbortedError` (`{message}`), `UnknownError` (`{message, ref?}`), `APIError` (`{message, statusCode?, isRetryable, …}`), `ContextOverflowError` (`{message, responseBody?}`), `StructuredOutputError` (`{message, retries}`), `ProviderAuthError` (`{providerID, message}`) — `message-v2.ts:41-57` + `message-error.ts:4-11`. Confirmed against the reference DB: of 422 messages carrying `data.error`, every `MessageAbortedError`/`UnknownError`/`APIError` row populates `data.error.data.message`. The adapter therefore reads `data.error.data.message` (best-effort: an absent `data`, a non-object body, or a missing/non-string `message` — e.g. the message-less `MessageOutputLengthError` — yields an empty `ErrorMessage`; the session is still finalized failed with its `ErrorClass`). This mirrors how the tool-op path surfaces `state.error` verbatim into `OpFinalizedEvent.ErrorMessage`. The canonical `TurnFinalizedEvent` carries only `ErrorClass` (no `ErrorMessage` field), so the error detail enriches the SESSION terminal only; the failed turn still records its `ErrorClass`. When a new `message` row appears (role=`assistant`): - Emit `TurnStartedEvent` with: - `SessionNativeID = message.session_id` - `Seq = (count of prior assistant messages in same session) + 1` -- When `data.time.completed` is set (or when the message has at least one `step-finish` part), emit `TurnFinalizedEvent` with the message-level `cost`/`tokens` and `Status` derived from `data.finish` (`stop`→completed, anything else→completed unless `data.error` is set). +- Emit `TurnFinalizedEvent` ONLY when the turn is TERMINAL (`turnIsTerminal`): `data.time.completed` is set, OR `data.error` is present, OR the message has at least one `step-finish` part. Otherwise the turn is **RUNNING** — `TurnStartedEvent` with NO `TurnFinalizedEvent` (opencode writes the assistant `message` row LIVE while the turn is still in progress; finalizing every live row would wrongly mark an in-flight turn completed). A later poll re-emits the whole tree and finalizes the turn once it actually completes; the re-emit is idempotent. When terminal, `TurnFinalizedEvent` carries the message-level `cost`/per-turn token delta and `Status` derived from `data.finish` (`stop`→completed, anything else→completed unless `data.error` is set → failed). For each `part` row of the assistant message, walking in `id` order: @@ -402,19 +525,20 @@ For each `part` row of the assistant message, walking in `id` order: |---|---| | `step-start` | open a new LLM Op (record state in adapter memory; emit `OpStartedEvent` with kind=`llm`, name=``, provider=`` from the parent message) | | `step-finish` | close the current LLM Op (emit `OpFinalizedEvent` with the step's `tokens`/`cost`; `Status="completed"`) | -| `reasoning` | emit `OpStartedEvent`+`OpFinalizedEvent` (kind=`reasoning`, ParentOpSeq=current LLM Op) using `data.time.start`/`data.time.end`; on missing `end`, `Status="running"` and end ts is null | -| `text` | NOT an op; surface as the assistant's final text. Skip canonical-event emission; the presenter retrieves text via a payload-style read. | +| `reasoning` | emit `OpStartedEvent`+`OpFinalizedEvent` (kind=`reasoning`, ParentOpSeq=current LLM Op) using `data.time.start`/`data.time.end`; on missing `end`, `Status="running"` and end ts is null. **ReasoningKind** (canonical-events.md:202 — `summary` \| `raw`): opencode reasoning parts carry no native summary-vs-raw discriminator, so the adapter emits `raw` (the part is the model's raw chain-of-thought text), unless `data.metadata.summary` is truthy, in which case it emits `summary`. The reasoning body (`data.text`) is referenced as a PayloadRef (kind `llm_reasoning`, field `text`), never inlined. | +| `text` | NOT an op; surface as the assistant's final text. The adapter does NOT emit an op for a `text` part, but DOES emit a `PayloadRef` (kind `llm_response`, field `text`) scoped to the turn's most-recent LLM op so the presenter can retrieve the assistant's text on demand without ai-viewer copying it. When no LLM op is open yet (a `text` part before any `step-start`), the ref is dropped (it has no op to attach to; `payload_refs.op_id` is NOT NULL). | | `tool` | emit `OpStartedEvent`+`OpFinalizedEvent` (kind=`tool`, ParentOpSeq=current LLM Op, name=`tool`, ToolNamespace=derived from `tool` (e.g. `github_get_file_contents` → namespace `github`, name `get_file_contents`)) using `state.time.start`/`state.time.end`; `Status` derived from `state.status` | | `tool` where `tool='task'` AND `state.metadata.sessionId` set | emit BOTH the tool Op AND an `OpStartedEvent` of kind=`session` with `ChildSessionNativeID = state.metadata.sessionId` (SOW-0005 decision: emit both; the `session` op is the topology parent so the sub-agent attaches in the topology view) | | `patch` | NOT an op; record in extras of the surrounding LLM op for the "Files changed" UI tab | | `compaction` | emit `LogEntry` severity=`INF`, source=`opencode`, message=`session compacted (auto=)` | | `retry` | emit `LogEntry` severity=`WRN`, message=`API retry attempt : ` | -| `file` | recorded as a `PayloadRef` with `kind="user_attachment"`, `format="json"`, `LocationURI = data.url` | -| `subtask` | emit `OpStartedEvent` of kind=`session` with `Extras.prompt`/`description`/`agent`/`model`; finalize is implicit when the child session finalizes | -| `agent`, `snapshot` | recorded in extras; no op emission | +| `file` | emit an INF `LogEntry` message=`file attachment`, source=`opencode`, with `Extras = {filename, url, mime}` (only the present fields), scoped to the turn and the open LLM op when one is open (`OpSeq` may be 0). **NOT** a `PayloadRef` (SOW-0005 round-4 P2-3): the canonical `PayloadRefEvent.PayloadKind` set is exactly `llm_request \| llm_response \| llm_sdk_request \| llm_sdk_response \| llm_reasoning \| tool_request \| tool_response \| log` (internal/canonical/events.go) — none of which is a user file attachment — so the earlier `kind="user_attachment"` ref was a canonical-contract violation. The attachment is surfaced as a `LogEntry` (mirroring compaction/retry) so the observability is preserved without inventing a non-canonical kind. A richer canonical attachment `PayloadKind` (and a resolver for it) is **deferred to a follow-up SOW**; the adapter must not emit a non-canonical kind in the interim. A `file` part with no `url`/`filename`/`mime` at all emits nothing. | +| `subtask`, `agent`, `snapshot` | **No-op in v1** (SOW-0005 round-4 P2-4). The earlier spec said `subtask` → a `session` op; live counts for all three part types are **ZERO** on the reference DB (the `subtask` part type is planned upstream but not yet populated — cross-referenced in §"Sub-Agent Linkage": the 11 parent-set sessions without a matching task part come from `subtask`/manual fork mechanisms that do not emit a populated `subtask` part). The adapter therefore treats `subtask`/`agent`/`snapshot` as **known, intentionally-ignored** part types (no op, no payload, no WARN — they are recognized, just unused), rather than implementing against zero data. Implementing `subtask` → `session` op is **deferred to a follow-up SOW** if/when these part types appear in a real database. | **Op `seq` numbering within a turn**: increment a counter per part processed, regardless of part type that gets emitted. `ParentOpSeq` for parts that fall inside a step is the LLM Op's seq (the `step-start`'s seq). +**Malformed `message.data` / `part.data` JSON (SOW-0005 round-3 P2-2).** A `message` or `part` whose `data` blob cannot be JSON-decoded is skipped with a session-scoped `LogEntry` severity=`WRN` (for the per-session detail view) **AND** routed through the adapter's `onError` callback. `data` is NOT-NULL in opencode's schema, so an undecodable blob is a corruption signal, not benign forward-compat drift; routing it through `onError` turns it into a source-scoped `SourceErrorEvent` that increments `sources.parse_errors` so a corrupt opencode DB **degrades `/api/health`** (the session `LogEntry` alone never reaches the health source-status panel). The two surfaces are complementary: the `LogEntry` carries the per-session/turn/op context for the detail view; the `SourceErrorEvent` carries the source-level health signal. (An unknown but well-formed `$.type`/role is different — that is forward-compat data, a `WRN` `LogEntry` only, NOT an `onError` health signal.) + ### Payload references Opencode keeps payloads in the SQLite database itself (`part.data.state.output`, `part.data.text`, etc.). It does not write payload files to disk. The adapter emits `PayloadRefEvent` with a custom URI scheme: @@ -425,6 +549,24 @@ LocationURI = "opencode-sqlite://opencode.db?part_id=&field=state.outpu The presenter resolves this scheme by re-querying SQLite for the named field. This keeps payloads out of ai-viewer's own database (they may be hundreds of MB total) and respects the read-only contract. +**Mapper/URI seam (SOW-0005 chunk split).** The row→event mapper (chunk B) is pure and DB-agnostic: it knows the owning `part.id` and the `field` path (`state.output`, `state.input`, `text`, …) but NOT how to build the final `opencode-sqlite://` URI. The canonical URI grammar lives in ONE place — `payloads.go`'s `buildPayloadURI(partID, field)` (chunk D) — mirroring how codex/claude_code keep URI construction in their `payloads.go`. The grammar is: + +- scheme `opencode-sqlite` (no host, no path); +- query params `part_id=&field=`, with both values URL-encoded via `net/url` so a part id or field path containing a reserved character is safe; +- producing exactly `opencode-sqlite://?part_id=&field=`. + +The mapper's built-in default (`defaultPayloadURI`, used in mapper-only unit tests) delegates to `buildPayloadURI`, so there is a single source of truth and the relative form is byte-identical to chunk B's contract. The future `/api/payloads` resolver (a separate Phase-2 SOW, NOT this chunk) will look up the owning source's database path from the `payload_ref`'s `source_id` and `SELECT part.` for that `part_id` read-only; chunk D builds NO resolver/parser (there is no consumer yet — that would be dead code). This mirrors codex, whose mapper defers `file://` construction to a `payloadURI` helper. The PayloadRef field map per part type: + +| part type | PayloadKind | field | +|---|---|---| +| `text` | `llm_response` | `text` | +| `reasoning` | `llm_reasoning` | `text` | +| `tool` with non-empty `state.output` | `tool_response` | `state.output` | + +**`tool_response` is emitted ONLY when `state.output` is non-empty (SOW-0005 round-6 P2-1).** A failed tool (`state.status == "error"`) typically carries only `state.error` and no `state.output`; emitting a `tool_response` PayloadRef pointing at `field=state.output` for such a tool would reference a body that does not exist (the future `/api/payloads` resolver would `SELECT part.state.output` and find nothing). The op's failure detail is carried by `OpFinalizedEvent.ErrorMessage` (= `state.error`) instead, so dropping the empty-output ref loses nothing. A failed tool that *does* produce partial output before failing still emits the ref (the gate is on `state.output != ""`, not on the status). + +(A `file` part is **NOT** a `PayloadRef` — SOW-0005 round-4 P2-3 removed the non-canonical `user_attachment` kind; a file part emits an INF `LogEntry` carrying `filename`/`url`/`mime` in extras, see the part-type table above. A canonical attachment `PayloadKind` is deferred to a follow-up SOW.) + ### Sub-Agent Linkage Confirmed two parallel mechanisms (both observed on the operator's DB): @@ -457,7 +599,7 @@ Op | Source field | | `op.kind=llm, model` | parent `message.data.modelID` | | `op.kind=llm, provider` | parent `message.data.providerID` | | `op.tokens_in/out/cost` | from the step-finish part: `part.data.tokens.input`, `.output`, `.cost`. NOTE: step-finish tokens **appear cumulative across steps within one assistant message** based on observed data (a sequence of input tokens 17438, 23075, 31713, 35407, … all monotonically increasing). The adapter records the **delta** between successive step-finish values within the same message, not the raw value, so per-LLM-op tokens are correct. | -| `turn.tokens_in/out/cost` | from the assistant `message.data.tokens.input/output/cost`. SOW-0005 decision: per-turn tokens are the **delta from the previous assistant message's cumulative totals** within the session (matching the opencode UI). The implementer MUST confirm the cumulative pattern on the live DB before pinning the golden — the step-finish cumulative pattern (row above / AC#3) is verified; this message-level pattern is the analogous one level up and is not yet independently confirmed. | +| `turn.tokens_in/out/cost` | from the assistant `message.data.tokens.input/output/cost`. SOW-0005 decision: per-turn tokens are the **delta from the previous assistant message's cumulative totals** within the session (matching the opencode UI). This message-level cumulative→delta behavior is the analogue, one level up, of the verified step-finish cumulative pattern (row above / AC#3) and is **pinned** by `TestMapSession_TurnNumberingAndTokenDeltas` plus the `e_cumulative_tokens` golden (`TestGoldenInvariant_ECumulativeTokens`, whose per-turn rollup is the message-level cumulative). | | `session.tokens_in/out/cost` | the rolled-up `session` columns (`tokens_input`, `tokens_output`, `cost`) when present; fall back to summing turns for sessions written before migration `20260510033149` | | `ctx_max` | static pricing table per `(providerID, modelID)`; opencode does not store it | | `ctx_used` | `tokens.input + tokens.cache.read` at the most recent step-finish for the turn | @@ -467,7 +609,7 @@ Op | Source field | 1. **Schema drift across opencode versions.** Sessions span ~30 migrations. Older rows may lack `cost`, `tokens_input`, `tokens_output`, `tokens_reasoning`, `tokens_cache_read`, `tokens_cache_write` (added by `20260510033149`), `workspace_id` (added by `20260227213759`), `path` (added by `20260428004200`), `agent`, `model`, `time_compacting`, `time_archived`. Drizzle adds them with NOT-NULL DEFAULT 0 or NULL where appropriate; all rows in the operator's DB have the columns now, but the column **values** are zero on old rows. The adapter: - At startup, queries `PRAGMA table_info(session)`, `PRAGMA table_info(message)`, `PRAGMA table_info(part)`, `PRAGMA table_info(session_message)`. - Builds the SELECT list dynamically — naming only known columns; never `SELECT *`. - - Tolerates missing columns by emitting empty/zero values in the canonical event and logging a structured INF on first occurrence per (table, column). + - Tolerates missing columns by emitting empty/zero values in the canonical event and emitting one structured INFO log per wanted-but-absent OPTIONAL column, at introspection time. The log carries a stable message (`opencode: optional column absent on this database schema; omitted from projection (old opencode version)`) plus structured keys `table` and `column`, emitted in a deterministic order (tables in `trackedTables` order, columns sorted). Required-column loss is fatal upstream (`introspectAll`), so every column that reaches the INFO path is an optional one the dynamic SELECT silently omitted. `Scan` and `Tail` EACH emit this set once on (re)start — they each introspect once, so on the rare old-schema path the missing-column set appears twice per source lifetime; that per-phase duplication is accepted (it is not deduplicated across phases). Production wiring: `tailer.go` `logMissingColumns` is called right after `introspectAll` succeeds in both `scanLoop` and `tailLoop`; the logger is threaded from `Adapter.logger` (`adapter.go` `Scan`/`Tail`). - Tolerates unknown tables and unknown `session_message.type`/`part.data.type` by skipping with a structured WARN. 2. **Soft delete.** `session.time_archived` is set when a session is archived in the opencode UI (2 sessions on the operator's DB). The adapter treats archive as `SessionFinalizedEvent` with `Status="completed"`. The data is never physically deleted by opencode under normal operation; the FK ON DELETE CASCADE only fires if a project is deleted, which would cascade to sessions+messages+parts. The adapter should not delete its own canonical rows when an opencode row disappears (deletion is rare and we want history). A follow-up SOW will decide deletion semantics. @@ -478,11 +620,11 @@ Op | Source field | 5. **`step-start` without matching `step-finish`.** 530 messages on the operator's DB have unbalanced step pairs (117119 starts vs 116589 finishes = 530 orphans). Treat orphan step-start as a running LLM op (no finalize); when a new step-start appears in the same message, force-close the previous one with `Status="cancelled"` and synthetic end_ts = next step-start's start_ts. -6. **Time units.** All opencode timestamps are **milliseconds since epoch**. Canonical events use **microseconds**. The adapter multiplies by 1000 on every emission. Mixing units is the most likely class of bug; a unit test pins this with fixture rows that span boundary values. +6. **Time units.** All opencode timestamps are **milliseconds since epoch**. Canonical events use **microseconds**. The adapter multiplies by 1000 on every emission. Mixing units is the most likely class of bug; a unit test pins this with fixture rows that span boundary values. A non-positive ms maps to 0 (an absent timestamp never fabricates a 1970-adjacent time). A crafted/corrupt ms whose `*1000` would overflow `int64` **saturates** at `math.MaxInt64` rather than wrapping to a negative time that would reorder events — and, since round-3 P2-1, that clamp is **surfaced via `onWarn`** (with the table/field context) when a warn callback is wired, so a corrupt timestamp is no longer a silent saturation. The pure mapper-only path (no `onWarn`) still degrades silently. The same warning-capable saturating arithmetic guards the `ctx_used = tokens.input + tokens.cache.read` ADDITION at step-finish: a crafted pair whose sum overflows `int64` clamps to `math.MaxInt64` with an `onWarn` rather than wrapping negative. 7. **`event` / `event_sequence` tables empty.** They exist in the schema but are unused on the operator's DB. The adapter ignores them. If opencode starts populating `event` in a future version, the adapter logs an INF and continues; a follow-up SOW will integrate it (it may give us monotonic per-session sequence numbers we currently synthesize). -8. **Compaction reshapes data.** When opencode compacts a session, message and part rows can change (text/tool output get summarized, marker `compaction` parts get inserted). The adapter detects this via `time_updated` and `time_compacting`. Strategy: when `time_compacting` becomes non-NULL the adapter pauses delta reads for that session until `time_compacting` returns to NULL, then re-reads the whole session's messages and parts and emits `SessionUpdatedEvent`+re-emits ops with new content (the ingester absorbs the re-emission via SQL-layer idempotent upserts, not a `SourceSeq` gate). Compaction is rare (432 out of 127k messages = 0.3%). +8. **Compaction reshapes data.** When opencode compacts a session, message and part rows can change (text/tool output get summarized, marker `compaction` parts get inserted). The adapter detects this via `time_updated` and `time_compacting`. Strategy: when `time_compacting` becomes non-NULL the adapter pauses delta reads for that session until `time_compacting` returns to NULL, then re-reads the whole session's messages and parts and emits `SessionUpdatedEvent`+re-emits ops with new content (the ingester absorbs the re-emission via SQL-layer idempotent upserts, not a `SourceSeq` gate). Compaction is rare (432 out of 127k messages = 0.3%). **The `time_compacting` check and the tree read are ATOMIC (SOW-0005 round-3 P1-2):** `loadAndMapSession` reads the session row, checks `time_compacting`, and loads messages+parts in ONE read-only transaction, so a non-NULL `time_compacting` skips the tree emit on a single consistent snapshot — compaction cannot begin *between* the check and the tree read (a TOCTOU that would have emitted a partial/mutating tree). The skipped session re-surfaces in a later delta when `time_compacting` clears (its `time_updated` bumps). 9. **Cross-process WAL inheritance.** If opencode crashes mid-transaction, the WAL file may contain uncommitted pages. SQLite handles this transparently on the next open: any reader sees only committed pages. We rely on this — never call `wal_checkpoint`. @@ -490,6 +632,87 @@ Op | Source field | 11. **Sensitive content.** `session.title`, `session.directory`, `message.data.summary.title`, every `text`/`reasoning` part, every tool `state.input`/`state.output`, and every `patch.files` entry contain real operator data. ai-viewer never copies these into its own database except as references (via `payload_refs.location_uri`). The presenter fetches them on demand at request time and is the only component that materializes payload bytes. +## Testing / Golden Fixtures (Chunk E) + +The adapter's row→event behaviour is pinned by a committed golden suite plus +direct per-scenario invariant assertions and a `data`-JSON fuzz target. This is +the SQLite analogue of the codex golden harness (`codex/golden_test.go`). + +**Fixture format — `fixture.sql`, never a binary `.db`.** The repo commits ZERO +binary database fixtures (opaque to diffs, can't be secret-scanned). Each scenario +under `testdata/opencode//` ships a human-reviewable `fixture.sql` +(`CREATE TABLE` + `INSERT`s, the faithful real schema for normal scenarios; the +reduced pre-`20260510033149` schema for the drift scenario). At run time the +golden harness (`buildFixtureDB`) builds a throwaway SQLite database in +`t.TempDir()` from `fixture.sql` via a SEPARATE read-write `database/sql` +connection (production NEVER opens opencode.db read-write — this is the harness +constructing the fixture), closes the writer, and the adapter under test reopens +the path strictly read-only through `New`/`openReadOnly`. All fixture content is +synthetic and invented (ids like `ses_happy01`/`prt_a01`, providers +`anthropic`/`openai`, models `claude-x`/`gpt-y`); the operator's real database is +never read or copied (R5). + +**Golden encoding.** `golden_test.go` auto-discovers every `testdata/opencode/*/` +directory, scans the built DB, filters out `SourceProgressEvent` (a checkpoint, +not content), and serialises the remaining events one `{kind,payload}` JSONL line +per event into `expected.jsonl`. The only absolute path embedded in a +non-SourceProgress event is the `SourceID` (`opencode:`), rewritten to +`opencode:` for portability and PII hygiene. The `opencode-sqlite://?part_id=&field=` +PayloadRef URIs are DB-relative (no path, no basename) and need no substitution. +Run `go test ./internal/adapters/opencode/ -update-golden` to regenerate; the +generator is deterministic (every non-SourceProgress `Ts` derives from a fixture +row timestamp ×1000 — no wall-clock leaks), so regeneration is byte-idempotent. + +**Goldens are not self-justifying.** A `-update-golden` run pins whatever the code +emitted, regressions included. `golden_invariants_test.go` therefore re-scans each +fixture and asserts the load-bearing invariant keyed on canonical-event FIELDS +(not golden text), so a regression fails there even after a golden refresh. + +The five scenarios and what each pins: + +| scenario | pins | +|---|---| +| `a_happy` | baseline `session→turn→op` tree: root SessionStarted → TurnStarted → LLM op (step-start) → reasoning op (+ `llm_reasoning` PayloadRef) → `llm_response` PayloadRef (text) → tool op (+ `tool_response` PayloadRef) → LLM op_finalized (step-finish) → TurnFinalized; NO SessionFinalized (running). PayloadRef URIs + ms→µs. | +| `b_subagent_task` | sub-agent linkage BOTH ways (AC#4): the child `session.parent_id` row maps to `Kind=sub_agent`+`ParentNativeID`/`RootNativeID`=parent; the parent's `tool='task'` part (with `state.metadata.sessionId`) emits BOTH a session Op (`Kind=session`, `ChildSessionNativeID`, the topology parent, emitted first) AND a tool Op (`Kind=tool`, `name=task`) in the same turn. | +| `c_multi_provider` | multi-provider (AC#7): two turns with `providerID` anthropic then openai → each LLM op carries its `ProviderAlias` verbatim + canonical `Provider` (two catalog providers downstream). Also the two-level token model: per-op tokens reset per message (turn2 op = 300/80) while per-turn tokens are the session-level delta (turn2 turn = 200/50). | +| `d_schema_drift` | graceful degrade on the pre-`20260510033149` schema (AC#5): `introspectAll` ACCEPTS it (required cols present), the dynamic SELECT omits the 9 missing optional `session` columns, SessionStarted carries empty `Model`/`AgentName` and Extras WITHOUT `providerID`/`variant`, while op/turn token+provider values survive (they come from `message.data`, untouched by the column drift). | +| `e_cumulative_tokens` | cumulative→delta token math (AC#3): four step-finish parts with CUMULATIVE inputs 100/250/410/400 (outputs 20/50/90/80) → per-LLM-op deltas 100/150/160/0 and 20/30/40/0 (the 4th clamps to 0 because the cumulative decreased). The per-turn rollup is the message-level cumulative (400/80). | +| `g_nested_subagent` | nested-root resolution (SOW-0005 P2.4): a 3-level tree root→child→grandchild. The grandchild's `RootNativeID` is the TRUE tree root (`ses_groot`), NOT its direct parent (`ses_gchild`), while its `ParentNativeID` is the direct parent — proving `resolveRootID` walks the chain to the top. Each session's turn finalizes (completed ts present, P1.3). | + +**Resume property (AC#6, scenario-level).** Complementary to chunk C's +two-stage-insert `TestScanLoop_ResumeZeroDupesZeroGaps`, `golden_resume_test.go` +pins the durability properties expressible over a STATIC fixture: (a) a re-scan +from the final cursor (persisted+reparsed) emits ZERO content events (no duplicate +on restart), (b) two cold scans from the zero cursor emit the identical content +multiset (no nondeterministic drop/duplicate), and (c) on the two-session +`b_subagent_task` fixture a re-scan from the final cursor re-emits neither session +(the watermark advances past every session touched in one cycle). Together: +resume/re-scan never drops or duplicates a content event. + +**`data`-JSON fuzz.** `data_fuzz_test.go` fuzzes `decodeMessageData` (the +message.data user|assistant union) and `decodePartData` (the part.data 12-variant +`$.type` union) — the untrusted-bytes boundary where a malformed/truncated blob +from the live database meets the adapter (opencode's analogue of codex's +`FuzzParseLine`). Contract: the decoder NEVER panics on any input — it returns a +struct or a wrapped error; the typed helpers reachable from a decoded value +(`role`/`kind`/`subAgentSessionID`/`modelID`/`reasoningKind`) must also not panic. +Seeds cover both message roles, all 12 part variants (incl. the `tool='task'` +metadata.sessionId edge), unknown `$.type`/role, and malformed/truncated/empty/ +deeply-nested bodies. + +**AC#5 INF logging (wired).** The "one INFO log per missing optional column" +promise (Edge Cases #1) is implemented: `tailer.go` `logMissingColumns` iterates +each table's `tableSchema.Missing` right after `introspectAll` succeeds in BOTH +`scanLoop` and `tailLoop`, emitting one `logger.Info(...)` per (table, column) +with the stable message + `table`/`column` keys. `TestGoldenInvariant_DSchemaDrift_MissingColumnsLoggedINF` +(`golden_invariants_test.go`) proves it: it `Scan`s the `d_schema_drift` fixture +through the public adapter with a record-capturing `slog.Handler` +(`golden_loghandler_test.go` `captureHandler`) and asserts the set of logged +(table, column) pairs equals the set introspection reports Missing — exactly one +INFO record per missing column, nothing extra. The `d_schema_drift` golden still +pins the graceful DEGRADE (accept + omit columns + zero values); the INF set is +not serialised into `expected.jsonl` (it is a log, not a canonical event). + ## Canonical Model Gaps 1. **Multi-provider opencode vs single-provider canonical `Provider`.** Opencode's `providerID` is a user-defined alias (`llm-netdata-cloud`, `zai-coding-plan`, `kimi-for-coding`, etc.), not a canonical vendor. The canonical `Provider` field is documented (per `canonical-events.md:101`) as a vendor name like `anthropic`. The adapter passes the alias through; the canonical model needs either (a) a `provider_alias` field separate from `provider`, or (b) acceptance that `provider` is the source-recorded value and the UI must dereference. **Recommendation**: extend canonical event with `ProviderAlias` (optional) and reserve `Provider` for canonical vendor names; add a vendor-mapping table in `internal/canonical/providers.go`. Filed as a follow-up SOW question. diff --git a/.agents/sow/specs/canonical-events.md b/.agents/sow/specs/canonical-events.md index 0d68984..b5e8f45 100644 --- a/.agents/sow/specs/canonical-events.md +++ b/.agents/sow/specs/canonical-events.md @@ -106,7 +106,7 @@ Notes on terminal signal availability per source: - **ai-agent v2** — `opTree.success`/`opTree.error` carry terminal state; the `'final'` snapshot marks completed/failed. Pre-final snapshots → `running`. - **claude-code** — has **no native terminal signal** (sessions are resumable indefinitely). Adapter never emits `SessionFinalizedEvent` for claude-code; sessions stay `running`. UI filters via `last_activity_ts` for staleness display. - **codex** — emits `task_complete` per turn but has **no per-session terminal signal** (a clean rollout simply stops being appended; `recorder.rs:1610` may even append metadata after a turn ends). Like claude-code, the adapter does **not** emit `SessionFinalizedEvent(completed)` for a cleanly-ended session — it stays `running` and the UI uses `last_activity_ts` for staleness. The *only* `SessionFinalizedEvent` codex emits is the synthetic `failed/incomplete` for a session whose most-recent turn was left hanging (no `task_complete`/`turn_aborted`) and whose file is mtime-stale ≥ 1 h (a crash); see `adapter-codex.md` state-machine rule #23. -- **opencode** — no explicit session-end column. Adapter infers terminal status from the last assistant message's `data.error` and `data.completedAt`. +- **opencode** — no explicit session-end column. Adapter infers terminal status from the last assistant message's `data.error` and `data.completedAt`. On a failed terminal it sets `ErrorClass` from `data.error.name` (or a default when empty) and `ErrorMessage` from `data.error.data.message` — opencode's `AssistantError` union serializes as `{name, data:{message, …}}` and every shipping variant except `MessageOutputLengthError` carries `data.message` (so `ErrorMessage` is empty only for that one message-less variant or a malformed body; decode is best-effort and never aborts the session). ### TurnStartedEvent / TurnFinalizedEvent diff --git a/.agents/sow/specs/deployment.md b/.agents/sow/specs/deployment.md index 435c071..0e0da1d 100644 --- a/.agents/sow/specs/deployment.md +++ b/.agents/sow/specs/deployment.md @@ -114,10 +114,9 @@ against. Each existing location becomes a source; missing locations are silently skipped. Phase 1 shipped the `aiagent_v3` and `aiagent_v2` adapters; Phase 2 added -`claude-code` (SOW-0003) and `codex` (SOW-0004), both now wired into the -binary. The `opencode` row is reserved for its Phase 2 SOW (SOW-0005); its -adapter package is not yet compiled in. The `Format` column is the registry -key the adapter registers under (note `claude-code` is hyphenated). +`claude-code` (SOW-0003), `codex` (SOW-0004), and `opencode` (SOW-0005), all now +wired into the binary. The `Format` column is the registry key the adapter +registers under (note `claude-code` is hyphenated). | Format | Probe | Status | |---|---|---| @@ -125,7 +124,43 @@ key the adapter registers under (note `claude-code` is hyphenated). | aiagent_v2 | `~/.ai-agent/sessions/` exists | live (Chunk 11) | | claude-code | `~/.claude/projects/` (or `$CLAUDE_CONFIG_DIR/projects/`) exists | live (SOW-0003) | | codex | `$CODEX_HOME/sessions/` (default `~/.codex/sessions/`) exists | live (SOW-0004) | -| opencode | `~/.local/share/opencode/opencode.db` exists | adapter pending (Phase 2 SOW) | +| opencode | `~/.local/share/opencode/opencode.db` exists (a regular file) | live (SOW-0005) | + +The `opencode` probe targets a single regular file (the SQLite database), unlike +the four directory probes above. Because `os.Stat` succeeds on a directory too, the +opencode probe additionally requires `info.Mode().IsRegular()` (SOW-0005 round-3 +P3-2) so a *directory* named `opencode.db` does NOT register as a source (it would +fail to open as a database). The other adapters' probes intentionally accept a +directory and are unchanged. When a regular file exists the source is registered +read-only and the discovery log line carries `sessions`, `messages`, `parts` row +counts and `latest_migration` (the newest `__drizzle_migrations` name), read once +at startup via `opencode.ProbeStatus`. A `ProbeStatus` error never blocks +discovery: the source is still registered and the failure is logged as a +`probe_error` attribute (the adapter's own `Scan`/`Tail` then surface any fatal +schema problem via `/api/health`). + +**The opencode source location is a filesystem path (SOW-0005 round-3 P2-4).** Both +auto-discovery and `--source opencode:` resolve to a real file path that +`startSource` validates with `os.Stat` before the adapter opens it. The adapter's +`buildReadOnlyDSN` (`conn.go`) additionally accepts pre-built `file:` URIs and the +in-memory `:memory:` form, but those DSN shapes are for the adapter's +**programmatic/test use only** — they are NOT valid `--source` locations because +`os.Stat` cannot stat a `file:`/`:memory:` DSN string. Operators always pass a +filesystem path; the DSN forms never appear on the CLI. + +The opencode database path resolution order (`opencodeDBPath`, SOW-0005 P2.4) is: + +1. `$OPENCODE_DB`, if non-empty — used **verbatim** as a full path to the database. +2. else `$XDG_DATA_HOME/opencode/opencode.db`, if `$XDG_DATA_HOME` is non-empty. +3. else `~/.local/share/opencode/opencode.db` (the XDG default base). + +CAVEAT: `$OPENCODE_DB` is honoured as the conventional override name but could +NOT be confirmed against opencode's upstream source during SOW-0005 (the mirror +was unavailable), so it is treated as best-effort; the XDG base (`~/.local/share` +== `$XDG_DATA_HOME`) IS opencode's verified default location. Per-channel +`opencode-.db` variants (anomalyco/opencode +`packages/opencode/src/storage/db.ts`) remain out of scope for auto-discovery — +point `--source opencode:` at a non-default database explicitly. The Chunk 11 v2 probe checks for the parent `sessions/` directory rather than the glob `*.json.gz` documented earlier: a freshly-bootstrapped diff --git a/cmd/ai-viewer-ingest/discovery.go b/cmd/ai-viewer-ingest/discovery.go new file mode 100644 index 0000000..6fcc049 --- /dev/null +++ b/cmd/ai-viewer-ingest/discovery.go @@ -0,0 +1,171 @@ +// Auto-discovery helpers for ai-viewer-ingest. Split out of sources.go so each +// file stays under the 400-line budget. This file owns the default-location +// resolvers (one per adapter, honoring the relevant env override) and the +// best-effort observability counters the auto-discovery log line carries +// (acceptance #8). The counters are deliberately lightweight predicates that +// mirror each adapter's ingest match WITHOUT importing the adapter package, so +// the surfaced count matches what the source will actually yield. They never +// fail discovery: a read/walk error yields 0. +package main + +import ( + "os" + "path/filepath" + "strings" +) + +// claudeProjectsDir returns the claude-code projects root, honoring +// $CLAUDE_CONFIG_DIR (spec adapter-claude-code.md §2.1). When the env var is +// set, the root is "$CLAUDE_CONFIG_DIR/projects"; otherwise "~/.claude/projects". +func claudeProjectsDir(home string) string { + if cfg := os.Getenv("CLAUDE_CONFIG_DIR"); cfg != "" { + return filepath.Join(cfg, "projects") + } + return filepath.Join(home, ".claude", "projects") +} + +// countProjectDirs returns the number of immediate subdirectories under the +// claude-code projects root (each is one sanitized-cwd project). Returns 0 +// on any read error — the count is observability, not a gate. +func countProjectDirs(root string) int { + entries, err := os.ReadDir(root) + if err != nil { + return 0 + } + n := 0 + for _, e := range entries { + if e.IsDir() { + n++ + } + } + return n +} + +// codexSessionsDir returns the codex sessions root, honoring $CODEX_HOME +// (SOW-0004 C#3). When the env var is set, the root is "$CODEX_HOME/sessions"; +// otherwise "~/.codex/sessions". This is the directory the adapter walks and +// tails; the probe checks it for existence. +func codexSessionsDir(home string) string { + if ch := os.Getenv("CODEX_HOME"); ch != "" { + return filepath.Join(ch, "sessions") + } + return filepath.Join(home, ".codex", "sessions") +} + +// opencodeDBPath resolves the opencode database file path the auto-discovery +// probe checks (deployment.md §"Source Auto-Discovery"). Resolution order: +// +// 1. $OPENCODE_DB, if non-empty — used VERBATIM as a full path to the database. +// 2. else $XDG_DATA_HOME/opencode/opencode.db, if $XDG_DATA_HOME is non-empty. +// 3. else ~/.local/share/opencode/opencode.db (the XDG default base). +// +// CAVEAT: $OPENCODE_DB is honoured as the conventional override name, but it +// could NOT be confirmed against opencode's upstream source during this work +// (the mirror was unavailable), so it is treated as best-effort. The XDG base +// (~/.local/share == $XDG_DATA_HOME) IS opencode's verified default location. +// Either way the probe os.Stats the resolved path and registers the source only +// if it exists; pointing --source opencode: at a non-default database +// remains the explicit escape hatch. +func opencodeDBPath(home string) string { + if db := os.Getenv("OPENCODE_DB"); db != "" { + return db + } + if xdg := os.Getenv("XDG_DATA_HOME"); xdg != "" { + return filepath.Join(xdg, "opencode", "opencode.db") + } + return filepath.Join(home, ".local", "share", "opencode", "opencode.db") +} + +// codexRolloutPrefix is the shared filename prefix for both modern and legacy +// codex rollouts (openai/codex codex-rs/rollout/src/list.rs filters on +// starts_with("rollout-")). Duplicated here as a lightweight observability +// predicate; the adapter's discovery.go holds the authoritative anchored +// regexes used for actual ingest. +const codexRolloutPrefix = "rollout-" + +// codexArchivedDir is the codex session archive, pruned from both ingest and +// these observability counts (spec adapter-codex.md §"Filesystem Layout"). +const codexArchivedDir = "archived_sessions" + +// countRolloutFiles returns the number of modern sharded codex rollouts +// ("rollout-*.jsonl") under the sessions root, counting ONLY files at the +// YYYY/MM/DD shard depth and pruning archived_sessions/. Returns 0 on any walk +// error — the count is observability for acceptance #8, not a gate, so it is +// read best-effort and never blocks discovery. Mirrors discovery.go's modern +// match (^rollout-.*\.jsonl$) AND its shard-depth requirement (F8) without +// importing the adapter package, so the surfaced count matches what is ingested. +func countRolloutFiles(root string) int { + n := 0 + _ = filepath.WalkDir(root, func(path string, d os.DirEntry, err error) error { + if err != nil { + if d != nil && d.IsDir() { + return filepath.SkipDir + } + return nil + } + if d.IsDir() { + if d.Name() == codexArchivedDir && path != root { + return filepath.SkipDir + } + return nil + } + name := d.Name() + if strings.HasPrefix(name, codexRolloutPrefix) && strings.HasSuffix(name, ".jsonl") && codexAtShardDepth(root, path) { + n++ + } + return nil + }) + return n +} + +// codexAtShardDepth reports whether path is a rollout at the required YYYY/MM/DD +// shard depth relative to root: exactly three leading numeric path components +// then the basename (F8). Mirrors discovery.go's hasShardDepth without importing +// the adapter package, so countRolloutFiles never over-counts a stray +// rollout-*.jsonl placed at the wrong depth. A relpath failure counts the file +// out (best-effort observability). +func codexAtShardDepth(root, path string) bool { + rel, err := filepath.Rel(root, path) + if err != nil { + return false + } + parts := strings.Split(filepath.ToSlash(rel), "/") + if len(parts) != 4 { + return false + } + for _, p := range parts[:3] { + if len(p) == 0 { + return false + } + for _, c := range p { + if c < '0' || c > '9' { + return false + } + } + } + return true +} + +// countLegacyJSON returns the number of legacy flat codex rollouts +// ("rollout-*.json") directly under the sessions root (NOT in shards). These are +// recognized but NOT ingested in v1 (one informational SourceError per file); +// the count is surfaced separately so the operator sees the deferred-legacy +// volume (acceptance #8). Returns 0 on any read error. Mirrors discovery.go's +// legacy match (^rollout-.*\.json$, root-only). +func countLegacyJSON(root string) int { + entries, err := os.ReadDir(root) + if err != nil { + return 0 + } + n := 0 + for _, e := range entries { + if e.IsDir() { + continue + } + name := e.Name() + if strings.HasPrefix(name, codexRolloutPrefix) && strings.HasSuffix(name, ".json") { + n++ + } + } + return n +} diff --git a/cmd/ai-viewer-ingest/discovery_test.go b/cmd/ai-viewer-ingest/discovery_test.go new file mode 100644 index 0000000..13c951e --- /dev/null +++ b/cmd/ai-viewer-ingest/discovery_test.go @@ -0,0 +1,251 @@ +// Tests for the codex auto-discovery probe and its observability counters +// (SOW-0004 acceptance #8), plus the shared discovery-helper counters. Split out +// of sources_test.go so each test file stays under the 400-line budget, mirroring +// the discovery.go / sources.go production split. They pin: +// +// - the probe registers a source at $CODEX_HOME/sessions (default +// ~/.codex/sessions) when the directory exists, with location = the +// walked sessions dir; +// - $CODEX_HOME overrides the default location; +// - an absent sessions dir registers no codex source; +// - countRolloutFiles / countLegacyJSON report the modern (sharded .jsonl) +// and legacy (root .json) volumes SEPARATELY; +// - the discovery log line carries both counts as distinct keys. +package main + +import ( + "bytes" + "log/slog" + "os" + "path/filepath" + "testing" + + "github.com/netdata/ai-viewer/internal/adapters" + "github.com/netdata/ai-viewer/internal/canonical" +) + +// plantCodexLayout writes a sessions tree under root with `modern` sharded +// rollout-*.jsonl files (in a YYYY/MM/DD shard), `legacy` root rollout-*.json +// files, and a couple of decoys that must NOT be counted (an archived_sessions +// shard, a non-rollout file, a .jsonl outside the rollout prefix). +func plantCodexLayout(t *testing.T, root string, modern, legacy int) { + t.Helper() + shard := filepath.Join(root, "2025", "11", "20") + if err := os.MkdirAll(shard, 0o755); err != nil { + t.Fatalf("mkdir shard: %v", err) + } + for i := 0; i < modern; i++ { + name := filepath.Join(shard, "rollout-2025-11-20T10-00-0"+itoa(i)+"-uuid.jsonl") + if err := os.WriteFile(name, []byte(`{"type":"session_meta"}`+"\n"), 0o644); err != nil { + t.Fatalf("write modern rollout: %v", err) + } + } + for i := 0; i < legacy; i++ { + name := filepath.Join(root, "rollout-2025-06-0"+itoa(i)+"-uuid.json") + if err := os.WriteFile(name, []byte(`{}`), 0o644); err != nil { + t.Fatalf("write legacy rollout: %v", err) + } + } + // Decoys: an archived shard rollout (pruned), a non-rollout .jsonl, a + // non-rollout file at the root, AND a rollout-*.jsonl at the WRONG depth + // (directly under the sessions root, not in a YYYY/MM/DD shard — F8). None of + // these must be counted. + arch := filepath.Join(root, "archived_sessions", "2025", "11", "20") + if err := os.MkdirAll(arch, 0o755); err != nil { + t.Fatalf("mkdir archive: %v", err) + } + if err := os.WriteFile(filepath.Join(arch, "rollout-archived-uuid.jsonl"), []byte("{}"), 0o644); err != nil { + t.Fatalf("write archived: %v", err) + } + if err := os.WriteFile(filepath.Join(shard, "not-a-rollout.jsonl"), []byte("{}"), 0o644); err != nil { + t.Fatalf("write decoy jsonl: %v", err) + } + if err := os.WriteFile(filepath.Join(root, "history.jsonl"), []byte("{}"), 0o644); err != nil { + t.Fatalf("write decoy root file: %v", err) + } + // A rollout-*.jsonl placed directly under the sessions root (wrong shard + // depth) must NOT be counted as a modern rollout (F8). + if err := os.WriteFile(filepath.Join(root, "rollout-2025-11-20T10-00-09-strayroot.jsonl"), []byte(`{"type":"session_meta"}`+"\n"), 0o644); err != nil { + t.Fatalf("write stray-root rollout: %v", err) + } +} + +// itoa is a tiny single-digit int→string helper so plantCodexLayout stays free +// of strconv for the small counts the tests use. +func itoa(i int) string { return string(rune('0' + i)) } + +// TestAutoDiscover_CodexProbe verifies acceptance #8: a tmpdir +// ~/.codex/sessions tree with modern sharded rollouts is auto-discovered as a +// codex source whose location is the sessions root, and the registered factory +// can construct it. +func TestAutoDiscover_CodexProbe(t *testing.T) { + // Not parallel: t.Setenv mutates process-wide HOME / CODEX_HOME. + tmp := t.TempDir() + t.Setenv("HOME", tmp) + t.Setenv("CODEX_HOME", "") + sessions := filepath.Join(tmp, ".codex", "sessions") + plantCodexLayout(t, sessions, 2, 3) + + got, err := resolveSources(nil, silentLogger()) + if err != nil { + t.Fatalf("resolveSources: %v", err) + } + var cdx *configuredSource + for i := range got { + if got[i].format == "codex" { + cdx = &got[i] + } + } + if cdx == nil { + t.Fatalf("codex source not auto-discovered; got %+v", got) + } + if cdx.location != sessions { + t.Fatalf("codex location = %q, want %q", cdx.location, sessions) + } + // The discovered source must be constructable via the registry, proving the + // adapter's init() ran (acceptance #1). + factory, ok := adapters.Get("codex") + if !ok { + t.Fatal("codex factory not registered") + } + if _, err := factory(cdx.location, canonical.AdapterOptions{Logger: silentLogger()}); err != nil { + t.Fatalf("codex factory(%q): %v", cdx.location, err) + } +} + +// TestAutoDiscover_CodexHomeOverride verifies the probe honors $CODEX_HOME +// (SOW-0004 C#3): the sessions root is "$CODEX_HOME/sessions", not ~/.codex. +func TestAutoDiscover_CodexHomeOverride(t *testing.T) { + // Not parallel: mutates process-wide env. + tmp := t.TempDir() + t.Setenv("HOME", tmp) // no ~/.codex here + codexHome := filepath.Join(tmp, "custom-codex") + t.Setenv("CODEX_HOME", codexHome) + sessions := filepath.Join(codexHome, "sessions") + plantCodexLayout(t, sessions, 1, 0) + + got, err := resolveSources(nil, silentLogger()) + if err != nil { + t.Fatalf("resolveSources: %v", err) + } + var loc string + for _, s := range got { + if s.format == "codex" { + loc = s.location + } + } + if loc != sessions { + t.Fatalf("codex location = %q, want %q (CODEX_HOME honored)", loc, sessions) + } +} + +// TestAutoDiscover_NoCodexWhenAbsent verifies a workstation without +// ~/.codex/sessions does not register a codex source. +func TestAutoDiscover_NoCodexWhenAbsent(t *testing.T) { + // Not parallel: mutates process-wide env. + tmp := t.TempDir() + t.Setenv("HOME", tmp) + t.Setenv("CODEX_HOME", "") + + got, err := resolveSources(nil, silentLogger()) + if err != nil { + t.Fatalf("resolveSources: %v", err) + } + for _, s := range got { + if s.format == "codex" { + t.Fatalf("codex registered with no sessions dir present: %+v", got) + } + } +} + +// TestAutoDiscover_CodexProbeLogsBothCountsSeparately verifies the probe's +// discovery log line carries the modern and legacy volumes as DISTINCT keys +// (acceptance #8: "/api/sources reports both counts separately" — the structured +// log is the operator-facing surface at discovery time). +func TestAutoDiscover_CodexProbeLogsBothCountsSeparately(t *testing.T) { + // Not parallel: mutates process-wide env. + tmp := t.TempDir() + t.Setenv("HOME", tmp) + t.Setenv("CODEX_HOME", "") + sessions := filepath.Join(tmp, ".codex", "sessions") + plantCodexLayout(t, sessions, 2, 3) + + var buf bytes.Buffer + logger := slog.New(slog.NewTextHandler(&buf, &slog.HandlerOptions{Level: slog.LevelInfo})) + if _, err := resolveSources(nil, logger); err != nil { + t.Fatalf("resolveSources: %v", err) + } + out := buf.String() + if !bytes.Contains(buf.Bytes(), []byte("modern_rollouts=2")) { + t.Errorf("discovery log missing modern_rollouts=2; got:\n%s", out) + } + if !bytes.Contains(buf.Bytes(), []byte("legacy_json=3")) { + t.Errorf("discovery log missing legacy_json=3; got:\n%s", out) + } +} + +// TestCountRolloutFiles verifies the modern-rollout counter mirrors discovery.go's +// match: rollout-*.jsonl under YYYY/MM/DD shards, archived_sessions pruned, +// non-rollout .jsonl, root non-rollout files, AND a rollout-*.jsonl at the wrong +// shard depth (directly under the root) all ignored (F8). +func TestCountRolloutFiles(t *testing.T) { + t.Parallel() + tmp := t.TempDir() + plantCodexLayout(t, tmp, 4, 2) + if n := countRolloutFiles(tmp); n != 4 { + t.Fatalf("countRolloutFiles = %d, want 4 (archived + decoys + wrong-depth stray excluded)", n) + } + if n := countRolloutFiles(filepath.Join(tmp, "missing")); n != 0 { + t.Fatalf("countRolloutFiles(missing) = %d, want 0", n) + } +} + +// TestCountLegacyJSON verifies the legacy counter mirrors discovery.go's match: +// rollout-*.json directly under the root only (not in shards), non-rollout root +// files ignored. +func TestCountLegacyJSON(t *testing.T) { + t.Parallel() + tmp := t.TempDir() + plantCodexLayout(t, tmp, 4, 2) + if n := countLegacyJSON(tmp); n != 2 { + t.Fatalf("countLegacyJSON = %d, want 2", n) + } + if n := countLegacyJSON(filepath.Join(tmp, "missing")); n != 0 { + t.Fatalf("countLegacyJSON(missing) = %d, want 0", n) + } +} + +// TestOpencodeDBPath_Resolution pins the opencode DB path resolution order +// (SOW-0005 AC#8 / P1.4): $OPENCODE_DB verbatim wins; else $XDG_DATA_HOME derives +// "/opencode/opencode.db"; else the ~/.local/share default. Each case uses +// t.Setenv so the process env is restored after the subtest. +func TestOpencodeDBPath_Resolution(t *testing.T) { + // Not parallel: t.Setenv mutates process-wide env. + const home = "/home/u" + + t.Run("OPENCODE_DB verbatim wins", func(t *testing.T) { + t.Setenv("OPENCODE_DB", "/custom/path/oc.db") + t.Setenv("XDG_DATA_HOME", "/xdg/data") // must be ignored when OPENCODE_DB set + if got := opencodeDBPath(home); got != "/custom/path/oc.db" { + t.Errorf("opencodeDBPath = %q, want /custom/path/oc.db ($OPENCODE_DB verbatim)", got) + } + }) + + t.Run("XDG_DATA_HOME derivation", func(t *testing.T) { + t.Setenv("OPENCODE_DB", "") + t.Setenv("XDG_DATA_HOME", "/xdg/data") + want := filepath.Join("/xdg/data", "opencode", "opencode.db") + if got := opencodeDBPath(home); got != want { + t.Errorf("opencodeDBPath = %q, want %q ($XDG_DATA_HOME derived)", got, want) + } + }) + + t.Run("home default", func(t *testing.T) { + t.Setenv("OPENCODE_DB", "") + t.Setenv("XDG_DATA_HOME", "") + want := filepath.Join(home, ".local", "share", "opencode", "opencode.db") + if got := opencodeDBPath(home); got != want { + t.Errorf("opencodeDBPath = %q, want %q (~/.local/share default)", got, want) + } + }) +} diff --git a/cmd/ai-viewer-ingest/sources.go b/cmd/ai-viewer-ingest/sources.go index 4d3dd78..b1792bb 100644 --- a/cmd/ai-viewer-ingest/sources.go +++ b/cmd/ai-viewer-ingest/sources.go @@ -19,6 +19,7 @@ import ( "path/filepath" "strings" "sync" + "time" "github.com/netdata/ai-viewer/internal/adapters" // Side-effect import: the codex adapter registers its factory with @@ -27,6 +28,12 @@ import ( // codex is registered from here to keep this chunk's change additive and // co-located with its probe. _ "github.com/netdata/ai-viewer/internal/adapters/codex" + // Named import: the opencode adapter both registers its factory via init() + // (like codex) AND exposes ProbeStatus, which the opencode rich-attrs branch + // below calls to surface session/message/part counts + the latest migration + // at discovery (SOW-0005 AC#8). Co-located with its probe for the same + // additive reason as codex. + "github.com/netdata/ai-viewer/internal/adapters/opencode" "github.com/netdata/ai-viewer/internal/canonical" "github.com/netdata/ai-viewer/internal/ingest" ) @@ -40,6 +47,14 @@ type configuredSource struct { location string } +// opencodeProbeTimeout bounds the one-time opencode auto-discovery ProbeStatus +// COUNT(*) (SOW-0005 round-4 P3-1). The probe is best-effort observability; a +// slow or locked opencode database must not stall startup discovery, so the probe +// runs under this short deadline and discovery proceeds (source registered) on +// timeout. 10 s is generous for a COUNT(*) even on a multi-GB database while still +// bounding a pathological stall. +const opencodeProbeTimeout = 10 * time.Second + // resolveSources returns the source list to start. When the operator // passes any --source flag, auto-discovery is bypassed entirely (per // deployment.md §"Source Auto-Discovery": explicit replaces implicit). @@ -95,6 +110,12 @@ func autoDiscoverSources(logger *slog.Logger) []configuredSource { format string location string probe string + // requireRegular gates the probe on info.Mode().IsRegular() in addition to + // existence. opencode's source is a single SQLite FILE, so a *directory* + // named opencode.db must NOT register (it cannot be opened as a database). + // The four directory-based probes leave this false — os.Stat-exists is the + // right check for them (SOW-0005 round-3 P3-2). + requireRegular bool }{ { format: "aiagent_v3", @@ -116,12 +137,32 @@ func autoDiscoverSources(logger *slog.Logger) []configuredSource { location: codexSessionsDir(home), probe: codexSessionsDir(home), }, + { + // opencode's source is a single SQLite FILE (not a directory like the + // four probes above). The location IS the database path the adapter + // opens read-only (deployment.md §"Source Auto-Discovery"). It requires a + // REGULAR file: a directory named opencode.db must not register, because + // the adapter would fail to open it as a database (SOW-0005 round-3 P3-2). + format: "opencode", + location: opencodeDBPath(home), + probe: opencodeDBPath(home), + requireRegular: true, + }, } var out []configuredSource seen := make(map[string]struct{}, len(probes)) for _, p := range probes { - if _, err := os.Stat(p.probe); err != nil { + info, err := os.Stat(p.probe) + if err != nil { + continue + } + if p.requireRegular && !info.Mode().IsRegular() { + // A non-regular file at the opencode DB path (e.g. a directory named + // opencode.db) is not a usable source; skip it rather than registering a + // source the adapter cannot open. + logger.Warn("ai-viewer-ingest: skipping opencode source — path is not a regular file", + "format", p.format, "location", p.location) continue } key := p.format + ":" + p.location @@ -146,144 +187,34 @@ func autoDiscoverSources(logger *slog.Logger) []configuredSource { attrs = append(attrs, "modern_rollouts", countRolloutFiles(p.location), "legacy_json", countLegacyJSON(p.location)) + case "opencode": + // Surface session/message/part counts + the latest applied migration + // (SOW-0005 acceptance #8) via the adapter's read-only ProbeStatus. + // Best-effort: a probe error (unreadable file, foreign schema) is + // logged as a probe_error attr and discovery STILL registers the + // source — counting must never block discovery. The COUNT(*) cost is + // a one-time startup hit (see opencode.ProbeStatus). The probe is + // BOUNDED by a short timeout (SOW-0005 round-4 P3-1) so a slow/locked + // database cannot stall startup discovery indefinitely; on timeout the + // probe returns its error and discovery proceeds with the source + // registered (the counts are observability, not a gate). + probeCtx, cancelProbe := context.WithTimeout(context.Background(), opencodeProbeTimeout) + sessions, messages, parts, latest, perr := opencode.ProbeStatus(probeCtx, p.location) + cancelProbe() + attrs = append(attrs, + "sessions", sessions, + "messages", messages, + "parts", parts, + "latest_migration", latest) + if perr != nil { + attrs = append(attrs, "probe_error", perr.Error()) + } } logger.Info("ai-viewer-ingest: auto-discovered source", attrs...) } return out } -// claudeProjectsDir returns the claude-code projects root, honoring -// $CLAUDE_CONFIG_DIR (spec adapter-claude-code.md §2.1). When the env var is -// set, the root is "$CLAUDE_CONFIG_DIR/projects"; otherwise "~/.claude/projects". -func claudeProjectsDir(home string) string { - if cfg := os.Getenv("CLAUDE_CONFIG_DIR"); cfg != "" { - return filepath.Join(cfg, "projects") - } - return filepath.Join(home, ".claude", "projects") -} - -// countProjectDirs returns the number of immediate subdirectories under the -// claude-code projects root (each is one sanitized-cwd project). Returns 0 -// on any read error — the count is observability, not a gate. -func countProjectDirs(root string) int { - entries, err := os.ReadDir(root) - if err != nil { - return 0 - } - n := 0 - for _, e := range entries { - if e.IsDir() { - n++ - } - } - return n -} - -// codexSessionsDir returns the codex sessions root, honoring $CODEX_HOME -// (SOW-0004 C#3). When the env var is set, the root is "$CODEX_HOME/sessions"; -// otherwise "~/.codex/sessions". This is the directory the adapter walks and -// tails; the probe checks it for existence. -func codexSessionsDir(home string) string { - if ch := os.Getenv("CODEX_HOME"); ch != "" { - return filepath.Join(ch, "sessions") - } - return filepath.Join(home, ".codex", "sessions") -} - -// codexRolloutPrefix is the shared filename prefix for both modern and legacy -// codex rollouts (openai/codex codex-rs/rollout/src/list.rs filters on -// starts_with("rollout-")). Duplicated here as a lightweight observability -// predicate; the adapter's discovery.go holds the authoritative anchored -// regexes used for actual ingest. -const codexRolloutPrefix = "rollout-" - -// codexArchivedDir is the codex session archive, pruned from both ingest and -// these observability counts (spec adapter-codex.md §"Filesystem Layout"). -const codexArchivedDir = "archived_sessions" - -// countRolloutFiles returns the number of modern sharded codex rollouts -// ("rollout-*.jsonl") under the sessions root, counting ONLY files at the -// YYYY/MM/DD shard depth and pruning archived_sessions/. Returns 0 on any walk -// error — the count is observability for acceptance #8, not a gate, so it is -// read best-effort and never blocks discovery. Mirrors discovery.go's modern -// match (^rollout-.*\.jsonl$) AND its shard-depth requirement (F8) without -// importing the adapter package, so the surfaced count matches what is ingested. -func countRolloutFiles(root string) int { - n := 0 - _ = filepath.WalkDir(root, func(path string, d os.DirEntry, err error) error { - if err != nil { - if d != nil && d.IsDir() { - return filepath.SkipDir - } - return nil - } - if d.IsDir() { - if d.Name() == codexArchivedDir && path != root { - return filepath.SkipDir - } - return nil - } - name := d.Name() - if strings.HasPrefix(name, codexRolloutPrefix) && strings.HasSuffix(name, ".jsonl") && codexAtShardDepth(root, path) { - n++ - } - return nil - }) - return n -} - -// codexAtShardDepth reports whether path is a rollout at the required YYYY/MM/DD -// shard depth relative to root: exactly three leading numeric path components -// then the basename (F8). Mirrors discovery.go's hasShardDepth without importing -// the adapter package, so countRolloutFiles never over-counts a stray -// rollout-*.jsonl placed at the wrong depth. A relpath failure counts the file -// out (best-effort observability). -func codexAtShardDepth(root, path string) bool { - rel, err := filepath.Rel(root, path) - if err != nil { - return false - } - parts := strings.Split(filepath.ToSlash(rel), "/") - if len(parts) != 4 { - return false - } - for _, p := range parts[:3] { - if len(p) == 0 { - return false - } - for _, c := range p { - if c < '0' || c > '9' { - return false - } - } - } - return true -} - -// countLegacyJSON returns the number of legacy flat codex rollouts -// ("rollout-*.json") directly under the sessions root (NOT in shards). These are -// recognized but NOT ingested in v1 (one informational SourceError per file); -// the count is surfaced separately so the operator sees the deferred-legacy -// volume (acceptance #8). Returns 0 on any read error. Mirrors discovery.go's -// legacy match (^rollout-.*\.json$, root-only). -func countLegacyJSON(root string) int { - entries, err := os.ReadDir(root) - if err != nil { - return 0 - } - n := 0 - for _, e := range entries { - if e.IsDir() { - continue - } - name := e.Name() - if strings.HasPrefix(name, codexRolloutPrefix) && strings.HasSuffix(name, ".json") { - n++ - } - } - return n -} - // cursorLookup is the minimal contract startSource needs to resume from // the durable cursor. The production wiring uses *sql.DB through // sqlCursorLookup; tests inject a fake to verify the round-trip without diff --git a/cmd/ai-viewer-ingest/sources_test.go b/cmd/ai-viewer-ingest/sources_test.go index a99fbfb..f48dd89 100644 --- a/cmd/ai-viewer-ingest/sources_test.go +++ b/cmd/ai-viewer-ingest/sources_test.go @@ -1,214 +1,316 @@ -// Tests for the codex auto-discovery probe and its observability counters -// (SOW-0004 acceptance #8). They pin: +// Tests for the opencode auto-discovery probe and its observability counters +// (SOW-0005 acceptance #8). The shared discovery counters and the codex probe +// tests live in discovery_test.go (split for the 400-line budget). These pin: // -// - the probe registers a source at $CODEX_HOME/sessions (default -// ~/.codex/sessions) when the directory exists, with location = the -// walked sessions dir; -// - $CODEX_HOME overrides the default location; -// - an absent sessions dir registers no codex source; -// - countRolloutFiles / countLegacyJSON report the modern (sharded .jsonl) -// and legacy (root .json) volumes SEPARATELY; -// - the discovery log line carries both counts as distinct keys. +// - a synthetic opencode DB at the default path is auto-discovered as an +// "opencode" source whose location is the database FILE (not a directory), +// and the registered factory can construct it; +// - the discovery log line carries session/message/part counts + the latest +// migration as distinct keys; +// - an absent DB registers no opencode source; +// - a probe error (a file that is not a valid opencode DB) still registers the +// source and logs a probe_error attr (counting must not block discovery). package main import ( "bytes" + "context" + "database/sql" "log/slog" "os" "path/filepath" "testing" "github.com/netdata/ai-viewer/internal/adapters" + "github.com/netdata/ai-viewer/internal/adapters/opencode" "github.com/netdata/ai-viewer/internal/canonical" + + // The opencode probe test builds a synthetic SQLite database with the + // modernc driver (same CGO-free driver the adapter uses), then opens it via + // the registered adapter read-only. Synthetic, schema-shaped, never the + // operator's data (SOW-0005 R5). + _ "modernc.org/sqlite" ) -// plantCodexLayout writes a sessions tree under root with `modern` sharded -// rollout-*.jsonl files (in a YYYY/MM/DD shard), `legacy` root rollout-*.json -// files, and a couple of decoys that must NOT be counted (an archived_sessions -// shard, a non-rollout file, a .jsonl outside the rollout prefix). -func plantCodexLayout(t *testing.T, root string, modern, legacy int) { +// plantOpencodeDB builds a synthetic opencode SQLite database at the default +// discovery path under home (~/.local/share/opencode/opencode.db) with the four +// tracked tables, a populated __drizzle_migrations table, and the given number of +// sessions/messages/parts. It is built via a throwaway read-write connection +// (the adapter NEVER opens opencode.db read-write; the probe reopens it +// read-only) and the handle is closed so the WAL is flushed before the probe +// runs. Content is synthetic, never the operator's data. +func plantOpencodeDB(t *testing.T, home string, sessions, messages, parts int, latestMigration string) string { t.Helper() - shard := filepath.Join(root, "2025", "11", "20") - if err := os.MkdirAll(shard, 0o755); err != nil { - t.Fatalf("mkdir shard: %v", err) - } - for i := 0; i < modern; i++ { - name := filepath.Join(shard, "rollout-2025-11-20T10-00-0"+itoa(i)+"-uuid.jsonl") - if err := os.WriteFile(name, []byte(`{"type":"session_meta"}`+"\n"), 0o644); err != nil { - t.Fatalf("write modern rollout: %v", err) + dbPath := filepath.Join(home, ".local", "share", "opencode", "opencode.db") + if err := os.MkdirAll(filepath.Dir(dbPath), 0o755); err != nil { + t.Fatalf("mkdir opencode data dir: %v", err) + } + rw, err := sql.Open("sqlite", "file:"+dbPath+"?_pragma=busy_timeout(5000)") + if err != nil { + t.Fatalf("open rw: %v", err) + } + defer func() { _ = rw.Close() }() + + ddl := []string{ + `CREATE TABLE session (id TEXT PRIMARY KEY, project_id TEXT NOT NULL, parent_id TEXT, + slug TEXT NOT NULL, directory TEXT NOT NULL, title TEXT NOT NULL, version TEXT NOT NULL, + agent TEXT, model TEXT, time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, time_archived INTEGER)`, + `CREATE TABLE message (id TEXT PRIMARY KEY, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, data TEXT NOT NULL)`, + `CREATE TABLE part (id TEXT PRIMARY KEY, message_id TEXT NOT NULL, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, data TEXT NOT NULL)`, + `CREATE TABLE session_message (id TEXT PRIMARY KEY, session_id TEXT NOT NULL, type TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, data TEXT NOT NULL)`, + `CREATE TABLE __drizzle_migrations (id INTEGER PRIMARY KEY AUTOINCREMENT, hash TEXT NOT NULL, + created_at NUMERIC, name TEXT, applied_at TEXT)`, + } + for _, stmt := range ddl { + if _, err := rw.Exec(stmt); err != nil { + t.Fatalf("create schema: %v\nstmt: %s", err, stmt) } } - for i := 0; i < legacy; i++ { - name := filepath.Join(root, "rollout-2025-06-0"+itoa(i)+"-uuid.json") - if err := os.WriteFile(name, []byte(`{}`), 0o644); err != nil { - t.Fatalf("write legacy rollout: %v", err) + for i := 0; i < sessions; i++ { + if _, err := rw.Exec( + `INSERT INTO session (id, project_id, slug, directory, title, version, time_created, time_updated) + VALUES (?,?,?,?,?,?,?,?)`, + itoaWide("ses", i), "prj_1", "slug", "/work", "Title", "9.9.9", 100+i, 100+i); err != nil { + t.Fatalf("insert session: %v", err) } } - // Decoys: an archived shard rollout (pruned), a non-rollout .jsonl, a - // non-rollout file at the root, AND a rollout-*.jsonl at the WRONG depth - // (directly under the sessions root, not in a YYYY/MM/DD shard — F8). None of - // these must be counted. - arch := filepath.Join(root, "archived_sessions", "2025", "11", "20") - if err := os.MkdirAll(arch, 0o755); err != nil { - t.Fatalf("mkdir archive: %v", err) + for i := 0; i < messages; i++ { + if _, err := rw.Exec( + `INSERT INTO message (id, session_id, time_created, time_updated, data) VALUES (?,?,?,?,?)`, + itoaWide("msg", i), itoaWide("ses", 0), 200+i, 200+i, `{"role":"assistant"}`); err != nil { + t.Fatalf("insert message: %v", err) + } } - if err := os.WriteFile(filepath.Join(arch, "rollout-archived-uuid.jsonl"), []byte("{}"), 0o644); err != nil { - t.Fatalf("write archived: %v", err) + for i := 0; i < parts; i++ { + if _, err := rw.Exec( + `INSERT INTO part (id, message_id, session_id, time_created, time_updated, data) VALUES (?,?,?,?,?,?)`, + itoaWide("prt", i), itoaWide("msg", 0), itoaWide("ses", 0), 300+i, 300+i, `{"type":"text","text":"x"}`); err != nil { + t.Fatalf("insert part: %v", err) + } } - if err := os.WriteFile(filepath.Join(shard, "not-a-rollout.jsonl"), []byte("{}"), 0o644); err != nil { - t.Fatalf("write decoy jsonl: %v", err) + // Two migrations; the second (latest) is the one the probe must report. + if _, err := rw.Exec(`INSERT INTO __drizzle_migrations (hash, name) VALUES (?,?)`, "h0", "20260127222353_first"); err != nil { + t.Fatalf("insert migration: %v", err) } - if err := os.WriteFile(filepath.Join(root, "history.jsonl"), []byte("{}"), 0o644); err != nil { - t.Fatalf("write decoy root file: %v", err) + if _, err := rw.Exec(`INSERT INTO __drizzle_migrations (hash, name) VALUES (?,?)`, "h1", latestMigration); err != nil { + t.Fatalf("insert migration: %v", err) } - // A rollout-*.jsonl placed directly under the sessions root (wrong shard - // depth) must NOT be counted as a modern rollout (F8). - if err := os.WriteFile(filepath.Join(root, "rollout-2025-11-20T10-00-09-strayroot.jsonl"), []byte(`{"type":"session_meta"}`+"\n"), 0o644); err != nil { - t.Fatalf("write stray-root rollout: %v", err) + return dbPath +} + +// itoaWide zero-pads a small index into a 12-wide lexicographically-sortable id +// suffix so synthetic ids sort in creation order like real Sonyflake ids. +func itoaWide(prefix string, n int) string { + digits := []byte("000000000000") + i := len(digits) - 1 + for n > 0 && i >= 0 { + digits[i] = byte('0' + n%10) + n /= 10 + i-- } + return prefix + "_" + string(digits) } -// itoa is a tiny single-digit int→string helper so plantCodexLayout stays free -// of strconv for the small counts the tests use. -func itoa(i int) string { return string(rune('0' + i)) } +// clearOtherAdapterEnv unsets the env overrides for the codex/claude probes so an +// opencode probe test sees only the HOME-rooted opencode DB (the other probes +// look under HOME too, but their directories will not exist). It also clears the +// opencode resolution overrides ($OPENCODE_DB, $XDG_DATA_HOME) so the probe falls +// through to the ~/.local/share default these tests plant under HOME. +func clearOtherAdapterEnv(t *testing.T) { + t.Helper() + t.Setenv("CODEX_HOME", "") + t.Setenv("CLAUDE_CONFIG_DIR", "") + t.Setenv("OPENCODE_DB", "") + t.Setenv("XDG_DATA_HOME", "") +} -// TestAutoDiscover_CodexProbe verifies acceptance #8: a tmpdir -// ~/.codex/sessions tree with modern sharded rollouts is auto-discovered as a -// codex source whose location is the sessions root, and the registered factory -// can construct it. -func TestAutoDiscover_CodexProbe(t *testing.T) { - // Not parallel: t.Setenv mutates process-wide HOME / CODEX_HOME. +// TestAutoDiscover_OpencodeProbe verifies acceptance #8: a synthetic opencode DB +// at the default path is auto-discovered as an "opencode" source whose location +// is the database file, and the registered factory can construct it. +func TestAutoDiscover_OpencodeProbe(t *testing.T) { + // Not parallel: t.Setenv mutates process-wide HOME. tmp := t.TempDir() t.Setenv("HOME", tmp) - t.Setenv("CODEX_HOME", "") - sessions := filepath.Join(tmp, ".codex", "sessions") - plantCodexLayout(t, sessions, 2, 3) + clearOtherAdapterEnv(t) + dbPath := plantOpencodeDB(t, tmp, 2, 3, 4, "20260510033149_latest") got, err := resolveSources(nil, silentLogger()) if err != nil { t.Fatalf("resolveSources: %v", err) } - var cdx *configuredSource + var oc *configuredSource for i := range got { - if got[i].format == "codex" { - cdx = &got[i] + if got[i].format == "opencode" { + oc = &got[i] } } - if cdx == nil { - t.Fatalf("codex source not auto-discovered; got %+v", got) + if oc == nil { + t.Fatalf("opencode source not auto-discovered; got %+v", got) } - if cdx.location != sessions { - t.Fatalf("codex location = %q, want %q", cdx.location, sessions) + if oc.location != dbPath { + t.Fatalf("opencode location = %q, want %q (the DB file)", oc.location, dbPath) } - // The discovered source must be constructable via the registry, proving the - // adapter's init() ran (acceptance #1). - factory, ok := adapters.Get("codex") + factory, ok := adapters.Get("opencode") if !ok { - t.Fatal("codex factory not registered") + t.Fatal("opencode factory not registered") } - if _, err := factory(cdx.location, canonical.AdapterOptions{Logger: silentLogger()}); err != nil { - t.Fatalf("codex factory(%q): %v", cdx.location, err) + if _, err := factory(oc.location, canonical.AdapterOptions{Logger: silentLogger()}); err != nil { + t.Fatalf("opencode factory(%q): %v", oc.location, err) } } -// TestAutoDiscover_CodexHomeOverride verifies the probe honors $CODEX_HOME -// (SOW-0004 C#3): the sessions root is "$CODEX_HOME/sessions", not ~/.codex. -func TestAutoDiscover_CodexHomeOverride(t *testing.T) { +// TestAutoDiscover_OpencodeProbeLogsCountsAndMigration verifies the discovery log +// line carries the session/message/part counts and the latest migration as +// distinct keys (acceptance #8: the structured log is the operator-facing surface +// at discovery time). +func TestAutoDiscover_OpencodeProbeLogsCountsAndMigration(t *testing.T) { // Not parallel: mutates process-wide env. tmp := t.TempDir() - t.Setenv("HOME", tmp) // no ~/.codex here - codexHome := filepath.Join(tmp, "custom-codex") - t.Setenv("CODEX_HOME", codexHome) - sessions := filepath.Join(codexHome, "sessions") - plantCodexLayout(t, sessions, 1, 0) + t.Setenv("HOME", tmp) + clearOtherAdapterEnv(t) + plantOpencodeDB(t, tmp, 2, 3, 4, "20260510033149_latest") - got, err := resolveSources(nil, silentLogger()) - if err != nil { + var buf bytes.Buffer + logger := slog.New(slog.NewTextHandler(&buf, &slog.HandlerOptions{Level: slog.LevelInfo})) + if _, err := resolveSources(nil, logger); err != nil { t.Fatalf("resolveSources: %v", err) } - var loc string - for _, s := range got { - if s.format == "codex" { - loc = s.location + out := buf.String() + for _, want := range []string{"sessions=2", "messages=3", "parts=4", "latest_migration=20260510033149_latest"} { + if !bytes.Contains(buf.Bytes(), []byte(want)) { + t.Errorf("discovery log missing %q; got:\n%s", want, out) } } - if loc != sessions { - t.Fatalf("codex location = %q, want %q (CODEX_HOME honored)", loc, sessions) - } } -// TestAutoDiscover_NoCodexWhenAbsent verifies a workstation without -// ~/.codex/sessions does not register a codex source. -func TestAutoDiscover_NoCodexWhenAbsent(t *testing.T) { +// TestAutoDiscover_NoOpencodeWhenAbsent verifies a workstation without the +// opencode DB does not register an opencode source. +func TestAutoDiscover_NoOpencodeWhenAbsent(t *testing.T) { // Not parallel: mutates process-wide env. tmp := t.TempDir() t.Setenv("HOME", tmp) - t.Setenv("CODEX_HOME", "") + clearOtherAdapterEnv(t) got, err := resolveSources(nil, silentLogger()) if err != nil { t.Fatalf("resolveSources: %v", err) } for _, s := range got { - if s.format == "codex" { - t.Fatalf("codex registered with no sessions dir present: %+v", got) + if s.format == "opencode" { + t.Fatalf("opencode registered with no DB present: %+v", got) } } } -// TestAutoDiscover_CodexProbeLogsBothCountsSeparately verifies the probe's -// discovery log line carries the modern and legacy volumes as DISTINCT keys -// (acceptance #8: "/api/sources reports both counts separately" — the structured -// log is the operator-facing surface at discovery time). -func TestAutoDiscover_CodexProbeLogsBothCountsSeparately(t *testing.T) { +// TestAutoDiscover_OpencodeProbeErrorStillRegisters verifies that when the file +// at the probe path exists but is NOT a valid opencode database (no tables), +// ProbeStatus errors yet the source is STILL registered (counting must not block +// discovery) and the log carries a probe_error attr. +func TestAutoDiscover_OpencodeProbeErrorStillRegisters(t *testing.T) { // Not parallel: mutates process-wide env. tmp := t.TempDir() t.Setenv("HOME", tmp) - t.Setenv("CODEX_HOME", "") - sessions := filepath.Join(tmp, ".codex", "sessions") - plantCodexLayout(t, sessions, 2, 3) + clearOtherAdapterEnv(t) + dbPath := filepath.Join(tmp, ".local", "share", "opencode", "opencode.db") + if err := os.MkdirAll(filepath.Dir(dbPath), 0o755); err != nil { + t.Fatalf("mkdir: %v", err) + } + // A non-SQLite regular file at the probe path: os.Stat succeeds (so it is + // discovered) but ProbeStatus fails (the count queries hit no tables). + if err := os.WriteFile(dbPath, []byte("not a database"), 0o644); err != nil { + t.Fatalf("write bogus db: %v", err) + } var buf bytes.Buffer logger := slog.New(slog.NewTextHandler(&buf, &slog.HandlerOptions{Level: slog.LevelInfo})) - if _, err := resolveSources(nil, logger); err != nil { + got, err := resolveSources(nil, logger) + if err != nil { t.Fatalf("resolveSources: %v", err) } - out := buf.String() - if !bytes.Contains(buf.Bytes(), []byte("modern_rollouts=2")) { - t.Errorf("discovery log missing modern_rollouts=2; got:\n%s", out) + var registered bool + for _, s := range got { + if s.format == "opencode" && s.location == dbPath { + registered = true + } + } + if !registered { + t.Fatalf("opencode source NOT registered despite a probe error; got %+v", got) } - if !bytes.Contains(buf.Bytes(), []byte("legacy_json=3")) { - t.Errorf("discovery log missing legacy_json=3; got:\n%s", out) + if !bytes.Contains(buf.Bytes(), []byte("probe_error=")) { + t.Errorf("discovery log missing probe_error attr; got:\n%s", buf.String()) } } -// TestCountRolloutFiles verifies the modern-rollout counter mirrors discovery.go's -// match: rollout-*.jsonl under YYYY/MM/DD shards, archived_sessions pruned, -// non-rollout .jsonl, root non-rollout files, AND a rollout-*.jsonl at the wrong -// shard depth (directly under the root) all ignored (F8). -func TestCountRolloutFiles(t *testing.T) { - t.Parallel() +// TestAutoDiscover_OpencodeDirectoryNotRegistered pins SOW-0005 round-3 P3-2: a +// DIRECTORY named opencode.db at the default discovery path must NOT register as +// an opencode source — os.Stat succeeds on a directory, so the probe additionally +// requires info.Mode().IsRegular(). The companion positive case (a regular DB +// file IS discovered) is TestAutoDiscover_OpencodeProbe above. +func TestAutoDiscover_OpencodeDirectoryNotRegistered(t *testing.T) { + // Not parallel: mutates process-wide env. tmp := t.TempDir() - plantCodexLayout(t, tmp, 4, 2) - if n := countRolloutFiles(tmp); n != 4 { - t.Fatalf("countRolloutFiles = %d, want 4 (archived + decoys + wrong-depth stray excluded)", n) + t.Setenv("HOME", tmp) + clearOtherAdapterEnv(t) + // Create a DIRECTORY exactly where the DB file would live. + dirAsDB := filepath.Join(tmp, ".local", "share", "opencode", "opencode.db") + if err := os.MkdirAll(dirAsDB, 0o755); err != nil { + t.Fatalf("mkdir dir-as-db: %v", err) + } + + var buf bytes.Buffer + logger := slog.New(slog.NewTextHandler(&buf, &slog.HandlerOptions{Level: slog.LevelWarn})) + got, err := resolveSources(nil, logger) + if err != nil { + t.Fatalf("resolveSources: %v", err) + } + for _, s := range got { + if s.format == "opencode" { + t.Fatalf("opencode registered for a DIRECTORY named opencode.db: %+v", got) + } } - if n := countRolloutFiles(filepath.Join(tmp, "missing")); n != 0 { - t.Fatalf("countRolloutFiles(missing) = %d, want 0", n) + if !bytes.Contains(buf.Bytes(), []byte("not a regular file")) { + t.Errorf("expected a WARN that the opencode path is not a regular file; got:\n%s", buf.String()) } } -// TestCountLegacyJSON verifies the legacy counter mirrors discovery.go's match: -// rollout-*.json directly under the root only (not in shards), non-rollout root -// files ignored. -func TestCountLegacyJSON(t *testing.T) { - t.Parallel() +// TestOpencodeProbeRespectsCancelledContext pins SOW-0005 round-4 P3-1: the startup +// ProbeStatus is now passed a bounded/cancellable context (autoDiscoverSources uses +// context.WithTimeout(opencodeProbeTimeout)) instead of context.Background(), so a +// cancelled context aborts the probe promptly with an error rather than running the +// COUNT(*) queries to completion. A normal context still returns the counts. The +// probe is best-effort: discovery surfaces the error and still registers the source +// (covered by TestAutoDiscover_OpencodeProbeErrorStillRegisters), but the +// cancellation must be HONORED rather than ignored. +func TestOpencodeProbeRespectsCancelledContext(t *testing.T) { + // Not parallel: t.Setenv mutates process-wide HOME. tmp := t.TempDir() - plantCodexLayout(t, tmp, 4, 2) - if n := countLegacyJSON(tmp); n != 2 { - t.Fatalf("countLegacyJSON = %d, want 2", n) + t.Setenv("HOME", tmp) + clearOtherAdapterEnv(t) + dbPath := plantOpencodeDB(t, tmp, 2, 3, 4, "20260510033149_init") + + // An already-cancelled context: ProbeStatus must return an error (it does not + // silently run to completion ignoring cancellation). + cancelled, cancel := context.WithCancel(context.Background()) + cancel() + if _, _, _, _, err := opencode.ProbeStatus(cancelled, dbPath); err == nil { + t.Error("ProbeStatus with a cancelled context returned nil error; the probe must honor cancellation (round-4 P3-1)") + } + + // A normal (bounded) context still succeeds and returns the planted counts — + // proving the timeout/cancellable wiring did not break the happy path. + ctx, cancel2 := context.WithTimeout(context.Background(), opencodeProbeTimeout) + defer cancel2() + sessions, messages, parts, latest, err := opencode.ProbeStatus(ctx, dbPath) + if err != nil { + t.Fatalf("ProbeStatus(valid ctx): %v", err) + } + if sessions != 2 || messages != 3 || parts != 4 { + t.Errorf("ProbeStatus counts = (%d,%d,%d), want (2,3,4)", sessions, messages, parts) } - if n := countLegacyJSON(filepath.Join(tmp, "missing")); n != 0 { - t.Fatalf("countLegacyJSON(missing) = %d, want 0", n) + if latest != "20260510033149_init" { + t.Errorf("ProbeStatus latest migration = %q, want the planted one", latest) } } diff --git a/internal/adapters/opencode/adapter.go b/internal/adapters/opencode/adapter.go new file mode 100644 index 0000000..7282315 --- /dev/null +++ b/internal/adapters/opencode/adapter.go @@ -0,0 +1,225 @@ +package opencode + +import ( + "context" + "errors" + "fmt" + "log/slog" + + "github.com/netdata/ai-viewer/internal/adapters" + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file is the registered canonical.Adapter for opencode (SOW-0005 chunk D). +// It mirrors codex/adapter.go exactly, substituting the SQLite specifics: the +// "location" is the opencode database file path (not a sessions directory), and +// the cursor is the per-table watermark cursor (cursor.go) rather than per-file +// byte offsets. The DB is opened ONLY through the chunk-A openReadOnly helper; +// this file never opens a write path. +// +// Format is declared in mapper.go (const Format = "opencode"); it is the single +// stable identifier shared by the mapper (which stamps it onto every LogEntry's +// Source) and this file (which registers it). Defining it once mirrors codex. + +// sourceIDPrefix is prepended to the configured database path to produce the +// canonical events' SourceID. Used only for log attribution; idempotency is a +// SQL-layer guarantee keyed on each row's natural identity (not SourceSeq). +// Mirrors codex. +const sourceIDPrefix = Format + ":" + +// Adapter is the opencode source adapter. One instance corresponds to one +// opencode database file (default ~/.local/share/opencode/opencode.db). The +// instance is safe for a single Scan goroutine followed by a single Tail +// goroutine; concurrent Scan+Tail on one instance is not part of the contract +// (specs/adapter-contract.md). Mirrors codex.Adapter. +type Adapter struct { + dbPath string + sourceID string + logger *slog.Logger + // onError surfaces non-fatal per-record parse errors. Never nil after + // construction; New and Factory substitute a no-op when nil so adapter code + // can call it unconditionally. + onError func(error) + // scanCursor holds the final watermark cursor recorded by the most recent + // Scan, so a following Tail on the SAME instance resumes from where Scan left + // off rather than snapshotting current HEAD (closing the Scan→Tail data-loss + // window). Nil until Scan runs (a cold Tail then falls back to + // snapshotCursor). The ingester drives Scan→Tail on one instance + // (cmd/ai-viewer-ingest/sources.go runAdapter), single-threaded, so a plain + // field needs no synchronisation. Mirrors codex. + scanCursor *Cursor +} + +// Compile-time conformance to the canonical.Adapter interface. +var _ canonical.Adapter = (*Adapter)(nil) + +// New constructs an Adapter for the given opencode database file with the shared +// canonical.AdapterOptions bundle. An empty location (the DB path) is rejected so +// misconfigured ingesters fail fast. Mirrors codex.New. +func New(location string, opts canonical.AdapterOptions) (*Adapter, error) { + if location == "" { + return nil, errors.New("opencode: location (database path) must be non-empty") + } + logger := opts.Logger + if logger == nil { + logger = slog.Default() + } + logger = logger.With("adapter", Format, "db", location) + onError := opts.OnError + if onError == nil { + onError = func(error) {} + } + return &Adapter{ + dbPath: location, + sourceID: sourceIDPrefix + location, + logger: logger, + onError: onError, + }, nil +} + +// Name implements canonical.Adapter. +func (a *Adapter) Name() string { return Format } + +// Format implements canonical.Adapter. +func (a *Adapter) Format() string { return Format } + +// Scan implements canonical.Adapter. Opens the database read-only, pages every +// tracked table from `since` forward, and emits the affected sessions' events. +// Returns when caught up or when ctx is cancelled. The caller owns `out`; Scan +// never closes it. Mirrors codex.Scan: the final watermark cursor is recorded on +// the instance even on cancellation so a following Tail resumes from completed +// work rather than replaying from HEAD. +func (a *Adapter) Scan(ctx context.Context, since canonical.Cursor, out chan<- canonical.Event) error { + start := a.coerceCursor(since) + final, sErr := scanLoop(ctx, a.dbPath, a.sourceID, start, out, a.logger, a.onError) + // Record the final watermark even on cancellation so a Tail that follows a + // context-cancelled Scan still resumes from the watermark reached so far (the + // cursor reflects only fully-consumed rows). On a hard error it is still the + // best resume point available. + cursorCopy := final + a.scanCursor = &cursorCopy + if sErr != nil { + if errors.Is(sErr, context.Canceled) || errors.Is(sErr, context.DeadlineExceeded) { + return nil + } + return fmt.Errorf("opencode: scan: %w", sErr) + } + return nil +} + +// Tail implements canonical.Adapter. Follows the database with the poll-loop +// tailer until ctx is cancelled. Same channel-ownership and cancellation rules as +// Scan. Tail resumes from the watermark cursor the preceding Scan recorded on +// this instance, closing the data-loss window where rows committed BETWEEN Scan +// finishing and Tail starting would be skipped if Tail snapshotted current HEAD. +// Any re-emission of an already-seen session tree is absorbed by the ingester's +// SQL-layer idempotent upserts. A cold Tail with no preceding Scan falls back to a +// current-HEAD snapshot so it follows from now rather than replaying full +// history. Mirrors codex.Tail. +func (a *Adapter) Tail(ctx context.Context, out chan<- canonical.Event) error { + var cur Cursor + // warmStart distinguishes a Tail resumed from a Scan cursor (the boundary bucket + // was already emitted by Scan) from a cold HEAD-snapshot Tail (follow-from-now, + // boundary never emitted). It seeds the round-6 P1 boundaryReal gate so a cold Tail + // never replays its snapshot boundary on the first post-snapshot forward change. + warmStart := a.scanCursor != nil + if warmStart { + cur = a.coerceCursor(*a.scanCursor) + } else { + snap, err := a.snapshotCursor(ctx) + if err != nil { + return fmt.Errorf("opencode: tail snapshot: %w", err) + } + cur = snap + } + return tailLoop(ctx, a.dbPath, a.sourceID, cur, warmStart, out, a.logger, a.onError) +} + +// ParseCursor implements canonical.Adapter. Empty input yields the zero Cursor; +// non-empty input is decoded as JSON. The returned Cursor is opaque to the +// ingester and used only via Cursor.String() and Cursor.After(). Mirrors +// codex.ParseCursor. +func (a *Adapter) ParseCursor(stored string) (canonical.Cursor, error) { + c, err := ParseCursor(stored) + if err != nil { + return nil, err + } + return c, nil +} + +// coerceCursor accepts a Cursor produced by this adapter, a nil Cursor (first +// run), or an alien cursor type (treated as empty so the ingester's "I lost +// track" path re-scans from the zero watermark). Never returns nil, and never +// returns a cursor that would skip data on an alien type. Mirrors +// codex.coerceCursor; opencode's cursor carries Tables (not Files), so that is +// the map normalised here. +func (a *Adapter) coerceCursor(c canonical.Cursor) Cursor { + if c == nil { + return newCursor() + } + if typed, ok := c.(Cursor); ok { + if typed.Tables == nil { + typed.Tables = map[string]TableWatermark{} + } + if typed.Version == 0 { + typed.Version = cursorVersion + } + return typed + } + return newCursor() +} + +// snapshotCursor builds a cursor at the database's current HEAD so a cold Tail +// (no preceding Scan) follows changes from now on rather than replaying historical +// events (existing content is Scan's job). It opens read-only, introspects the +// schema, sets each tracked table's watermark to its current MAX(id) + +// MAX(time_updated), and records the real __drizzle_migrations schema hash. This +// is the SQLite analogue of codex stat'ing current file sizes. At a HEAD snapshot +// the monotonic high-water (MaxIDSeen) and the (time_updated, id) paging-position +// id (MaxTimeUpdatedID) both start at the current MAX(id) — paging then follows +// strictly from NOW (SOW-0005 round-2 P1-A). A table on an old schema without +// time_updated contributes the id watermarks only (MaxTimeUpdatedMs stays 0). +func (a *Adapter) snapshotCursor(ctx context.Context) (Cursor, error) { + db, err := openReadOnly(ctx, a.dbPath, withMaxOpenConns(2)) + if err != nil { + return Cursor{}, err + } + defer func() { _ = db.Close() }() + + schema, err := introspectAll(ctx, db) + if err != nil { + return Cursor{}, err + } + + cur := newCursor() + for _, table := range trackedTables { + mid, mErr := maxID(ctx, db, table) + if mErr != nil { + return Cursor{}, mErr + } + var mtu int64 + if schema[table].has("time_updated") { + mtu, mErr = maxTimeUpdated(ctx, db, table) + if mErr != nil { + return Cursor{}, mErr + } + } + cur = cur.withTable(table, TableWatermark{MaxIDSeen: mid, MaxTimeUpdatedMs: mtu, MaxTimeUpdatedID: mid}) + } + return recordSchemaHash(ctx, db, cur, a.onError), nil +} + +// Factory adapts New to canonical.AdapterFactory so the registry can construct an +// Adapter from the generic (location, opts) pair. The location is the opencode +// database file path. Mirrors codex.Factory. +func Factory(location string, opts canonical.AdapterOptions) (canonical.Adapter, error) { + a, err := New(location, opts) + if err != nil { + return nil, err + } + return a, nil +} + +func init() { + adapters.Register(Format, Factory) +} diff --git a/internal/adapters/opencode/adapter_lifecycle_test.go b/internal/adapters/opencode/adapter_lifecycle_test.go new file mode 100644 index 0000000..60c98d3 --- /dev/null +++ b/internal/adapters/opencode/adapter_lifecycle_test.go @@ -0,0 +1,325 @@ +package opencode + +import ( + "context" + "database/sql" + "path/filepath" + "sync" + "testing" + "time" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file holds the opencode Adapter's Scan/Tail/snapshot LIFECYCLE tests, +// split out of adapter_test.go to keep each test file under the 400-line budget. +// The construction/cursor tests + the shared discardOpts/hasSession helpers live +// in adapter_test.go (same package). + +// TestAdapter_ScanRecordsCursorAndCancelReturnsNil verifies Scan emits the +// expected events, records scanCursor on the instance, and that a cancelled Scan +// returns nil while still recording a best-effort cursor. +func TestAdapter_ScanRecordsCursorAndCancelReturnsNil(t *testing.T) { + t.Parallel() + path := seedBackfillDB(t, t.TempDir(), 2) + opts, _, _ := discardOpts() + a, err := New(path, opts) + if err != nil { + t.Fatalf("New: %v", err) + } + + out := make(chan canonical.Event, 4096) + if err := a.Scan(context.Background(), nil, out); err != nil { + t.Fatalf("Scan: %v", err) + } + got := drainAll(out) + if c := countKind(got, canonical.EvSessionStarted); c != 2 { + t.Errorf("Scan SessionStarted = %d, want 2", c) + } + if a.scanCursor == nil { + t.Fatal("Scan did not record scanCursor on the instance") + } + if !a.scanCursor.hasProgress() { + t.Error("recorded scanCursor has no progress after a non-empty scan") + } + + // A cancelled Scan returns nil and still records a best-effort cursor. + aCancel, err := New(path, opts) + if err != nil { + t.Fatalf("New: %v", err) + } + ctx, cancel := context.WithCancel(context.Background()) + cancel() + out2 := make(chan canonical.Event, 4096) + if err := aCancel.Scan(ctx, nil, out2); err != nil { + t.Fatalf("cancelled Scan = %v, want nil", err) + } + if aCancel.scanCursor == nil { + t.Fatal("cancelled Scan did not record a best-effort cursor") + } +} + +// TestAdapter_ScanThenTailHandoff is the load-bearing Scan→Tail hand-off: Scan +// records the watermark on the instance, and a following Tail resumes from it +// (NOT from HEAD — no full-history replay). A session inserted AFTER Scan is +// emitted by Tail, and an OLD session sitting BELOW the resumed boundary ms is +// never re-emitted. +// +// Note (SOW-0005 round-7 P1-1): the warm Tail's first gate-open poll runs the +// bounded boundary-bucket re-scan against the resume watermark's ms `T` — that is +// the fix's contract (a same-ms in-place update of an already-emitted row must be +// re-checked). A session whose rows sit AT `T` is therefore idempotently re-emitted +// (harmless — the writer's upserts absorb it; pinned by TestP1_R6_/TestP1_R7_*). +// The hand-off property that matters is the one this test pins: Tail does NOT +// replay the FULL history — a session at an EARLIER ms than the boundary is never +// re-emitted. The fixture puts ses_old well below the boundary to prove that. +func TestAdapter_ScanThenTailHandoff(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + // ses_old sits far below the eventual resume boundary (ms 10) — Scan emits it, + // and the resumed Tail must NEVER replay it (it is not in the boundary bucket). + insertSession(t, rw, "ses_old", "", 10, 10, 0) + insertAssistantMessage(t, rw, "msg_old", "ses_old", 10, 10, 3, 1) + // ses_a is the last-scanned session; its tree sits at the resume boundary ms. + insertSession(t, rw, "ses_a", "", 100, 100, 0) + insertAssistantMessage(t, rw, "msg_a", "ses_a", 110, 110, 5, 2) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + + opts, _, _ := discardOpts() + a, err := New(path, opts) + if err != nil { + t.Fatalf("New: %v", err) + } + + scanOut := make(chan canonical.Event, 4096) + if err := a.Scan(context.Background(), nil, scanOut); err != nil { + t.Fatalf("Scan: %v", err) + } + scanEvents := drainAll(scanOut) + if !hasSession(scanEvents, "ses_a") || !hasSession(scanEvents, "ses_old") { + t.Fatal("Scan did not emit the seeded sessions") + } + if a.scanCursor == nil { + t.Fatal("Scan did not record scanCursor") + } + + // Insert a NEW session after Scan, then run Tail. Tail must resume from the + // recorded watermark and emit the new session. + rw2, err := openRWAgain(t, path) + if err != nil { + t.Fatalf("reopen rw: %v", err) + } + defer func() { _ = rw2.Close() }() + insertSession(t, rw2, "ses_b", "", 200, 200, 0) + insertAssistantMessage(t, rw2, "msg_b", "ses_b", 210, 210, 7, 3) + + tailOut := make(chan canonical.Event, 4096) + ctx, cancel := context.WithCancel(context.Background()) + var wg sync.WaitGroup + wg.Add(1) + go func() { + defer wg.Done() + _ = a.Tail(ctx, tailOut) + }() + defer func() { cancel(); wg.Wait() }() + + got, ok := waitForSession(tailOut, "ses_b", 8*time.Second) + if !ok { + t.Fatal("Tail did not emit the session inserted after Scan") + } + // The resumed Tail must NOT replay full history: ses_old (far below the resume + // boundary ms) is never re-emitted. (ses_a sits AT the boundary ms and may be + // idempotently re-emitted by the bounded boundary re-scan — round-7 P1-1 — which + // is NOT a hand-off break; the watermark resume is intact, only the single + // boundary ms is re-checked.) + if hasSession(got, "ses_old") { + t.Error("resumed Tail replayed ses_old from below the boundary (full-history replay — cursor hand-off broken)") + } +} + +// TestAdapter_TailColdSnapshot covers the cold-Tail path (no preceding Scan): +// Tail snapshots current HEAD so it follows from now and does NOT replay a +// pre-existing session; a session inserted AFTER the loop starts is emitted. +func TestAdapter_TailColdSnapshot(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + // A pre-existing session the cold snapshot must NOT replay. + insertSession(t, rw, "ses_old", "", 100, 100, 0) + insertAssistantMessage(t, rw, "msg_old", "ses_old", 110, 110, 5, 2) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + + opts, _, _ := discardOpts() + a, err := New(path, opts) + if err != nil { + t.Fatalf("New: %v", err) + } + + tailOut := make(chan canonical.Event, 4096) + ctx, cancel := context.WithCancel(context.Background()) + var wg sync.WaitGroup + wg.Add(1) + go func() { + defer wg.Done() + _ = a.Tail(ctx, tailOut) // cold: no preceding Scan → HEAD snapshot + }() + defer func() { cancel(); wg.Wait() }() + + // Let the snapshot + watch establish, then insert a NEW session. + time.Sleep(200 * time.Millisecond) + rw2, err := openRWAgain(t, path) + if err != nil { + t.Fatalf("reopen rw: %v", err) + } + defer func() { _ = rw2.Close() }() + insertSession(t, rw2, "ses_fresh", "", 300, 300, 0) + insertAssistantMessage(t, rw2, "msg_fresh", "ses_fresh", 310, 310, 9, 4) + + got, ok := waitForSession(tailOut, "ses_fresh", 8*time.Second) + if !ok { + t.Fatal("cold Tail did not emit the session inserted after it started") + } + if hasSession(got, "ses_old") { + t.Error("cold Tail replayed the pre-existing session ses_old (HEAD snapshot broken)") + } +} + +// TestAdapter_SnapshotCursor pins snapshotCursor: it records the DB's current +// HEAD watermarks for every tracked table AND the real __drizzle_migrations +// schema hash. +func TestAdapter_SnapshotCursor(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db", drizzleMigrationsDDL) + insertSession(t, rw, "ses_1", "", 100, 100, 0) + insertAssistantMessage(t, rw, "msg_1", "ses_1", 110, 110, 5, 2) + insertPart(t, rw, "prt_1", "msg_1", "ses_1", 120, 120, textBody("a")) + insertMigration(t, rw, "20260127222353_a") + insertMigration(t, rw, "20260510033149_b") + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + + opts, _, _ := discardOpts() + a, err := New(path, opts) + if err != nil { + t.Fatalf("New: %v", err) + } + cur, err := a.snapshotCursor(context.Background()) + if err != nil { + t.Fatalf("snapshotCursor: %v", err) + } + + // Each table's watermark equals the DB maxima. + db, _ := introspect(t, path) + for _, table := range trackedTables { + wantMaxID, _ := maxID(ctxBG(), db, table) + wantMaxTU, _ := maxTimeUpdated(ctxBG(), db, table) + w := cur.Tables[table] + // A cold-Tail HEAD snapshot starts both the monotonic high-water and the + // paging-position id at MAX(id) (SOW-0005 round-2 P1-A). + if w.MaxIDSeen != wantMaxID || w.MaxTimeUpdatedID != wantMaxID || w.MaxTimeUpdatedMs != wantMaxTU { + t.Errorf("table %q snapshot watermark = %+v, want {MaxIDSeen:%q MaxTimeUpdatedID:%q MaxTimeUpdatedMs:%d}", table, w, wantMaxID, wantMaxID, wantMaxTU) + } + } + // The real migration-name hash is recorded. + wantHash := schemaHash([]string{"20260127222353_a", "20260510033149_b"}) + if cur.SchemaHash != wantHash { + t.Errorf("snapshot SchemaHash = %q, want real migration digest %q", cur.SchemaHash, wantHash) + } +} + +// TestAdapter_TailColdSnapshotMissingDB verifies a cold Tail over a missing DB +// surfaces the snapshot open error (Tail wraps and returns it before the loop). +func TestAdapter_TailColdSnapshotMissingDB(t *testing.T) { + t.Parallel() + opts, _, _ := discardOpts() + a, err := New(t.TempDir()+"/no-such.db", opts) + if err != nil { + t.Fatalf("New: %v", err) + } + // snapshotCursor surfaces the open error directly (avoids racing tailLoop). + if _, err := a.snapshotCursor(context.Background()); err == nil { + t.Error("snapshotCursor over a missing DB = nil error, want open error") + } +} + +// seedIncompatibleSchemaDB builds a DB whose session table LACKS the required +// time_updated column, so introspectAll fails fast — driving the fatal-schema +// error path in Scan and the cold-Tail snapshot. +func seedIncompatibleSchemaDB(t *testing.T, dir string) string { + t.Helper() + path := filepath.Join(dir, "opencode-bad.db") + rw, err := sqlOpenRW(t, path) + if err != nil { + t.Fatalf("open rw: %v", err) + } + defer func() { _ = rw.Close() }() + stmts := []string{ + // session is MISSING time_updated (a required column) → unreadable. + `CREATE TABLE session (id TEXT PRIMARY KEY, project_id TEXT NOT NULL, parent_id TEXT, + slug TEXT NOT NULL, directory TEXT NOT NULL, title TEXT NOT NULL, version TEXT NOT NULL, + time_created INTEGER NOT NULL)`, + `CREATE TABLE message (id TEXT PRIMARY KEY, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, data TEXT NOT NULL)`, + `CREATE TABLE part (id TEXT PRIMARY KEY, message_id TEXT NOT NULL, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, data TEXT NOT NULL)`, + `CREATE TABLE session_message (id TEXT PRIMARY KEY, session_id TEXT NOT NULL, type TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, data TEXT NOT NULL)`, + } + for _, s := range stmts { + if _, err := rw.Exec(s); err != nil { + t.Fatalf("seed incompatible schema: %v\nstmt: %s", err, s) + } + } + return path +} + +// TestAdapter_ScanIncompatibleSchemaHardError drives Scan's fatal-error branch: +// an incompatible schema (a required column missing) makes scanLoop return a +// non-cancel error, which Scan wraps and returns (NOT nil). The best-effort +// cursor is still recorded on the instance. +func TestAdapter_ScanIncompatibleSchemaHardError(t *testing.T) { + t.Parallel() + path := seedIncompatibleSchemaDB(t, t.TempDir()) + opts, _, _ := discardOpts() + a, err := New(path, opts) + if err != nil { + t.Fatalf("New: %v", err) + } + out := make(chan canonical.Event, 16) + if err := a.Scan(context.Background(), nil, out); err == nil { + t.Fatal("Scan over an incompatible schema = nil error, want fatal schema error") + } + if a.scanCursor == nil { + t.Error("Scan still records a best-effort cursor even on a hard error") + } +} + +// TestAdapter_TailColdSnapshotIncompatibleSchema drives the cold-Tail snapshot's +// introspect-error branch: snapshotCursor surfaces the incompatible-schema error, +// which Tail wraps and returns before entering the poll loop. +func TestAdapter_TailColdSnapshotIncompatibleSchema(t *testing.T) { + t.Parallel() + path := seedIncompatibleSchemaDB(t, t.TempDir()) + opts, _, _ := discardOpts() + a, err := New(path, opts) + if err != nil { + t.Fatalf("New: %v", err) + } + if _, err := a.snapshotCursor(context.Background()); err == nil { + t.Error("snapshotCursor over an incompatible schema = nil error, want introspect error") + } +} + +// sqlOpenRW opens a writable handle to a synthetic DB path (test-only; production +// never opens opencode.db read-write). Mirrors store_testhelpers_test.rwDSNFor. +func sqlOpenRW(t *testing.T, path string) (*sql.DB, error) { + t.Helper() + return sql.Open(driverName, "file:"+escapeURIPath(filepath.ToSlash(path))+"?_pragma=busy_timeout(5000)") +} diff --git a/internal/adapters/opencode/adapter_test.go b/internal/adapters/opencode/adapter_test.go new file mode 100644 index 0000000..a15f99a --- /dev/null +++ b/internal/adapters/opencode/adapter_test.go @@ -0,0 +1,176 @@ +package opencode + +import ( + "errors" + "io" + "log/slog" + "sync" + "testing" + + "github.com/netdata/ai-viewer/internal/adapters" + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file holds the opencode Adapter's CONSTRUCTION + cursor tests +// (New/Name/Format/Factory/ParseCursor/coerceCursor) plus the shared test +// helpers. The Scan/Tail/snapshot lifecycle tests live in +// adapter_lifecycle_test.go (split for the 400-line budget). + +// discardOpts returns AdapterOptions with a discard logger and a recording +// onError, plus the slice the errors land in (guarded by mu for the Tail +// goroutine). Mirrors codex's silentOpts. +func discardOpts() (canonical.AdapterOptions, *[]string, *sync.Mutex) { + var mu sync.Mutex + errs := &[]string{} + opts := canonical.AdapterOptions{ + Logger: slog.New(slog.NewTextHandler(io.Discard, nil)), + OnError: func(e error) { + mu.Lock() + *errs = append(*errs, e.Error()) + mu.Unlock() + }, + } + return opts, errs, &mu +} + +// alienCursor is a foreign canonical.Cursor used to drive coerceCursor's +// type-assertion-miss branch. +type alienCursor struct{} + +func (alienCursor) String() string { return "{}" } +func (alienCursor) After(canonical.Cursor) bool { return false } + +// TestAdapter_NewRejectsEmptyLocation pins the fail-fast guard on an empty DB path. +func TestAdapter_NewRejectsEmptyLocation(t *testing.T) { + t.Parallel() + if _, err := New("", canonical.AdapterOptions{}); err == nil { + t.Fatal("New(\"\") = nil error, want non-nil") + } +} + +// TestAdapter_NewDefaultsNilDeps verifies New tolerates a nil Logger and nil +// OnError (substituting defaults) so adapter code can call them unconditionally. +func TestAdapter_NewDefaultsNilDeps(t *testing.T) { + t.Parallel() + a, err := New("/some/opencode.db", canonical.AdapterOptions{}) + if err != nil { + t.Fatalf("New: %v", err) + } + if a.logger == nil { + t.Error("logger is nil; want default") + } + if a.onError == nil { + t.Error("onError is nil; want no-op default") + } + a.onError(errors.New("x")) // the no-op must be callable + if a.sourceID != "opencode:/some/opencode.db" { + t.Errorf("sourceID = %q, want opencode:/some/opencode.db", a.sourceID) + } +} + +// TestAdapter_NameAndFormat pins the registry identifiers. +func TestAdapter_NameAndFormat(t *testing.T) { + t.Parallel() + a, err := New("/some/opencode.db", canonical.AdapterOptions{}) + if err != nil { + t.Fatalf("New: %v", err) + } + if a.Name() != "opencode" || a.Format() != "opencode" { + t.Errorf("Name()/Format() = %q/%q, want opencode/opencode", a.Name(), a.Format()) + } + if a.Name() != Format || a.Format() != Format { + t.Errorf("Name/Format must equal Format const %q", Format) + } +} + +// TestAdapter_FactoryAndRegistry builds an Adapter through the registry factory, +// proving init() ran (acceptance #1), and rejects the empty location. +func TestAdapter_FactoryAndRegistry(t *testing.T) { + t.Parallel() + factory, ok := adapters.Get("opencode") + if !ok { + t.Fatal("opencode factory not registered (init did not run)") + } + a, err := factory(t.TempDir()+"/opencode.db", canonical.AdapterOptions{}) + if err != nil { + t.Fatalf("registry factory: %v", err) + } + if a == nil { + t.Fatal("registry factory returned nil adapter") + } + if a.Name() != "opencode" || a.Format() != "opencode" { + t.Errorf("factory adapter Name()/Format() = %q/%q, want opencode", a.Name(), a.Format()) + } + // The package-level Factory rejects an empty location. + if _, err := Factory("", canonical.AdapterOptions{}); err == nil { + t.Fatal("Factory(\"\") = nil error, want non-nil") + } +} + +// TestAdapter_ParseCursor round-trips a cursor and rejects a bad version. +func TestAdapter_ParseCursor(t *testing.T) { + t.Parallel() + a, err := New("/some/opencode.db", canonical.AdapterOptions{}) + if err != nil { + t.Fatalf("New: %v", err) + } + // Empty → zero cursor (not nil). + c, err := a.ParseCursor("") + if err != nil { + t.Fatalf("ParseCursor(\"\"): %v", err) + } + if c == nil { + t.Fatal("ParseCursor(\"\") = nil cursor") + } + // Round-trip a non-empty cursor (current v2 split-watermark shape). + seed := newCursor().withTable("session", TableWatermark{MaxIDSeen: "ses_9", MaxTimeUpdatedMs: 42, MaxTimeUpdatedID: "ses_9"}) + got, err := a.ParseCursor(seed.String()) + if err != nil { + t.Fatalf("ParseCursor(round-trip): %v", err) + } + if !got.After(newCursor()) { + t.Error("round-tripped cursor should be After the empty cursor") + } + // A future/unknown version is NOT an error — it re-scans from zero (our own + // cursor shape drifting is recoverable by an idempotent backfill; SOW-0005 P1-A). + reScan, err := a.ParseCursor(`{"version":999}`) + if err != nil { + t.Errorf("ParseCursor(unknown version) = %v, want nil (re-scan from zero)", err) + } + if tc, ok := reScan.(Cursor); !ok || tc.hasProgress() { + t.Errorf("ParseCursor(unknown version) = %+v, want a fresh zero cursor", reScan) + } +} + +// TestAdapter_CoerceCursor covers the nil, typed-zero, and alien-type branches. +func TestAdapter_CoerceCursor(t *testing.T) { + t.Parallel() + a, err := New("/some/opencode.db", canonical.AdapterOptions{}) + if err != nil { + t.Fatalf("New: %v", err) + } + // nil → fresh cursor with non-nil map + version. + if c := a.coerceCursor(nil); c.Tables == nil || c.Version != cursorVersion { + t.Errorf("coerceCursor(nil) = %+v, want initialized map + version", c) + } + // Typed cursor with nil map + zero version is normalized in place. + if c := a.coerceCursor(Cursor{}); c.Tables == nil || c.Version != cursorVersion { + t.Errorf("coerceCursor(typed-zero) = %+v, want normalized", c) + } + // Alien cursor type → fresh cursor (full re-scan; never skips data). + c := a.coerceCursor(alienCursor{}) + if c.Version != cursorVersion || c.hasProgress() { + t.Errorf("coerceCursor(alien) = %+v, want fresh zero-watermark cursor", c) + } +} + +// hasSession reports whether evs contains a SessionStarted for nativeID. Shared +// with the lifecycle tests in adapter_lifecycle_test.go. +func hasSession(evs []canonical.Event, nativeID string) bool { + for _, ev := range evs { + if s, ok := ev.(canonical.SessionStartedEvent); ok && s.NativeID == nativeID { + return true + } + } + return false +} diff --git a/internal/adapters/opencode/conn.go b/internal/adapters/opencode/conn.go new file mode 100644 index 0000000..18dd0c1 --- /dev/null +++ b/internal/adapters/opencode/conn.go @@ -0,0 +1,320 @@ +package opencode + +import ( + "context" + "database/sql" + "fmt" + "net/url" + "path/filepath" + "strings" + "time" + + // Registers the "sqlite" driver with database/sql. Pure-Go, CGO-free + // per AGENTS.md tech stack — the same driver internal/store uses, but + // this adapter opens a DIFFERENT database (opencode's, not ai-viewer's) + // under a DIFFERENT, read-only-only contract. See openReadOnly. + _ "modernc.org/sqlite" +) + +// driverName is the database/sql driver identifier registered by +// modernc.org/sqlite. Kept in a constant so the DSN builder and openReadOnly +// agree; mirrors internal/store/store.go:driverName. +const driverName = "sqlite" + +// readOnlyPragmas is the EXACT, ordered set of connection-time PRAGMAs the +// adapter appends to every opencode DSN. It is the read-safety contract, in +// one place, asserted by conn_test.go's six write-probes: +// +// - query_only(true): SQL-layer rejection of any INSERT/UPDATE/DELETE and +// of write-path PRAGMAs (wal_checkpoint, etc). Defence-in-depth on top +// of the OS-level mode=ro below. +// - busy_timeout(5000): wait up to 5 s for a lock rather than failing +// immediately. Locks are rare in WAL mode but happen during opencode's +// own checkpoint; matches opencode's own busy_timeout +// (anomalyco/opencode @ 2b3ddf9 :: packages/opencode/src/storage/db.ts). +// +// Deliberately NOT included (SOW-0005 Open Decision #1, recorded): +// +// - foreign_keys: immaterial for a read-only connection — no write can +// violate a constraint — so it is omitted rather than forced off. +// - journal_mode(WAL): a read-only connection cannot change the journal +// mode, and opening a WAL-mode database with mode=ro already enters WAL +// reader mode (consistent snapshot from the last checkpoint, never +// blocking opencode's writer). Setting it would be a no-op at best and a +// spurious write attempt at worst, so it is left out. +// +// This is intentionally NARROWER than internal/store.buildDSN, which targets +// ai-viewer's OWN database and forces foreign_keys(on) + journal_mode(wal) + +// synchronous(normal) for a writable pool of 8. That helper must never be +// pointed at opencode.db: its writer contract and FK enforcement are wrong +// for an external, live, read-only source. +var readOnlyPragmas = []string{ + "query_only(true)", + "busy_timeout(5000)", +} + +// txlockDeferred is the forced _txlock value: a deferred BEGIN takes its +// snapshot on the first SELECT and never acquires a write lock. The DSN builder +// drops any caller _txlock (e.g. "exclusive", which would open a write-path +// BEGIN) and sets this, so the read-only contract holds even against a +// maliciously-constructed path string (adapter-opencode.md §"Read Strategy"). +const txlockDeferred = "deferred" + +// maxOpenConns bounds the read pool. SOW-0005 Open Decision #1: two +// connections — one for the watch poll, one for a rare presenter-triggered +// re-read. A live multi-GB WAL database tolerates many concurrent readers, +// but the adapter needs no more than two and a small bound keeps file +// descriptors and WAL page cache predictable. +const maxOpenConns = 2 + +// connMaxLifetime recycles a pooled connection periodically so stale WAL +// pages held in a long-lived connection's cache are released back, per +// adapter-opencode.md §"Read Strategy" (SetConnMaxLifetime(30 * time.Minute)). +const connMaxLifetime = 30 * time.Minute + +// buildReadOnlyDSN turns an opencode database file path into the read-only +// modernc.org/sqlite DSN. The path is made absolute and wrapped in the +// "file:" URI form so the driver preserves the query string — without the +// "file:" prefix modernc.org/sqlite strips everything after the first '?' +// (conn.go:53-55), which would silently drop mode=ro and the PRAGMAs and let +// the OS open the file read+write. The path component is percent-escaped so +// a directory containing '?', '#', or spaces cannot corrupt the query. +// +// The resulting DSN always carries, in this order: +// +// file:?_pragma=query_only(true)&_pragma=busy_timeout(5000)&_txlock=deferred&mode=ro +// +// mode=ro asks the OS to open the file O_RDONLY: SQLite cannot upgrade the +// connection to writable, so it is the primary, OS-enforced guard. The +// _pragma entries are the SQL-layer second line of defence. +// +// Read-safety policy (adapter-opencode.md §"Read Strategy"): the DSN is built +// as an ALLOWLIST, not a denylist. ALL caller-supplied `_pragma` values are +// DROPPED and replaced with exactly the readOnlyPragmas set, so no caller +// pragma — colliding or not — survives. A write-path pragma the old denylist +// did not name (wal_checkpoint(TRUNCATE), optimize, foreign_keys(on), …) can +// therefore never reach the connection. `_txlock` is forced to `deferred` +// (any caller `_txlock`, e.g. `exclusive`, is dropped) so a BEGIN can never +// take a write lock. mode=ro is forced regardless of the caller. +// +// A DSN that is already a "file:" URI (or an in-memory ":memory:" form used +// only by tests that want a throwaway shared cache) is accepted; its query is +// parsed only to be DISCARDED and rebuilt from the read-only set, so callers +// may hand either a bare path or a pre-built URI without weakening the guard. +// +// Bare-path opacity (SOW-0005 round-4 P3-2): the `?`-split that strips the query +// runs ONLY for the URI forms (file: / :memory:). A BARE filesystem path is +// treated as OPAQUE — POSIX allows '?' in a filename, so a bare path containing +// '?' opens the LITERAL file rather than misparsing everything after the '?' as a +// DSN query. The whole bare path (including any '?') is percent-escaped into the +// file: URI path. The default opencode database path contains no '?', so this is a +// correctness guard for an unusual --source location, not a change to the common +// case. +// +// CLI CONTRACT (SOW-0005 round-3 P2-4): the ingest CLI's opencode source +// location is always a FILESYSTEM PATH — both auto-discovery and +// `--source opencode:` resolve to a real path, which +// cmd/ai-viewer-ingest's startSource validates with os.Stat BEFORE constructing +// the adapter. os.Stat fails for the "file:"/":memory:" DSN forms, so those +// shapes are NOT valid --source locations: they exist purely for this package's +// programmatic and test callers (throwaway shared-cache DBs). buildReadOnlyDSN +// still accepts them when called directly; the filesystem-path-only rule is a +// CLI-layer contract, not a restriction enforced here. +func buildReadOnlyDSN(dbPath string) (string, error) { + if dbPath == "" { + return "", fmt.Errorf("opencode: database path must be non-empty") + } + + // Only the URI FORMS (file: / :memory:) carry a `?`-delimited query string; + // a BARE filesystem path is treated as OPAQUE (SOW-0005 round-4 P3-2). POSIX + // allows '?' in a filename, so splitting a bare path on '?' would misparse a + // real file whose name contains '?' — dropping part of the path into a bogus + // query. The default opencode path has no '?', but pointing --source + // opencode: at such a file must open the LITERAL file. Query splitting is + // therefore scoped to the URI forms only; the bare-path branch escapes the + // whole dbPath (including any '?') as the file: path. + var fileURI, existingQuery string + switch { + case strings.HasPrefix(dbPath, "file:"): + fileURI, existingQuery = splitQuery(dbPath) + case isMemoryDSN(dbPath): + fileURI, existingQuery = splitQuery(dbPath) + default: + // Bare filesystem path: opaque. Do NOT split on '?'. + abs, err := filepath.Abs(dbPath) + if err != nil { + return "", fmt.Errorf("opencode: resolve db path %q: %w", dbPath, err) + } + // SQLite URIs require forward slashes and percent-escaping of the + // path. url.PathEscape leaves '/' intact (it escapes only segment + // reserved characters) and escapes '?'/'#'/spaces, giving a valid opaque + // file: path for a filename containing any of those. + uriPath := filepath.ToSlash(abs) + if !strings.HasPrefix(uriPath, "/") { + uriPath = "/" + uriPath + } + fileURI = "file:" + escapeURIPath(uriPath) + } + + // Parse the caller query only to VALIDATE it (a malformed query is a hard + // error) — its contents are then discarded. Building from a fresh url.Values + // guarantees no caller _pragma or _txlock can leak through. existingQuery is + // empty for a bare path (never split), so this is a no-op there. + if _, err := url.ParseQuery(existingQuery); err != nil { + return "", fmt.Errorf("opencode: invalid db DSN query for %q: %w", dbPath, err) + } + + params := url.Values{} + // mode=ro is the OS-level guard; forced regardless of the caller. + params.Set("mode", "ro") + // _txlock=deferred: a read snapshot taken on first SELECT, never a write + // lock. Forced so a caller _txlock=exclusive cannot open a write-path BEGIN. + params.Set("_txlock", txlockDeferred) + // The ONLY pragmas on the connection are our read-only set (allowlist). + for _, p := range readOnlyPragmas { + params.Add("_pragma", p) + } + + return fileURI + "?" + params.Encode(), nil +} + +// openReadOnly opens the opencode database at dbPath strictly read-only and +// returns the pooled *sql.DB. It is the ONLY way this package acquires a +// connection to opencode's database, and the single chokepoint the +// read-safety tests exercise. +// +// Why a separate helper and not store.OpenReader: +// +// - store.OpenReader targets ai-viewer's OWN database. It forces +// foreign_keys(on) and a pool of 8 and is paired with a writer process +// (the ingester) that runs migrations. None of that is correct for an +// external, live, concurrently-written source we must never touch. +// - The opencode DB is the highest read-safety risk in the project: a +// stray write corrupts the operator's primary coding tool. Isolating its +// connection logic in this package keeps the contract — DSN, PRAGMAs, +// pool bounds — visible and independently testable, decoupled from +// ai-viewer's own-DB concerns. +// +// sql.Open is lazy (it never contacts the database), so PingContext is +// invoked immediately to surface a missing or unreadable file at the open +// call rather than at the first delta query. Because the DSN carries +// mode=ro, the OS refuses to create a missing file, so a non-existent path +// fails here rather than silently materialising an empty database. +func openReadOnly(ctx context.Context, dbPath string, opts ...connOption) (*sql.DB, error) { + cfg := connConfig{ + maxOpenConns: maxOpenConns, + connMaxLifetime: connMaxLifetime, + } + for _, o := range opts { + o(&cfg) + } + + dsn, err := buildReadOnlyDSN(dbPath) + if err != nil { + return nil, err + } + + db, err := sql.Open(driverName, dsn) + if err != nil { + return nil, fmt.Errorf("opencode: open %q (ro): %w", dbPath, err) + } + db.SetMaxOpenConns(cfg.maxOpenConns) + db.SetMaxIdleConns(1) + db.SetConnMaxLifetime(cfg.connMaxLifetime) + + if err := db.PingContext(ctx); err != nil { + _ = db.Close() + return nil, fmt.Errorf("opencode: ping %q (ro): %w", dbPath, err) + } + return db, nil +} + +// connConfig holds the tunables openReadOnly applies to the pool. Defaults +// come from the package constants; tests override them via connOption to +// keep production defaults the single source of truth. +type connConfig struct { + maxOpenConns int + connMaxLifetime time.Duration +} + +// connOption mutates a connConfig. Used by tests to pin a single connection +// or a short lifetime; production callers pass no options and inherit the +// package defaults. +type connOption func(*connConfig) + +// withMaxOpenConns overrides the pool size. Test-only knob. +func withMaxOpenConns(n int) connOption { + return func(c *connConfig) { c.maxOpenConns = n } +} + +// splitQuery splits a DSN into its prefix (path or file: URI) and the query +// string after the first '?'. Mirrors store.splitDSNQuery; kept local so the +// adapter stays self-contained per SOW-0005 Open Decision #1. +func splitQuery(dsn string) (prefix, query string) { + if p, q, ok := strings.Cut(dsn, "?"); ok { + return p, q + } + return dsn, "" +} + +// isMemoryDSN reports whether dsn refers to an in-memory SQLite database. +// Only tests use the in-memory form; production always passes a file path. +func isMemoryDSN(dsn string) bool { + if dsn == ":memory:" { + return true + } + return strings.HasPrefix(dsn, "file::memory:") || + strings.Contains(dsn, ":memory:?") || + strings.HasPrefix(dsn, ":memory:?") +} + +// pragmaName extracts the lowercase pragma identifier from a _pragma value, +// tolerating the "name(value)" and "name=value" forms and an optional +// "." qualifier. A trimmed, lowercased identifier is returned. +func pragmaName(v string) string { + v = strings.TrimSpace(v) + if dot := strings.IndexByte(v, '.'); dot > 0 && isBareIdent(v[:dot]) { + v = v[dot+1:] + } + end := len(v) + for i, r := range v { + if r == '(' || r == '=' || r == ' ' || r == '\t' { + end = i + break + } + } + return strings.ToLower(strings.TrimSpace(v[:end])) +} + +// isBareIdent reports whether s is a bare SQL identifier +// ([A-Za-z_][A-Za-z0-9_]*) — used to recognise a "." qualifier so +// pragmaName strips exactly one legitimate schema prefix and nothing else. +func isBareIdent(s string) bool { + if s == "" { + return false + } + for i := 0; i < len(s); i++ { + c := s[i] + isAlpha := c == '_' || (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z') + isDigit := c >= '0' && c <= '9' + if i == 0 && !isAlpha { + return false + } + if i > 0 && !isAlpha && !isDigit { + return false + } + } + return true +} + +// escapeURIPath percent-escapes a slash-separated absolute path for use as +// the opaque path of a "file:" SQLite URI, preserving the '/' separators. +// url.PathEscape escapes a single segment (including '/'), so the path is +// split on '/' and each segment escaped independently. +func escapeURIPath(p string) string { + segs := strings.Split(p, "/") + for i, s := range segs { + segs[i] = url.PathEscape(s) + } + return strings.Join(segs, "/") +} diff --git a/internal/adapters/opencode/conn_dsn_test.go b/internal/adapters/opencode/conn_dsn_test.go new file mode 100644 index 0000000..6322de4 --- /dev/null +++ b/internal/adapters/opencode/conn_dsn_test.go @@ -0,0 +1,192 @@ +package opencode + +import ( + "net/url" + "os" + "path/filepath" + "reflect" + "sort" + "strings" + "testing" +) + +// This file pins the read-only DSN ALLOWLIST policy (SOW-0005 P1.2): every +// caller-supplied _pragma is dropped and the query is rebuilt from the fixed +// read-only set + mode=ro + _txlock=deferred, so no path-string vector can reach +// a write-path pragma or an exclusive (write-lock) BEGIN. Split out of +// conn_test.go to keep each file ≤400 lines. + +// TestBuildReadOnlyDSN_AllowlistDropsAllCallerPragmas asserts the ALLOWLIST +// policy: EVERY caller-supplied _pragma is dropped — both the ones that collide +// with the read-only set (query_only, busy_timeout, including a schema-qualified +// form) AND any other (cache_size). The built DSN's _pragma set is EXACTLY the +// read-only set, nothing else. This is the inversion of the old denylist that let +// non-colliding pragmas pass through. +func TestBuildReadOnlyDSN_AllowlistDropsAllCallerPragmas(t *testing.T) { + t.Parallel() + in := "file:/db/opencode.db?_pragma=query_only(false)&_pragma=main.busy_timeout(1)&_pragma=cache_size(-2000)" + dsn, err := buildReadOnlyDSN(in) + if err != nil { + t.Fatalf("buildReadOnlyDSN: %v", err) + } + _, query := splitQuery(dsn) + params, err := url.ParseQuery(query) + if err != nil { + t.Fatalf("parse query: %v", err) + } + got := append([]string(nil), params["_pragma"]...) + sort.Strings(got) + want := append([]string(nil), readOnlyPragmas...) + sort.Strings(want) + if !reflect.DeepEqual(got, want) { + t.Errorf("_pragma set = %v, want EXACTLY the read-only set %v (allowlist drops all caller pragmas)", got, want) + } +} + +// TestBuildReadOnlyDSN_MaliciousDSNNeutralised is the P1.2 read-safety proof: a +// path string crafted with write-path pragmas (wal_checkpoint(TRUNCATE), +// foreign_keys(on)) and _txlock=exclusive yields a DSN that carries ONLY the +// read-only pragmas, mode=ro, and _txlock=deferred — none of the injected write +// vectors survive. The old denylist would have let the non-colliding pragmas and +// the exclusive txlock through. +func TestBuildReadOnlyDSN_MaliciousDSNNeutralised(t *testing.T) { + t.Parallel() + in := "file:/x/opencode.db?_pragma=wal_checkpoint(TRUNCATE)&_txlock=exclusive&_pragma=foreign_keys(on)" + dsn, err := buildReadOnlyDSN(in) + if err != nil { + t.Fatalf("buildReadOnlyDSN: %v", err) + } + _, query := splitQuery(dsn) + params, err := url.ParseQuery(query) + if err != nil { + t.Fatalf("parse query: %v", err) + } + + // _pragma must be EXACTLY the read-only set — no wal_checkpoint, no foreign_keys. + got := append([]string(nil), params["_pragma"]...) + sort.Strings(got) + want := append([]string(nil), readOnlyPragmas...) + sort.Strings(want) + if !reflect.DeepEqual(got, want) { + t.Errorf("_pragma = %v, want EXACTLY %v (no injected write-path pragma may survive)", got, want) + } + for _, p := range params["_pragma"] { + switch pragmaName(p) { + case "wal_checkpoint", "foreign_keys", "optimize": + t.Errorf("injected write-path pragma survived: %q", p) + } + } + if got := params.Get("_txlock"); got != txlockDeferred { + t.Errorf("_txlock = %q, want %q (injected exclusive must be replaced)", got, txlockDeferred) + } + if got := params.Get("mode"); got != "ro" { + t.Errorf("mode = %q, want ro", got) + } +} + +// --- P3-2: a bare filesystem path is opaque (not split on '?') ----------------- + +// TestBuildReadOnlyDSN_BarePathWithQuestionMarkIsOpaque pins SOW-0005 round-4 P3-2: +// a BARE filesystem path containing '?' must be treated as the LITERAL path — the +// '?' is part of the filename (POSIX allows it), NOT a DSN query delimiter. The +// built DSN therefore percent-escapes the '?' (as %3F) into the file: path and +// carries ONLY the forced read-only query (mode=ro, _txlock, the read-only +// pragmas), with no fragment of the path leaking into the query. +func TestBuildReadOnlyDSN_BarePathWithQuestionMarkIsOpaque(t *testing.T) { + t.Parallel() + bare := "/data/oc?weird/opencode.db" + dsn, err := buildReadOnlyDSN(bare) + if err != nil { + t.Fatalf("buildReadOnlyDSN(bare-with-?): %v", err) + } + prefix, query := splitQuery(dsn) + // The path portion must contain the escaped '?', proving it was NOT split off. + if !strings.Contains(prefix, "%3F") { + t.Errorf("bare-path DSN prefix %q lost the literal '?' (should be %%3F-escaped, not split)", prefix) + } + // The query must be EXACTLY the forced read-only set — none of the path's + // "weird/opencode.db" tail may appear as a query param. + params, err := url.ParseQuery(query) + if err != nil { + t.Fatalf("parse query: %v", err) + } + if params.Get("mode") != "ro" { + t.Errorf("mode = %q, want ro", params.Get("mode")) + } + if strings.Contains(query, "weird") || strings.Contains(query, "opencode.db") { + t.Errorf("part of the bare path leaked into the query: %q", query) + } +} + +// TestOpenReadOnly_BarePathWithQuestionMarkOpensLiteralFile is the end-to-end proof: +// an opencode DB created at a path whose DIRECTORY name contains '?' is opened +// read-only via the literal path (round-4 P3-2). Before the fix the '?' split the +// DSN and the OS opened a different (non-existent) path, failing the ping. +func TestOpenReadOnly_BarePathWithQuestionMarkOpensLiteralFile(t *testing.T) { + t.Parallel() + base := t.TempDir() + weird := filepath.Join(base, "q?dir") + if err := os.MkdirAll(weird, 0o755); err != nil { + t.Fatalf("mkdir %q: %v", weird, err) + } + // newEmptyDB creates the schema at weird/; its rwDSNFor escapes the path, + // so the file is created literally inside the '?'-named directory. + path, rw := newEmptyDB(t, weird, "opencode.db") + if !strings.Contains(path, "?") { + t.Fatalf("test path %q unexpectedly has no '?'", path) + } + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, err := openReadOnly(ctxBG(), path) + if err != nil { + t.Fatalf("openReadOnly(bare path with '?') failed — the '?' was misparsed as a query: %v", err) + } + t.Cleanup(func() { _ = db.Close() }) + // A trivial query confirms the connection is live (the literal file opened). + if _, err := introspectAll(ctxBG(), db); err != nil { + t.Fatalf("introspect over the '?'-path DB: %v", err) + } +} + +// TestBuildReadOnlyDSN_FileURIFormStillStripsQuery guards the OTHER side of the +// round-4 P3-2 scoping: the URI forms (file:) STILL split + rebuild the query, so a +// caller-supplied query on a file: DSN is parsed and replaced by the read-only set. +func TestBuildReadOnlyDSN_FileURIFormStillStripsQuery(t *testing.T) { + t.Parallel() + in := "file:/db/opencode.db?_pragma=query_only(false)&mode=rwc" + dsn, err := buildReadOnlyDSN(in) + if err != nil { + t.Fatalf("buildReadOnlyDSN(file: form): %v", err) + } + _, query := splitQuery(dsn) + params, err := url.ParseQuery(query) + if err != nil { + t.Fatalf("parse query: %v", err) + } + if params.Get("mode") != "ro" { + t.Errorf("file: form mode = %q, want ro (caller mode=rwc must be replaced)", params.Get("mode")) + } + got := append([]string(nil), params["_pragma"]...) + sort.Strings(got) + want := append([]string(nil), readOnlyPragmas...) + sort.Strings(want) + if !reflect.DeepEqual(got, want) { + t.Errorf("file: form _pragma = %v, want EXACTLY the read-only set %v", got, want) + } +} + +// TestBuildReadOnlyDSN_MalformedQueryRejectedForURIForm pins that a malformed query +// is still rejected for the URI forms (the validation path is unchanged); a bare +// path has no query to validate so it never hits this error. +func TestBuildReadOnlyDSN_MalformedQueryRejectedForURIForm(t *testing.T) { + t.Parallel() + if _, err := buildReadOnlyDSN("file:/db/opencode.db?%zz"); err == nil { + t.Error("malformed query on a file: DSN should be rejected") + } + // The same bytes as a BARE path: '?%zz' is part of the filename, so it is + // escaped, NOT validated as a query — no error. + if _, err := buildReadOnlyDSN("/db/opencode.db?%zz"); err != nil { + t.Errorf("bare path containing '?%%zz' must be opaque (no query validation), got %v", err) + } +} diff --git a/internal/adapters/opencode/conn_test.go b/internal/adapters/opencode/conn_test.go new file mode 100644 index 0000000..58afb6f --- /dev/null +++ b/internal/adapters/opencode/conn_test.go @@ -0,0 +1,342 @@ +package opencode + +import ( + "context" + "database/sql" + "net/url" + "path/filepath" + "strings" + "testing" +) + +// seedSyntheticDB creates a throwaway opencode-shaped SQLite database in dir +// via a SEPARATE read-write connection and returns its file path. The schema +// mirrors the four tracked tables (verified shape, never real data); content +// is synthetic. Callers then reopen the path through openReadOnly to assert +// the read-only contract. The read-write handle is closed before return so +// the WAL is flushed and the read-only opener sees a complete file. +func seedSyntheticDB(t *testing.T, dir string) string { + t.Helper() + path := filepath.Join(dir, "opencode.db") + // A plain read-write file: URI. This is the ONLY writable handle in the + // test; production never opens opencode.db this way. + rwDSN := "file:" + escapeURIPath(filepath.ToSlash(path)) + "?_pragma=busy_timeout(5000)" + rw, err := sql.Open(driverName, rwDSN) + if err != nil { + t.Fatalf("open rw: %v", err) + } + defer func() { + if cerr := rw.Close(); cerr != nil { + t.Errorf("close rw: %v", cerr) + } + }() + stmts := []string{ + `CREATE TABLE session ( + id TEXT PRIMARY KEY, project_id TEXT NOT NULL, parent_id TEXT, + slug TEXT NOT NULL, directory TEXT NOT NULL, title TEXT NOT NULL, + version TEXT NOT NULL, agent TEXT, model TEXT, + cost REAL NOT NULL DEFAULT 0, + tokens_input INTEGER NOT NULL DEFAULT 0, + tokens_output INTEGER NOT NULL DEFAULT 0, + tokens_reasoning INTEGER NOT NULL DEFAULT 0, + tokens_cache_read INTEGER NOT NULL DEFAULT 0, + tokens_cache_write INTEGER NOT NULL DEFAULT 0, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + time_archived INTEGER, time_compacting INTEGER)`, + `CREATE TABLE message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL)`, + `CREATE TABLE part ( + id TEXT PRIMARY KEY, message_id TEXT NOT NULL, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL)`, + `CREATE TABLE session_message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, type TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL)`, + `INSERT INTO session (id, project_id, slug, directory, title, version, time_created, time_updated) + VALUES ('ses_aaa','prj_aaa','calm-otter','/work/example','synthetic title','1.0.0',1700000000000,1700000000000)`, + `INSERT INTO message (id, session_id, time_created, time_updated, data) + VALUES ('msg_aaa','ses_aaa',1700000000000,1700000000000,'{"role":"assistant"}')`, + `INSERT INTO part (id, message_id, session_id, time_created, time_updated, data) + VALUES ('prt_aaa','msg_aaa','ses_aaa',1700000000000,1700000000000,'{"type":"text","text":"synthetic"}')`, + `INSERT INTO session_message (id, session_id, type, time_created, time_updated, data) + VALUES ('evt_aaa','ses_aaa','model-switched',1700000000000,1700000000000,'{}')`, + } + for _, s := range stmts { + if _, err := rw.Exec(s); err != nil { + t.Fatalf("seed exec failed: %v\nstmt: %s", err, s) + } + } + return path +} + +// TestOpenReadOnly_RejectsAllWrites is the SOW-0005 AC#2 read-only +// enforcement test. It seeds a synthetic opencode-shaped DB with a SEPARATE +// read-write connection, reopens it through the adapter's openReadOnly helper, +// and exercises the six write paths AC#2 names. The contract being asserted is +// the one that actually protects the operator's live opencode database: NO +// write to that database ever succeeds. +// +// SQLite enforces this with two distinct mechanisms, and the six probes split +// into two groups accordingly (verified against modernc.org/sqlite v1.50.1): +// +// - Direct write statements — INSERT, UPDATE, DELETE, VACUUM — are rejected +// outright with "attempt to write a readonly database (8)". mode=ro (OS +// O_RDONLY) blocks them at the file layer and query_only(true) blocks them +// at the SQL layer; either alone suffices, so the guard is defence in +// depth. +// - Two probes do NOT return an error, and asserting that they did would pin +// a false mechanism: +// - PRAGMA wal_checkpoint is a NO-OP on a read-only connection: it returns +// a status row of (busy, -1, -1) meaning "no frames checkpointed". It +// physically cannot mutate the WAL under mode=ro. The safety property is +// "no checkpoint occurred", asserted by the -1/-1 sentinel, not an error. +// - ATTACH ... 'rwc' attaches a SEPARATE side database (never opencode.db) +// and succeeds, but query_only(true) then blocks any write INTO that +// attached schema. The safety property is "no durable mutation path +// opens", asserted by the attached-write erroring — and opencode.db is +// untouched regardless. +// +// Asserting the precise property of each probe is a STRONGER read-safety proof +// than a blanket "all error", and it matches how SQLite actually behaves. +func TestOpenReadOnly_RejectsAllWrites(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path := seedSyntheticDB(t, dir) + + db, err := openReadOnly(context.Background(), path) + if err != nil { + t.Fatalf("openReadOnly: %v", err) + } + t.Cleanup(func() { _ = db.Close() }) + + // Group 1: direct write statements that MUST be rejected with an error. + directWrites := []struct { + name string + sql string + }{ + {"INSERT", `INSERT INTO session (id, project_id, slug, directory, title, version, time_created, time_updated) + VALUES ('ses_bbb','prj_bbb','x','/x','t','1',1,1)`}, + {"UPDATE", `UPDATE session SET title = 'mutated' WHERE id = 'ses_aaa'`}, + {"DELETE", `DELETE FROM session WHERE id = 'ses_aaa'`}, + {"VACUUM", `VACUUM`}, + } + for _, p := range directWrites { + p := p + t.Run(p.name, func(t *testing.T) { + t.Parallel() + if _, err := db.ExecContext(context.Background(), p.sql); err == nil { + t.Fatalf("%s succeeded against a read-only connection; want error", p.name) + } + }) + } + + // Group 2a: PRAGMA wal_checkpoint must be a verified no-op. The result row + // is (busy, log, checkpointed); a read-only connection cannot checkpoint, + // so log and checkpointed come back -1 ("nothing done"). Asserting the + // no-op is the real safety property — it proves the WAL was not mutated. + t.Run("PRAGMA wal_checkpoint is a no-op", func(t *testing.T) { + t.Parallel() + var busy, logFrames, checkpointed int + row := db.QueryRowContext(context.Background(), `PRAGMA wal_checkpoint(TRUNCATE)`) + if err := row.Scan(&busy, &logFrames, &checkpointed); err != nil { + // A hard error here is also acceptable read-safety — the checkpoint + // definitely did not run. Only a SUCCESSFUL checkpoint of frames + // would be a contract breach. + t.Logf("wal_checkpoint errored (also safe): %v", err) + return + } + if logFrames != -1 || checkpointed != -1 { + t.Fatalf("wal_checkpoint mutated the WAL: busy=%d log=%d checkpointed=%d (want log=-1 checkpointed=-1)", busy, logFrames, checkpointed) + } + }) + + // Group 2b: ATTACH 'rwc' attaches a side database, but any write into it + // must be blocked by query_only(true). The ATTACH targets a DIFFERENT file + // and never touches opencode.db; the durable-mutation path is the write, + // and that write must error. + t.Run("ATTACH rwc blocks the attached write", func(t *testing.T) { + t.Parallel() + side := "file:" + escapeURIPath(filepath.ToSlash(filepath.Join(dir, "attached.db"))) + "?mode=rwc" + // The ATTACH itself may succeed (it opens a separate file). What must + // NOT succeed is a write into the attached schema. + _, _ = db.ExecContext(context.Background(), `ATTACH DATABASE '`+side+`' AS side`) + if _, err := db.ExecContext(context.Background(), `CREATE TABLE side.t (x INTEGER)`); err == nil { + t.Fatal("write into ATTACHed rwc database succeeded; query_only must block it") + } + _, _ = db.ExecContext(context.Background(), `DETACH DATABASE side`) + }) + + // Belt-and-braces: the row the seed wrote must be untouched after all the + // rejected/no-op writes, proving none partially applied to opencode.db. + var title string + if err := db.QueryRow(`SELECT title FROM session WHERE id = 'ses_aaa'`).Scan(&title); err != nil { + t.Fatalf("read-back select: %v", err) + } + if title != "synthetic title" { + t.Fatalf("row mutated despite read-only: title=%q", title) + } +} + +// TestBuildReadOnlyDSN_ContainsContract asserts the constructed DSN carries +// exactly the read-safety contract: the file: scheme (so the driver keeps the +// query string), mode=ro (OS-level guard), and both required PRAGMAs. This +// pins the constant so a future edit cannot quietly drop a guard. +func TestBuildReadOnlyDSN_ContainsContract(t *testing.T) { + t.Parallel() + dsn, err := buildReadOnlyDSN("/var/lib/opencode/opencode.db") + if err != nil { + t.Fatalf("buildReadOnlyDSN: %v", err) + } + if !strings.HasPrefix(dsn, "file:") { + t.Errorf("DSN missing file: scheme: %q", dsn) + } + prefix, query := splitQuery(dsn) + if !strings.HasSuffix(prefix, "/var/lib/opencode/opencode.db") { + t.Errorf("DSN path not preserved: %q", prefix) + } + params, err := url.ParseQuery(query) + if err != nil { + t.Fatalf("parse query %q: %v", query, err) + } + if got := params.Get("mode"); got != "ro" { + t.Errorf("mode = %q, want ro", got) + } + pragmas := params["_pragma"] + wantPragma := map[string]bool{"query_only(true)": false, "busy_timeout(5000)": false} + for _, p := range pragmas { + if _, ok := wantPragma[p]; ok { + wantPragma[p] = true + } + } + for p, seen := range wantPragma { + if !seen { + t.Errorf("DSN missing required pragma %q (got %v)", p, pragmas) + } + } +} + +// The DSN allowlist tests (TestBuildReadOnlyDSN_AllowlistDropsAllCallerPragmas + +// TestBuildReadOnlyDSN_MaliciousDSNNeutralised, SOW-0005 P1.2) live in +// conn_dsn_test.go (split to keep this file ≤400 lines). + +// TestBuildReadOnlyDSN_Errors covers the rejected inputs: an empty path and a +// query string that cannot be parsed. +func TestBuildReadOnlyDSN_Errors(t *testing.T) { + t.Parallel() + if _, err := buildReadOnlyDSN(""); err == nil { + t.Error("empty path: want error") + } + if _, err := buildReadOnlyDSN("file:/db.sqlite?%zz"); err == nil { + t.Error("malformed query: want error") + } +} + +// TestOpenReadOnly_MissingFileFails asserts a non-existent database path +// fails at the open call (mode=ro refuses to create it), not silently +// materialising an empty database. +func TestOpenReadOnly_MissingFileFails(t *testing.T) { + t.Parallel() + missing := filepath.Join(t.TempDir(), "does-not-exist.db") + if _, err := openReadOnly(context.Background(), missing); err == nil { + t.Fatalf("openReadOnly on missing file: want error (mode=ro must not create)") + } +} + +// TestOpenReadOnly_HonoursMaxOpenConns exercises the test-only pool override +// and confirms a connection can be acquired and queried under it. +func TestOpenReadOnly_HonoursMaxOpenConns(t *testing.T) { + t.Parallel() + path := seedSyntheticDB(t, t.TempDir()) + db, err := openReadOnly(context.Background(), path, withMaxOpenConns(1)) + if err != nil { + t.Fatalf("openReadOnly: %v", err) + } + t.Cleanup(func() { _ = db.Close() }) + if got := db.Stats().MaxOpenConnections; got != 1 { + t.Errorf("MaxOpenConnections = %d, want 1", got) + } + var n int + if err := db.QueryRow(`SELECT count(*) FROM session`).Scan(&n); err != nil { + t.Fatalf("count: %v", err) + } + if n != 1 { + t.Errorf("session count = %d, want 1", n) + } +} + +// TestIsBareIdent covers the identifier recognition used to strip a single +// "." qualifier: valid bare names, an empty string, a leading digit, +// and an embedded non-identifier character. +func TestIsBareIdent(t *testing.T) { + t.Parallel() + cases := map[string]bool{ + "main": true, + "temp": true, + "_x9": true, + "": false, + "9bad": false, + "has-dash": false, + "has.dot": false, + } + for in, want := range cases { + if got := isBareIdent(in); got != want { + t.Errorf("isBareIdent(%q) = %v, want %v", in, got, want) + } + } +} + +// TestIsMemoryDSN covers the in-memory DSN forms the helper recognises so a +// future test that wants a throwaway shared-cache DB is routed correctly. +func TestIsMemoryDSN(t *testing.T) { + t.Parallel() + cases := map[string]bool{ + ":memory:": true, + "file::memory:?cache=shared": true, + ":memory:?cache=shared": true, + "file:/tmp/x.db": false, + "/tmp/x.db": false, + } + for in, want := range cases { + if got := isMemoryDSN(in); got != want { + t.Errorf("isMemoryDSN(%q) = %v, want %v", in, got, want) + } + } +} + +// TestBuildReadOnlyDSN_MemoryAndFileURIPassthrough asserts a pre-built file: +// URI and an in-memory DSN are accepted and still gain the read-only query +// parameters, exercising the non-path branches of buildReadOnlyDSN. +func TestBuildReadOnlyDSN_MemoryAndFileURIPassthrough(t *testing.T) { + t.Parallel() + for _, in := range []string{"file:/already/a/uri.db", ":memory:"} { + dsn, err := buildReadOnlyDSN(in) + if err != nil { + t.Fatalf("buildReadOnlyDSN(%q): %v", in, err) + } + if !strings.Contains(dsn, "mode=ro") { + t.Errorf("buildReadOnlyDSN(%q) = %q, missing mode=ro", in, dsn) + } + } +} + +// TestPragmaName covers the identifier extraction across the forms the strip +// pass must recognise: bare, valued, schema-qualified, and whitespace. +func TestPragmaName(t *testing.T) { + t.Parallel() + cases := map[string]string{ + "query_only(true)": "query_only", + "busy_timeout=5000": "busy_timeout", + "main.query_only(false)": "query_only", + " foreign_keys (on) ": "foreign_keys", + "cache_size(-64000)": "cache_size", + "": "", + } + for in, want := range cases { + if got := pragmaName(in); got != want { + t.Errorf("pragmaName(%q) = %q, want %q", in, got, want) + } + } +} diff --git a/internal/adapters/opencode/cursor.go b/internal/adapters/opencode/cursor.go new file mode 100644 index 0000000..46a3da6 --- /dev/null +++ b/internal/adapters/opencode/cursor.go @@ -0,0 +1,271 @@ +package opencode + +import ( + "encoding/json" + "fmt" + "maps" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// cursorVersion is the on-disk version of the persisted cursor. Bumped to 2 by +// SOW-0005 round-2 P1-A: the watermark split (a single MaxID conflated the +// monotonic insert-detection id with the (time_updated, id) paging position, so +// an in-place UPDATE of an OLD row regressed MaxID and re-armed the expensive +// idle scan forever). ParseCursor refuses unknown versions AND treats a v1 (or +// any old-shape) blob as a fresh zero cursor, forcing a one-time full re-scan +// that is idempotent (the ingester upserts). Mirrors codex/cursor.go. +const cursorVersion = 2 + +// trackedTables is the fixed set of opencode tables the cursor watermarks, +// in the order they are read. session/message/part are the canonical tree; +// session_message is the agent/model-switch sidecar. Any other opencode table +// is out of scope (adapter-opencode.md §"Tables we read"). +var trackedTables = []string{"session", "message", "part", "session_message"} + +// Cursor is the resume token persisted in sources.cursor for the opencode +// adapter. Unlike the four file adapters (byte-offset per file), opencode is +// a single SQLite database, so the cursor is a per-table watermark over +// time-prefixed Sonyflake IDs and the auto-bumped time_updated column. +// See adapter-opencode.md §"Cursor". +// +// There are NO byte offsets here: opencode never appends lines to a file the +// adapter tails. The watermarks are the only progress state. +type Cursor struct { + // Version is the on-disk format version. Defaults to cursorVersion on + // construction; ParseCursor refuses anything else. + Version int `json:"version"` + // SchemaHash is a digest of the applied-migration name list + // (__drizzle_migrations.name). It detects that opencode applied a new + // migration between runs; on mismatch the adapter re-probes the schema + // (later chunks) but does NOT reset the watermarks — column drift is + // handled per-column. Observability/invalidation only; NOT part of + // After() ordering. Empty until the first probe records it. + SchemaHash string `json:"schema_hash,omitempty"` + // Tables maps each tracked table name to its watermark. A table absent + // from the map has had no rows observed yet (cold start). + Tables map[string]TableWatermark `json:"tables,omitempty"` +} + +// TableWatermark is the per-table progress state. SOW-0005 round-2 P1-A split +// the former single MaxID into TWO INDEPENDENT concepts because the old field +// conflated two roles that can move in opposite directions: +// +// - MaxIDSeen — the monotonic highest id EVER observed for the table. It +// drives the CHEAP, PK-indexed insert detection in detectChange +// (MAX(id) > MaxIDSeen is a b-tree seek). It NEVER regresses: advancing it +// with `if id > MaxIDSeen` keeps it the true high-water id even when an OLD +// row is updated in place. The pre-P1-A code set MaxID to the LAST-PAGED +// row's id (which sorts by (time_updated, id)); when an old low-id row was +// re-stamped with a fresh time_updated it sorted LAST → MaxID regressed to +// that small id → MAX(id) stayed permanently greater → every idle poll re-ran +// the unindexed (time_updated, id) full scan on the live multi-GB DB, +// defeating AC#6's gate. MaxIDSeen cannot regress, so that scan never re-arms. +// - MaxTimeUpdatedMs + MaxTimeUpdatedID — the (time_updated, id) PAGING +// POSITION: the last-paged row's pair, the source of truth for the delta +// query's WHERE/ORDER (time_updated > :tu OR (time_updated = :tu AND +// id > :tuid) ORDER BY time_updated, id). MaxTimeUpdatedID is the in-place +// tie-break id (it MAY be small if an old row was just re-paged) and is the +// only id the delta query binds. It is also the resume-ordering id for +// cmpWatermark/After (the order rows are actually consumed in). +type TableWatermark struct { + // MaxIDSeen is the monotonic highest opencode row id EVER observed for this + // table (e.g. "prt_..."). It only ever increases (advanced via + // `if id > MaxIDSeen`), so the cheap MAX(id) > MaxIDSeen insert check never + // re-arms the expensive probe after an in-place update of an old row. + // Empty means no row observed yet. + MaxIDSeen string `json:"max_id_seen,omitempty"` + // MaxTimeUpdatedMs is the highest time_updated reached by paging, in + // milliseconds since the UNIX epoch (opencode's native unit — the mapper + // converts to canonical microseconds, never the cursor). It is the delta + // query's :tu bind. 0 means no row paged yet. + MaxTimeUpdatedMs int64 `json:"max_time_updated,omitempty"` + // MaxTimeUpdatedID is the id of the last row paged at MaxTimeUpdatedMs — the + // (time_updated, id) tie-break the delta query binds as :tuid. It MAY be a + // small id (an old row re-stamped with a new time_updated), which is exactly + // why it is kept SEPARATE from MaxIDSeen. Empty means no row paged yet. + MaxTimeUpdatedID string `json:"max_time_updated_id,omitempty"` +} + +// newCursor returns an empty Cursor ready for use. +func newCursor() Cursor { + return Cursor{ + Version: cursorVersion, + Tables: map[string]TableWatermark{}, + } +} + +// String implements canonical.Cursor. Returns stable JSON (encoding/json +// sorts map keys) suitable for persistence. Mirrors codex/cursor.go. +func (c Cursor) String() string { + out := c + if out.Tables == nil { + out.Tables = map[string]TableWatermark{} + } + if out.Version == 0 { + out.Version = cursorVersion + } + b, err := json.Marshal(out) + if err != nil { + // json.Marshal on a struct of known-encodable types cannot fail; if it + // ever does, surface a sentinel so callers don't silently persist an + // empty value. + return fmt.Sprintf(`{"error":%q}`, err.Error()) + } + return string(b) +} + +// After implements canonical.Cursor. Reports whether c is strictly after +// other on at least one table's watermark, with NO table regressing. A +// table's watermark advances when its (MaxTimeUpdatedMs, MaxTimeUpdatedID) pair +// increases lexicographically — time first, then id as the tiebreaker, +// matching the delta-query ordering. A lower pair on any shared table, or a +// table the other has progress on that c lacks, defeats After. MaxIDSeen and +// SchemaHash are NOT part of the ordering: MaxIDSeen is the cheap-detect +// high-water and can advance independently of the paging position, and +// SchemaHash is observability-only. The discipline is codex/cursor.go's After +// verbatim, lifted from byte offsets to the paging-position pair. +func (c Cursor) After(other canonical.Cursor) bool { + o, ok := other.(Cursor) + if !ok { + // A different cursor concrete type is comparable only by emptiness: + // c is After it iff c has any table progress. + return c.hasProgress() + } + advancedOne := false + for name, mine := range c.Tables { + theirs, present := o.Tables[name] + if !present { + if mine.nonZero() { + advancedOne = true + } + continue + } + switch cmpWatermark(mine, theirs) { + case -1: + return false + case 1: + advancedOne = true + } + } + // Missing any table the other has progress on is a regression. + for name, theirs := range o.Tables { + if _, present := c.Tables[name]; present { + continue + } + if theirs.nonZero() { + return false + } + } + return advancedOne +} + +// cmpWatermark orders two watermarks by the PAGING POSITION (MaxTimeUpdatedMs, +// MaxTimeUpdatedID): time first, then id as the tiebreaker. Returns -1 if ab. This is the same composite key the delta query sorts +// by, so After's notion of "advanced" matches the order in which rows are +// actually consumed. MaxIDSeen is intentionally NOT compared here — it is the +// cheap-detect high-water, not the resume position. +func cmpWatermark(a, b TableWatermark) int { + switch { + case a.MaxTimeUpdatedMs < b.MaxTimeUpdatedMs: + return -1 + case a.MaxTimeUpdatedMs > b.MaxTimeUpdatedMs: + return 1 + } + switch { + case a.MaxTimeUpdatedID < b.MaxTimeUpdatedID: + return -1 + case a.MaxTimeUpdatedID > b.MaxTimeUpdatedID: + return 1 + } + return 0 +} + +// nonZero reports whether the watermark carries any progress (either the +// cheap-detect high-water or the paging position has moved). +func (w TableWatermark) nonZero() bool { + return w.MaxIDSeen != "" || w.MaxTimeUpdatedID != "" || w.MaxTimeUpdatedMs != 0 +} + +// advanceMaxIDSeen returns a copy of the watermark whose MaxIDSeen is raised to +// id when id is lexicographically greater (the monotonic insert-detection +// high-water never regresses). A smaller/equal id leaves it unchanged. The +// paging-position fields are untouched. +func (w TableWatermark) advanceMaxIDSeen(id string) TableWatermark { + if id > w.MaxIDSeen { + w.MaxIDSeen = id + } + return w +} + +// hasProgress reports whether any tracked table has a non-zero watermark. +func (c Cursor) hasProgress() bool { + for _, w := range c.Tables { + if w.nonZero() { + return true + } + } + return false +} + +// ParseCursor decodes a stored cursor JSON blob into a Cursor. Empty input +// yields an empty Cursor (first run). An unknown version is rejected. A v1 (or +// any pre-P1-A old-shape) cursor is treated as a FRESH ZERO cursor: the +// watermark split (P1-A) changed the on-disk shape, and a one-time full re-scan +// from zero is idempotent (the ingester upserts) and is the safe migration — +// far cheaper to reason about than partially re-deriving the new MaxIDSeen from +// the old MaxID. Mirrors codex/cursor.go's version discipline. +func ParseCursor(stored string) (Cursor, error) { + if stored == "" { + return newCursor(), nil + } + var c Cursor + if err := json.Unmarshal([]byte(stored), &c); err != nil { + return Cursor{}, fmt.Errorf("opencode: decode cursor: %w", err) + } + if c.Version == 0 { + // A version-less blob predates explicit versioning; treat it as the + // retired v1 shape → fresh re-scan (P1-A migration). + return newCursor(), nil + } + if c.Version != cursorVersion { + // A v1 cursor (or any other retired version) re-scans from zero rather + // than erroring: column/shape drift in OUR own cursor is recoverable by a + // one-time idempotent backfill, unlike a corrupt blob. + return newCursor(), nil + } + if c.Tables == nil { + c.Tables = map[string]TableWatermark{} + } + return c, nil +} + +// withTable returns a new Cursor with the given table's watermark replaced. +// The receiver is not mutated. Mirrors codex's withFile. +func (c Cursor) withTable(table string, w TableWatermark) Cursor { + out := c.clone() + out.Tables[table] = w + return out +} + +// withSchemaHash returns a new Cursor carrying the given schema hash. The +// receiver is not mutated. Used by later chunks when __drizzle_migrations is +// probed; lives here so the cursor stays the single owner of its fields. +func (c Cursor) withSchemaHash(hash string) Cursor { + out := c.clone() + out.SchemaHash = hash + return out +} + +// clone deep-copies the cursor's map so callers can mutate the result without +// affecting the receiver. Mirrors codex/cursor.go. +func (c Cursor) clone() Cursor { + out := Cursor{ + Version: cursorVersion, + SchemaHash: c.SchemaHash, + Tables: make(map[string]TableWatermark, len(c.Tables)+1), + } + maps.Copy(out.Tables, c.Tables) + return out +} diff --git a/internal/adapters/opencode/cursor_regression_test.go b/internal/adapters/opencode/cursor_regression_test.go new file mode 100644 index 0000000..89faa6c --- /dev/null +++ b/internal/adapters/opencode/cursor_regression_test.go @@ -0,0 +1,136 @@ +package opencode + +import ( + "database/sql" + "testing" + "time" +) + +// This file is the load-bearing P1-A regression proof (SOW-0005 round-2): the +// cursor's MaxIDSeen (monotonic insert-detect high-water) is kept SEPARATE from +// the (time_updated, id) paging position, so an in-place UPDATE of an OLD row — +// whose time_updated jumps above the newest row but whose id stays small — does +// NOT regress the cheap-detect watermark and does NOT permanently re-arm the +// expensive unindexed MAX(time_updated) full scan on every idle poll (which is +// exactly what the pre-P1-A single-MaxID code did, defeating AC#6's gate). + +// pageSession pages the session table forward from `from` via scanTableDelta and +// returns the advanced watermark. It mirrors scanMessagesFrom (store_query_test) +// for the session table, which this regression seeds. +func pageSession(t *testing.T, db *sql.DB, schema schemaSet, from TableWatermark) TableWatermark { + t.Helper() + s := schema["session"] + idx := newColumnIndex(s) + scan, _ := scanSessionRow(idx, len(s.Present), nil) + delta, err := scanTableDelta(ctxBG(), db, s, from, func(rows *sql.Rows) (rowKey, error) { + return scan(rows) + }, &warnSink{}, nil) + if err != nil { + t.Fatalf("scanTableDelta(session): %v", err) + } + return delta.watermark +} + +// TestP1A_OldRowUpdateDoesNotReArmIdleScan is the decisive regression. It seeds +// monotonic sessions, pages them to establish the cursor, UPDATEs the OLDEST +// (lowest-id) row in place so its time_updated sorts LAST, re-pages that update, +// and then asserts an IDLE detectChange (gate closed): +// +// - returns changed=false (the in-place update of an already-seen id is not a +// new insert), and +// - executes ZERO MAX(time_updated) queries (the cheap MAX(id) path is +// satisfied by MaxIDSeen, so the expensive unindexed scan never runs). +// +// Pre-P1-A this failed both ways: the paging position's id regressed to the +// small updated id, so MAX(id) stayed permanently greater than the watermark, +// flipping changed=true and forcing the expensive scan on every idle poll. +// +// NOT t.Parallel(): the counting driver shares one global queryLog (see +// store_testhelpers_test.go), so counting tests run serially. +func TestP1A_OldRowUpdateDoesNotReArmIdleScan(t *testing.T) { + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + // Five monotonic sessions: id and time_updated both increase together. + for i := 1; i <= 5; i++ { + insertSession(t, rw, fmtID("ses", i), "", int64(i*10), int64(i*10), 0) + } + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + + db, schema := introspect(t, path) + + // Page from zero to establish the cursor. The monotonic fixture's max id and + // max time_updated both belong to ses_5. + wm := pageSession(t, db, schema, TableWatermark{}) + if wm.MaxIDSeen != fmtID("ses", 5) { + t.Fatalf("after initial paging MaxIDSeen = %q, want %q", wm.MaxIDSeen, fmtID("ses", 5)) + } + if wm.MaxTimeUpdatedID != fmtID("ses", 5) || wm.MaxTimeUpdatedMs != 50 { + t.Fatalf("after initial paging paging-position = {%d,%q}, want {50,%q}", wm.MaxTimeUpdatedMs, wm.MaxTimeUpdatedID, fmtID("ses", 5)) + } + + // In-place UPDATE of the OLDEST row (ses_000...001): its time_updated jumps + // ABOVE the newest row (50 → 999) while its id stays the lowest. This is the + // Drizzle .$onUpdate pattern that re-stamps time_updated on an existing row. + rw2, err := openRWAgain(t, path) + if err != nil { + t.Fatalf("reopen rw: %v", err) + } + if _, err := rw2.Exec(`UPDATE session SET time_updated = 999 WHERE id = ?`, fmtID("ses", 1)); err != nil { + _ = rw2.Close() + t.Fatalf("in-place update of old row: %v", err) + } + if err := rw2.Close(); err != nil { + t.Fatalf("close rw2: %v", err) + } + + // Re-page the in-place update from the established watermark. The paging + // position advances to the updated old row (999, ses_1), but MaxIDSeen MUST + // NOT regress — it stays at ses_5 (the true high-water id). + wm = pageSession(t, db, schema, wm) + if wm.MaxTimeUpdatedMs != 999 || wm.MaxTimeUpdatedID != fmtID("ses", 1) { + t.Fatalf("after re-paging the update paging-position = {%d,%q}, want {999,%q}", wm.MaxTimeUpdatedMs, wm.MaxTimeUpdatedID, fmtID("ses", 1)) + } + if wm.MaxIDSeen != fmtID("ses", 5) { + t.Fatalf("MaxIDSeen REGRESSED to %q after an old-row update; want it pinned at %q (P1-A)", wm.MaxIDSeen, fmtID("ses", 5)) + } + + // Build the post-update cursor (only the session table is seeded here; the + // other tracked tables stay at zero watermark, which is correct — they are + // empty, so their MAX(id) is "" and the cheap check is also satisfied). + cur := newCursor().withTable("session", wm) + + // Now drive an IDLE detectChange through the COUNTING driver and assert no + // MAX(time_updated) is issued and no change is reported. + cdb, log := openCounting(t, path) + cschema, err := introspectAll(ctxBG(), cdb) + if err != nil { + t.Fatalf("introspectAll(counting): %v", err) + } + log.reset() + + now := time.Unix(1_700_000_000, 0) + st := newPollState(false) + st.markProbe(now) // lastProbe = now → 60 s net not yet due + st.lastWALEvent = now.Add(-time.Second) // a WAL event BEFORE the probe → gate stays CLOSED + + changed, probed, derr := detectChange(ctxBG(), cdb, cschema, cur, &st, now.Add(activePollInterval)) + if derr != nil { + t.Fatalf("detectChange: %v", derr) + } + if changed { + t.Error("idle detectChange reported changed=true after an in-place update of an OLD, already-seen row (P1-A regression: MaxIDSeen must absorb it)") + } + if probed { + t.Error("idle detectChange ran the gated probe with the gate CLOSED") + } + if n := log.countContaining("MAX(time_updated)"); n != 0 { + t.Errorf("idle detectChange executed MAX(time_updated) %d times after an old-row update; want 0 (P1-A: the expensive scan must not re-arm)", n) + } + // Sanity: the cheap MAX(id) DID run (one per tracked table), proving the cheap + // path is what closed the cycle. + if n := log.countContaining("MAX(id)"); n < len(trackedTables) { + t.Errorf("cheap MAX(id) ran %d times, want >= %d (one per tracked table)", n, len(trackedTables)) + } +} diff --git a/internal/adapters/opencode/cursor_test.go b/internal/adapters/opencode/cursor_test.go new file mode 100644 index 0000000..33113e4 --- /dev/null +++ b/internal/adapters/opencode/cursor_test.go @@ -0,0 +1,287 @@ +package opencode + +import ( + "encoding/json" + "testing" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +func TestParseCursor_Empty(t *testing.T) { + t.Parallel() + c, err := ParseCursor("") + if err != nil { + t.Fatalf("ParseCursor(\"\"): %v", err) + } + if len(c.Tables) != 0 || c.Version != cursorVersion { + t.Fatalf("empty cursor wrong: %+v", c) + } +} + +func TestParseCursor_RoundTrip(t *testing.T) { + t.Parallel() + orig := newCursor(). + withSchemaHash("deadbeef"). + withTable("part", TableWatermark{MaxIDSeen: "prt_zzz", MaxTimeUpdatedMs: 1779793313250, MaxTimeUpdatedID: "prt_zzz"}). + withTable("message", TableWatermark{MaxIDSeen: "msg_yyy", MaxTimeUpdatedMs: 1779793313106, MaxTimeUpdatedID: "msg_yyy"}) + encoded := orig.String() + got, err := ParseCursor(encoded) + if err != nil { + t.Fatalf("ParseCursor: %v", err) + } + if got.SchemaHash != "deadbeef" { + t.Errorf("schema_hash lost: %q", got.SchemaHash) + } + w := got.Tables["part"] + if w.MaxIDSeen != "prt_zzz" || w.MaxTimeUpdatedMs != 1779793313250 || w.MaxTimeUpdatedID != "prt_zzz" { + t.Errorf("part watermark lost: %+v", w) + } + if got.Tables["message"].MaxTimeUpdatedMs != 1779793313106 { + t.Errorf("message watermark lost: %+v", got.Tables["message"]) + } +} + +// TestParseCursor_V1ReScans verifies a v1 (pre-P1-A) cursor — which used the +// conflated single max_id field — is treated as a FRESH ZERO cursor so the +// adapter does a one-time idempotent full re-scan onto the new split-watermark +// shape (SOW-0005 round-2 P1-A), rather than mis-loading the old max_id into the +// new fields or erroring. +func TestParseCursor_V1ReScans(t *testing.T) { + t.Parallel() + v1 := `{"version":1,"schema_hash":"abc","tables":{"part":{"max_id":"prt_a","max_time_updated":10}}}` + got, err := ParseCursor(v1) + if err != nil { + t.Fatalf("ParseCursor(v1): %v", err) + } + if got.Version != cursorVersion { + t.Errorf("v1 cursor not upgraded to version %d: %+v", cursorVersion, got) + } + if got.hasProgress() { + t.Errorf("v1 cursor must re-scan from zero (no progress), got %+v", got.Tables) + } + if got.SchemaHash != "" { + t.Errorf("v1 cursor must drop its old schema_hash on re-scan, got %q", got.SchemaHash) + } +} + +// TestParseCursor_VersionlessReScans verifies a cursor JSON that omits the +// version field (a pre-versioned or hand-written blob) is treated as the retired +// shape → fresh re-scan, not silently coerced to the current version with stale +// watermarks. +func TestParseCursor_VersionlessReScans(t *testing.T) { + t.Parallel() + got, err := ParseCursor(`{"tables":{"part":{"max_id":"prt_a","max_time_updated":7}}}`) + if err != nil { + t.Fatalf("ParseCursor(no version): %v", err) + } + if got.Version != cursorVersion { + t.Errorf("version = %d, want defaulted to %d", got.Version, cursorVersion) + } + if got.hasProgress() { + t.Errorf("versionless cursor must re-scan from zero, got %+v", got.Tables) + } +} + +// TestParseCursor_UnknownVersionReScans asserts a future/unknown version is +// treated as a fresh re-scan rather than erroring: our OWN cursor shape drifting +// is recoverable by an idempotent backfill (unlike a corrupt blob, which errors). +func TestParseCursor_UnknownVersionReScans(t *testing.T) { + t.Parallel() + got, err := ParseCursor(`{"version":99}`) + if err != nil { + t.Fatalf("ParseCursor(version 99): unexpected error %v", err) + } + if got.Version != cursorVersion || got.hasProgress() { + t.Errorf("unknown version must yield a fresh zero cursor, got %+v", got) + } +} + +func TestParseCursor_Malformed(t *testing.T) { + t.Parallel() + if _, err := ParseCursor(`{not json`); err == nil { + t.Fatal("ParseCursor(malformed): want error") + } +} + +func TestCursor_StringStableSortedKeys(t *testing.T) { + t.Parallel() + c := newCursor(). + withTable("session", TableWatermark{MaxIDSeen: "ses_b", MaxTimeUpdatedMs: 2, MaxTimeUpdatedID: "ses_b"}). + withTable("part", TableWatermark{MaxIDSeen: "prt_a", MaxTimeUpdatedMs: 1, MaxTimeUpdatedID: "prt_a"}) + first := c.String() + second := c.String() + if first != second { + t.Fatalf("cursor String() not stable:\n first: %s\n second: %s", first, second) + } + var probe struct { + Tables map[string]json.RawMessage `json:"tables"` + } + if err := json.Unmarshal([]byte(c.String()), &probe); err != nil { + t.Fatalf("unmarshal: %v", err) + } + if len(probe.Tables) != 2 { + t.Fatalf("want 2 tables, got %d", len(probe.Tables)) + } +} + +// TestCursor_After exercises the single-table advance/hold/regress cases and +// the empty-vs-progress comparisons, mirroring codex's After discipline lifted +// to the paging-position pair (time_updated, MaxTimeUpdatedID). +func TestCursor_After(t *testing.T) { + t.Parallel() + base := newCursor().withTable("part", TableWatermark{MaxTimeUpdatedID: "prt_m", MaxTimeUpdatedMs: 50}) + aheadByTime := newCursor().withTable("part", TableWatermark{MaxTimeUpdatedID: "prt_m", MaxTimeUpdatedMs: 100}) + aheadByID := newCursor().withTable("part", TableWatermark{MaxTimeUpdatedID: "prt_z", MaxTimeUpdatedMs: 50}) + behind := newCursor().withTable("part", TableWatermark{MaxTimeUpdatedID: "prt_a", MaxTimeUpdatedMs: 10}) + + if !aheadByTime.After(base) { + t.Error("aheadByTime.After(base) = false, want true") + } + if !aheadByID.After(base) { + t.Error("aheadByID.After(base) = false, want true (id is the tiebreaker at equal time)") + } + if base.After(aheadByTime) { + t.Error("base.After(aheadByTime) = true, want false") + } + if behind.After(base) { + t.Error("behind.After(base) = true, want false") + } + if newCursor().After(base) { + t.Error("empty.After(base) = true, want false") + } + if !base.After(newCursor()) { + t.Error("base.After(empty) = false, want true") + } +} + +// TestCursor_AfterIgnoresMaxIDSeen asserts MaxIDSeen does NOT participate in +// After ordering (SOW-0005 round-2 P1-A): two cursors with the SAME paging +// position but different MaxIDSeen are equal under After (neither is after the +// other). MaxIDSeen is the cheap-detect high-water, not the resume position. +func TestCursor_AfterIgnoresMaxIDSeen(t *testing.T) { + t.Parallel() + lowSeen := newCursor().withTable("part", TableWatermark{MaxIDSeen: "prt_a", MaxTimeUpdatedMs: 50, MaxTimeUpdatedID: "prt_m"}) + highSeen := newCursor().withTable("part", TableWatermark{MaxIDSeen: "prt_z", MaxTimeUpdatedMs: 50, MaxTimeUpdatedID: "prt_m"}) + if lowSeen.After(highSeen) || highSeen.After(lowSeen) { + t.Errorf("MaxIDSeen must not affect After: low.After(high)=%v high.After(low)=%v", + lowSeen.After(highSeen), highSeen.After(lowSeen)) + } +} + +// TestCursor_AfterMultiTable asserts After requires at least one table to +// advance with NO table regressing — the no-regression invariant that +// prevents a partial cursor from being treated as forward progress. +func TestCursor_AfterMultiTable(t *testing.T) { + t.Parallel() + base := newCursor(). + withTable("message", TableWatermark{MaxTimeUpdatedID: "msg_m", MaxTimeUpdatedMs: 50}). + withTable("part", TableWatermark{MaxTimeUpdatedID: "prt_m", MaxTimeUpdatedMs: 50}) + // One table advances, the other holds: After. + oneAdvances := newCursor(). + withTable("message", TableWatermark{MaxTimeUpdatedID: "msg_z", MaxTimeUpdatedMs: 60}). + withTable("part", TableWatermark{MaxTimeUpdatedID: "prt_m", MaxTimeUpdatedMs: 50}) + if !oneAdvances.After(base) { + t.Error("oneAdvances.After(base) = false, want true") + } + // One advances but the other regresses: NOT After. + mixed := newCursor(). + withTable("message", TableWatermark{MaxTimeUpdatedID: "msg_z", MaxTimeUpdatedMs: 60}). + withTable("part", TableWatermark{MaxTimeUpdatedID: "prt_a", MaxTimeUpdatedMs: 40}) + if mixed.After(base) { + t.Error("mixed (one regresses).After(base) = true, want false") + } + // Missing a table the other has progress on: regression, NOT After. + missing := newCursor().withTable("message", TableWatermark{MaxTimeUpdatedID: "msg_z", MaxTimeUpdatedMs: 100}) + if missing.After(base) { + t.Error("missing-table.After(base) = true, want false") + } +} + +func TestCursor_AfterAlienType(t *testing.T) { + t.Parallel() + type alien struct{ canonical.Cursor } + c := newCursor().withTable("part", TableWatermark{MaxTimeUpdatedID: "prt_a", MaxTimeUpdatedMs: 1}) + if !c.After(alien{}) { + t.Error("cursor with progress should be After an alien cursor type") + } + if newCursor().After(alien{}) { + t.Error("empty cursor should not be After an alien cursor type") + } +} + +// TestCursor_SchemaHashNotPartOfAfter asserts schema_hash is +// observability/invalidation-only and does NOT participate in After ordering. +func TestCursor_SchemaHashNotPartOfAfter(t *testing.T) { + t.Parallel() + a := newCursor().withTable("part", TableWatermark{MaxTimeUpdatedID: "prt_a", MaxTimeUpdatedMs: 50}) + b := newCursor(). + withTable("part", TableWatermark{MaxTimeUpdatedID: "prt_a", MaxTimeUpdatedMs: 50}). + withSchemaHash("changed") + if a.After(b) || b.After(a) { + t.Errorf("schema_hash must not affect After: a.After(b)=%v b.After(a)=%v", a.After(b), b.After(a)) + } +} + +// TestCursor_CloneIndependent asserts clone produces an independent map so +// mutating a derived cursor never affects the receiver. +func TestCursor_CloneIndependent(t *testing.T) { + t.Parallel() + orig := newCursor().withTable("part", TableWatermark{MaxIDSeen: "prt_a", MaxTimeUpdatedMs: 10, MaxTimeUpdatedID: "prt_a"}) + derived := orig. + withTable("part", TableWatermark{MaxIDSeen: "prt_z", MaxTimeUpdatedMs: 20, MaxTimeUpdatedID: "prt_z"}). + withSchemaHash("x") + if orig.Tables["part"].MaxIDSeen != "prt_a" { + t.Errorf("receiver mutated: orig MaxIDSeen = %q, want prt_a", orig.Tables["part"].MaxIDSeen) + } + if orig.SchemaHash != "" { + t.Errorf("receiver mutated: orig SchemaHash = %q, want empty", orig.SchemaHash) + } + if derived.Tables["part"].MaxIDSeen != "prt_z" { + t.Errorf("derived MaxIDSeen = %q, want prt_z", derived.Tables["part"].MaxIDSeen) + } +} + +// TestAdvanceMaxIDSeen asserts the monotonic high-water never regresses: a +// greater id raises it, a smaller/equal id leaves it unchanged, and the paging +// position is untouched (SOW-0005 round-2 P1-A). +func TestAdvanceMaxIDSeen(t *testing.T) { + t.Parallel() + w := TableWatermark{MaxIDSeen: "prt_m", MaxTimeUpdatedMs: 99, MaxTimeUpdatedID: "prt_m"} + // A greater id raises the high-water. + if got := w.advanceMaxIDSeen("prt_z"); got.MaxIDSeen != "prt_z" { + t.Errorf("advanceMaxIDSeen(prt_z).MaxIDSeen = %q, want prt_z", got.MaxIDSeen) + } + // A SMALLER id (an old row re-stamped) must NOT pull the high-water back. + got := w.advanceMaxIDSeen("prt_a") + if got.MaxIDSeen != "prt_m" { + t.Errorf("advanceMaxIDSeen(prt_a).MaxIDSeen = %q, want prt_m (no regression)", got.MaxIDSeen) + } + // The paging position is untouched by the high-water advance. + if got.MaxTimeUpdatedMs != 99 || got.MaxTimeUpdatedID != "prt_m" { + t.Errorf("advanceMaxIDSeen mutated the paging position: %+v", got) + } +} + +// TestCmpWatermark covers the composite ordering directly: time dominates, the +// PAGING-POSITION id (MaxTimeUpdatedID) breaks ties; MaxIDSeen is not compared. +func TestCmpWatermark(t *testing.T) { + t.Parallel() + cases := []struct { + name string + a, b TableWatermark + want int + }{ + {"equal", TableWatermark{MaxTimeUpdatedID: "x", MaxTimeUpdatedMs: 5}, TableWatermark{MaxTimeUpdatedID: "x", MaxTimeUpdatedMs: 5}, 0}, + {"time less", TableWatermark{MaxTimeUpdatedMs: 4}, TableWatermark{MaxTimeUpdatedMs: 5}, -1}, + {"time greater", TableWatermark{MaxTimeUpdatedMs: 6}, TableWatermark{MaxTimeUpdatedMs: 5}, 1}, + {"id tiebreak less", TableWatermark{MaxTimeUpdatedID: "a", MaxTimeUpdatedMs: 5}, TableWatermark{MaxTimeUpdatedID: "b", MaxTimeUpdatedMs: 5}, -1}, + {"id tiebreak greater", TableWatermark{MaxTimeUpdatedID: "c", MaxTimeUpdatedMs: 5}, TableWatermark{MaxTimeUpdatedID: "b", MaxTimeUpdatedMs: 5}, 1}, + {"time beats id", TableWatermark{MaxTimeUpdatedID: "a", MaxTimeUpdatedMs: 6}, TableWatermark{MaxTimeUpdatedID: "z", MaxTimeUpdatedMs: 5}, 1}, + {"MaxIDSeen ignored", TableWatermark{MaxIDSeen: "z", MaxTimeUpdatedID: "a", MaxTimeUpdatedMs: 5}, TableWatermark{MaxIDSeen: "a", MaxTimeUpdatedID: "a", MaxTimeUpdatedMs: 5}, 0}, + } + for _, tc := range cases { + if got := cmpWatermark(tc.a, tc.b); got != tc.want { + t.Errorf("%s: cmpWatermark = %d, want %d", tc.name, got, tc.want) + } + } +} diff --git a/internal/adapters/opencode/data_fuzz_test.go b/internal/adapters/opencode/data_fuzz_test.go new file mode 100644 index 0000000..acebe2a --- /dev/null +++ b/internal/adapters/opencode/data_fuzz_test.go @@ -0,0 +1,141 @@ +package opencode + +import "testing" + +// This file fuzzes the opencode `data`-JSON decoders — decodeMessageData (the +// message.data user|assistant union) and decodePartData (the part.data 12-variant +// $.type union) in types.go. They are opencode's analogue of codex's JSONL line +// parser (codex/parser_fuzz_test.go FuzzParseLine): the untrusted-bytes boundary +// where a malformed/truncated/adversarial blob from a live, concurrently-written +// SQLite database first meets the adapter. The contract under fuzz is the same: +// the decoder NEVER panics on any input — it returns either a decoded struct or a +// wrapped error. Every typed helper reachable from a SUCCESSFULLY decoded value +// (role/kind/subAgentSessionID/modelID/reasoningKind) must also not panic. All +// seeds are synthetic with placeholder identities (no real session content). + +// messageDataSeeds covers both message roles plus the malformed/edge inputs. +func messageDataSeeds() [][]byte { + return [][]byte{ + // Valid user message (nested model object, summary, tools). + []byte(`{"id":"msg_u1","sessionID":"ses_x","role":"user","time":{"created":1000},"agent":"general","model":{"providerID":"anthropic","modelID":"claude-x","variant":"default"},"tools":{"read":true}}`), + // Valid assistant message with tokens/cost/finish. + []byte(`{"id":"msg_a1","sessionID":"ses_x","role":"assistant","parentID":"msg_u1","agent":"general","modelID":"claude-x","providerID":"anthropic","mode":"general","cost":0.02,"tokens":{"total":1000,"input":500,"output":80,"reasoning":16,"cache":{"read":100,"write":0}},"time":{"created":2000,"completed":9000},"finish":"stop"}`), + // Assistant with an error (tagged AssistantError union). + []byte(`{"role":"assistant","modelID":"claude-x","providerID":"anthropic","error":{"name":"ProviderError","data":{"message":"boom"}},"time":{"created":2000}}`), + // Assistant with completed absent (still-running turn). + []byte(`{"role":"assistant","modelID":"m","providerID":"p","time":{"created":2000}}`), + // Unknown role (forward-compat → roleUnknown, not an error). + []byte(`{"role":"system","time":{"created":1}}`), + // Role absent entirely. + []byte(`{"time":{"created":1}}`), + // Numbers as strings / wrong types in a sibling field (dropped/ignored). + []byte(`{"role":"assistant","cost":"not-a-number"}`), + // Empty object / null / blank / whitespace / garbage. + []byte(`{}`), + []byte(`null`), + []byte(``), + []byte(` `), + []byte(`{not json`), + // Deeply-nested blob (defends against unbounded recursion / stack issues). + []byte(`{"role":"assistant","tokens":{"cache":{"read":1}},"error":{"data":{"a":{"b":{"c":{"d":{"e":[1,2,3]}}}}}}}`), + } +} + +// partDataSeeds covers each of the 12 known $.type variants + an unknown type + +// the malformed/edge inputs. +func partDataSeeds() [][]byte { + return [][]byte{ + []byte(`{"type":"step-start","snapshot":"snap_1"}`), + []byte(`{"type":"step-finish","reason":"stop","cost":0.01,"tokens":{"input":100,"output":20,"reasoning":0,"cache":{"read":0,"write":0}}}`), + []byte(`{"type":"text","text":"hello","synthetic":false,"time":{"start":1,"end":2}}`), + []byte(`{"type":"reasoning","text":"thinking","time":{"start":1,"end":2},"metadata":{"summary":true}}`), + // tool with state.metadata.sessionId (the sub-agent task edge, AC#4). + []byte(`{"type":"tool","callID":"call_1","tool":"task","state":{"status":"completed","input":{"prompt":"go"},"output":"done","metadata":{"sessionId":"ses_child"},"time":{"start":1,"end":2}}}`), + // tool error state. + []byte(`{"type":"tool","callID":"c","tool":"bash","state":{"status":"error","input":{"cmd":"x"},"error":"exit 1","time":{"start":1,"end":2}}}`), + // tool running (no end → still running). + []byte(`{"type":"tool","callID":"c","tool":"read","state":{"status":"running","input":{},"time":{"start":1}}}`), + // tool with a null/absent state. + []byte(`{"type":"tool","callID":"c","tool":"grep","state":null}`), + []byte(`{"type":"patch","hash":"abc123","files":["/work/a.go","/work/b.go"]}`), + []byte(`{"type":"snapshot","snapshot":"hash_1"}`), + []byte(`{"type":"compaction","auto":true,"overflow":false}`), + []byte(`{"type":"retry","attempt":2,"error":{"name":"APIError"},"time":{"created":1}}`), + []byte(`{"type":"file","mime":"image/png","filename":"x.png","url":"https://example.invalid/x.png"}`), + []byte(`{"type":"subtask","prompt":"do x","description":"d","agent":"reviewer"}`), + []byte(`{"type":"agent","name":"reviewer","source":"task"}`), + // metadata.sessionId present but state absent (subAgentSessionID must be safe). + []byte(`{"type":"tool","tool":"task","callID":"c"}`), + // metadata that is not an object (subAgentSessionID guards malformed). + []byte(`{"type":"tool","tool":"task","state":{"status":"completed","metadata":"oops","time":{"start":1,"end":2}}}`), + // Unknown $.type (forward-compat → partUnknown, not an error). + []byte(`{"type":"brand_new_part","x":1}`), + // Crafted/corrupt step-finish token values at the int64 extremes (SOW-0005 + // round-2 P2-F): the cumulative→delta subtraction must clamp, not wrap/panic. + []byte(`{"type":"step-finish","reason":"stop","tokens":{"input":9223372036854775807,"output":-9223372036854775808,"cache":{"read":9223372036854775807,"write":-1}}}`), + []byte(`{"type":"step-finish","reason":"stop","tokens":{"input":-9223372036854775808,"output":9223372036854775807,"total":9223372036854775807}}`), + // $.type absent. + []byte(`{"text":"orphan"}`), + // Empty / null / blank / garbage. + []byte(`{}`), + []byte(`null`), + []byte(``), + []byte(` `), + []byte(`{"type":`), + // Deeply-nested state blob. + []byte(`{"type":"tool","tool":"x","state":{"status":"completed","input":{"a":{"b":{"c":{"d":{"e":1}}}}},"metadata":{"sessionId":"s"},"time":{"start":1,"end":2}}}`), + } +} + +// FuzzDecodeMessageData feeds arbitrary bytes into decodeMessageData. Contract: +// never panics; returns a struct or a wrapped error. On a decoded value, the +// reachable typed helpers (role, modelID via the nested model object) must also +// not panic. +func FuzzDecodeMessageData(f *testing.F) { + for _, s := range messageDataSeeds() { + f.Add(s) + } + // Cross-seed: a part body fed to the message decoder must also be safe. + for _, s := range partDataSeeds() { + f.Add(s) + } + f.Fuzz(func(_ *testing.T, data []byte) { + d, err := decodeMessageData(data) + if err == nil { + _ = d.role() + if d.Model != nil { + _ = d.Model.modelID() + } + } + }) +} + +// FuzzDecodePartData feeds arbitrary bytes into decodePartData. Contract: never +// panics; returns a struct or a wrapped error. On a decoded value, kind() and the +// tool-state sub-agent extraction (subAgentSessionID, which parses raw metadata) +// must also not panic. +func FuzzDecodePartData(f *testing.F) { + for _, s := range partDataSeeds() { + f.Add(s) + } + // Cross-seed: a message body fed to the part decoder must also be safe. + for _, s := range messageDataSeeds() { + f.Add(s) + } + f.Fuzz(func(_ *testing.T, data []byte) { + d, err := decodePartData(data) + if err == nil { + _ = d.kind() + _ = reasoningKind(data) + if d.State != nil { + _ = d.State.subAgentSessionID() + } + // The decoded tokens flow into the cumulative→delta math, whose checked + // subtraction must clamp (not wrap/panic) on crafted extremes (P2-F). A + // single-element sequence exercises the first-snapshot path; a two-element + // sequence (the same tokens twice) exercises the prev-subtraction path. + _ = computeStepDeltas([]tokenCounts{d.Tokens}, nil) + _ = computeStepDeltas([]tokenCounts{d.Tokens, d.Tokens}, nil) + } + }) +} diff --git a/internal/adapters/opencode/doc.go b/internal/adapters/opencode/doc.go new file mode 100644 index 0000000..aa221f1 --- /dev/null +++ b/internal/adapters/opencode/doc.go @@ -0,0 +1,58 @@ +// Package opencode implements the canonical.Adapter for the opencode CLI +// session store. It is the only adapter that does not read filesystem +// snapshots: opencode keeps every session, message, and part in a single +// live, multi-GB SQLite database that opencode itself writes to +// concurrently. +// +// # Source layout +// +// opencode stores everything in one SQLite database with WAL companions: +// +// ~/.local/share/opencode/opencode.db main database +// ~/.local/share/opencode/opencode.db-wal write-ahead log +// ~/.local/share/opencode/opencode.db-shm shared-memory index +// +// The schema is Drizzle-managed and evolves across ~30 historic migrations. +// The adapter reads four tables — session, message, part, session_message — +// and treats the schema as evolving: it introspects each table with +// PRAGMA table_info at startup and names only columns that actually exist +// (never SELECT *), so an older-schema database missing a newer column is +// tolerated rather than failing the query. +// +// # Read-only delta-query model +// +// Because the source is a live database with a concurrent writer, this +// adapter does NOT stream JSONL like the four file adapters +// (aiagent_v2/v3, claude_code, codex). Instead it opens the database +// strictly read-only (see conn.go) and runs SQL delta-queries gated by a +// watermark cursor (see cursor.go). The read-safety contract is the +// dominant design constraint: any accidental write would corrupt the +// operator's primary coding tool, so the connection helper layers +// OS-level mode=ro with the query_only(true) PRAGMA and never issues a +// write-path statement. +// +// # Watermark cursor +// +// opencode IDs are time-prefixed Sonyflake strings (ses_/msg_/prt_/evt_), +// lexicographically sortable as time and PK-indexed. The cursor records, per +// table, TWO independent watermarks (SOW-0005 round-2 P1-A split): MaxIDSeen, +// the MONOTONIC highest id ever observed (the cheap, PK-b-tree insert check +// MAX(id) > MaxIDSeen — it never regresses, so an in-place update of an old row +// cannot re-arm the expensive scan); and the (MaxTimeUpdatedMs, MaxTimeUpdatedID) +// PAGING POSITION, the last-paged row's (time_updated, id) pair that the delta +// query binds and that catches in-place mutations (the unindexed MAX(time_updated) +// probe, gated to run only after WAL activity). There are no byte offsets — the +// file-adapter cursor model does not apply here. +// +// This file (doc.go) and the package siblings deliver only the read-only +// foundation: the connection helper, the watermark cursor, the typed row +// and discriminated-data structs, and the schema-introspection layer. The +// row→event mapper, the delta-query bodies, the poll-loop tailer, the +// payload-URI builder, and the adapter/registry wiring arrive in later +// chunks. +// +// See .agents/sow/specs/adapter-opencode.md for the full format reference +// (SQLite schema, read strategy, watch strategy, cursor design, canonical +// mapping, edge cases) and .agents/sow/specs/adapter-contract.md for the +// universal adapter rules. +package opencode diff --git a/internal/adapters/opencode/golden_invariants_test.go b/internal/adapters/opencode/golden_invariants_test.go new file mode 100644 index 0000000..b040286 --- /dev/null +++ b/internal/adapters/opencode/golden_invariants_test.go @@ -0,0 +1,468 @@ +package opencode + +import ( + "log/slog" + "path/filepath" + "testing" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file makes the committed goldens NOT self-justifying. TestGolden pins the +// EXACT emitted bytes, but a future `-update-golden` could silently launder a +// regression past review. These tests re-scan each fixture and assert the +// load-bearing INVARIANT each scenario exists to prove (AC#3/#4/#5/#7), keyed on +// canonical-event fields rather than golden text — so a regression that changed +// the math/linkage/degrade would fail HERE even after a golden refresh. + +// scenarioEvents builds the named scenario's fixture DB and returns its scanned, +// SourceProgress-filtered event stream (the same content TestGolden pins). +func scenarioEvents(t *testing.T, scenario string) []canonical.Event { + t.Helper() + dbPath := buildFixtureDB(t, fixtureSQLPath(scenario)) + abs, err := filepath.Abs(dbPath) + if err != nil { + t.Fatalf("abs: %v", err) + } + out := scanScenario(t, abs) + filtered := make([]canonical.Event, 0, len(out)) + for _, ev := range out { + if _, ok := ev.(canonical.SourceProgressEvent); ok { + continue + } + filtered = append(filtered, ev) + } + return filtered +} + +// sessionStarts returns every SessionStartedEvent in the stream. +func sessionStarts(events []canonical.Event) []canonical.SessionStartedEvent { + var out []canonical.SessionStartedEvent + for _, ev := range events { + if s, ok := ev.(canonical.SessionStartedEvent); ok { + out = append(out, s) + } + } + return out +} + +// sessionStartByID finds the SessionStartedEvent for a native id (fatal if absent). +func sessionStartByID(t *testing.T, events []canonical.Event, id string) canonical.SessionStartedEvent { + t.Helper() + for _, s := range sessionStarts(events) { + if s.NativeID == id { + return s + } + } + t.Fatalf("no SessionStartedEvent for %q", id) + return canonical.SessionStartedEvent{} +} + +// TestGoldenInvariant_AHappy pins the baseline tree shape: exactly one root +// session, one turn, one LLM op, one reasoning op, one tool op, and NO +// SessionFinalized (a running session with neither archive nor error stays +// running — adapter-opencode.md "Canonical Model Gaps" #5). +func TestGoldenInvariant_AHappy(t *testing.T) { + t.Parallel() + ev := scenarioEvents(t, "a_happy") + + if got := countKind(ev, canonical.EvSessionStarted); got != 1 { + t.Fatalf("SessionStarted = %d, want 1", got) + } + ss := firstStarted(t, ev) + if ss.Kind != canonical.KindRoot { + t.Errorf("Kind = %q, want root", ss.Kind) + } + if ss.ParentNativeID != "" { + t.Errorf("root ParentNativeID = %q, want empty", ss.ParentNativeID) + } + if ss.Model != "claude-x" { + t.Errorf("Model = %q, want claude-x", ss.Model) + } + if got := countKind(ev, canonical.EvTurnStarted); got != 1 { + t.Errorf("TurnStarted = %d, want 1", got) + } + if got := len(llmOps(ev)); got != 1 { + t.Errorf("llm ops = %d, want 1", got) + } + if got := countKindOpKind(ev, canonical.OpReasoning); got != 1 { + t.Errorf("reasoning ops = %d, want 1", got) + } + if got := len(toolOps(ev)); got != 1 { + t.Errorf("tool ops = %d, want 1", got) + } + if got := countKind(ev, canonical.EvSessionFinalized); got != 0 { + t.Errorf("SessionFinalized = %d, want 0 (running session)", got) + } +} + +// TestGoldenInvariant_BSubagentTask is the AC#4 dual-edge proof. The single +// tool='task' part must yield BOTH a session Op (kind=session, +// ChildSessionNativeID=ses_child01) AND a tool Op (kind=tool, name=task), and the +// child session row must map to Kind=sub_agent with ParentNativeID=ses_parent01. +func TestGoldenInvariant_BSubagentTask(t *testing.T) { + t.Parallel() + ev := scenarioEvents(t, "b_subagent_task") + + // Edge 1: session Op naming the child as topology parent. + var sessionOp *canonical.OpStartedEvent + for i, s := range opStarts(ev) { + if s.Kind == canonical.OpSession { + cp := opStarts(ev)[i] + sessionOp = &cp + break + } + } + if sessionOp == nil { + t.Fatal("no session Op emitted for tool='task' (AC#4 edge 1 missing)") + } + if sessionOp.ChildSessionNativeID != "ses_child01" { + t.Errorf("session Op ChildSessionNativeID = %q, want ses_child01", sessionOp.ChildSessionNativeID) + } + + // Edge 2: tool Op for the task tool, in the SAME turn as the session Op. + var taskTool *canonical.OpStartedEvent + for _, s := range toolOps(ev) { + if s.Name == "task" { + cp := s + taskTool = &cp + break + } + } + if taskTool == nil { + t.Fatal("no tool Op name=task emitted (AC#4 edge 2 missing)") + } + if taskTool.TurnSeq != sessionOp.TurnSeq { + t.Errorf("task tool TurnSeq %d != session Op TurnSeq %d (must be same turn)", taskTool.TurnSeq, sessionOp.TurnSeq) + } + if taskTool.Seq == sessionOp.Seq { + t.Errorf("task tool Seq %d must differ from session Op Seq %d (two distinct ops)", taskTool.Seq, sessionOp.Seq) + } + + // Edge 3 (parent_id): the child session maps to sub_agent linked to the parent. + child := sessionStartByID(t, ev, "ses_child01") + if child.Kind != canonical.KindSubAgent { + t.Errorf("child Kind = %q, want sub_agent", child.Kind) + } + if child.ParentNativeID != "ses_parent01" { + t.Errorf("child ParentNativeID = %q, want ses_parent01", child.ParentNativeID) + } + if child.RootNativeID != "ses_parent01" { + t.Errorf("child RootNativeID = %q, want ses_parent01", child.RootNativeID) + } +} + +// TestGoldenInvariant_CMultiProvider is the AC#7 multi-provider proof: the two +// turns' LLM ops carry distinct ProviderAlias verbatim (anthropic, openai) plus a +// canonical Provider, so two catalog providers seed downstream. It ALSO pins the +// two-level token model: per-op tokens reset per message (turn2 op = 300/80) while +// per-turn tokens are the session-level delta (turn2 turn = 200/50). +func TestGoldenInvariant_CMultiProvider(t *testing.T) { + t.Parallel() + ev := scenarioEvents(t, "c_multi_provider") + + aliases := map[string]bool{} + for _, s := range llmOps(ev) { + if s.ProviderAlias == "" { + t.Errorf("LLM op (turn %d) has empty ProviderAlias", s.TurnSeq) + } + if s.Provider != s.ProviderAlias { + t.Errorf("turn %d Provider %q != alias %q (both in knownProviderAliases here)", s.TurnSeq, s.Provider, s.ProviderAlias) + } + aliases[s.ProviderAlias] = true + } + for _, want := range []string{"anthropic", "openai"} { + if !aliases[want] { + t.Errorf("provider alias %q absent (want both anthropic+openai)", want) + } + } + + // Per-op (per-message-reset) vs per-turn (session-delta) tokens for turn 2. + op2 := opFinalForTurnSeq(t, ev, 2) + if op2.TokensIn != 300 || op2.TokensOut != 80 { + t.Errorf("turn2 LLM op tokens = %d/%d, want 300/80 (per-message cumulative)", op2.TokensIn, op2.TokensOut) + } + tf2 := turnFinalForSeq(t, ev, 2) + if tf2.TokensIn != 200 || tf2.TokensOut != 50 { + t.Errorf("turn2 turn tokens = %d/%d, want 200/50 (session-level delta)", tf2.TokensIn, tf2.TokensOut) + } +} + +// TestGoldenInvariant_DSchemaDrift is the AC#5 graceful-degrade proof. Against the +// pre-session_usage schema (session lacks agent/model/cost/tokens_*) the adapter +// must NOT reject (it emits a full tree), session-level Model/AgentName must be +// empty (their columns are gone), Extras must lack providerID/variant (both come +// from session.model), yet the op/turn token+provider values must survive because +// they come from message.data (untouched by the column drift). +func TestGoldenInvariant_DSchemaDrift(t *testing.T) { + t.Parallel() + ev := scenarioEvents(t, "d_schema_drift") + + if got := countKind(ev, canonical.EvSessionStarted); got != 1 { + t.Fatalf("SessionStarted = %d, want 1 (old schema must still ingest, not be rejected)", got) + } + ss := firstStarted(t, ev) + if ss.Model != "" { + t.Errorf("Model = %q, want empty (session.model column absent on old schema)", ss.Model) + } + if ss.AgentName != "" { + t.Errorf("AgentName = %q, want empty (session.agent column absent)", ss.AgentName) + } + if _, ok := ss.Extras["providerID"]; ok { + t.Errorf("Extras has providerID on old schema; want absent (derives from session.model)") + } + if _, ok := ss.Extras["variant"]; ok { + t.Errorf("Extras has variant on old schema; want absent") + } + // Always-present columns still populate Extras. + for _, k := range []string{"directory", "project_id", "slug", "title", "version"} { + if _, ok := ss.Extras[k]; !ok { + t.Errorf("Extras missing always-present key %q", k) + } + } + // Op-level provider/model + token values come from message.data, unaffected. + ops := llmOps(ev) + if len(ops) != 1 { + t.Fatalf("llm ops = %d, want 1", len(ops)) + } + if ops[0].Provider != "anthropic" || ops[0].Model != "claude-x" { + t.Errorf("LLM op provider/model = %q/%q, want anthropic/claude-x (from message.data)", ops[0].Provider, ops[0].Model) + } + tf := turnFinals(ev) + if len(tf) != 1 || tf[0].TokensIn != 60 || tf[0].TokensOut != 15 { + t.Errorf("turn tokens = %v, want one turn 60/15 (from message.data)", tf) + } +} + +// TestGoldenInvariant_DSchemaDrift_MissingColumnsLoggedINF is the AC#5 INF-logging +// proof. Against the pre-`20260510033149` schema (session lacks the optional +// columns the dynamic SELECT omits — including time_compacting, SOW-0005 round-2 +// P2-E), a Scan through the public adapter must emit +// exactly one INFO record per missing optional column, each carrying the matching +// `table`+`column` attributes. The set of logged (table, column) pairs must equal +// the set of columns introspection reports Missing — no more, no less. (Scan and +// Tail each emit the set once; this test exercises Scan, so one record per column.) +func TestGoldenInvariant_DSchemaDrift_MissingColumnsLoggedINF(t *testing.T) { + t.Parallel() + dbPath := buildFixtureDB(t, fixtureSQLPath("d_schema_drift")) + abs, err := filepath.Abs(dbPath) + if err != nil { + t.Fatalf("abs: %v", err) + } + + // Ground truth: the columns introspection reports Missing on this schema. + set, err := introspectAll(ctxBG(), openRO(t, dbPath)) + if err != nil { + t.Fatalf("introspectAll must accept the old schema (graceful degrade), got: %v", err) + } + want := map[[2]string]bool{} + for _, table := range trackedTables { + for _, col := range set[table].Missing { + want[[2]string{table, col}] = true + } + } + if len(want) == 0 { + t.Fatal("d_schema_drift fixture has no missing optional columns; the INF assertion is vacuous") + } + + // Scan through the public adapter with a record-capturing logger. + rec := &captureHandler{} + a, err := New(abs, canonical.AdapterOptions{Logger: slog.New(rec)}) + if err != nil { + t.Fatalf("New: %v", err) + } + out := make(chan canonical.Event, 8192) + if err := a.Scan(ctxBG(), nil, out); err != nil { + t.Fatalf("Scan: %v", err) + } + + // Collect the (table, column) pairs from the missing-column INFO records. + const wantMsg = "opencode: optional column absent on this database schema; omitted from projection (old opencode version)" + got := map[[2]string]int{} + for _, r := range rec.records() { + if r.level != slog.LevelInfo || r.message != wantMsg { + continue + } + got[[2]string{r.attrs["table"], r.attrs["column"]}]++ + } + + // Every Missing column logged exactly once; nothing extra logged. + for key := range want { + if got[key] != 1 { + t.Errorf("missing-column INF for %v logged %d times, want exactly 1", key, got[key]) + } + } + for key, n := range got { + if !want[key] { + t.Errorf("unexpected missing-column INF for %v (logged %d times)", key, n) + } + } +} + +// TestGoldenInvariant_GNestedSubagent is the SOW-0005 P2.4 proof: in a 3-level +// session tree (root → child → grandchild) every session's RootNativeID is the +// TRUE tree root (ses_groot), NOT its direct parent. The grandchild is the +// load-bearing case: the pre-P2.4 code set its RootNativeID to its direct parent +// (ses_gchild); the chain-walk resolver must set it to ses_groot. ParentNativeID +// still points at the DIRECT parent (the immediate link), so the two differ for +// the grandchild — exactly what pins the fix. +func TestGoldenInvariant_GNestedSubagent(t *testing.T) { + t.Parallel() + ev := scenarioEvents(t, "g_nested_subagent") + + root := sessionStartByID(t, ev, "ses_groot") + if root.Kind != canonical.KindRoot { + t.Errorf("root Kind = %q, want root", root.Kind) + } + if root.RootNativeID != "ses_groot" { + t.Errorf("root RootNativeID = %q, want ses_groot (its own id)", root.RootNativeID) + } + + child := sessionStartByID(t, ev, "ses_gchild") + if child.ParentNativeID != "ses_groot" || child.RootNativeID != "ses_groot" { + t.Errorf("child parent/root = %q/%q, want ses_groot/ses_groot", child.ParentNativeID, child.RootNativeID) + } + + grand := sessionStartByID(t, ev, "ses_ggrand") + if grand.Kind != canonical.KindSubAgent { + t.Errorf("grandchild Kind = %q, want sub_agent", grand.Kind) + } + // The DIRECT parent is the child; the TRUE ROOT is the topmost ancestor. + if grand.ParentNativeID != "ses_gchild" { + t.Errorf("grandchild ParentNativeID = %q, want ses_gchild (direct parent)", grand.ParentNativeID) + } + if grand.RootNativeID != "ses_groot" { + t.Errorf("grandchild RootNativeID = %q, want ses_groot (tree root, NOT the direct parent ses_gchild)", grand.RootNativeID) + } +} + +// countKindOpKind counts OpStartedEvents of a given OpKind. +func countKindOpKind(events []canonical.Event, kind canonical.OpKind) int { + n := 0 + for _, s := range opStarts(events) { + if s.Kind == kind { + n++ + } + } + return n +} + +// opFinalForTurnSeq returns the (single) LLM op_finalized in the given turn. +// c_multi_provider has one LLM op per turn, so this is unambiguous. +func opFinalForTurnSeq(t *testing.T, events []canonical.Event, turnSeq int) canonical.OpFinalizedEvent { + t.Helper() + for _, f := range opFinals(events) { + if f.TurnSeq == turnSeq { + return f + } + } + t.Fatalf("no op_finalized in turn %d", turnSeq) + return canonical.OpFinalizedEvent{} +} + +// turnFinalForSeq returns the TurnFinalizedEvent for a turn seq (fatal if absent). +func turnFinalForSeq(t *testing.T, events []canonical.Event, seq int) canonical.TurnFinalizedEvent { + t.Helper() + for _, f := range turnFinals(events) { + if f.Seq == seq { + return f + } + } + t.Fatalf("no turn_finalized for seq %d", seq) + return canonical.TurnFinalizedEvent{} +} + +// TestGoldenInvariant_IFailedAssistant pins SOW-0005 round-5 P3-1: a session whose +// LAST assistant message carries data.error finalizes as SessionFinalized +// Status=failed with BOTH ErrorClass (data.error.name) AND ErrorMessage +// (data.error.data.message). The failed turn carries ErrorClass too +// (TurnFinalizedEvent has no ErrorMessage field, so the message rides only on the +// session terminal). Keyed on canonical-event fields so a regression that dropped +// ErrorMessage fails HERE even after a -update-golden refresh. +func TestGoldenInvariant_IFailedAssistant(t *testing.T) { + t.Parallel() + ev := scenarioEvents(t, "i_failed_assistant") + + if got := countKind(ev, canonical.EvSessionFinalized); got != 1 { + t.Fatalf("SessionFinalized = %d, want 1 (failed assistant message)", got) + } + fin := sessionFinal(ev) + if fin == nil { + t.Fatal("no SessionFinalizedEvent emitted for a failed assistant message") + } + if fin.Status != canonical.StatusFailed { + t.Errorf("Status = %q, want failed", fin.Status) + } + if fin.ErrorClass != "MessageAbortedError" { + t.Errorf("ErrorClass = %q, want MessageAbortedError (data.error.name)", fin.ErrorClass) + } + if fin.ErrorMessage != "request was aborted by the user" { + t.Errorf("ErrorMessage = %q, want the data.error.data.message string (P3-1)", fin.ErrorMessage) + } + // The failed turn carries the same ErrorClass (the canonical TurnFinalizedEvent + // has no ErrorMessage field — the detail enriches the session terminal only). + tf := turnFinalForSeq(t, ev, 1) + if tf.Status != "failed" { + t.Errorf("turn1 Status = %q, want failed", tf.Status) + } + if tf.ErrorClass != "MessageAbortedError" { + t.Errorf("turn1 ErrorClass = %q, want MessageAbortedError", tf.ErrorClass) + } +} + +// TestGoldenInvariant_JFileAttachment pins SOW-0005 round-4 P2-3 + round-6 P3-3 +// end-to-end: a file part flows through the full load→map pipeline as an INF +// LogEntry carrying {filename,url,mime} in extras, and the stream emits NO +// PayloadRefEvent at all (a file attachment has no canonical PayloadKind — the +// removed "user_attachment" kind was a contract violation). Keyed on canonical +// fields so a future -update-golden cannot launder a regression that re-introduced +// a non-canonical PayloadRef for a file part. +func TestGoldenInvariant_JFileAttachment(t *testing.T) { + t.Parallel() + ev := scenarioEvents(t, "j_file_attachment") + + // NO PayloadRef anywhere — the file part must not emit one (and nothing else in + // this scenario does either: text/tool parts are absent). + if got := countKind(ev, canonical.EvPayloadRef); got != 0 { + t.Fatalf("PayloadRef count = %d, want 0 (a file part is a LogEntry, not a payload ref; round-6 P3-3)", got) + } + // Defence in depth: any PayloadRef that DID slip through must at least carry a + // canonical kind (this also guards the assertion above against a kind rename). + for _, e := range ev { + if p, ok := e.(canonical.PayloadRefEvent); ok && !canonicalPayloadKinds[p.PayloadKind] { + t.Fatalf("non-canonical PayloadRef kind=%q emitted for the file-attachment scenario", p.PayloadKind) + } + } + + // Exactly one INF LogEntry "file attachment" with the three extras, scoped to the + // turn and the open LLM op. + var found int + for _, e := range ev { + l, ok := e.(canonical.LogEntryEvent) + if !ok || l.Message != "file attachment" { + continue + } + found++ + if l.Severity != "INF" { + t.Errorf("file-attachment LogEntry severity = %q, want INF", l.Severity) + } + if l.Source != Format { + t.Errorf("file-attachment LogEntry source = %q, want %q", l.Source, Format) + } + if l.Extras["filename"] != "diagram.png" { + t.Errorf("extras.filename = %v, want diagram.png", l.Extras["filename"]) + } + if l.Extras["mime"] != "image/png" { + t.Errorf("extras.mime = %v, want image/png", l.Extras["mime"]) + } + if l.Extras["url"] != "https://cdn.example.invalid/diagram.png" { + t.Errorf("extras.url = %v, want the verbatim data.url", l.Extras["url"]) + } + if l.TurnSeq != 1 || l.OpSeq != 1 { + t.Errorf("file-attachment LogEntry scope = (turn %d, op %d), want (1, 1)", l.TurnSeq, l.OpSeq) + } + } + if found != 1 { + t.Fatalf("file-attachment INF LogEntry count = %d, want 1", found) + } +} diff --git a/internal/adapters/opencode/golden_loghandler_test.go b/internal/adapters/opencode/golden_loghandler_test.go new file mode 100644 index 0000000..e1bb0fa --- /dev/null +++ b/internal/adapters/opencode/golden_loghandler_test.go @@ -0,0 +1,88 @@ +package opencode + +import ( + "context" + "log/slog" + "sync" +) + +// This file holds the record-capturing slog.Handler the AC#5 missing-column INF +// assertion (golden_invariants_test.go:TestGoldenInvariant_DSchemaDrift_MissingColumnsLoggedINF) +// uses to capture structured log records rather than parsing text logs. + +// capturedRecord is one slog record captured by captureHandler: its level, +// message, and string-valued attributes (the only kind the adapter logs here), +// flattened across any WithAttrs presets and the per-call attrs. +type capturedRecord struct { + level slog.Level + message string + attrs map[string]string +} + +// captureHandler is a minimal structured slog.Handler that records every Handle +// call for assertion (level + message + string attrs). New() wraps the logger +// with .With("adapter", …, "db", …) before scanLoop/tailLoop log the per- +// (table,column) attrs, so WithAttrs returns a DERIVED handler that the adapter +// actually logs through. All derived handlers share one *captureStore (mutex + +// records slice), so the original handler the test holds sees every record +// regardless of which derived handler appended it (append-reallocation safe). +// Concurrency-safe so the Scan goroutine and the test reader never race. +type captureHandler struct { + store *captureStore + presets map[string]string +} + +// captureStore is the shared sink every derived captureHandler appends to. +type captureStore struct { + mu sync.Mutex + recs []capturedRecord +} + +func (h *captureHandler) ensure() *captureStore { + if h.store == nil { + h.store = &captureStore{} + } + return h.store +} + +func (h *captureHandler) Enabled(context.Context, slog.Level) bool { return true } + +func (h *captureHandler) Handle(_ context.Context, r slog.Record) error { + st := h.ensure() + attrs := map[string]string{} + for k, v := range h.presets { + attrs[k] = v + } + r.Attrs(func(a slog.Attr) bool { + attrs[a.Key] = a.Value.String() + return true + }) + st.mu.Lock() + st.recs = append(st.recs, capturedRecord{level: r.Level, message: r.Message, attrs: attrs}) + st.mu.Unlock() + return nil +} + +func (h *captureHandler) WithAttrs(as []slog.Attr) slog.Handler { + merged := map[string]string{} + for k, v := range h.presets { + merged[k] = v + } + for _, a := range as { + merged[a.Key] = a.Value.String() + } + return &captureHandler{store: h.ensure(), presets: merged} +} + +func (h *captureHandler) WithGroup(string) slog.Handler { return h } + +// records returns a snapshot of every captured record, across all derived +// handlers (they share h's store). +func (h *captureHandler) records() []capturedRecord { + st := h.ensure() + st.mu.Lock() + defer st.mu.Unlock() + out := make([]capturedRecord, len(st.recs)) + copy(out, st.recs) + return out +} diff --git a/internal/adapters/opencode/golden_resume_test.go b/internal/adapters/opencode/golden_resume_test.go new file mode 100644 index 0000000..3c5f30b --- /dev/null +++ b/internal/adapters/opencode/golden_resume_test.go @@ -0,0 +1,185 @@ +package opencode + +import ( + "path/filepath" + "testing" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file holds the AC#3 cumulative-token invariant (a direct, golden-text- +// independent assertion of the per-op delta sequence) and the SCENARIO-LEVEL +// resume golden (AC#6 durability over a committed fixture.sql). +// +// Relationship to chunk C's TestScanLoop_ResumeZeroDupesZeroGaps: that test +// builds the DB in two INSERT stages on a throwaway DB and proves +// union(part1,part2)==cold-baseline. This chunk-E test pins the COMPLEMENTARY +// resume properties the brief calls out for a STATIC fixture (where two-stage +// seeding is impossible): (1) a re-scan from the FINAL cursor emits ZERO new +// content events (idempotent re-scan), and (2) two cold scans from the zero +// cursor emit identical content — together "resume/re-scan never drops or +// duplicates a content event". It uses the same eventFingerprint/ +// contentFingerprints/multisetDiff helpers (defined in tailer_resume_test.go). + +// TestGoldenInvariant_ECumulativeTokens is the AC#3 regression, asserted on the +// scanned events (independent of the golden bytes). The four step-finish parts +// carry CUMULATIVE inputs 100/250/410/400 and outputs 20/50/90/80; the per-LLM-op +// deltas MUST be 100/150/160/0 and 20/30/40/0 (the 4th clamps because the +// cumulative decreased). A regression to raw-value emission would make the +// sequence 100/250/410/400 and fail here. +func TestGoldenInvariant_ECumulativeTokens(t *testing.T) { + t.Parallel() + ev := scenarioEvents(t, "e_cumulative_tokens") + + // op_finalized events for the four LLM ops, in op-seq order. + fins := opFinals(ev) + bySeq := map[int]canonical.OpFinalizedEvent{} + for _, f := range fins { + bySeq[f.Seq] = f + } + wantIn := []int64{100, 150, 160, 0} + wantOut := []int64{20, 30, 40, 0} + for i := 0; i < 4; i++ { + seq := i + 1 + f, ok := bySeq[seq] + if !ok { + t.Fatalf("missing op_finalized seq %d", seq) + } + if f.TokensIn != wantIn[i] { + t.Errorf("op seq %d TokensIn = %d, want %d (cumulative->delta)", seq, f.TokensIn, wantIn[i]) + } + if f.TokensOut != wantOut[i] { + t.Errorf("op seq %d TokensOut = %d, want %d", seq, f.TokensOut, wantOut[i]) + } + } + + // The per-turn rollup is the message-level cumulative (first turn = own total). + tf := turnFinals(ev) + if len(tf) != 1 { + t.Fatalf("turn_finalized count = %d, want 1", len(tf)) + } + if tf[0].TokensIn != 400 || tf[0].TokensOut != 80 { + t.Errorf("turn tokens = %d/%d, want 400/80 (message-level cumulative)", tf[0].TokensIn, tf[0].TokensOut) + } +} + +// TestGoldenInvariant_ResumeIdempotentReScan pins AC#6 over a static fixture: a +// cold scan to the final cursor, then a SECOND scan from that persisted+reparsed +// cursor, emits ZERO new content events — every row is at-or-below the watermark, +// so nothing re-emits and nothing is dropped. This is the "no duplicate on +// restart" half of the durability contract for the SQLite adapter (the ingester's +// idempotent upserts would absorb a re-emission, but the cursor must prevent one +// in the first place when there is no new data). +func TestGoldenInvariant_ResumeIdempotentReScan(t *testing.T) { + t.Parallel() + dbPath := buildFixtureDB(t, fixtureSQLPath("a_happy")) + + // Cold scan from zero → final cursor + baseline content. + out1 := make(chan canonical.Event, 8192) + var ce1 collectErrs + final, err := scanLoop(ctxBG(), dbPath, "opencode:x", newCursor(), out1, silentLogger(), ce1.onError) + if err != nil { + t.Fatalf("cold scanLoop: %v", err) + } + baseline := contentFingerprints(drainAll(out1)) + if len(baseline) == 0 { + t.Fatal("cold scan produced no content events") + } + + // Persist + reparse the cursor (the durable round-trip the ingester performs). + reparsed, err := ParseCursor(final.String()) + if err != nil { + t.Fatalf("ParseCursor: %v", err) + } + + // Re-scan from the final cursor → expect ZERO content events (no new rows). + out2 := make(chan canonical.Event, 8192) + var ce2 collectErrs + if _, err := scanLoop(ctxBG(), dbPath, "opencode:x", reparsed, out2, silentLogger(), ce2.onError); err != nil { + t.Fatalf("re-scan from final cursor: %v", err) + } + resumed := contentFingerprints(drainAll(out2)) + if len(resumed) != 0 { + t.Errorf("re-scan from final cursor emitted %d content events, want 0 (no dup on restart):\n%v", len(resumed), resumed) + } +} + +// TestGoldenInvariant_ResumeFromZeroIsDeterministic pins the other half of AC#6: +// two independent cold scans from the zero cursor over the SAME fixture emit the +// IDENTICAL content multiset (no gap, no nondeterministic drop/duplicate). Run on +// the two-turn multi-provider fixture so the determinism covers multi-turn +// ordering and the cumulative-delta math, not just a single turn. +func TestGoldenInvariant_ResumeFromZeroIsDeterministic(t *testing.T) { + t.Parallel() + dbPath := buildFixtureDB(t, fixtureSQLPath("c_multi_provider")) + + scan := func() []string { + out := make(chan canonical.Event, 8192) + var ce collectErrs + if _, err := scanLoop(ctxBG(), dbPath, "opencode:x", newCursor(), out, silentLogger(), ce.onError); err != nil { + t.Fatalf("scanLoop: %v", err) + } + return contentFingerprints(drainAll(out)) + } + + first := scan() + second := scan() + if len(first) == 0 { + t.Fatal("scan produced no content events") + } + if diff := multisetDiff(first, second); diff != "" { + t.Fatalf("two cold scans differ (nondeterministic drop/dup):\n%s", diff) + } +} + +// TestGoldenInvariant_ResumeMultiSessionFinalCursor pins the multi-session case: a +// cold scan over the parent+child fixture, then a re-scan from the final cursor +// re-emits NEITHER session (both fully consumed). This guards against a watermark +// that fails to advance past one of several sessions touched in a single cycle — +// which would re-walk a completed session on every poll. +func TestGoldenInvariant_ResumeMultiSessionFinalCursor(t *testing.T) { + t.Parallel() + dbPath := buildFixtureDB(t, fixtureSQLPath("b_subagent_task")) + + out1 := make(chan canonical.Event, 8192) + var ce1 collectErrs + final, err := scanLoop(ctxBG(), dbPath, "opencode:x", newCursor(), out1, silentLogger(), ce1.onError) + if err != nil { + t.Fatalf("cold scanLoop: %v", err) + } + base := drainAll(out1) + // Sanity: the cold scan saw BOTH sessions. + if !sessionPresent(base, "ses_parent01") || !sessionPresent(base, "ses_child01") { + t.Fatalf("cold scan missing a session; parent=%v child=%v", + sessionPresent(base, "ses_parent01"), sessionPresent(base, "ses_child01")) + } + + reparsed, err := ParseCursor(final.String()) + if err != nil { + t.Fatalf("ParseCursor: %v", err) + } + out2 := make(chan canonical.Event, 8192) + var ce2 collectErrs + if _, err := scanLoop(ctxBG(), dbPath, "opencode:x", reparsed, out2, silentLogger(), ce2.onError); err != nil { + t.Fatalf("re-scan from final cursor: %v", err) + } + resumed := contentFingerprints(drainAll(out2)) + if len(resumed) != 0 { + t.Errorf("re-scan re-emitted %d events across 2 fully-consumed sessions, want 0:\n%v", len(resumed), resumed) + } +} + +// sessionPresent reports whether a SessionStartedEvent for nativeID is in the slice. +func sessionPresent(events []canonical.Event, nativeID string) bool { + for _, s := range sessionStarts(events) { + if s.NativeID == nativeID { + return true + } + } + return false +} + +// fixtureSQLPath returns the repo-relative path to a scenario's fixture.sql. +func fixtureSQLPath(scenario string) string { + return filepath.Join("..", "..", "..", "testdata", "opencode", scenario, "fixture.sql") +} diff --git a/internal/adapters/opencode/golden_test.go b/internal/adapters/opencode/golden_test.go new file mode 100644 index 0000000..ada492b --- /dev/null +++ b/internal/adapters/opencode/golden_test.go @@ -0,0 +1,257 @@ +package opencode + +import ( + "context" + "database/sql" + "encoding/json" + "flag" + "fmt" + "os" + "path/filepath" + "strings" + "testing" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file is the GOLDEN-TEST HARNESS for the opencode adapter (SOW-0005 chunk +// E). It mirrors codex/golden_test.go's auto-discovering shape — the +// -update-golden flag, the testdata/// loop, SourceProgress +// filtering, the goldenEvent{kind,payload} JSONL wire shape, and the +// placeholder substitution — adapting only the FIXTURE LOADING. Codex walks a +// $CODEX_HOME directory of rollout JSONL files; opencode reads a single SQLite +// database, which git cannot carry as a binary blob. So each scenario commits a +// human-reviewable fixture.sql (CREATE TABLE + INSERTs) and the harness builds a +// throwaway SQLite DB from it at run time via a SEPARATE read-write connection; +// the ADAPTER under test still opens that path strictly read-only via New → +// openReadOnly (the read-only contract is unchanged). +// +// The only embedding of an absolute path in a non-SourceProgress event is the +// SourceID ("opencode:"); it is rewritten to "opencode:" so the +// golden is portable and carries no operator filesystem path. The +// opencode-sqlite://?part_id=&field= PayloadRef URIs are DB-relative (no path, +// no basename — see payloads.go) and therefore already portable; they need no +// substitution. + +var updateGolden = flag.Bool("update-golden", false, "rewrite golden expected.jsonl files for the opencode adapter") + +// goldenEvent is the wire shape written into expected.jsonl: the kind +// discriminator plus the concrete payload. Resilient to field additions on the +// canonical types — updating with -update-golden picks up new fields. Mirrors +// codex/golden_test.go. +type goldenEvent struct { + Kind string `json:"kind"` + Payload json.RawMessage `json:"payload"` +} + +// rootPlaceholder replaces the test's absolute database path inside the +// "opencode:" SourceID so golden files are portable across workstations +// and CI AND carry no operator filesystem path. Mirrors codex's . +const rootPlaceholder = "" + +// TestGolden runs every scenario directory under testdata/opencode/ that +// contains a fixture.sql and asserts Scan produces the canonical events recorded +// in expected.jsonl. Run with -update-golden to refresh. Mirrors codex's +// auto-discovering harness; the opencode INPUT is a fixture.sql the harness +// loads into a fresh temp SQLite DB (see buildFixtureDB). +// +// CRITICAL (chunk brief): a golden generated with -update-golden is NOT +// self-justifying — it pins whatever the code emitted, bugs included. Every +// expected.jsonl in this suite was hand-verified line by line against the spec +// and the fixture's intent before being trusted; the per-scenario invariants are +// additionally asserted in scenario-specific tests (golden_invariants_test.go) +// so a future -update-golden cannot silently launder a regression past review. +func TestGolden(t *testing.T) { + t.Parallel() + + base := filepath.Join("..", "..", "..", "testdata", "opencode") + entries, err := os.ReadDir(base) + if err != nil { + t.Fatalf("readdir %s: %v", base, err) + } + for _, e := range entries { + if !e.IsDir() { + continue + } + name := e.Name() + t.Run(name, func(t *testing.T) { + t.Parallel() + runGoldenScenario(t, filepath.Join(base, name)) + }) + } +} + +// runGoldenScenario builds the scenario's SQLite DB from fixture.sql, scans it +// through the public adapter, filters out SourceProgress (a checkpoint, not +// content), encodes the remaining events with the placeholder, and +// compares to (or rewrites, under -update-golden) expected.jsonl. +func runGoldenScenario(t *testing.T, scenarioDir string) { + t.Helper() + fixturePath := filepath.Join(scenarioDir, "fixture.sql") + if _, err := os.Stat(fixturePath); err != nil { + t.Skipf("fixture.sql missing: %v", err) + return + } + + dbPath := buildFixtureDB(t, fixturePath) + absDB, err := filepath.Abs(dbPath) + if err != nil { + t.Fatalf("abs: %v", err) + } + absDB = filepath.Clean(absDB) + + events := scanScenario(t, absDB) + + filtered := make([]canonical.Event, 0, len(events)) + for _, ev := range events { + if _, ok := ev.(canonical.SourceProgressEvent); ok { + continue + } + filtered = append(filtered, ev) + } + + encoded, err := encodeEvents(filtered, absDB) + if err != nil { + t.Fatalf("encode: %v", err) + } + + goldenPath := filepath.Join(scenarioDir, "expected.jsonl") + if *updateGolden { + if err := os.WriteFile(goldenPath, encoded, 0o600); err != nil { + t.Fatalf("write golden: %v", err) + } + t.Logf("updated golden: %s", goldenPath) + return + } + + want, err := os.ReadFile(goldenPath) // #nosec G304 -- goldenPath is a fixed testdata path under the scenario dir, not user input + if err != nil { + t.Fatalf("read golden %s: %v (run with -update-golden to create)", goldenPath, err) + } + if string(want) != string(encoded) { + t.Errorf("golden mismatch for %s\n--- want ---\n%s\n--- got ---\n%s", + goldenPath, string(want), string(encoded)) + } +} + +// scanScenario opens absDB through the public adapter (New + Scan), drains the +// emitted events, and returns them. The adapter opens the database read-only via +// its own openReadOnly helper; the harness never hands it a writable handle. +func scanScenario(t *testing.T, absDB string) []canonical.Event { + t.Helper() + a, err := New(absDB, canonical.AdapterOptions{}) + if err != nil { + t.Fatalf("New: %v", err) + } + out := make(chan canonical.Event, 8192) + if err := a.Scan(context.Background(), nil, out); err != nil { + t.Fatalf("Scan: %v", err) + } + return drainAll(out) +} + +// buildFixtureDB creates a fresh temp SQLite database in t.TempDir() and applies +// the scenario's fixture.sql (the human-reviewable INPUT analogue). The DB is +// built through a SEPARATE read-write database/sql connection (production NEVER +// opens opencode.db read-write; this is the test harness building the fixture); +// the connection is closed before the adapter reopens the path read-only so the +// WAL is flushed and the adapter sees a stable file. Returns the DB path. +// +// fixture.sql is executed statement-by-statement (split on ";\n") so the +// modernc.org/sqlite driver — which does not run multi-statement strings through +// database/sql's Exec — applies every CREATE/INSERT. Synthetic content only; no +// operator data ever reaches testdata/. +func buildFixtureDB(t *testing.T, fixturePath string) string { + t.Helper() + sqlBytes, err := os.ReadFile(fixturePath) // #nosec G304 -- fixturePath is a fixed testdata path, not user input + if err != nil { + t.Fatalf("read fixture %s: %v", fixturePath, err) + } + + path := filepath.Join(t.TempDir(), "opencode.db") + rw, err := sql.Open(driverName, rwDSNFor(path)) + if err != nil { + t.Fatalf("open rw fixture db: %v", err) + } + for _, stmt := range splitSQLStatements(string(sqlBytes)) { + if _, err := rw.Exec(stmt); err != nil { + _ = rw.Close() + t.Fatalf("apply fixture stmt: %v\nstmt: %s", err, stmt) + } + } + if err := rw.Close(); err != nil { + t.Fatalf("close rw fixture db: %v", err) + } + return path +} + +// splitSQLStatements splits a fixture.sql blob into individual executable +// statements. opencode fixtures are simple (CREATE TABLE + INSERT, no triggers, +// no procedural blocks, no embedded ';' inside string literals beyond what the +// fixtures avoid by construction), so a split on ";" terminating a line is +// sufficient and keeps the harness dependency-free. Blank/comment-only fragments +// (-- lines) are dropped. Each returned statement is trimmed and non-empty. +func splitSQLStatements(blob string) []string { + var out []string + for _, raw := range strings.Split(blob, ";\n") { + stmt := stripSQLComments(raw) + if stmt != "" { + out = append(out, stmt) + } + } + // A trailing statement not followed by a newline (no ";\n") is caught by the + // final fragment; strip a lone trailing ';' it may carry. + for i, s := range out { + out[i] = strings.TrimSuffix(strings.TrimSpace(s), ";") + } + cleaned := out[:0] + for _, s := range out { + if strings.TrimSpace(s) != "" { + cleaned = append(cleaned, s) + } + } + return cleaned +} + +// stripSQLComments removes whole-line "--" comments and trims surrounding +// whitespace, returning "" for a fragment that is only comments/whitespace. It +// does not attempt to strip inline trailing comments (the fixtures keep "--" +// comments on their own lines), keeping the splitter simple and predictable. +func stripSQLComments(frag string) string { + var b strings.Builder + for _, line := range strings.Split(frag, "\n") { + trimmed := strings.TrimSpace(line) + if strings.HasPrefix(trimmed, "--") { + continue + } + b.WriteString(line) + b.WriteByte('\n') + } + return strings.TrimSpace(b.String()) +} + +// encodeEvents serialises events one goldenEvent per line, with the absolute +// test-machine database path inside the "opencode:" SourceID replaced by +// rootPlaceholder so golden files are portable AND carry no operator filesystem +// path (sensitive-data hygiene). Mirrors codex's encodeEvents; opencode embeds +// the path in exactly ONE field (SourceID) — the PayloadRef LocationURI is the +// path-free opencode-sqlite://?part_id=&field= form, so no second rewrite is +// needed. +func encodeEvents(events []canonical.Event, absDB string) ([]byte, error) { + var b strings.Builder + for _, ev := range events { + payload, err := json.Marshal(ev) + if err != nil { + return nil, fmt.Errorf("marshal %T: %w", ev, err) + } + s := strings.ReplaceAll(string(payload), sourceIDPrefix+absDB, sourceIDPrefix+rootPlaceholder) + ge := goldenEvent{Kind: string(ev.EventKind()), Payload: json.RawMessage(s)} + enc, err := json.Marshal(ge) + if err != nil { + return nil, err + } + b.Write(enc) + b.WriteByte('\n') + } + return []byte(b.String()), nil +} diff --git a/internal/adapters/opencode/mapper.go b/internal/adapters/opencode/mapper.go new file mode 100644 index 0000000..025d1e6 --- /dev/null +++ b/internal/adapters/opencode/mapper.go @@ -0,0 +1,385 @@ +package opencode + +import ( + "encoding/json" + "fmt" + "math" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file is the PURE row→event mapper for the opencode adapter (SOW-0005 +// chunk B). Given one session row plus its ordered assistant/user messages and +// each message's ordered parts, mapSession emits the full canonical event +// stream for that session. It is DETERMINISTIC and RE-EMITTABLE: chunk C's +// tailer re-feeds an affected session's WHOLE tree on any change, and the +// ingester's idempotent upserts + the (post-SOW-0004) idempotent catalog +// absorb the re-emission. The mapper performs NO I/O and runs NO SQL — chunk C +// owns the database; the mapper consumes the chunk-A typed rows only +// (adapter-opencode.md §"Mapping to Canonical Events"; SOW-0005 Pre-Impl Gate +// "Canonical mapping"). +// +// File split (each ≤ ~400 lines per the chunk brief): this file drives the +// session + turn loop and terminal-status decision; mapper_parts.go walks a +// message's parts; mapper_ops.go holds the op emitters, computeStepDeltas, the +// tool-namespace + provider-canonicalization helpers, and the PayloadRef seam +// that chunk D fills with the opencode-sqlite:// URI builder. + +// Format is the stable adapter identifier ("opencode"). It is the Source the +// mapper stamps on every LogEntry and the name chunk D's adapter.go registers +// with the adapter registry. It is defined here (the mapper is the first +// non-test consumer) so chunk B compiles and is testable in isolation; chunk D +// references this const rather than redefining it (mirrors codex, where one +// Format const is shared by mapper.go and adapter.go). +const Format = "opencode" + +// messageWithParts pairs one message row with its parts, already ordered by +// (time_created, id) — the unit chunk C hands the mapper. The mapper does not +// re-sort; ordering is the query layer's job (adapter-opencode.md §"Mapping to +// Canonical Events": parts walked in id order, assistant messages ordered by +// (time_created, id)). +type messageWithParts struct { + Message messageRow + Parts []partRow +} + +// mapSession projects one opencode session tree onto the canonical event +// stream. It is the package's single mapper entry point. +// +// Emission order (deterministic): +// 1. SessionStartedEvent (always, first). +// 2. For each assistant message in input order: TurnStartedEvent, then its +// parts' ops/payloads/logs in part order, then TurnFinalizedEvent. User +// messages anchor the following assistant turn but emit no events of their +// own (opencode pairs a user→assistant cycle; the assistant message IS the +// turn — adapter-opencode.md §"Turn synthesis"). +// 3. SessionFinalizedEvent IFF a terminal signal is present (archived → +// completed; last assistant message carries data.error → failed). A session +// with neither stays running with no finalize, like claude-code/codex +// (adapter-opencode.md §"Per-table emit rules", "Canonical Model Gaps" #5). +// +// SourceSeq is a deterministic per-event counter (observability only; the +// durable resume state is the watermark cursor, not SourceSeq — see +// canonical.EventBase). It is packed from a monotonically increasing record +// index so a re-emit of the same tree yields identical SourceSeqs. +func mapSession(sourceID string, s sessionRow, msgs []messageWithParts, opts ...MapOption) ([]canonical.Event, error) { + m := newSessionMapper(sourceID, s) + for _, o := range opts { + o(m) + } + out := make([]canonical.Event, 0, 16+4*len(msgs)) + + out = append(out, m.sessionStarted()) + + for i := range msgs { + evs, err := m.mapMessage(msgs[i]) + if err != nil { + return nil, err + } + out = append(out, evs...) + } + + if fin := m.sessionFinalized(); fin != nil { + out = append(out, fin) + } + return out, nil +} + +// sessionMapper threads the per-session inference state: turn numbering, the +// previous assistant message's cumulative token totals (for the message-level +// per-turn delta — SOW decision #4), and the running SourceSeq record index. +// One sessionMapper processes exactly one session start-to-finish; it is not +// reused (turn/op numbering is per-session). +type sessionMapper struct { + sourceID string + session sessionRow + + // turnSeq is the last assigned 1-based turn Seq (assistant-message order). + turnSeq int + + // prevTurnTokens is the PRIOR assistant message's cumulative token totals, + // used to compute the current turn's per-turn delta (SOW decision #4: the + // message-level tokens are the session-running total at completion of the + // turn, so the per-turn value is this turn's cumulative minus the previous + // turn's cumulative). havePrevTurn distinguishes "no prior turn" (turn 1, + // whose delta is its own cumulative) from a genuine zero prior. + // + // The message-level cumulative→delta behavior is PINNED by tests + // (TestMapSession_TurnNumberingAndTokenDeltas) and the e_cumulative_tokens + // golden (TestGoldenInvariant_ECumulativeTokens: turn rollup = the + // message-level cumulative). It is the analogue, one level up, of the + // step-level cumulative pattern (AC#3, TestComputeStepDeltas_AC3): opencode's + // assistant message.data.tokens is the session-running total at the turn's + // completion, so the per-turn delta is this cumulative minus the previous + // turn's. The arithmetic is isolated here behind subClampWarn so a future + // upstream change is a one-spot edit. + prevTurnTokens tokenCounts + havePrevTurn bool + + // recordIdx is the running 0-based record ordinal feeding SourceSeq. It is + // advanced once per emitted event via the seq() closure; its absolute value + // is never load-bearing (SourceSeq is observability-only), only its + // determinism across re-emits matters. + recordIdx uint64 + + // failError / failEndUs carry the session's failed-terminal signal as the + // LAST assistant turn's terminal error (SOW-0005 round-2 P1-B). mapAssistantTurn + // SETS them when a turn carries data.error and CLEARS them when a turn does not, + // so they reflect only the final turn's state — matching adapter-opencode.md + // §"Per-table emit rules" ("failure is the LAST assistant message's state"). A + // session that errored on an early turn but whose last turn succeeded is NOT + // failed. sessionFinalized consumes them when the session is not archived. + // failError stays nil for a clean (or recovered) session. + failError *assistantError + failEndUs int64 + + // uriBuilder is the chunk-D-injectable PayloadRef LocationURI builder (the + // opencode-sqlite:// seam — see mapper_ops.go payloadURIBuilder). nil in + // mapper-only unit tests, where defaultPayloadURI is used. + uriBuilder payloadURIBuilder + + // rootID is the resolved TRUE tree root of this session (the topmost ancestor + // of the parent_id chain), injected by the loader via WithRootNativeID + // (SOW-0005 P2.4). When empty, rootNativeID falls back to the direct-parent + // heuristic (the mapper-only path that has no DB to walk the chain). For a + // root session this is its own id; for a nested sub-agent it is the whole + // tree's root, not the direct parent. + rootID string + + // warn surfaces a load-bearing decode failure (malformed session.model JSON, + // malformed task metadata) with structured context, so a corrupt field + // degrades to zero WITHOUT being silently swallowed (SOW-0005 P2.6; "no + // silent failures"). Injected by the loader (WithOnWarn → onError); a no-op in + // mapper-only unit tests. It NEVER aborts the session — the mapper degrades + // the affected field and continues. + warn func(error) +} + +// MapOption configures mapSession. The only option today injects the chunk-D +// PayloadRef URI builder; it is variadic so chunk C/D can call mapSession +// without it (mapper-only tests) and inherit the deterministic default. +type MapOption func(*sessionMapper) + +// WithPayloadURIBuilder injects the production PayloadRef LocationURI builder +// (chunk D). When unset, mapSession uses defaultPayloadURI (the relative +// opencode-sqlite:// form with no database basename). +func WithPayloadURIBuilder(b payloadURIBuilder) MapOption { + return func(m *sessionMapper) { m.uriBuilder = b } +} + +// WithRootNativeID injects the resolved TRUE tree root for this session (the +// topmost parent_id ancestor — SOW-0005 P2.4), which the loader computes via +// resolveRootID by walking the parent_id chain. When unset (mapper-only tests +// with no DB), rootNativeID falls back to the direct-parent heuristic. An empty +// root is ignored (falls back), so a caller that cannot resolve it degrades +// rather than emitting an empty RootNativeID. +func WithRootNativeID(root string) MapOption { + return func(m *sessionMapper) { m.rootID = root } +} + +// WithOnWarn injects a callback for load-bearing decode failures (SOW-0005 P2.6), +// so a malformed session.model JSON or task metadata is surfaced with structured +// context rather than silently degraded to zero. The loader wires this to the +// adapter's onError; mapper-only tests may omit it (the failure then degrades +// silently, as before, in the no-DB path). +func WithOnWarn(warn func(error)) MapOption { + return func(m *sessionMapper) { m.warn = warn } +} + +// mwarn surfaces a decode failure via the injected warn callback when one is set +// (a no-op otherwise), so the pure mapper degrades a field without aborting and +// WITHOUT silently swallowing the error (SOW-0005 P2.6). +func (m *sessionMapper) mwarn(err error) { + if m.warn != nil { + m.warn(err) + } +} + +// newSessionMapper constructs a mapper for one session. +func newSessionMapper(sourceID string, s sessionRow) *sessionMapper { + return &sessionMapper{sourceID: sourceID, session: s} +} + +// nativeID is the session's canonical native id (session.id). +func (m *sessionMapper) nativeID() string { return m.session.ID } + +// rootNativeID returns the root of this session's tree. When the loader injected +// a resolved root (WithRootNativeID — the topmost parent_id ancestor, SOW-0005 +// P2.4) that wins, so a >2-level nested sub-agent points at the true tree root, +// not its direct parent. Without an injected root (mapper-only path with no DB to +// walk), it falls back to the direct parent for a sub-agent (a meaningful pointer +// even before the parent row lands — mirrors codex/claude_code), else the +// session's own id (adapter-opencode.md §"Per-table emit rules"). +func (m *sessionMapper) rootNativeID() string { + if m.rootID != "" { + return m.rootID + } + if m.session.ParentID != "" { + return m.session.ParentID + } + return m.session.ID +} + +// nextBase returns the next EventBase with a deterministic SourceSeq and the +// given canonical (microsecond) timestamp. Each call advances recordIdx, so the +// emitted stream's SourceSeqs are stable across re-emits of the same tree. +func (m *sessionMapper) nextBase(tsUs int64) canonical.EventBase { + b := canonical.EventBase{SourceID: m.sourceID, SourceSeq: m.recordIdx, Ts: tsUs} + m.recordIdx++ + return b +} + +// sessionStarted builds the once-per-session SessionStartedEvent (adapter- +// opencode.md §"Per-table emit rules"). Kind = sub_agent when parent_id is set +// (+ParentNativeID), else root. Model is session.model $.id; Cwd is the start- +// of-session directory; AgentName is session.agent. Per-session extras carry +// the provider alias, version, slug, title, project id, and directory so the UI +// can attribute the capture (turn-extras like per-turn cwd are deferred to +// SOW-0021; no canonical turn Extras carrier exists — SOW decision #4). +func (m *sessionMapper) sessionStarted() canonical.SessionStartedEvent { + kind := canonical.KindRoot + if m.session.ParentID != "" { + kind = canonical.KindSubAgent + } + mr := m.sessionModel() + ev := canonical.SessionStartedEvent{ + EventBase: m.nextBase(m.msToMicrosWarn(m.session.TimeCreatedMs, "session.time_created")), + NativeID: m.session.ID, + RootNativeID: m.rootNativeID(), + ParentNativeID: m.session.ParentID, + Kind: kind, + AgentName: m.session.Agent, + Model: mr.modelID(), + Cwd: m.session.Directory, + Extras: m.sessionExtras(mr), + } + return ev +} + +// sessionModel decodes the session.model JSON ({id, providerID, variant?}), +// returning a zero modelRef when absent or malformed. An ABSENT model column +// (older schema, nil bytes) is silent forward-compat; a PRESENT-but-malformed +// blob is load-bearing (it drops the session's model/provider attribution), so it +// is surfaced via mwarn with the session id rather than silently zeroed +// (SOW-0005 P2.6). The field still degrades to zero — the session is not aborted. +func (m *sessionMapper) sessionModel() modelRef { + if len(m.session.Model) == 0 { + return modelRef{} + } + var mr modelRef + if err := json.Unmarshal(m.session.Model, &mr); err != nil { + m.mwarn(fmt.Errorf("opencode: malformed session.model JSON (table=session id=%s field=model); model/provider omitted: %w", m.session.ID, err)) + return modelRef{} + } + return mr +} + +// sessionExtras builds sessions.extras_json from the session row. Only non-empty +// values are included so an older-schema row missing a column contributes +// nothing rather than an empty string. The provider alias is surfaced verbatim +// (the canonical provider mapping for ops is best-effort; see canonicalProvider). +func (m *sessionMapper) sessionExtras(mr modelRef) map[string]any { + extras := map[string]any{} + putStr(extras, "providerID", mr.ProviderID) + putStr(extras, "variant", mr.Variant) + putStr(extras, "version", m.session.Version) + putStr(extras, "slug", m.session.Slug) + putStr(extras, "title", m.session.Title) + putStr(extras, "project_id", m.session.ProjectID) + putStr(extras, "directory", m.session.Directory) + if len(extras) == 0 { + return nil + } + return extras +} + +// sessionFinalized decides the session's terminal classification (adapter- +// opencode.md §"Per-table emit rules", "Canonical Model Gaps" #5): +// +// - time_archived set → SessionFinalized(completed, EndTs = archived ms→µs). +// Archival is the only clean terminal signal opencode records. +// - else the LAST assistant turn carries data.error → SessionFinalized(failed, +// ErrorClass = error.name or defaultErrorClass when empty, ErrorMessage = +// error.data.message when present (round-5 P3-1), EndTs = that message's +// completed-or-created ts). A session whose last turn recovered is NOT failed +// even if an earlier turn errored (SOW-0005 round-2 P1-B). +// - else running: NO SessionFinalized (opencode never finalizes a session, it +// only archives — like claude-code/codex which have no per-session terminal). +// +// Archival WINS over an error: an archived session is a user action and its +// archive timestamp is the authoritative terminal, even if its last turn +// errored. Returns nil when the session stays running. +func (m *sessionMapper) sessionFinalized() canonical.Event { + if m.session.TimeArchivedMs > 0 { + archUs := m.msToMicrosWarn(m.session.TimeArchivedMs, "session.time_archived") + ev := canonical.SessionFinalizedEvent{ + EventBase: m.nextBase(archUs), + NativeID: m.session.ID, + Status: canonical.StatusCompleted, + EndTs: archUs, + } + return ev + } + if m.failError != nil { + ev := canonical.SessionFinalizedEvent{ + EventBase: m.nextBase(m.failEndUs), + NativeID: m.session.ID, + Status: canonical.StatusFailed, + ErrorClass: errorClass(m.failError), // defaultErrorClass when name empty (P2-A) + ErrorMessage: errorMessage(m.failError), // data.message when present (round-5 P3-1) + EndTs: m.failEndUs, + } + return ev + } + return nil +} + +// clampMsToMicros converts opencode's native milliseconds to canonical +// microseconds, reporting whether it had to CLAMP. A non-positive input (absent +// timestamp) maps to (0, false) so an unset column never fabricates a +// 1970-adjacent time (adapter-opencode.md §"Edge Cases" #6 — the single ms→µs +// conversion point). A crafted/corrupt huge ms whose *1000 would overflow int64 +// SATURATES at math.MaxInt64 and returns clamped=true (SOW-0005 round-2 P2-F): a +// wrapped timestamp would become negative and reorder events nonsensically; the +// far-future clamp is the safe degradation. This is the pure core; callers with a +// warn channel surface the clamp via msToMicrosWarn (SOW-0005 round-3 P2-1). +func clampMsToMicros(ms int64) (us int64, clamped bool) { + if ms <= 0 { + return 0, false + } + if ms > math.MaxInt64/1000 { + return math.MaxInt64, true + } + return ms * 1000, false +} + +// msToMicros is the non-warning ms→µs conversion used by the pure free-function +// helpers (toolStartUs / turnEndUs / cancelOpenLLMOp) that have NO warn channel. +// It saturates silently; the mapper-method emission sites use msToMicrosWarn so a +// clamp is surfaced (SOW-0005 round-3 P2-1). +func msToMicros(ms int64) int64 { + us, _ := clampMsToMicros(ms) + return us +} + +// msToMicrosWarn is the warning-capable ms→µs conversion the mapper's method sites +// use: it saturates exactly like msToMicros but, on a clamp, surfaces a structured +// WARN via mwarn with the field label so a crafted/corrupt timestamp is no longer +// a silent saturation (SOW-0005 round-3 P2-1). The pure mapper-only path (no warn +// wired) still degrades silently. field names the source column/path for context +// (e.g. "session.time_created"). +func (m *sessionMapper) msToMicrosWarn(ms int64, field string) int64 { + us, clamped := clampMsToMicros(ms) + if clamped { + m.mwarn(fmt.Errorf("opencode: timestamp ms→µs overflow (field=%s ms=%d); clamped to MaxInt64 (P2-1)", field, ms)) + } + return us +} + +// putStr inserts k=v into the extras map only when v is non-empty, so an +// older-schema zero value contributes nothing. +func putStr(m map[string]any, k, v string) { + if v != "" { + m[k] = v + } +} diff --git a/internal/adapters/opencode/mapper_branch_test.go b/internal/adapters/opencode/mapper_branch_test.go new file mode 100644 index 0000000..161dfeb --- /dev/null +++ b/internal/adapters/opencode/mapper_branch_test.go @@ -0,0 +1,447 @@ +package opencode + +import ( + "encoding/json" + "testing" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file pins the mapper's defensive / edge branches that the happy-path +// tests in mapper_test.go do not reach: malformed bodies, orphan steps, +// missing-timestamp fallbacks, the chunk-D PayloadRef-URI seam, known-but-unused +// part types, and the top-level (no-LLM-op) op-parent case. Each test asserts a +// real behavior, not just line coverage. + +// --- chunk-D PayloadRef URI seam (WithPayloadURIBuilder, payloadURI inject) --- + +func TestMapSession_InjectedPayloadURIBuilder(t *testing.T) { + s := rootSession("ses_x", 0) + end := int64(2200) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), + reasoningPart("prt_2", 2000, &end, false), + ), + } + // Chunk D injects a builder that prefixes the resolved db basename. + builder := func(partID, field string) string { + return "opencode-sqlite://opencode.db?part_id=" + partID + "&field=" + field + } + evs, err := mapSession(testSourceID, s, msgs, WithPayloadURIBuilder(builder)) + if err != nil { + t.Fatalf("mapSession: %v", err) + } + var found bool + for _, ev := range evs { + if p, ok := ev.(canonical.PayloadRefEvent); ok && p.PayloadKind == "llm_reasoning" { + found = true + want := "opencode-sqlite://opencode.db?part_id=prt_2&field=text" + if p.LocationURI != want { + t.Fatalf("LocationURI = %q want %q (injected builder)", p.LocationURI, want) + } + } + } + if !found { + t.Fatal("no llm_reasoning PayloadRef") + } +} + +func TestDefaultPayloadURI(t *testing.T) { + got := defaultPayloadURI("prt_9", "state.output") + want := "opencode-sqlite://?part_id=prt_9&field=state.output" + if got != want { + t.Fatalf("defaultPayloadURI = %q want %q", got, want) + } +} + +// --- orphan step-finish (no matching step-start) ------------------------------ + +func TestMapSession_OrphanStepFinishNoCrash(t *testing.T) { + s := rootSession("ses_x", 0) + // A step-finish with no preceding step-start: no LLM op to close. Must be a + // no-op (adapter-opencode.md §"Edge Cases" #5), emitting no op finalize. + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepFinish("prt_1", 100, 10, 0, 0, 0, 0.1), + ), + } + evs := run(t, s, msgs) + if n := len(opStarts(evs)); n != 0 { + t.Fatalf("op starts = %d want 0 (orphan step-finish opens nothing)", n) + } + if n := len(opFinals(evs)); n != 0 { + t.Fatalf("op finals = %d want 0 (orphan step-finish closes nothing)", n) + } +} + +// --- nextStepDelta out-of-range (more step-finishes than deltas) -------------- + +func TestNextStepDelta_OutOfRange(t *testing.T) { + tc := &turnContext{stepDeltas: []tokenCounts{{Input: 5}}} + if d := tc.nextStepDelta(); d.Input != 5 { + t.Fatalf("first delta = %d want 5", d.Input) + } + // Second call is past the end → zero delta, no panic. + if d := tc.nextStepDelta(); d != (tokenCounts{}) { + t.Fatalf("out-of-range delta = %+v want zero", d) + } +} + +// --- missing-timestamp fallbacks ---------------------------------------------- + +func TestMapSession_ReasoningStartFallsBackToPartCreated(t *testing.T) { + s := rootSession("ses_x", 0) + // reasoning part with no time.start → falls back to the part's time_created. + raw, _ := json.Marshal(map[string]any{"type": "reasoning", "text": "x"}) + p := partRow{ID: "prt_2", MessageID: "msg_a", SessionID: "ses_x", TimeCreatedMs: 1900, Data: raw} + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), p), + } + evs := run(t, s, msgs) + for _, op := range opStarts(evs) { + if op.Kind == canonical.OpReasoning { + if op.Ts != 1900*1000 { + t.Fatalf("reasoning Ts = %d want %d (part time_created fallback)", op.Ts, 1900*1000) + } + return + } + } + t.Fatal("no reasoning op") +} + +func TestMapSession_ToolStartFallsBackToPartCreated(t *testing.T) { + s := rootSession("ses_x", 0) + // tool state with no time.start → op start falls back to part time_created. + state := map[string]any{"status": "running", "input": map[string]any{}} + raw, _ := json.Marshal(map[string]any{"type": "tool", "callID": "c", "tool": "bash", "state": state}) + p := partRow{ID: "prt_2", MessageID: "msg_a", SessionID: "ses_x", TimeCreatedMs: 1950, Data: raw} + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), p), + } + evs := run(t, s, msgs) + tools := toolOps(evs) + if len(tools) != 1 { + t.Fatalf("tool op count = %d want 1", len(tools)) + } + if tools[0].Ts != 1950*1000 { + t.Fatalf("tool Ts = %d want %d (part time_created fallback)", tools[0].Ts, 1950*1000) + } +} + +// --- malformed part body → one WRN, no op ------------------------------------- + +func TestMapSession_MalformedPartSkippedWithWarn(t *testing.T) { + s := rootSession("ses_x", 0) + bad := partRow{ID: "prt_bad", MessageID: "msg_a", SessionID: "ses_x", Data: []byte("{not json")} + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + bad), + } + evs := run(t, s, msgs) + warns := 0 + for _, ev := range evs { + if l, ok := ev.(canonical.LogEntryEvent); ok && l.Severity == "WRN" { + warns++ + } + } + if warns != 1 { + t.Fatalf("WRN count = %d want 1 for malformed part", warns) + } + if n := len(opStarts(evs)); n != 0 { + t.Fatalf("op starts = %d want 0 for malformed part", n) + } +} + +// --- unknown message role → one WRN ------------------------------------------- + +func TestMapSession_UnknownRoleSkippedWithWarn(t *testing.T) { + s := rootSession("ses_x", 0) + raw, _ := json.Marshal(map[string]any{"role": "system", "time": map[string]any{"created": 1500}}) + msgs := []messageWithParts{ + {Message: messageRow{ID: "msg_sys", SessionID: "ses_x", TimeCreatedMs: 1500, Data: raw}}, + } + evs := run(t, s, msgs) + if n := countKind(evs, canonical.EvTurnStarted); n != 0 { + t.Fatalf("TurnStarted = %d want 0 for unknown role", n) + } + warns := 0 + for _, ev := range evs { + if l, ok := ev.(canonical.LogEntryEvent); ok && l.Severity == "WRN" { + warns++ + } + } + if warns != 1 { + t.Fatalf("WRN count = %d want 1 for unknown role", warns) + } +} + +// --- known-but-unused part types (snapshot/subtask/agent) → no-op ------------- + +func TestMapSession_KnownNoOpParts(t *testing.T) { + s := rootSession("ses_x", 0) + mk := func(id, typ string) partRow { + raw, _ := json.Marshal(map[string]any{"type": typ}) + return partRow{ID: id, MessageID: "msg_a", SessionID: "ses_x", Data: raw} + } + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), + mk("prt_2", "snapshot"), + mk("prt_3", "subtask"), + mk("prt_4", "agent"), + ), + } + evs := run(t, s, msgs) + // Only the LLM op from step-start; no ops or logs from the known-no-op parts. + for _, op := range opStarts(evs) { + if op.Kind != canonical.OpLLM { + t.Fatalf("unexpected op kind %q from known-no-op part", op.Kind) + } + } + for _, ev := range evs { + if l, ok := ev.(canonical.LogEntryEvent); ok { + t.Fatalf("unexpected log %q for known-no-op part (must be silent)", l.Message) + } + } +} + +// --- tool with unknown status + an end → finalized "completed" at that end ---- + +// TestMapSession_ToolUnknownStatusWithEnd pins the P1-C audit (SOW-0005 round-2): +// a FUTURE/unknown opencode tool status that carries an end still finalizes, but +// with the CANONICAL status "completed" (it ended) — never the raw opencode +// string, which is not a canonical op status (canonical-events.md:196). +func TestMapSession_ToolUnknownStatusWithEnd(t *testing.T) { + s := rootSession("ses_x", 0) + end := int64(2600) + state := map[string]any{ + "status": "weird-future-status", + "input": map[string]any{"k": "v"}, + "output": "out", + "time": map[string]any{"start": int64(2000), "end": end}, + } + raw, _ := json.Marshal(map[string]any{"type": "tool", "callID": "c", "tool": "bash", "state": state}) + p := partRow{ID: "prt_2", MessageID: "msg_a", SessionID: "ses_x", Data: raw} + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), p), + } + evs := run(t, s, msgs) + // No finalize may carry the raw opencode status. + for _, f := range opFinals(evs) { + if f.Status == "weird-future-status" { + t.Fatalf("finalize carries the raw opencode status %q (P1-C: must be canonical)", f.Status) + } + } + var fin *canonical.OpFinalizedEvent + for i, f := range opFinals(evs) { + if f.Status == "completed" { + fin = &opFinals(evs)[i] + } + } + if fin == nil { + t.Fatal("unknown-status tool with an end must finalize with canonical status 'completed'") + } + if fin.EndTs != end*1000 { + t.Fatalf("EndTs = %d want %d", fin.EndTs, end*1000) + } +} + +// --- tool with nil state → no finalize ---------------------------------------- + +func TestMapSession_ToolNilStateNoFinalize(t *testing.T) { + s := rootSession("ses_x", 0) + raw, _ := json.Marshal(map[string]any{"type": "tool", "callID": "c", "tool": "bash"}) // no state + p := partRow{ID: "prt_2", MessageID: "msg_a", SessionID: "ses_x", TimeCreatedMs: 2000, Data: raw} + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), p), + } + evs := run(t, s, msgs) + tools := toolOps(evs) + if len(tools) != 1 { + t.Fatalf("tool op count = %d want 1", len(tools)) + } + for _, f := range opFinals(evs) { + if f.Seq == tools[0].Seq && f.TurnSeq == tools[0].TurnSeq { + t.Fatal("tool with nil state must NOT finalize") + } + } +} + +// --- tool op before any step-start → ParentOpSeq = -1 (top-level) ------------- + +func TestMapSession_ToolBeforeStepIsTopLevel(t *testing.T) { + s := rootSession("ses_x", 0) + end := int64(2500) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + toolPart("prt_1", "bash", "completed", 2000, &end, nil), // no preceding step-start + ), + } + evs := run(t, s, msgs) + tools := toolOps(evs) + if len(tools) != 1 { + t.Fatalf("tool op count = %d want 1", len(tools)) + } + if tools[0].ParentOpSeq != -1 { + t.Fatalf("ParentOpSeq = %d want -1 (top-level, no LLM op open)", tools[0].ParentOpSeq) + } +} + +// --- file part before any LLM op → INF LogEntry, op 0 (round-4 P2-3) ---------- + +// TestMapSession_FileBeforeLLMOp pins the round-4 P2-3 contract for a file part +// that arrives BEFORE any step-start: unlike the old PayloadRef (which had to be +// dropped because payload_refs.op_id is NOT NULL), an INF LogEntry's OpSeq may be +// 0, so the attachment is STILL surfaced — turn-scoped, op 0 — and no non-canonical +// PayloadRef is emitted. +func TestMapSession_FileBeforeLLMOp(t *testing.T) { + s := rootSession("ses_x", 0) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + filePart("prt_1", "https://cdn.example.invalid/x.png"), // before any step-start + ), + } + evs := run(t, s, msgs) + // No non-canonical PayloadRef at all. + for _, ev := range evs { + if p, ok := ev.(canonical.PayloadRefEvent); ok && p.PayloadKind == "user_attachment" { + t.Fatal("file part must NOT emit a non-canonical user_attachment PayloadRef (round-4 P2-3)") + } + } + // The attachment IS surfaced as an INF LogEntry with OpSeq 0 (no LLM op open). + var found int + for _, ev := range evs { + if l, ok := ev.(canonical.LogEntryEvent); ok && l.Message == "file attachment" { + found++ + if l.Severity != "INF" { + t.Errorf("file-attachment severity = %q, want INF", l.Severity) + } + if l.OpSeq != 0 { + t.Errorf("file-attachment OpSeq = %d, want 0 (no LLM op open)", l.OpSeq) + } + } + } + if found != 1 { + t.Fatalf("file-attachment INF LogEntry count = %d, want 1 (surfaced even before any op)", found) + } +} + +// --- patch before any step-start → dropped (no LLM op to attach extras) ------- + +func TestMapSession_PatchBeforeStepDropped(t *testing.T) { + s := rootSession("ses_x", 0) + c := int64(3000) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, &c, "the-alias", "the-model", tokenCounts{Input: 10}, 0.1, "stop", ""), + patchPart("prt_1"), // before any step-start → no LLM op, patch dropped + stepStart("prt_2"), + stepFinish("prt_3", 10, 1, 0, 0, 0, 0.1), + ), + } + evs := run(t, s, msgs) + // The LLM op opens AFTER the patch, so its extras must NOT carry patch_files. + for _, op := range opStarts(evs) { + if op.Kind == canonical.OpLLM { + if _, ok := op.Extras["patch_files"]; ok { + t.Fatal("patch before any step-start must be dropped, not grafted onto a later op") + } + } + } +} + +// --- jsonTrimBytes null/empty handling ---------------------------------------- + +func TestJSONTrimBytes(t *testing.T) { + if b := jsonTrimBytes([]byte(" null ")); b != nil { + t.Fatalf("null → %q want nil", b) + } + if b := jsonTrimBytes([]byte(" ")); b != nil { + t.Fatalf("blank → %q want nil", b) + } + if b := jsonTrimBytes([]byte(` {"a":1} `)); string(b) != `{"a":1}` { + t.Fatalf("object → %q want trimmed object", b) + } +} + +// --- byte-accounting + logEntry defensive guards (nil state / nil extras) ----- + +func TestToolBytes_NilState(t *testing.T) { + // A partData with no state contributes zero bytes either way. + if n := toolBytesIn(partData{}); n != 0 { + t.Fatalf("toolBytesIn(nil state) = %d want 0", n) + } + if n := toolBytesOut(partData{}); n != 0 { + t.Fatalf("toolBytesOut(nil state) = %d want 0", n) + } +} + +func TestLogEntry_NilExtrasDefaulted(t *testing.T) { + m := newSessionMapper(testSourceID, rootSession("ses_x", 0)) + // A nil extras map must be defaulted to an empty (non-nil) map so the event + // carries an addressable Extras. + ev := m.logEntry(m.nextBase(0), "INF", 0, 0, "x", nil) + if ev.Extras == nil { + t.Fatal("logEntry Extras must be non-nil even when passed nil") + } +} + +// --- bytes_in / bytes_out on a completed tool --------------------------------- + +func TestMapSession_ToolBytesAccounting(t *testing.T) { + s := rootSession("ses_x", 0) + end := int64(2500) + state := map[string]any{ + "status": "completed", + "input": map[string]any{"command": "ls"}, + "output": "file1\nfile2", + "time": map[string]any{"start": int64(2000), "end": end}, + } + raw, _ := json.Marshal(map[string]any{"type": "tool", "callID": "c", "tool": "bash", "state": state}) + p := partRow{ID: "prt_2", MessageID: "msg_a", SessionID: "ses_x", Data: raw} + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), p), + } + evs := run(t, s, msgs) + for _, f := range opFinals(evs) { + if f.Status == "completed" && f.BytesOut > 0 { + if int(f.BytesOut) != len("file1\nfile2") { + t.Fatalf("BytesOut = %d want %d", f.BytesOut, len("file1\nfile2")) + } + if f.BytesIn <= 0 { + t.Fatalf("BytesIn = %d want >0 (serialized input length)", f.BytesIn) + } + return + } + } + t.Fatal("no completed tool finalize with byte accounting") +} + +// --- malformed session.model is tolerated (zero model) ------------------------ + +func TestMapSession_MalformedSessionModelTolerated(t *testing.T) { + s := rootSession("ses_x", 0) + s.Model = []byte("{not json") + evs := run(t, s, nil) + st := firstStarted(t, evs) + if st.Model != "" { + t.Fatalf("Model = %q want empty for malformed session.model", st.Model) + } +} + +// --- session with no extras at all → nil Extras ------------------------------- + +func TestMapSession_NoExtrasYieldsNil(t *testing.T) { + // A bare session row (older schema, only required cols) carries no extras. + s := sessionRow{ID: "ses_bare", TimeCreatedMs: 1000} + evs := run(t, s, nil) + st := firstStarted(t, evs) + if st.Extras != nil { + t.Fatalf("Extras = %v want nil for a bare session", st.Extras) + } +} diff --git a/internal/adapters/opencode/mapper_emitters.go b/internal/adapters/opencode/mapper_emitters.go new file mode 100644 index 0000000..00c96f0 --- /dev/null +++ b/internal/adapters/opencode/mapper_emitters.go @@ -0,0 +1,100 @@ +package opencode + +import ( + "fmt" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file holds the NON-OP part emitters the part walker (mapper_parts.go) +// delegates to: text → PayloadRef, patch → LLM-op extras, compaction → INF log, +// retry → WRN log, file → INF log (SOW-0005 round-4 P2-3: a file part is an +// attachment, NOT a payload-with-op; it is surfaced as an INF LogEntry carrying +// filename/url/mime in extras rather than a PayloadRef with a non-canonical +// PayloadKind). Split out of mapper_ops.go to keep each file ≤400 lines (SOW-0005 +// round-2; the P1-C/P2-D additions pushed mapper_ops.go over budget). The OP +// emitters (LLM/reasoning/tool + the task→session op) stay in mapper_ops.go; the +// turn finalizer + token math in mapper_turn.go. + +// emitTextPayload handles a text part (adapter-opencode.md §"Per-table emit +// rules": text is NOT an op; emit a PayloadRef for the assistant text scoped to +// the turn's most-recent LLM op). When no LLM op is open yet (a text part before +// any step-start) the ref is DROPPED — payload_refs.op_id is NOT NULL, so a ref +// with no op would FK-roll-back the ingest batch (mirrors codex's discipline). +func (m *sessionMapper) emitTextPayload(tc *turnContext, p partRow) []canonical.Event { + if tc.llmOpSeq == 0 { + return nil + } + return []canonical.Event{m.payloadRef(m.nextBase(m.msToMicrosWarn(p.TimeCreatedMs, "part.time_created (text)")), tc.turnSeq, tc.llmOpSeq, "llm_response", "text", p.ID, "text", -1)} +} + +// recordPatch handles a patch part (adapter-opencode.md §"Per-table emit rules": +// patch is NOT an op; record file-change info in the surrounding LLM op's extras +// for the "Files changed" UI tab). The info is stashed on tc.llmExtras and +// re-emitted onto the LLM op at step-finish (closeLLMOp). When no LLM op is open +// the patch is dropped (no op to attach to); this is rare (patch always follows a +// step-start in practice). Returns no events of its own. +func (m *sessionMapper) recordPatch(tc *turnContext, data partData) []canonical.Event { + if tc.llmOpSeq == 0 || tc.llmExtras == nil { + return nil + } + if data.Hash != "" { + tc.llmExtras["patch_hash"] = data.Hash + } + if len(data.Files) > 0 { + tc.llmExtras["patch_files"] = data.Files + } + return nil +} + +// emitCompactionLog handles a compaction part (adapter-opencode.md §"Per-table +// emit rules": compaction → INF LogEntry). It records the auto flag. +func (m *sessionMapper) emitCompactionLog(tc *turnContext, p partRow, data partData) []canonical.Event { + base := m.nextBase(m.msToMicrosWarn(p.TimeCreatedMs, "part.time_created (compaction)")) + return []canonical.Event{m.logEntry(base, "INF", tc.turnSeq, tc.llmOpSeq, + fmt.Sprintf("session compacted (auto=%t)", data.Auto), + map[string]any{"auto": data.Auto})} +} + +// emitRetryLog handles a retry part (adapter-opencode.md §"Per-table emit rules": +// retry → WRN LogEntry message `API retry attempt : `). It records +// the attempt number AND the triggering error's name (opencode's RetryPart carries +// an `error: ApiError` whose `name` classifies the failure — SOW-0005 round-6 P3-1). +// When the error name is absent (older/forward-compat retry part), the message and +// extras omit it so an empty `: ` suffix never leaks. +func (m *sessionMapper) emitRetryLog(tc *turnContext, p partRow, data partData) []canonical.Event { + base := m.nextBase(m.msToMicrosWarn(p.TimeCreatedMs, "part.time_created (retry)")) + msg := fmt.Sprintf("API retry attempt %d", data.Attempt) + extras := map[string]any{"attempt": data.Attempt} + if data.Error.Name != "" { + msg += ": " + data.Error.Name + extras["error.name"] = data.Error.Name + } + return []canonical.Event{m.logEntry(base, "WRN", tc.turnSeq, tc.llmOpSeq, msg, extras)} +} + +// emitFileLog handles a file part (adapter-opencode.md §"Per-table emit rules": +// file → INF LogEntry). SOW-0005 round-4 P2-3: a file part is a user file +// ATTACHMENT, not an op-scoped payload artifact. The canonical PayloadRefEvent +// PayloadKind set (internal/canonical/events.go) is exactly +// llm_request|llm_response|llm_sdk_request|llm_sdk_response|llm_reasoning| +// tool_request|tool_response|log — none of which is a user file attachment — so +// emitting a "user_attachment" PayloadRef violated the canonical contract. Instead +// the attachment is surfaced as an INF LogEntry carrying filename/url/mime in its +// extras, scoped to the turn and (when open) the LLM op — mirroring how +// compaction/retry parts emit a LogEntry. A richer canonical attachment +// PayloadKind is deferred to a follow-up SOW. Unlike the dropped PayloadRef path, +// this is NOT gated on an open LLM op (a LogEntry's OpSeq may be 0): a file +// attachment before any step-start is still surfaced, turn-scoped, op 0. A part +// with no url/filename/mime at all emits nothing (no attachment to record). +func (m *sessionMapper) emitFileLog(tc *turnContext, p partRow, data partData) []canonical.Event { + if data.URL == "" && data.Filename == "" && data.MIME == "" { + return nil + } + extras := map[string]any{} + putStr(extras, "filename", data.Filename) + putStr(extras, "url", data.URL) + putStr(extras, "mime", data.MIME) + base := m.nextBase(m.msToMicrosWarn(p.TimeCreatedMs, "part.time_created (file)")) + return []canonical.Event{m.logEntry(base, "INF", tc.turnSeq, tc.llmOpSeq, "file attachment", extras)} +} diff --git a/internal/adapters/opencode/mapper_ops.go b/internal/adapters/opencode/mapper_ops.go new file mode 100644 index 0000000..f1dfb42 --- /dev/null +++ b/internal/adapters/opencode/mapper_ops.go @@ -0,0 +1,338 @@ +package opencode + +import ( + "encoding/json" + "fmt" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file holds the per-part OP EMITTERS the part walker (mapper_parts.go) +// delegates to: the LLM op (step-start/step-finish), reasoning, tool (+ the +// task→session op, AC#4), and the non-op text/patch/file/compaction/retry +// emitters. The pure tool helpers live in mapper_tools.go; the token math +// (computeStepDeltas, AC#3), turn finalizer (SOW decision #4), provider +// canonicalization (AC#7), and the PayloadRef opencode-sqlite:// seam (chunk D) +// live in mapper_turn.go. + +// --- LLM op (step-start / step-finish) ---------------------------------------- + +// openLLMOp handles a step-start part: it opens a new LLM op (adapter- +// opencode.md §"Per-table emit rules": step-start → open LLM Op, name=modelID, +// provider=providerID from the parent message). The op stays open until the +// next step-finish closes it (closeLLMOp). Model/Provider/ProviderAlias come +// from the assistant message: ProviderAlias is data.providerID verbatim; Provider +// is the best-effort canonical mapping (default = alias) so the catalog seeds a +// provider row (catalog.go seeds only when Provider != "") (AC#7). +// +// Force-close (adapter-opencode.md §"Edge Cases" #5): if the PREVIOUS LLM op is +// still open (a step-start with no intervening step-finish), it is force-closed +// with Status="cancelled" and a synthetic EndTs = THIS step-start's start ts, +// emitted BEFORE the new OpStarted so the prior op is finalized in order. An op +// still open at TURN end stays running (no finalize) per Edge #4 — only a NEW +// step-start triggers the cancel. +func (m *sessionMapper) openLLMOp(tc *turnContext, msg *messageData, p partRow, _ partData) []canonical.Event { + startUs := m.msToMicrosWarn(p.TimeCreatedMs, "part.time_created (step-start)") + out := make([]canonical.Event, 0, 2) + if tc.llmOpOpen { + out = append(out, m.cancelOpenLLMOp(tc, startUs)) + } + + tc.opSeq++ + tc.llmOpSeq = tc.opSeq + tc.llmOpOpen = true + tc.llmStartUs = startUs + tc.llmExtras = map[string]any{} + alias := msg.ProviderID + // Snapshot the op identity so the patch-enrichment re-emit (closeLLMOp) is + // self-contained and survives the writer's unconditional ops.name update (P2-D). + tc.llmName = msg.ModelID + tc.llmModel = msg.ModelID + tc.llmProvider = canonicalProvider(alias) + tc.llmProviderAlias = alias + out = append(out, canonical.OpStartedEvent{ + EventBase: m.nextBase(startUs), + SessionNativeID: m.nativeID(), + TurnSeq: tc.turnSeq, + Seq: tc.llmOpSeq, + ParentOpSeq: -1, + Kind: canonical.OpLLM, + Name: tc.llmName, + Model: tc.llmModel, + Provider: tc.llmProvider, + ProviderAlias: tc.llmProviderAlias, + }) + return out +} + +// cancelOpenLLMOp synthesizes the cancelled OpFinalizedEvent for a previously- +// open LLM op that a new step-start supersedes (adapter-opencode.md §"Edge Cases" +// #5). EndTs is the new step-start's start ts (nextStartUs), floored to the open +// op's start so a clock anomaly never produces end < start. No tokens are folded +// in — a cancelled step never finished its accounting; its step-finish (if any) +// is consumed normally by the next closeLLMOp via stepCumIdx. The caller has +// already confirmed tc.llmOpOpen. +func (m *sessionMapper) cancelOpenLLMOp(tc *turnContext, nextStartUs int64) canonical.OpFinalizedEvent { + endUs := nextStartUs + if endUs < tc.llmStartUs { + endUs = tc.llmStartUs + } + tc.llmOpOpen = false + return canonical.OpFinalizedEvent{ + EventBase: m.nextBase(endUs), + SessionNativeID: m.nativeID(), + TurnSeq: tc.turnSeq, + Seq: tc.llmOpSeq, + Status: "cancelled", + EndTs: endUs, + } +} + +// closeLLMOp handles a step-finish part: it closes the currently-open LLM op +// with the per-step token DELTA (computed up front for the whole message; +// addressed by stepCumIdx) and the step's cost (adapter-opencode.md §"Per-table +// emit rules": step-finish → close LLM Op with per-op tokens via computeStepDeltas). +// If a patch part landed inside this step, its info was stashed in tc.llmExtras +// and is re-emitted onto the op via an idempotent OpStarted re-emit before the +// finalize (mirrors codex's enrichment re-emit; the writer upserts (turn,seq)). +// A step-finish with no open LLM op (orphan, adapter-opencode.md §"Edge Cases" +// #5) is a no-op rather than a crash. +func (m *sessionMapper) closeLLMOp(tc *turnContext, p partRow, data partData) []canonical.Event { + if tc.llmOpSeq == 0 { + // Orphan step-finish (no matching step-start). Forward-compat: nothing to + // close. The step's tokens were already folded into stepDeltas, so they + // are not lost for the turn rollup path; the op simply does not exist. + tc.stepCumIdx++ + return nil + } + delta := tc.nextStepDelta() + endUs := m.msToMicrosWarn(p.TimeCreatedMs, "part.time_created (step-finish)") + if endUs < tc.llmStartUs { + endUs = tc.llmStartUs + } + // The op is now closed; a new step-start opens a fresh one. Clearing this + // before any cancelled-finalize check means a normal close is never + // force-cancelled (Edge #5 fires only when the prior op was still open). + tc.llmOpOpen = false + out := make([]canonical.Event, 0, 2) + // Re-emit the LLM OpStarted carrying any accumulated patch extras so they + // reach ops.extras_json before the finalize (idempotent UPDATE on (turn,seq)). + // The re-emit carries the FULL op identity (Name/Model/Provider/ProviderAlias), + // not just Extras: the ingest writer updates ops.name UNCONDITIONALLY (model/ + // provider are COALESCE-protected, name is NOT — writer.go:587), so an + // identity-less re-emit would BLANK ops.name (SOW-0005 round-2 P2-D). + if len(tc.llmExtras) > 0 { + out = append(out, canonical.OpStartedEvent{ + EventBase: m.nextBase(tc.llmStartUs), + SessionNativeID: m.nativeID(), + TurnSeq: tc.turnSeq, + Seq: tc.llmOpSeq, + ParentOpSeq: -1, + Kind: canonical.OpLLM, + Name: tc.llmName, + Model: tc.llmModel, + Provider: tc.llmProvider, + ProviderAlias: tc.llmProviderAlias, + Extras: tc.llmExtras, + }) + } + out = append(out, canonical.OpFinalizedEvent{ + EventBase: m.nextBase(endUs), + SessionNativeID: m.nativeID(), + TurnSeq: tc.turnSeq, + Seq: tc.llmOpSeq, + Status: "completed", + EndTs: endUs, + TokensIn: delta.Input, + TokensOut: delta.Output, + TokensCacheRead: delta.Cache.Read, + TokensCacheWrite: delta.Cache.Write, + CostUSD: data.Cost, + // CtxUsed = input + cache.read at this step-finish (the most-recent step's + // cumulative input is the live context occupancy — adapter-opencode.md + // "ctx_used" row). Uses the CUMULATIVE value (data.Tokens), not the delta: + // context occupancy is a level, not a per-step increment. Saturating add with + // a WARN on overflow so a crafted/corrupt pair cannot wrap to a negative + // ctx_used (SOW-0005 round-3 P2-1). + CtxUsed: addClampWarn(data.Tokens.Input, data.Tokens.Cache.Read, "ctx_used (tokens.input+tokens.cache.read)", m.mwarn), + }) + // The LLM op is now closed; subsequent reasoning/tool parts (until the next + // step-start) have no parent step. They still attach to the just-closed op's + // seq as ParentOpSeq so the topology stays under the LLM call that produced + // them (matches the spec's "ParentOpSeq = the step-start's seq"); a new + // step-start re-opens a fresh op. + tc.llmExtras = map[string]any{} + return out +} + +// nextStepDelta returns the per-step token delta for the next step-finish in +// order, advancing stepCumIdx. Out-of-range (more step-finishes than precomputed +// deltas, which cannot happen for a well-formed message) yields a zero delta. +func (tc *turnContext) nextStepDelta() tokenCounts { + if tc.stepCumIdx < 0 || tc.stepCumIdx >= len(tc.stepDeltas) { + tc.stepCumIdx++ + return tokenCounts{} + } + d := tc.stepDeltas[tc.stepCumIdx] + tc.stepCumIdx++ + return d +} + +// computeStepDeltas (the cumulative→delta token math) and nonNeg / jsonTrimBytes +// live in mapper_turn.go alongside the turn finalizer that also uses them. + +// --- reasoning op ------------------------------------------------------------- + +// emitReasoningOp handles a reasoning part (adapter-opencode.md §"Per-table emit +// rules": reasoning → reasoning Op, ParentOpSeq=current LLM Op). ReasoningKind is +// raw by default (the part is the model's raw chain-of-thought text) and summary +// when data.metadata.summary is truthy (spec firming). A missing time.end leaves +// the op running (no finalize); the reasoning body (data.text) is referenced as +// an llm_reasoning PayloadRef, never inlined. +func (m *sessionMapper) emitReasoningOp(tc *turnContext, p partRow, data partData) []canonical.Event { + tc.opSeq++ + seq := tc.opSeq + startUs := m.msToMicrosWarn(data.Time.Start, "part.data.time.start (reasoning)") + if startUs == 0 { + startUs = m.msToMicrosWarn(p.TimeCreatedMs, "part.time_created (reasoning)") + } + out := make([]canonical.Event, 0, 3) + out = append(out, canonical.OpStartedEvent{ + EventBase: m.nextBase(startUs), + SessionNativeID: m.nativeID(), + TurnSeq: tc.turnSeq, + Seq: seq, + ParentOpSeq: tc.parentSeq(), + Kind: canonical.OpReasoning, + ReasoningKind: reasoningKind(p.Data), + }) + // Body → llm_reasoning PayloadRef on the reasoning op (field=text). + if data.Text != "" { + out = append(out, m.payloadRef(m.nextBase(startUs), tc.turnSeq, seq, "llm_reasoning", "text", p.ID, "text", int64(len(data.Text)))) + } + if data.Time.End != nil { + endUs := m.msToMicrosWarn(*data.Time.End, "part.data.time.end (reasoning)") + if endUs < startUs { + endUs = startUs + } + out = append(out, canonical.OpFinalizedEvent{ + EventBase: m.nextBase(endUs), + SessionNativeID: m.nativeID(), + TurnSeq: tc.turnSeq, + Seq: seq, + Status: "completed", + EndTs: endUs, + }) + } + return out +} + +// reasoningKind classifies a reasoning part as summary or raw (canonical-events +// .md:202). opencode carries no native discriminator, so the default is raw; a +// truthy data.metadata.summary flips it to summary (spec firming). The reasoning +// metadata is not a typed field on the chunk-A partData struct (which decodes +// only fields earlier chunks needed), so it is decoded locally from the raw part +// body here — keeping the new mapper-only concern out of the chunk-A types. +func reasoningKind(raw []byte) string { + var d struct { + Metadata struct { + Summary bool `json:"summary"` + } `json:"metadata"` + } + if json.Unmarshal(raw, &d) == nil && d.Metadata.Summary { + return "summary" + } + return "raw" +} + +// --- tool op (+ task→session op, AC#4) ---------------------------------------- + +// emitToolOp handles a tool part (adapter-opencode.md §"Per-table emit rules": +// tool → tool Op, namespace derived, status from state.status; tool='task' with +// state.metadata.sessionId → ALSO a session Op as the topology parent, AC#4). +// Start/end come from state.time; a missing end (running/pending) leaves the op +// running (no finalize). The output body (completed/error) is referenced as a +// tool_response PayloadRef. The session op is emitted FIRST so it becomes the +// topology parent the sub-agent attaches under. +func (m *sessionMapper) emitToolOp(tc *turnContext, p partRow, data partData) []canonical.Event { + out := make([]canonical.Event, 0, 4) + + // task→session op (AC#4): emit the session op first so it is the topology + // parent. The tool op follows so the turn still records the task invocation. + childID, metaMalformed := taskChildSessionID(data) + if metaMalformed { + // Present-but-unparseable task metadata: a possible sub-agent linkage was + // dropped. Surface it (SOW-0005 P2.6) rather than silently losing the edge; + // the tool op below is still emitted so the task invocation is recorded. + m.mwarn(fmt.Errorf("opencode: malformed task metadata (table=part id=%s field=state.metadata); sub-agent linkage omitted", p.ID)) + } + if childID != "" { + tc.opSeq++ + sessSeq := tc.opSeq + out = append(out, canonical.OpStartedEvent{ + EventBase: m.nextBase(m.toolStartUs(data, p)), + SessionNativeID: m.nativeID(), + TurnSeq: tc.turnSeq, + Seq: sessSeq, + ParentOpSeq: tc.parentSeq(), + Kind: canonical.OpSession, + ChildSessionNativeID: childID, + }) + } + + tc.opSeq++ + seq := tc.opSeq + name, namespace := toolNameNamespace(data.Tool) + startUs := m.toolStartUs(data, p) + out = append(out, canonical.OpStartedEvent{ + EventBase: m.nextBase(startUs), + SessionNativeID: m.nativeID(), + TurnSeq: tc.turnSeq, + Seq: seq, + ParentOpSeq: tc.parentSeq(), + Kind: canonical.OpTool, + Name: name, + ToolNamespace: namespace, + }) + + status, endPtr, errMsg, hasOutput := toolTerminal(data) + if hasOutput { + out = append(out, m.payloadRef(m.nextBase(startUs), tc.turnSeq, seq, "tool_response", "json", p.ID, "state.output", -1)) + } + if endPtr != nil { + endUs := m.msToMicrosWarn(*endPtr, "part.data.state.time.end (tool)") + if endUs < startUs { + endUs = startUs + } + // A failed tool op carries an ErrorClass label alongside the opencode detail + // (state.error → ErrorMessage). opencode's tool error is a bare string with + // no class, so the class is the generic defaultErrorClass (SOW-0005 round-2 + // P1-C). A non-failed status carries neither. + var errClass string + if status == "failed" { + errClass = defaultErrorClass + } + out = append(out, canonical.OpFinalizedEvent{ + EventBase: m.nextBase(endUs), + SessionNativeID: m.nativeID(), + TurnSeq: tc.turnSeq, + Seq: seq, + Status: status, + ErrorClass: errClass, + ErrorMessage: errMsg, + EndTs: endUs, + BytesIn: toolBytesIn(data), + BytesOut: toolBytesOut(data), + }) + } + return out +} + +// The pure tool helpers (toolStartUs, toolTerminal, toolBytesIn/Out, +// taskChildSessionID, toolNameNamespace) live in mapper_tools.go. + +// The non-op part emitters (emitTextPayload, recordPatch, emitCompactionLog, +// emitRetryLog, emitFileLog) live in mapper_emitters.go; finalizeTurn, the +// cumulative→delta token math, provider canonicalization, the turnContext +// op-parent helper, and the PayloadRef URI seam live in mapper_turn.go (split out +// to keep each file ≤400 lines). diff --git a/internal/adapters/opencode/mapper_parts.go b/internal/adapters/opencode/mapper_parts.go new file mode 100644 index 0000000..3f32051 --- /dev/null +++ b/internal/adapters/opencode/mapper_parts.go @@ -0,0 +1,265 @@ +package opencode + +import ( + "fmt" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file walks one message's parts and synthesizes the turn + its ops. The +// session driver (mapper.go) calls mapMessage once per message in input order; +// the op emitters it delegates to live in mapper_ops.go. The part dispatch +// table is adapter-opencode.md §"Per-table emit rules" (the part-type table): +// step-start/step-finish bound an LLM op; reasoning/tool are nested ops; +// text/patch are not ops (text → PayloadRef on the LLM op; patch → LLM-op +// extras); compaction → INF log; retry → WRN log; file → INF log (an attachment, +// round-4 P2-3); an unknown $.type is forward-compat data skipped with one WARN. + +// turnContext holds the per-turn inference state mapMessage threads while +// walking parts: the canonical turn Seq, the running op-seq counter, the +// currently-open LLM op (parent for reasoning/tool ops and the attach point for +// text PayloadRefs and patch extras), and the ordered cumulative token snapshots +// from the message's step-finish parts (deltad at finalize via computeStepDeltas). +type turnContext struct { + turnSeq int + // opSeq is the 1-based op counter within this turn; it increments for EVERY + // op the mapper emits (LLM, reasoning, tool, session), so ParentOpSeq always + // names a real, already-emitted op. text/patch do not consume a seq (they + // are not ops). + opSeq int + + // llmOpSeq is the Seq of the MOST RECENT LLM op, whether still open or + // already closed by a step-finish. It is the ParentOpSeq for reasoning/tool + // ops and the OpSeq for text PayloadRefs and patch extras within (or after) + // the step (adapter-opencode.md "Op seq numbering within a turn"). It stays + // set after a step-finish so a trailing reasoning/tool/text still nests under + // the LLM call that produced it; openLLMState reports whether it is OPEN. + llmOpSeq int + // llmOpOpen reports whether the most-recent LLM op (llmOpSeq) is still OPEN + // (a step-start with no intervening step-finish). It distinguishes the + // force-close case (a new step-start while the prior op is still open → + // emit a cancelled finalize, adapter-opencode.md §"Edge Cases" #5) from the + // normal case (a new step-start after the prior op already closed). Reset to + // false by closeLLMOp. + llmOpOpen bool + // llmStartUs is the open LLM op's start timestamp (µs), carried so the + // step-finish that closes it can supply a sane EndTs floor. + llmStartUs int64 + // llmExtras accumulates patch info (and any future non-op step metadata) to + // be re-emitted onto the LLM op at step-finish via an idempotent OpStarted + // re-emit (mirrors codex's enrichment re-emit; the writer upserts (turn,seq)). + llmExtras map[string]any + // llmName / llmModel / llmProvider / llmProviderAlias snapshot the open LLM op's + // identity (from the assistant message) so the patch-enrichment OpStarted + // re-emit in closeLLMOp is SELF-CONTAINED. The ingest writer updates ops.name + // UNCONDITIONALLY (model/provider are COALESCE-protected, name is NOT — writer.go + // :587), so a re-emit that omitted Name would BLANK ops.name. Carrying the full + // identity makes the re-emit survive the unconditional update (SOW-0005 round-2 + // P2-D). + llmName string + llmModel string + llmProvider string + llmProviderAlias string + + // stepCumIdx is the index of the NEXT step-finish within the message, used + // to address the precomputed per-step deltas. The deltas are computed once + // for the whole message up front (computeStepDeltas) because a step-finish's + // tokens are cumulative within the message (AC#3); walking sequentially and + // indexing keeps the emit loop simple. + stepCumIdx int + // stepDeltas is the precomputed per-step token delta sequence for this + // message's step-finish parts, in order. + stepDeltas []tokenCounts +} + +// mapMessage projects one message (assistant or user) plus its parts onto +// canonical events. A user message anchors the following assistant turn but +// emits nothing of its own. A malformed/empty message body is skipped with one +// WRN log (forward-compat; the column is NOT NULL so an empty body is a +// corruption signal — types.go errEmptyData). The assistant turn is opened, +// its parts walked in order, and the turn finalized with the message-level +// per-turn token delta + cost (SOW decision #4). +func (m *sessionMapper) mapMessage(mwp messageWithParts) ([]canonical.Event, error) { + data, err := decodeMessageData(mwp.Message.Data) + if err != nil { + // One structured WRN session LogEntry (detail view) AND route through onError + // so /api/health degrades: message.data is NOT-NULL, so an undecodable blob is + // a corruption signal, not benign forward-compat drift (SOW-0005 round-3 P2-2). + // The row is skipped, not aborted (adapter-opencode.md §"Edge Cases" #1). + m.mwarn(fmt.Errorf("opencode: undecodable message.data (table=message id=%s); row skipped: %w", mwp.Message.ID, err)) + base := m.nextBase(m.msToMicrosWarn(mwp.Message.TimeCreatedMs, "message.time_created (undecodable)")) + return []canonical.Event{m.logEntry(base, "WRN", 0, 0, + "message data undecodable: "+err.Error(), + map[string]any{"message_id": mwp.Message.ID})}, nil + } + + switch data.role() { + case roleAssistant: + return m.mapAssistantTurn(mwp, data) + case roleUser: + // User messages are turn anchors only; opencode pairs user→assistant and + // the assistant message IS the turn (adapter-opencode.md §"Turn + // synthesis"). Nothing to emit. + return nil, nil + default: + // Unknown role: forward-compat skip with one WRN (types.go roleUnknown). + base := m.nextBase(m.msToMicrosWarn(mwp.Message.TimeCreatedMs, "message.time_created (unknown role)")) + return []canonical.Event{m.logEntry(base, "WRN", 0, 0, + fmt.Sprintf("unknown message role %q", data.Role), + map[string]any{"message_id": mwp.Message.ID})}, nil + } +} + +// mapAssistantTurn opens a turn for an assistant message, walks its parts, and +// finalizes the turn. Turn Seq is the assistant-message order (1-based). It also +// records the failed-terminal signal: when this message carries data.error, the +// session's terminal becomes failed (the LAST such message wins because messages +// are walked in order — adapter-opencode.md §"Per-table emit rules"). +func (m *sessionMapper) mapAssistantTurn(mwp messageWithParts, data messageData) ([]canonical.Event, error) { + m.turnSeq++ + tc := &turnContext{ + turnSeq: m.turnSeq, + stepDeltas: computeStepDeltas(stepFinishTokens(mwp.Parts), m.mwarn), + } + startUs := m.msToMicrosWarn(mwp.Message.TimeCreatedMs, "message.time_created (turn start)") + out := make([]canonical.Event, 0, 4+2*len(mwp.Parts)) + + out = append(out, canonical.TurnStartedEvent{ + EventBase: m.nextBase(startUs), + SessionNativeID: m.nativeID(), + Seq: tc.turnSeq, + }) + + hasStepFinish := false + for i := range mwp.Parts { + evs, err := m.mapPart(tc, &data, mwp.Parts[i]) + if err != nil { + return nil, err + } + if mwp.Parts[i].isStepFinish() { + hasStepFinish = true + } + out = append(out, evs...) + } + + // A step-start still OPEN at turn end (no step-finish closing it) stays a + // RUNNING LLM op with no finalize (adapter-opencode.md §"Edge Cases" #4/#5: + // orphan step-start is a running LLM op). It is the within-message force-close + // (a NEW step-start arriving) that synthesizes a cancelled finalize — handled + // in openLLMOp; the trailing open op is intentionally left running here. + + // Finalize the turn ONLY when it is terminal (adapter-opencode.md §"Per-table + // emit rules": data.time.completed set, OR data.error, OR ≥1 step-finish + // part). opencode writes the assistant message row live while the turn is + // still running; a non-terminal turn stays RUNNING (TurnStarted with no + // TurnFinalized) and a later poll re-emits + finalizes it once it completes + // (idempotent). Without this gate every live row would be wrongly finalized. + if turnIsTerminal(&data, hasStepFinish) { + out = append(out, m.finalizeTurn(tc, &data, mwp.Message)) + } + + // Track the session's failed-terminal signal as the LAST assistant turn's + // terminal error, NOT a sticky OR (SOW-0005 round-2 P1-B). Messages are walked + // in order, so: if THIS turn carries an error, record it (error PRESENCE, not a + // non-empty name — P2-A); if it does NOT, CLEAR any previously-tracked error + // (a later turn recovered, so the session did not end failed). sessionFinalized + // then reflects only the last turn's state — a session that errored on turn 3 + // but succeeded on turn 5 is NOT marked failed. + if data.Error != nil { + m.failError = data.Error + m.failEndUs = m.turnEndUs(&data, mwp.Message) + } else { + m.failError = nil + m.failEndUs = 0 + } + return out, nil +} + +// mapPart dispatches one part to its emitter per the part-type table (adapter- +// opencode.md §"Per-table emit rules"). Returns the events for that part, +// advancing tc's op/LLM/step state. An unknown $.type is skipped with one WRN. +func (m *sessionMapper) mapPart(tc *turnContext, msg *messageData, p partRow) ([]canonical.Event, error) { + data, err := decodePartData(p.Data) + if err != nil { + // Session LogEntry (detail view) AND onError (health): part.data is NOT-NULL, + // so an undecodable blob is corruption that must degrade /api/health, not just + // a per-session WRN (SOW-0005 round-3 P2-2). The part is skipped, not aborted. + m.mwarn(fmt.Errorf("opencode: undecodable part.data (table=part id=%s); part skipped: %w", p.ID, err)) + base := m.nextBase(0) + return []canonical.Event{m.logEntry(base, "WRN", tc.turnSeq, tc.llmOpSeq, + "part data undecodable: "+err.Error(), + map[string]any{"part_id": p.ID})}, nil + } + + switch data.kind() { + case partStepStart: + return m.openLLMOp(tc, msg, p, data), nil + case partStepFinish: + return m.closeLLMOp(tc, p, data), nil + case partReasoning: + return m.emitReasoningOp(tc, p, data), nil + case partTool: + return m.emitToolOp(tc, p, data), nil + case partText: + return m.emitTextPayload(tc, p), nil + case partPatch: + return m.recordPatch(tc, data), nil + case partCompaction: + return m.emitCompactionLog(tc, p, data), nil + case partRetry: + return m.emitRetryLog(tc, p, data), nil + case partFile: + return m.emitFileLog(tc, p, data), nil + case partSnapshot, partSubtask, partAgent: + // Known-but-not-an-op part types observed as 0-count on the live DB + // (adapter-opencode.md §"part" distribution). They carry no op/payload + // the mapper materializes in v1; recorded as no-ops here (NOT a WARN — + // they are known, just unused). A future SOW may surface subtask as a + // session op once the part type is populated. + return nil, nil + default: + // Unknown $.type: forward-compatibility data, skipped with one WRN + // (types.go partUnknown; adapter-opencode.md §"Edge Cases" #1). + base := m.nextBase(0) + return []canonical.Event{m.logEntry(base, "WRN", tc.turnSeq, tc.llmOpSeq, + fmt.Sprintf("unknown part type %q", data.RawType), + map[string]any{"part_id": p.ID})}, nil + } +} + +// stepFinishTokens extracts the ordered cumulative token snapshots from a +// message's step-finish parts, in part order. The result feeds computeStepDeltas +// so per-op tokens are the deltas between successive cumulative values (AC#3). +// A part that fails to decode is skipped (it cannot contribute a snapshot); the +// mapPart walk surfaces its WRN separately. +func stepFinishTokens(parts []partRow) []tokenCounts { + var out []tokenCounts + for i := range parts { + d, err := decodePartData(parts[i].Data) + if err != nil { + continue + } + if d.kind() == partStepFinish { + out = append(out, d.Tokens) + } + } + return out +} + +// logEntry builds a LogEntryEvent attached to the session and the given turn/op +// scope (0 when not turn/op-scoped). Source is the adapter Format. +func (m *sessionMapper) logEntry(base canonical.EventBase, severity string, turnSeq, opSeq int, message string, extras map[string]any) canonical.LogEntryEvent { + if extras == nil { + extras = map[string]any{} + } + return canonical.LogEntryEvent{ + EventBase: base, + SessionNativeID: m.nativeID(), + TurnSeq: turnSeq, + OpSeq: opSeq, + Severity: severity, + Source: Format, + Message: message, + Extras: extras, + } +} diff --git a/internal/adapters/opencode/mapper_test.go b/internal/adapters/opencode/mapper_test.go new file mode 100644 index 0000000..283ebfb --- /dev/null +++ b/internal/adapters/opencode/mapper_test.go @@ -0,0 +1,1196 @@ +package opencode + +import ( + "encoding/json" + "strings" + "testing" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file is the executable contract for the pure row→event mapper (SOW-0005 +// chunk B). Every test feeds SYNTHETIC typed rows (sessionRow / messageRow / +// partRow built by the helpers below) directly to mapSession and asserts the +// exact emitted canonical event stream. No DB, no operator data, no AI-vendor +// names — only schema-shaped synthetic values (adapter-opencode.md §"Sensitive +// content"; SOW-0005 R5). + +const testSourceID = "opencode:/test/opencode.db" + +// canonicalPayloadKinds is the EXACT canonical PayloadRefEvent.PayloadKind set +// (internal/canonical/events.go:323-326). The adapter must never emit a kind +// outside this set (SOW-0005 round-4 P2-3 removed the non-canonical +// "user_attachment"); tests assert every emitted PayloadRef's kind is a member. +var canonicalPayloadKinds = map[string]bool{ + "llm_request": true, + "llm_response": true, + "llm_sdk_request": true, + "llm_sdk_response": true, + "llm_reasoning": true, + "tool_request": true, + "tool_response": true, + "log": true, +} + +// --- synthetic-row builders --------------------------------------------------- + +// asgMsg builds an assistant messageRow with the given id/time and a data body +// carrying providerID/modelID/tokens/cost/finish and an optional completed ms. +// The tokens object is the TURN ROLLUP (cumulative across the session per the +// SOW decision); per-op step deltas live on the step-finish parts. +func asgMsg(id string, createdMs int64, completedMs *int64, provider, model string, tok tokenCounts, cost float64, finish string, errName string) messageRow { + d := map[string]any{ + "role": "assistant", + "providerID": provider, + "modelID": model, + "agent": "test-agent", + "cost": cost, + "tokens": tok, + "time": map[string]any{"created": createdMs}, + "finish": finish, + } + if completedMs != nil { + d["time"] = map[string]any{"created": createdMs, "completed": *completedMs} + } + if errName != "" { + d["error"] = map[string]any{"name": errName} + } + raw, _ := json.Marshal(d) + return messageRow{ID: id, SessionID: "ses_x", TimeCreatedMs: createdMs, TimeUpdatedMs: createdMs, Data: raw} +} + +// usrMsg builds a user messageRow (the mapper emits no turn for it; it only +// anchors the assistant turn that follows). +func usrMsg(id string, createdMs int64) messageRow { + raw, _ := json.Marshal(map[string]any{"role": "user", "time": map[string]any{"created": createdMs}}) + return messageRow{ID: id, SessionID: "ses_x", TimeCreatedMs: createdMs, TimeUpdatedMs: createdMs, Data: raw} +} + +// stepStart / stepFinish build the LLM-op delimiter parts. stepFinish carries +// CUMULATIVE token counts within the message (the mapper deltas them). +func stepStart(id string) partRow { + raw, _ := json.Marshal(map[string]any{"type": "step-start"}) + return partRow{ID: id, MessageID: "msg_a", SessionID: "ses_x", Data: raw} +} + +func stepFinish(id string, inCum, outCum, reasonCum, cacheRdCum, cacheWrCum int64, cost float64) partRow { + raw, _ := json.Marshal(map[string]any{ + "type": "step-finish", + "reason": "stop", + "cost": cost, + "tokens": map[string]any{ + "input": inCum, + "output": outCum, + "reasoning": reasonCum, + "cache": map[string]any{"read": cacheRdCum, "write": cacheWrCum}, + }, + }) + return partRow{ID: id, MessageID: "msg_a", SessionID: "ses_x", Data: raw} +} + +// toolPart builds a tool partRow with the given tool name and state status. +func toolPart(id, tool, status string, startMs int64, endMs *int64, metadata map[string]any) partRow { + state := map[string]any{ + "status": status, + "input": map[string]any{"q": "x"}, + "output": "result-bytes", + "time": map[string]any{"start": startMs}, + } + if endMs != nil { + state["time"] = map[string]any{"start": startMs, "end": *endMs} + } + if metadata != nil { + state["metadata"] = metadata + } + raw, _ := json.Marshal(map[string]any{"type": "tool", "callID": "call_" + id, "tool": tool, "state": state}) + return partRow{ID: id, MessageID: "msg_a", SessionID: "ses_x", Data: raw} +} + +// reasoningPart builds a reasoning partRow. summary=true sets metadata.summary. +func reasoningPart(id string, startMs int64, endMs *int64, summary bool) partRow { + d := map[string]any{"type": "reasoning", "text": "thinking", "time": map[string]any{"start": startMs}} + if endMs != nil { + d["time"] = map[string]any{"start": startMs, "end": *endMs} + } + if summary { + d["metadata"] = map[string]any{"summary": true} + } + raw, _ := json.Marshal(d) + return partRow{ID: id, MessageID: "msg_a", SessionID: "ses_x", Data: raw} +} + +func textPart(id string) partRow { + raw, _ := json.Marshal(map[string]any{"type": "text", "text": "final answer"}) + return partRow{ID: id, MessageID: "msg_a", SessionID: "ses_x", Data: raw} +} + +func patchPart(id string) partRow { + raw, _ := json.Marshal(map[string]any{"type": "patch", "hash": "deadbeef", "files": []string{"/x/a.go", "/x/b.go"}}) + return partRow{ID: id, MessageID: "msg_a", SessionID: "ses_x", Data: raw} +} + +func compactionPart(id string, auto bool) partRow { + raw, _ := json.Marshal(map[string]any{"type": "compaction", "auto": auto}) + return partRow{ID: id, MessageID: "msg_a", SessionID: "ses_x", Data: raw} +} + +func retryPart(id string, attempt int) partRow { + raw, _ := json.Marshal(map[string]any{"type": "retry", "attempt": attempt, "error": map[string]any{"name": "RateLimit"}}) + return partRow{ID: id, MessageID: "msg_a", SessionID: "ses_x", Data: raw} +} + +func filePart(id, url string) partRow { + raw, _ := json.Marshal(map[string]any{"type": "file", "mime": "image/png", "filename": "x.png", "url": url}) + return partRow{ID: id, MessageID: "msg_a", SessionID: "ses_x", Data: raw} +} + +func unknownPart(id string) partRow { + raw, _ := json.Marshal(map[string]any{"type": "future-thing", "foo": 1}) + return partRow{ID: id, MessageID: "msg_a", SessionID: "ses_x", Data: raw} +} + +// rootSession builds a root sessionRow with the given id and optional model. +func rootSession(id string, archivedMs int64) sessionRow { + model, _ := json.Marshal(map[string]any{"id": "the-model", "providerID": "the-alias"}) + return sessionRow{ + ID: id, ProjectID: "prj_1", Slug: "test-slug", Directory: "/work/dir", + Title: "Test Title", Version: "9.9.9", Agent: "test-agent", Model: model, + TimeCreatedMs: 1000, TimeUpdatedMs: 1000, TimeArchivedMs: archivedMs, + } +} + +// run drives mapSession with a fresh mapper and returns the emitted stream. +func run(t *testing.T, s sessionRow, msgs []messageWithParts) []canonical.Event { + t.Helper() + evs, err := mapSession(testSourceID, s, msgs) + if err != nil { + t.Fatalf("mapSession: %v", err) + } + return evs +} + +// mwp pairs a message with its ordered parts (the unit mapSession consumes). +func mwp(m messageRow, parts ...partRow) messageWithParts { + return messageWithParts{Message: m, Parts: parts} +} + +// --- assertion helpers (mirror codex/mapper_helpers_test.go) ------------------ + +func countKind(events []canonical.Event, kind canonical.EventKind) int { + n := 0 + for _, ev := range events { + if ev.EventKind() == kind { + n++ + } + } + return n +} + +func opStarts(events []canonical.Event) []canonical.OpStartedEvent { + var out []canonical.OpStartedEvent + for _, ev := range events { + if s, ok := ev.(canonical.OpStartedEvent); ok { + out = append(out, s) + } + } + return out +} + +func opFinals(events []canonical.Event) []canonical.OpFinalizedEvent { + var out []canonical.OpFinalizedEvent + for _, ev := range events { + if f, ok := ev.(canonical.OpFinalizedEvent); ok { + out = append(out, f) + } + } + return out +} + +func turnFinals(events []canonical.Event) []canonical.TurnFinalizedEvent { + var out []canonical.TurnFinalizedEvent + for _, ev := range events { + if f, ok := ev.(canonical.TurnFinalizedEvent); ok { + out = append(out, f) + } + } + return out +} + +func firstStarted(t *testing.T, events []canonical.Event) canonical.SessionStartedEvent { + t.Helper() + for _, ev := range events { + if s, ok := ev.(canonical.SessionStartedEvent); ok { + return s + } + } + t.Fatal("no SessionStartedEvent in stream") + return canonical.SessionStartedEvent{} +} + +func llmOps(events []canonical.Event) []canonical.OpStartedEvent { + var out []canonical.OpStartedEvent + for _, s := range opStarts(events) { + if s.Kind == canonical.OpLLM { + out = append(out, s) + } + } + return out +} + +func toolOps(events []canonical.Event) []canonical.OpStartedEvent { + var out []canonical.OpStartedEvent + for _, s := range opStarts(events) { + if s.Kind == canonical.OpTool { + out = append(out, s) + } + } + return out +} + +// --- SessionStarted + terminal status ---------------------------------------- + +func TestMapSession_RootSessionStarted(t *testing.T) { + s := rootSession("ses_root", 0) + evs := run(t, s, nil) + st := firstStarted(t, evs) + if st.NativeID != "ses_root" { + t.Fatalf("NativeID = %q, want ses_root", st.NativeID) + } + if st.Kind != canonical.KindRoot { + t.Fatalf("Kind = %q, want root", st.Kind) + } + if st.RootNativeID != "ses_root" { + t.Fatalf("RootNativeID = %q, want ses_root (self)", st.RootNativeID) + } + if st.ParentNativeID != "" { + t.Fatalf("ParentNativeID = %q, want empty", st.ParentNativeID) + } + if st.AgentName != "test-agent" { + t.Fatalf("AgentName = %q, want test-agent", st.AgentName) + } + if st.Model != "the-model" { + t.Fatalf("Model = %q, want the-model (from session.model $.id)", st.Model) + } + if st.Cwd != "/work/dir" { + t.Fatalf("Cwd = %q, want /work/dir", st.Cwd) + } + // Ts is ms→µs. + if st.Ts != 1000*1000 { + t.Fatalf("Ts = %d, want %d (ms→µs)", st.Ts, 1000*1000) + } + // Extras carry providerID/version/slug/title/project_id/directory. + for _, k := range []string{"providerID", "version", "slug", "title", "project_id", "directory"} { + if _, ok := st.Extras[k]; !ok { + t.Fatalf("Extras missing %q: %v", k, st.Extras) + } + } + // Running session (no archive, no error) => NO SessionFinalized. + if n := countKind(evs, canonical.EvSessionFinalized); n != 0 { + t.Fatalf("SessionFinalized count = %d, want 0 (running)", n) + } +} + +func TestMapSession_SubAgentLinkage(t *testing.T) { + s := rootSession("ses_child", 0) + s.ParentID = "ses_parent" + evs := run(t, s, nil) + st := firstStarted(t, evs) + if st.Kind != canonical.KindSubAgent { + t.Fatalf("Kind = %q, want sub_agent", st.Kind) + } + if st.ParentNativeID != "ses_parent" { + t.Fatalf("ParentNativeID = %q, want ses_parent", st.ParentNativeID) + } + if st.RootNativeID != "ses_parent" { + t.Fatalf("RootNativeID = %q, want ses_parent", st.RootNativeID) + } +} + +func TestMapSession_TerminalArchivedCompleted(t *testing.T) { + s := rootSession("ses_arch", 5000) + evs := run(t, s, nil) + fins := finalizes(evs) + if len(fins) != 1 { + t.Fatalf("SessionFinalized count = %d, want 1", len(fins)) + } + if fins[0].Status != canonical.StatusCompleted { + t.Fatalf("Status = %q, want completed", fins[0].Status) + } + if fins[0].EndTs != 5000*1000 { + t.Fatalf("EndTs = %d, want %d (archived ms→µs)", fins[0].EndTs, 5000*1000) + } +} + +func TestMapSession_TerminalFailedFromError(t *testing.T) { + s := rootSession("ses_fail", 0) + completed := int64(2000) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, &completed, "the-alias", "the-model", tokenCounts{Input: 10}, 0.1, "error", "ProviderError")), + } + evs := run(t, s, msgs) + fins := finalizes(evs) + if len(fins) != 1 { + t.Fatalf("SessionFinalized count = %d, want 1", len(fins)) + } + if fins[0].Status != canonical.StatusFailed { + t.Fatalf("Status = %q, want failed", fins[0].Status) + } + if fins[0].ErrorClass != "ProviderError" { + t.Fatalf("ErrorClass = %q, want ProviderError", fins[0].ErrorClass) + } +} + +func TestMapSession_RunningNoTerminalWhenIncomplete(t *testing.T) { + s := rootSession("ses_run", 0) + // assistant message with no completed ts and no error => session stays running. + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{Input: 10}, 0.1, "", "")), + } + evs := run(t, s, msgs) + if n := countKind(evs, canonical.EvSessionFinalized); n != 0 { + t.Fatalf("SessionFinalized count = %d, want 0 (running)", n) + } +} + +func finalizes(events []canonical.Event) []canonical.SessionFinalizedEvent { + var out []canonical.SessionFinalizedEvent + for _, ev := range events { + if f, ok := ev.(canonical.SessionFinalizedEvent); ok { + out = append(out, f) + } + } + return out +} + +// --- Turn synthesis + per-turn token deltas ----------------------------------- + +func TestMapSession_UserMessageEmitsNothing(t *testing.T) { + s := rootSession("ses_x", 0) + c := int64(3000) + // A user message precedes the assistant turn (opencode pairs user→assistant; + // the assistant message IS the turn). The user message must emit nothing and + // must NOT consume a turn Seq. + msgs := []messageWithParts{ + mwp(usrMsg("msg_u", 1400)), + mwp(asgMsg("msg_a", 1500, &c, "the-alias", "the-model", tokenCounts{Input: 50, Output: 5}, 0.1, "stop", "")), + } + evs := run(t, s, msgs) + if n := countKind(evs, canonical.EvTurnStarted); n != 1 { + t.Fatalf("TurnStarted = %d want 1 (user message must not open a turn)", n) + } + tf := turnFinals(evs) + if len(tf) != 1 || tf[0].Seq != 1 { + t.Fatalf("turn finals = %+v want one turn with Seq=1", tf) + } +} + +func TestMapSession_TurnNumberingAndTokenDeltas(t *testing.T) { + s := rootSession("ses_x", 0) + c1 := int64(2000) + c2 := int64(4000) + // Turn 1 cumulative-at-completion tokens: in=100,out=20. Turn 2: in=260,out=55. + // Per-turn DELTA: turn1 = 100/20; turn2 = 160/35 (SOW decision: message-level + // cumulative → delta from prior assistant message). + t1 := tokenCounts{Input: 100, Output: 20, Cache: cacheTokens{Read: 5, Write: 1}} + t2 := tokenCounts{Input: 260, Output: 55, Cache: cacheTokens{Read: 30, Write: 4}} + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, &c1, "the-alias", "the-model", t1, 0.10, "stop", "")), + mwp(asgMsg("msg_b", 3500, &c2, "the-alias", "the-model", t2, 0.25, "stop", "")), + } + evs := run(t, s, msgs) + if n := countKind(evs, canonical.EvTurnStarted); n != 2 { + t.Fatalf("TurnStarted count = %d, want 2", n) + } + tf := turnFinals(evs) + if len(tf) != 2 { + t.Fatalf("TurnFinalized count = %d, want 2", len(tf)) + } + if tf[0].Seq != 1 || tf[1].Seq != 2 { + t.Fatalf("turn seqs = %d,%d want 1,2", tf[0].Seq, tf[1].Seq) + } + if tf[0].TokensIn != 100 || tf[0].TokensOut != 20 { + t.Fatalf("turn1 tokens = %d/%d want 100/20", tf[0].TokensIn, tf[0].TokensOut) + } + if tf[1].TokensIn != 160 || tf[1].TokensOut != 35 { + t.Fatalf("turn2 tokens = %d/%d want 160/35 (delta)", tf[1].TokensIn, tf[1].TokensOut) + } + // Per-turn cache deltas work via TurnFinalizedEvent (SOW decision #4). + if tf[1].TokensCacheRead != 25 || tf[1].TokensCacheWrite != 3 { + t.Fatalf("turn2 cache = %d/%d want 25/3 (delta)", tf[1].TokensCacheRead, tf[1].TokensCacheWrite) + } + // Per-turn cost is the message-level cost verbatim (not a delta — cost is + // already per-message in opencode). + if tf[1].CostUSD != 0.25 { + t.Fatalf("turn2 cost = %v want 0.25", tf[1].CostUSD) + } + if tf[0].Status != "completed" { + t.Fatalf("turn1 status = %q want completed", tf[0].Status) + } +} + +// --- computeStepDeltas (AC#3) ------------------------------------------------- + +func TestComputeStepDeltas_AC3(t *testing.T) { + // AC#3: three step-finish parts with cumulative input 100,250,410 → per-op + // deltas 100,150,160. + cums := []tokenCounts{ + {Input: 100, Output: 10, Reasoning: 1, Cache: cacheTokens{Read: 1000, Write: 5}}, + {Input: 250, Output: 25, Reasoning: 3, Cache: cacheTokens{Read: 2500, Write: 5}}, + {Input: 410, Output: 40, Reasoning: 6, Cache: cacheTokens{Read: 4100, Write: 5}}, + } + got := computeStepDeltas(cums, nil) + wantIn := []int64{100, 150, 160} + wantOut := []int64{10, 15, 15} + wantReason := []int64{1, 2, 3} + wantCacheRd := []int64{1000, 1500, 1600} + wantCacheWr := []int64{5, 0, 0} + if len(got) != 3 { + t.Fatalf("deltas len = %d want 3", len(got)) + } + for i := range got { + if got[i].Input != wantIn[i] { + t.Errorf("delta[%d].Input = %d want %d", i, got[i].Input, wantIn[i]) + } + if got[i].Output != wantOut[i] { + t.Errorf("delta[%d].Output = %d want %d", i, got[i].Output, wantOut[i]) + } + if got[i].Reasoning != wantReason[i] { + t.Errorf("delta[%d].Reasoning = %d want %d", i, got[i].Reasoning, wantReason[i]) + } + if got[i].Cache.Read != wantCacheRd[i] { + t.Errorf("delta[%d].Cache.Read = %d want %d", i, got[i].Cache.Read, wantCacheRd[i]) + } + if got[i].Cache.Write != wantCacheWr[i] { + t.Errorf("delta[%d].Cache.Write = %d want %d", i, got[i].Cache.Write, wantCacheWr[i]) + } + } +} + +func TestComputeStepDeltas_NonMonotonicClampsToZero(t *testing.T) { + // Defensive: a non-monotonic sequence (a reset / out-of-order observation) + // must never emit a negative delta. The delta clamps to 0 (spec gap #3 — + // reconciliation recomputes the whole message; a clamp keeps a transient + // observation from corrupting cost with negatives). + cums := []tokenCounts{ + {Input: 300}, + {Input: 100}, // regression + {Input: 150}, + } + got := computeStepDeltas(cums, nil) + want := []int64{300, 0, 50} + for i := range got { + if got[i].Input != want[i] { + t.Errorf("delta[%d].Input = %d want %d", i, got[i].Input, want[i]) + } + } +} + +func TestComputeStepDeltas_Empty(t *testing.T) { + if got := computeStepDeltas(nil, nil); len(got) != 0 { + t.Fatalf("computeStepDeltas(nil) len = %d want 0", len(got)) + } +} + +// --- LLM ops from step-start/step-finish + per-op token deltas ---------------- + +func TestMapSession_LLMOpsStepDeltas(t *testing.T) { + s := rootSession("ses_x", 0) + c1 := int64(4000) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, &c1, "the-alias", "the-model", tokenCounts{Input: 410, Output: 40}, 0.3, "stop", ""), + stepStart("prt_1"), + stepFinish("prt_2", 100, 10, 0, 0, 0, 0.1), + stepStart("prt_3"), + stepFinish("prt_4", 250, 25, 0, 0, 0, 0.2), + stepStart("prt_5"), + stepFinish("prt_6", 410, 40, 0, 0, 0, 0.3), + ), + } + evs := run(t, s, msgs) + llm := llmOps(evs) + if len(llm) != 3 { + t.Fatalf("LLM op count = %d want 3", len(llm)) + } + for _, op := range llm { + if op.Model != "the-model" { + t.Errorf("LLM op Model = %q want the-model", op.Model) + } + if op.ProviderAlias != "the-alias" { + t.Errorf("LLM op ProviderAlias = %q want the-alias", op.ProviderAlias) + } + if op.Provider == "" { + t.Errorf("LLM op Provider must be non-empty (catalog seeding requires it)") + } + } + fin := opFinals(evs) + // Collect LLM finalize token deltas in op order. + var inDeltas []int64 + for _, f := range fin { + // LLM ops are seqs 1,3,5 (steps interleave with finishes); match by + // presence of token deltas — only LLM finalizes carry tokens here. + if f.TokensIn > 0 || f.TokensOut > 0 { + inDeltas = append(inDeltas, f.TokensIn) + } + } + if len(inDeltas) != 3 || inDeltas[0] != 100 || inDeltas[1] != 150 || inDeltas[2] != 160 { + t.Fatalf("LLM op token-in deltas = %v want [100 150 160]", inDeltas) + } +} + +// --- Tool ops + namespace derivation ------------------------------------------ + +func TestMapSession_ToolOpNamespaceMCP(t *testing.T) { + s := rootSession("ses_x", 0) + end := int64(2500) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), + toolPart("prt_2", "github_get_file_contents", "completed", 2000, &end, nil), + stepFinish("prt_3", 10, 1, 0, 0, 0, 0.01), + ), + } + evs := run(t, s, msgs) + tools := toolOps(evs) + if len(tools) != 1 { + t.Fatalf("tool op count = %d want 1", len(tools)) + } + if tools[0].Name != "get_file_contents" { + t.Fatalf("tool Name = %q want get_file_contents", tools[0].Name) + } + if tools[0].ToolNamespace != "github" { + t.Fatalf("tool ToolNamespace = %q want github", tools[0].ToolNamespace) + } + // ParentOpSeq must point at the open LLM op (the step-start's seq). + if tools[0].ParentOpSeq <= 0 { + t.Fatalf("tool ParentOpSeq = %d want >0 (under LLM op)", tools[0].ParentOpSeq) + } +} + +func TestMapSession_ToolOpBuiltinNoNamespace(t *testing.T) { + s := rootSession("ses_x", 0) + end := int64(2500) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), + toolPart("prt_2", "bash", "completed", 2000, &end, nil), + ), + } + evs := run(t, s, msgs) + tools := toolOps(evs) + if len(tools) != 1 { + t.Fatalf("tool op count = %d want 1", len(tools)) + } + if tools[0].Name != "bash" { + t.Fatalf("tool Name = %q want bash", tools[0].Name) + } + if tools[0].ToolNamespace != "" { + t.Fatalf("tool ToolNamespace = %q want empty (builtin)", tools[0].ToolNamespace) + } +} + +// TestMapSession_ToolOpStatusError pins P1-C (SOW-0005 round-2): an opencode tool +// whose state.status == "error" must finalize with the CANONICAL op status +// "failed" (NOT the non-canonical "error"), carrying the opencode detail in +// ErrorClass (a class label) + ErrorMessage (state.error). canonical op statuses +// are running|completed|failed|cancelled|truncated (canonical-events.md:196). +func TestMapSession_ToolOpStatusError(t *testing.T) { + s := rootSession("ses_x", 0) + end := int64(2500) + errState := func(id string) partRow { + state := map[string]any{ + "status": "error", + "input": map[string]any{}, + "error": "boom", + "time": map[string]any{"start": int64(2000), "end": end}, + } + raw, _ := json.Marshal(map[string]any{"type": "tool", "callID": "c", "tool": "bash", "state": state}) + return partRow{ID: id, MessageID: "msg_a", SessionID: "ses_x", Data: raw} + } + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), + errState("prt_2"), + ), + } + evs := run(t, s, msgs) + fin := opFinals(evs) + // No finalize may carry the non-canonical "error" status. + for i := range fin { + if fin[i].Status == "error" { + t.Fatalf("tool OpFinalized carries non-canonical status %q (P1-C: must be 'failed')", fin[i].Status) + } + } + var toolFin *canonical.OpFinalizedEvent + for i := range fin { + if fin[i].Status == "failed" { + toolFin = &fin[i] + } + } + if toolFin == nil { + t.Fatalf("no tool OpFinalized with status=failed in %d finals (P1-C: opencode 'error' → canonical 'failed')", len(fin)) + } + if toolFin.ErrorMessage != "boom" { + t.Fatalf("tool error message = %q want boom", toolFin.ErrorMessage) + } + if toolFin.ErrorClass != defaultErrorClass { + t.Fatalf("tool ErrorClass = %q want %q (P1-C carries a class label)", toolFin.ErrorClass, defaultErrorClass) + } + // SOW-0005 round-6 P2-1: this failed tool carries ONLY state.error (no + // state.output), so NO tool_response PayloadRef may be emitted — the future + // resolver would fetch a state.output body that does not exist. The detail is + // already in ErrorMessage above. + for _, ev := range evs { + if p, ok := ev.(canonical.PayloadRefEvent); ok && p.PayloadKind == "tool_response" { + t.Fatalf("failed tool with only state.error emitted a tool_response PayloadRef (uri=%q); none must be emitted (round-6 P2-1)", p.LocationURI) + } + } +} + +// TestMapSession_FailedToolNoOutputNoPayloadRef pins SOW-0005 round-6 P2-1 directly: +// a failed tool (state.status="error") whose state has ONLY an error string and NO +// state.output emits NO tool_response PayloadRef (its detail rides in ErrorMessage), +// while a COMPLETED tool WITH state.output still emits the tool_response ref. The two +// cases share one mapSession run so the gate (state.output != "", not the status) is +// pinned against both shapes at once. +func TestMapSession_FailedToolNoOutputNoPayloadRef(t *testing.T) { + s := rootSession("ses_x", 0) + end := int64(2600) + // A failed tool with only state.error (no output). + failNoOut := func(id string) partRow { + state := map[string]any{ + "status": "error", + "input": map[string]any{"command": "make"}, + "error": "command failed", + "time": map[string]any{"start": int64(2000), "end": end}, + } + raw, _ := json.Marshal(map[string]any{"type": "tool", "callID": "c1", "tool": "bash", "state": state}) + return partRow{ID: id, MessageID: "msg_a", SessionID: "ses_x", Data: raw} + } + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), + failNoOut("prt_fail"), // failed, only state.error → NO ref + toolPart("prt_ok", "read", "completed", 2700, &end, nil), // completed WITH output → ref + ), + } + evs := run(t, s, msgs) + + // Collect tool_response refs and the part ids they point at. + var refParts []string + for _, ev := range evs { + if p, ok := ev.(canonical.PayloadRefEvent); ok && p.PayloadKind == "tool_response" { + refParts = append(refParts, p.LocationURI) + } + } + // Exactly ONE tool_response ref, for the completed tool (prt_ok) — never for the + // failed-no-output tool (prt_fail). + if len(refParts) != 1 { + t.Fatalf("tool_response PayloadRef count = %d (%v), want exactly 1 (the completed tool with output)", len(refParts), refParts) + } + if !strings.Contains(refParts[0], "part_id=prt_ok") { + t.Errorf("the sole tool_response ref points at %q, want the completed tool prt_ok", refParts[0]) + } + for _, u := range refParts { + if strings.Contains(u, "part_id=prt_fail") { + t.Fatalf("failed tool prt_fail (only state.error) emitted a tool_response ref %q (round-6 P2-1: must not)", u) + } + } + + // The failed tool still finalizes failed with ErrorMessage carrying state.error. + var failFin *canonical.OpFinalizedEvent + for i, f := range opFinals(evs) { + if f.Status == "failed" { + ff := opFinals(evs)[i] + failFin = &ff + } + } + if failFin == nil { + t.Fatal("failed tool produced no OpFinalized with status=failed") + } + if failFin.ErrorMessage != "command failed" { + t.Errorf("failed tool ErrorMessage = %q, want %q (detail rides in ErrorMessage, not a payload ref)", failFin.ErrorMessage, "command failed") + } +} + +func TestMapSession_RunningToolNoFinalize(t *testing.T) { + s := rootSession("ses_x", 0) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), + toolPart("prt_2", "bash", "running", 2000, nil, nil), // no end => running + ), + } + evs := run(t, s, msgs) + // A running tool emits OpStarted but no OpFinalized for that op. + tools := toolOps(evs) + if len(tools) != 1 { + t.Fatalf("tool op count = %d want 1", len(tools)) + } + for _, f := range opFinals(evs) { + if f.Seq == tools[0].Seq && f.TurnSeq == tools[0].TurnSeq { + t.Fatalf("running tool op %d:%d must NOT be finalized", f.TurnSeq, f.Seq) + } + } +} + +// --- task tool → session op (AC#4) -------------------------------------------- + +func TestMapSession_TaskToolEmitsSessionOp(t *testing.T) { + s := rootSession("ses_x", 0) + end := int64(2500) + md := map[string]any{"sessionId": "ses_child"} + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), + toolPart("prt_2", "task", "completed", 2000, &end, md), + ), + } + evs := run(t, s, msgs) + // Must emit BOTH a tool op (Name=task) AND a session op (ChildSessionNativeID). + var sawTool, sawSession bool + for _, op := range opStarts(evs) { + if op.Kind == canonical.OpTool && op.Name == "task" { + sawTool = true + } + if op.Kind == canonical.OpSession && op.ChildSessionNativeID == "ses_child" { + sawSession = true + } + } + if !sawTool { + t.Fatal("missing tool op for task") + } + if !sawSession { + t.Fatal("missing session op with ChildSessionNativeID=ses_child") + } +} + +func TestMapSession_TaskToolNoSessionIDNoSessionOp(t *testing.T) { + s := rootSession("ses_x", 0) + end := int64(2500) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), + toolPart("prt_2", "task", "completed", 2000, &end, nil), // no sessionId + ), + } + evs := run(t, s, msgs) + for _, op := range opStarts(evs) { + if op.Kind == canonical.OpSession { + t.Fatal("task with no sessionId must NOT emit a session op") + } + } +} + +// --- reasoning op (AC: ParentOpSeq, ReasoningKind) ---------------------------- + +func TestMapSession_ReasoningOpDefaultRaw(t *testing.T) { + s := rootSession("ses_x", 0) + end := int64(2200) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), + reasoningPart("prt_2", 2000, &end, false), + ), + } + evs := run(t, s, msgs) + var r *canonical.OpStartedEvent + for _, op := range opStarts(evs) { + if op.Kind == canonical.OpReasoning { + o := op + r = &o + } + } + if r == nil { + t.Fatal("no reasoning op") + } + if r.ReasoningKind != "raw" { + t.Fatalf("ReasoningKind = %q want raw (default)", r.ReasoningKind) + } + if r.ParentOpSeq <= 0 { + t.Fatalf("reasoning ParentOpSeq = %d want >0 (under LLM op)", r.ParentOpSeq) + } +} + +func TestMapSession_ReasoningOpSummaryFromMetadata(t *testing.T) { + s := rootSession("ses_x", 0) + end := int64(2200) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), + reasoningPart("prt_2", 2000, &end, true), + ), + } + evs := run(t, s, msgs) + for _, op := range opStarts(evs) { + if op.Kind == canonical.OpReasoning && op.ReasoningKind == "summary" { + return + } + } + t.Fatal("no reasoning op with ReasoningKind=summary") +} + +func TestMapSession_ReasoningRunningWhenNoEnd(t *testing.T) { + s := rootSession("ses_x", 0) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), + reasoningPart("prt_2", 2000, nil, false), // no end + ), + } + evs := run(t, s, msgs) + var rSeq, rTurn int + for _, op := range opStarts(evs) { + if op.Kind == canonical.OpReasoning { + rSeq, rTurn = op.Seq, op.TurnSeq + } + } + for _, f := range opFinals(evs) { + if f.Seq == rSeq && f.TurnSeq == rTurn { + t.Fatal("reasoning op with no end must NOT be finalized") + } + } +} + +// --- text → PayloadRef (no op) ------------------------------------------------ + +func TestMapSession_TextEmitsPayloadRefNotOp(t *testing.T) { + s := rootSession("ses_x", 0) + end := int64(2200) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), + stepFinish("prt_2", 10, 1, 0, 0, 0, 0.01), + textPart("prt_3"), + ), + } + _ = end + evs := run(t, s, msgs) + // text must NOT add an op (only the one LLM op exists). + if n := len(llmOps(evs)); n != 1 { + t.Fatalf("LLM op count = %d want 1 (text is not an op)", n) + } + for _, op := range opStarts(evs) { + if op.Kind != canonical.OpLLM { + t.Fatalf("unexpected non-LLM op kind %q (text must not be an op)", op.Kind) + } + } + // text DOES emit a PayloadRef (llm_response, field=text) attached to the LLM op. + var sawTextRef bool + for _, ev := range evs { + if p, ok := ev.(canonical.PayloadRefEvent); ok && p.PayloadKind == "llm_response" { + sawTextRef = true + if p.OpSeq <= 0 { + t.Fatalf("text PayloadRef OpSeq = %d want >0 (attached to LLM op)", p.OpSeq) + } + } + } + if !sawTextRef { + t.Fatal("no llm_response PayloadRef for text part") + } +} + +func TestMapSession_TextBeforeAnyLLMOpDropped(t *testing.T) { + s := rootSession("ses_x", 0) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + textPart("prt_1"), // before any step-start => no op to attach to + ), + } + evs := run(t, s, msgs) + for _, ev := range evs { + if p, ok := ev.(canonical.PayloadRefEvent); ok && p.PayloadKind == "llm_response" { + t.Fatal("text PayloadRef before any LLM op must be dropped (op_id NOT NULL)") + } + } +} + +// --- patch → op extras (not an op) -------------------------------------------- + +func TestMapSession_PatchNotAnOpAddsExtras(t *testing.T) { + s := rootSession("ses_x", 0) + c := int64(3000) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, &c, "the-alias", "the-model", tokenCounts{Input: 10}, 0.1, "stop", ""), + stepStart("prt_1"), + patchPart("prt_2"), + stepFinish("prt_3", 10, 1, 0, 0, 0, 0.1), + ), + } + evs := run(t, s, msgs) + // patch must NOT add a DISTINCT op. The mapper re-emits the LLM OpStarted + // (idempotent UPDATE on (turn,seq)) to graft the patch extras before the + // finalize — mirrors codex's enrichment re-emit — so count DISTINCT (turn,seq) + // ops, not raw OpStarted events. + distinct := map[[2]int]canonical.OpKind{} + for _, op := range opStarts(evs) { + distinct[[2]int{op.TurnSeq, op.Seq}] = op.Kind + } + if len(distinct) != 1 { + t.Fatalf("distinct op count = %d want 1 (patch is not an op)", len(distinct)) + } + for _, k := range distinct { + if k != canonical.OpLLM { + t.Fatalf("only op must be the LLM op, got kind %q", k) + } + } + // The LLM op's re-emit carries patch info in Extras. + var found bool + for _, op := range opStarts(evs) { + if op.Kind == canonical.OpLLM { + if files, ok := op.Extras["patch_files"]; ok { + found = true + arr, _ := files.([]string) + if len(arr) != 2 { + t.Fatalf("patch_files len = %d want 2", len(arr)) + } + } + } + } + if !found { + t.Fatal("LLM op Extras missing patch_files") + } +} + +// --- compaction → INF log; retry → WRN log ------------------------------------ + +func TestMapSession_CompactionInfoLog(t *testing.T) { + s := rootSession("ses_x", 0) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), + compactionPart("prt_2", true), + ), + } + evs := run(t, s, msgs) + var found bool + for _, ev := range evs { + if l, ok := ev.(canonical.LogEntryEvent); ok && l.Severity == "INF" { + found = true + if l.Source != Format { + t.Fatalf("log Source = %q want %q", l.Source, Format) + } + } + } + if !found { + t.Fatal("no INF LogEntry for compaction") + } +} + +// TestMapSession_RetryWarnLog pins the retry → WRN LogEntry, including the +// triggering error's name in the message AND extras (SOW-0005 round-6 P3-1). +// retryPart builds an error with name "RateLimit". +func TestMapSession_RetryWarnLog(t *testing.T) { + s := rootSession("ses_x", 0) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), + retryPart("prt_2", 3), + ), + } + evs := run(t, s, msgs) + var retryLog *canonical.LogEntryEvent + for i, ev := range evs { + if l, ok := ev.(canonical.LogEntryEvent); ok && l.Severity == "WRN" { + le := evs[i].(canonical.LogEntryEvent) + retryLog = &le + } + } + if retryLog == nil { + t.Fatal("no WRN LogEntry for retry") + } + // The message carries the attempt AND the error name (P3-1). + if retryLog.Message != "API retry attempt 3: RateLimit" { + t.Errorf("retry WRN message = %q, want %q (attempt + error.name)", retryLog.Message, "API retry attempt 3: RateLimit") + } + if retryLog.Extras["error.name"] != "RateLimit" { + t.Errorf("retry WRN extras[error.name] = %v, want RateLimit", retryLog.Extras["error.name"]) + } + if retryLog.Extras["attempt"] != 3 { + t.Errorf("retry WRN extras[attempt] = %v, want 3", retryLog.Extras["attempt"]) + } +} + +// TestMapSession_RetryWarnLogNoErrorName pins the forward-compat fallback (P3-1): +// a retry part with NO error.name emits the bare "API retry attempt " message +// with no trailing ": " and no error.name extra (an older/partial retry part). +func TestMapSession_RetryWarnLogNoErrorName(t *testing.T) { + s := rootSession("ses_x", 0) + bareRetry := func(id string, attempt int) partRow { + raw, _ := json.Marshal(map[string]any{"type": "retry", "attempt": attempt}) + return partRow{ID: id, MessageID: "msg_a", SessionID: "ses_x", Data: raw} + } + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), + bareRetry("prt_2", 2), + ), + } + evs := run(t, s, msgs) + var retryLog *canonical.LogEntryEvent + for i, ev := range evs { + if l, ok := ev.(canonical.LogEntryEvent); ok && l.Severity == "WRN" { + le := evs[i].(canonical.LogEntryEvent) + retryLog = &le + } + } + if retryLog == nil { + t.Fatal("no WRN LogEntry for retry") + } + if retryLog.Message != "API retry attempt 2" { + t.Errorf("retry WRN message = %q, want bare %q (no error.name → no ': ' suffix)", retryLog.Message, "API retry attempt 2") + } + if _, ok := retryLog.Extras["error.name"]; ok { + t.Errorf("retry WRN extras must omit error.name when absent; got %v", retryLog.Extras["error.name"]) + } +} + +// --- file part → INF LogEntry (round-4 P2-3) ---------------------------------- + +// TestMapSession_FilePartLogEntry pins SOW-0005 round-4 P2-3: a file part emits an +// INF LogEntry carrying filename/url/mime in its extras (an attachment record), +// and NO PayloadRefEvent with a non-canonical PayloadKind. The old "user_attachment" +// PayloadKind is not in the canonical PayloadRefEvent set (internal/canonical/ +// events.go), so the adapter must not emit it. +func TestMapSession_FilePartLogEntry(t *testing.T) { + s := rootSession("ses_x", 0) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), + filePart("prt_2", "https://cdn.example.invalid/x.png"), + ), + } + evs := run(t, s, msgs) + + // Every PayloadRef in the stream must carry a CANONICAL PayloadKind (the + // internal/canonical/events.go set); in particular the removed "user_attachment" + // kind must never appear. + for _, ev := range evs { + if p, ok := ev.(canonical.PayloadRefEvent); ok { + if !canonicalPayloadKinds[p.PayloadKind] { + t.Fatalf("non-canonical PayloadRef kind=%q emitted (round-4 P2-3); canonical set only", p.PayloadKind) + } + } + } + + // Exactly one INF LogEntry "file attachment" with filename/url/mime in extras. + var found int + for _, ev := range evs { + l, ok := ev.(canonical.LogEntryEvent) + if !ok || l.Message != "file attachment" { + continue + } + found++ + if l.Severity != "INF" { + t.Errorf("file-attachment LogEntry severity = %q, want INF", l.Severity) + } + if l.Extras["url"] != "https://cdn.example.invalid/x.png" { + t.Errorf("file-attachment extras.url = %v, want the verbatim data.url", l.Extras["url"]) + } + if l.Extras["filename"] != "x.png" { + t.Errorf("file-attachment extras.filename = %v, want x.png", l.Extras["filename"]) + } + if l.Extras["mime"] != "image/png" { + t.Errorf("file-attachment extras.mime = %v, want image/png", l.Extras["mime"]) + } + // Scoped to the turn and the open LLM op (prt_1 opened one). + if l.TurnSeq != 1 { + t.Errorf("file-attachment LogEntry TurnSeq = %d, want 1", l.TurnSeq) + } + } + if found != 1 { + t.Fatalf("file-attachment INF LogEntry count = %d, want 1", found) + } +} + +// --- unknown part → forward-compat skip + one WARN ---------------------------- + +func TestMapSession_UnknownPartSkippedWithWarn(t *testing.T) { + s := rootSession("ses_x", 0) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), + unknownPart("prt_2"), + ), + } + evs := run(t, s, msgs) + // No op for the unknown part; exactly one WRN log. + warns := 0 + for _, ev := range evs { + if l, ok := ev.(canonical.LogEntryEvent); ok && l.Severity == "WRN" { + warns++ + } + } + if warns != 1 { + t.Fatalf("WRN count = %d want 1 for unknown part", warns) + } +} + +// --- provider alias canonicalization (AC#7) ----------------------------------- + +func TestCanonicalProvider(t *testing.T) { + cases := []struct{ alias, want string }{ + {"openrouter", "openrouter"}, // known passthrough + {"my-private-alias", "my-private-alias"}, // unknown → alias verbatim (default) + {"", ""}, + } + for _, c := range cases { + if got := canonicalProvider(c.alias); got != c.want { + t.Errorf("canonicalProvider(%q) = %q want %q", c.alias, got, c.want) + } + } +} + +// --- empty/malformed message data is skipped, not fatal ----------------------- + +func TestMapSession_EmptyMessageDataSkipped(t *testing.T) { + s := rootSession("ses_x", 0) + msgs := []messageWithParts{ + {Message: messageRow{ID: "msg_bad", SessionID: "ses_x", Data: []byte(" ")}}, + } + evs := run(t, s, msgs) + // A blank data body must not produce a turn; one WRN log surfaces it. + if n := countKind(evs, canonical.EvTurnStarted); n != 0 { + t.Fatalf("TurnStarted = %d want 0 for empty message data", n) + } + warns := 0 + for _, ev := range evs { + if l, ok := ev.(canonical.LogEntryEvent); ok && l.Severity == "WRN" { + warns++ + } + } + if warns != 1 { + t.Fatalf("WRN count = %d want 1 for empty message data", warns) + } +} + +// --- ordering / determinism: re-emission yields identical streams ------------- + +func TestMapSession_Deterministic(t *testing.T) { + s := rootSession("ses_x", 0) + c := int64(3000) + end := int64(2500) + build := func() []messageWithParts { + return []messageWithParts{ + mwp(asgMsg("msg_a", 1500, &c, "the-alias", "the-model", tokenCounts{Input: 100, Output: 20}, 0.2, "stop", ""), + stepStart("prt_1"), + reasoningPart("prt_2", 1800, &end, false), + toolPart("prt_3", "github_search", "completed", 2000, &end, nil), + stepFinish("prt_4", 100, 20, 0, 0, 0, 0.2), + textPart("prt_5"), + ), + } + } + a := run(t, s, build()) + b := run(t, s, build()) + if len(a) != len(b) { + t.Fatalf("non-deterministic length: %d vs %d", len(a), len(b)) + } + for i := range a { + if a[i].EventKind() != b[i].EventKind() { + t.Fatalf("event %d kind differs: %q vs %q", i, a[i].EventKind(), b[i].EventKind()) + } + if a[i].EventSourceSeq() != b[i].EventSourceSeq() { + t.Fatalf("event %d SourceSeq differs: %d vs %d", i, a[i].EventSourceSeq(), b[i].EventSourceSeq()) + } + } +} diff --git a/internal/adapters/opencode/mapper_tools.go b/internal/adapters/opencode/mapper_tools.go new file mode 100644 index 0000000..819b972 --- /dev/null +++ b/internal/adapters/opencode/mapper_tools.go @@ -0,0 +1,117 @@ +package opencode + +import "strings" + +// This file holds the pure tool-part helpers the tool-op emitter (emitToolOp in +// mapper_ops.go) delegates to: the op start/terminal derivation, the byte +// accounting, the task→child-session extraction (AC#4), and the MCP namespace +// heuristic. They are pure functions of a decoded partData; split out of +// mapper_ops.go to keep each file ≤ ~400 lines (mirrors codex's ops_tools.go). + +// toolStartUs returns the tool op's start timestamp (µs) from state.time.start, +// falling back to the part's time_created when the state has no start. It is a +// mapper METHOD (not a free function) so the ms→µs conversion goes through the +// warning-capable msToMicrosWarn (SOW-0005 round-4 P2-2): a crafted/corrupt huge +// tool timestamp clamps AND surfaces a WARN rather than silently saturating, since +// the result becomes an emitted op's Ts. +func (m *sessionMapper) toolStartUs(data partData, p partRow) int64 { + if data.State != nil && data.State.Time.Start > 0 { + return m.msToMicrosWarn(data.State.Time.Start, "part.state.time.start") + } + return m.msToMicrosWarn(p.TimeCreatedMs, "part.time_created (tool)") +} + +// toolTerminal derives a tool op's CANONICAL terminal status, end timestamp, +// error message, and whether an output body exists, from +// state.status/time/error/output (adapter-opencode.md §"Tool calls and Models"). +// The returned status is ALWAYS in the canonical op-status set +// (running|completed|failed|cancelled|truncated — canonical-events.md:196): +// opencode's "error" maps to canonical "failed" (SOW-0005 round-2 P1-C; the raw +// "error" string is NOT a canonical status and breaks status consumers), and an +// unknown future opencode status maps to "completed" once it carries an end +// (it finished) else "" (running, no finalize). A running/pending state has no +// end (endPtr nil → no finalize). failed carries state.error as the message. +func toolTerminal(data partData) (status string, endPtr *int64, errMsg string, hasOutput bool) { + if data.State == nil { + return "", nil, "", false + } + st := data.State + switch st.Status { + case "completed": + return "completed", st.Time.End, "", st.Output != "" + case "error": + // opencode "error" → canonical "failed" (the only valid terminal-error op + // status); the detail rides in errMsg → OpFinalizedEvent.ErrorMessage. + // hasOutput keys ONLY on state.output (SOW-0005 round-6 P2-1): a failed tool + // usually carries only state.error and NO output, so a tool_response PayloadRef + // at field=state.output would point at a body that does not exist (the future + // resolver would fetch nothing). The failure detail is carried by ErrorMessage + // (state.error), not a payload ref. A failed tool that produced partial output + // before failing still gets the ref (state.output != ""). + return "failed", st.Time.End, st.Error, st.Output != "" + case "running", "pending": + // In-flight: OpStarted only, no finalize (adapter-opencode.md §"Edge + // Cases" #4). A later poll observing the part now completed re-emits the + // whole tree and finalizes it (chunk C). No canonical status is emitted + // (endPtr nil suppresses the finalize), so the raw "running"/"pending" + // string is never surfaced AS a canonical op status. + return "", nil, "", false + default: + // Unknown future opencode status: keep the emitted status canonical. If it + // carries an end it finished → "completed"; otherwise leave it running (no + // finalize). Never surface the raw opencode string as a canonical status. + if st.Time.End != nil { + return "completed", st.Time.End, st.Error, st.Output != "" + } + return "", nil, "", false + } +} + +// toolBytesIn approximates op.bytes_in as the byte length of the tool's +// serialized input (adapter-opencode.md §"Tool calls and Models": +// len(JSON.stringify(state.input))). The input is already raw JSON on the +// decoded state, so its length is the serialized size. 0 when absent. +func toolBytesIn(data partData) int64 { + if data.State == nil { + return 0 + } + body := jsonTrimBytes(data.State.Input) + return int64(len(body)) +} + +// toolBytesOut approximates op.bytes_out as the byte length of the tool's output +// string (adapter-opencode.md §"Tool calls and Models": len(state.output)). +func toolBytesOut(data partData) int64 { + if data.State == nil { + return 0 + } + return int64(len(data.State.Output)) +} + +// taskChildSessionID returns the spawned child session id for a tool='task' +// part that carries state.metadata.sessionId, else "" (AC#4; adapter-opencode.md +// §"Sub-Agent Linkage"). Only tool='task' qualifies — the session-op edge is the +// task tool's dispatch, not any tool with a metadata.sessionId. The second return +// reports whether the task metadata was PRESENT but malformed, so the caller can +// surface a structured WARN rather than silently dropping a sub-agent linkage +// (SOW-0005 P2.6). +func taskChildSessionID(data partData) (childID string, metaMalformed bool) { + if data.Tool != "task" || data.State == nil { + return "", false + } + return data.State.subAgentSessionIDChecked() +} + +// toolNameNamespace derives the canonical op Name and ToolNamespace from an +// opencode tool name (adapter-opencode.md §"Tool calls and Models"). MCP tools +// are namespaced with an underscore convention (e.g. github_get_file_contents → +// namespace github, name get_file_contents); builtins (read/bash/grep/…) have no +// underscore and yield an empty namespace + the verbatim name. The split is on +// the FIRST underscore so a name like github_get_file_contents keeps the rest of +// the name intact. +func toolNameNamespace(tool string) (name, namespace string) { + if i := strings.IndexByte(tool, '_'); i > 0 && i < len(tool)-1 { + return tool[i+1:], tool[:i] + } + return tool, "" +} diff --git a/internal/adapters/opencode/mapper_turn.go b/internal/adapters/opencode/mapper_turn.go new file mode 100644 index 0000000..1b9368d --- /dev/null +++ b/internal/adapters/opencode/mapper_turn.go @@ -0,0 +1,370 @@ +package opencode + +import ( + "bytes" + "encoding/json" + "fmt" + "math" + "strings" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file holds the turn finalizer, the cumulative→delta token math +// (computeStepDeltas, AC#3), the best-effort provider canonicalization (AC#7), +// the turnContext op-parent helper, and the PayloadRef URI seam chunk D fills +// with the opencode-sqlite:// builder. The per-part op emitters live in +// mapper_ops.go; the session/turn driver in mapper.go; the part dispatch in +// mapper_parts.go. Split out of mapper_ops.go to keep each file ≤ ~400 lines. + +// --- cumulative→delta token math (AC#3) --------------------------------------- + +// computeStepDeltas converts a message's ORDERED cumulative step-finish token +// snapshots into per-step deltas (AC#3). opencode reports step-finish tokens +// CUMULATIVELY within a message (adapter-opencode.md §"Tool calls and Models", +// "Canonical Model Gaps" #3): input 100,250,410 → per-op 100,150,160. Each field +// (input/output/reasoning/cache.read/cache.write) is deltad independently against +// the running previous cumulative via a CHECKED subtraction (subClampWarn): a +// non-monotonic value yields a negative delta CLAMPED to 0, and a crafted/corrupt +// value whose subtraction would OVERFLOW int64 is clamped to [0, MaxInt64] with a +// WARN rather than wrapping (SOW-0005 round-2 P2-F). The first snapshot's delta is +// itself (previous = zero). onWarn (may be nil) surfaces an overflow with context. +func computeStepDeltas(cumulative []tokenCounts, onWarn func(error)) []tokenCounts { + if len(cumulative) == 0 { + return nil + } + out := make([]tokenCounts, len(cumulative)) + var prev tokenCounts + for i, cur := range cumulative { + out[i] = tokenCounts{ + Input: subClampWarn(cur.Input, prev.Input, "step-finish tokens.input", onWarn), + Output: subClampWarn(cur.Output, prev.Output, "step-finish tokens.output", onWarn), + Reasoning: subClampWarn(cur.Reasoning, prev.Reasoning, "step-finish tokens.reasoning", onWarn), + Total: subClampWarn(cur.Total, prev.Total, "step-finish tokens.total", onWarn), + Cache: cacheTokens{ + Read: subClampWarn(cur.Cache.Read, prev.Cache.Read, "step-finish tokens.cache.read", onWarn), + Write: subClampWarn(cur.Cache.Write, prev.Cache.Write, "step-finish tokens.cache.write", onWarn), + }, + } + prev = cur + } + return out +} + +// nonNeg clamps a delta to a non-negative value so a non-monotonic cumulative +// observation never emits a negative token count. +func nonNeg(v int64) int64 { + if v < 0 { + return 0 + } + return v +} + +// subClampWarn returns a-b clamped to [0, MaxInt64], detecting int64 overflow on +// the subtraction (a crafted/corrupt cumulative value) and surfacing a WARN with +// the field label rather than wrapping (SOW-0005 round-2 P2-F). On overflow it +// clamps: a positive overflow (a huge, b very negative) saturates to MaxInt64; a +// negative overflow clamps to 0 (negative token counts are meaningless). The +// normal non-monotonic case (a 0 { + return math.MaxInt64 + } + return 0 + } + return nonNeg(d) +} + +// addClampWarn returns a+b clamped to [0, MaxInt64], detecting int64 overflow on +// the ADDITION (crafted/corrupt cumulative token values) and surfacing a WARN with +// the field label rather than wrapping to a negative count (SOW-0005 round-3 +// P2-1). Both addends are non-negative token counts in practice; a positive +// overflow saturates to MaxInt64, and the (defensive) negative-input path clamps +// to 0. onWarn may be nil (the pure mapper-only path), in which case the clamp is +// silent but still safe. +func addClampWarn(a, b int64, field string, onWarn func(error)) int64 { + s := a + b + // Overflow on a+b iff both addends share a sign AND the sum's sign differs + // (standard signed-addition overflow detection). + if (a < 0) == (b < 0) && (s < 0) != (a < 0) { + if onWarn != nil { + onWarn(fmt.Errorf("opencode: %s sum overflow (%d+%d); clamped (P2-1)", field, a, b)) + } + if a > 0 || b > 0 { + return math.MaxInt64 + } + return 0 + } + return nonNeg(s) +} + +// jsonTrimBytes returns the raw JSON with surrounding whitespace trimmed, +// treating a bare null (or empty) as no bytes so an absent input does not +// contribute a phantom 4-byte ("null") size to bytes_in. +func jsonTrimBytes(raw []byte) []byte { + b := strings.TrimSpace(string(raw)) + if b == "" || b == "null" { + return nil + } + return []byte(b) +} + +// --- turn finalize ------------------------------------------------------------ + +// finalizeTurn builds the TurnFinalizedEvent for an assistant message. Per-turn +// tokens are the message-level cumulative totals MINUS the previous assistant +// message's cumulative totals (SOW decision #4 — see sessionMapper.prevTurnTokens +// for the implementer-verify note); cost is the message cost verbatim (already +// per-message in opencode, not cumulative). Per-turn cache tokens DO work via +// TurnFinalizedEvent's TokensCacheRead/Write fields (per-turn extras like cwd are +// deferred to SOW-0021). Status derives from data.finish/error (stop or any +// non-error finish → completed; data.error → failed). EndTs is the message's +// completed-or-created ts. The previous-cumulative snapshot is advanced AFTER +// computing this turn's delta. +func (m *sessionMapper) finalizeTurn(tc *turnContext, data *messageData, msg messageRow) canonical.TurnFinalizedEvent { + cum := data.Tokens + var delta tokenCounts + if m.havePrevTurn { + // Checked subtraction (subClampWarn): a crafted/corrupt cumulative value + // clamps with a WARN instead of wrapping int64 (SOW-0005 round-2 P2-F). + delta = tokenCounts{ + Input: subClampWarn(cum.Input, m.prevTurnTokens.Input, "turn tokens.input", m.mwarn), + Output: subClampWarn(cum.Output, m.prevTurnTokens.Output, "turn tokens.output", m.mwarn), + Reasoning: subClampWarn(cum.Reasoning, m.prevTurnTokens.Reasoning, "turn tokens.reasoning", m.mwarn), + Cache: cacheTokens{ + Read: subClampWarn(cum.Cache.Read, m.prevTurnTokens.Cache.Read, "turn tokens.cache.read", m.mwarn), + Write: subClampWarn(cum.Cache.Write, m.prevTurnTokens.Cache.Write, "turn tokens.cache.write", m.mwarn), + }, + } + } else { + delta = tokenCounts{ + Input: cum.Input, + Output: cum.Output, + Reasoning: cum.Reasoning, + Cache: cacheTokens{Read: cum.Cache.Read, Write: cum.Cache.Write}, + } + } + m.prevTurnTokens = cum + m.havePrevTurn = true + + status, errClass := turnStatus(data) + endUs := m.turnEndUs(data, msg) + return canonical.TurnFinalizedEvent{ + EventBase: m.nextBase(endUs), + SessionNativeID: m.nativeID(), + Seq: tc.turnSeq, + Status: status, + ErrorClass: errClass, + EndTs: endUs, + TokensIn: delta.Input, + TokensOut: delta.Output, + TokensCacheRead: delta.Cache.Read, + TokensCacheWrite: delta.Cache.Write, + CostUSD: data.Cost, + } +} + +// defaultErrorClass is the safe ErrorClass label for an error object that +// carries no name (SOW-0005 round-2 P2-A). It is a CLASS string (a human label +// for the failure category), NOT a canonical op/turn status, so a generic +// constant here is correct — the terminal status is "failed", and this only +// names the error class when the source did not. +const defaultErrorClass = "error" + +// errorClass returns the ErrorClass for an assistant error, defaulting to +// defaultErrorClass when the source supplied an error object with an empty name +// (SOW-0005 round-2 P2-A: error PRESENCE is what makes a turn failed; a missing +// name must not blank the class). err must be non-nil. +func errorClass(err *assistantError) string { + if err.Name != "" { + return err.Name + } + return defaultErrorClass +} + +// errorMessage extracts the human-readable detail for an assistant error from +// its tagged `data` body (SOW-0005 round-5 P3-1). opencode's AssistantError union +// serializes (NamedError.toObject, anomalyco/opencode core/util/error.ts) as +// {"name":,"data":}; every shipping variant EXCEPT +// MessageOutputLengthError carries a `message` string in `data` +// (MessageAbortedError/UnknownError/APIError/ContextOverflowError/ +// StructuredOutputError/ProviderAuthError — confirmed against the reference DB: +// MessageAbortedError/UnknownError/APIError all populate data.message). It +// becomes the canonical SessionFinalizedEvent.ErrorMessage, mirroring how the +// tool-op path surfaces state.error verbatim (mapper_tools.go toolTerminal); the +// canonical TurnFinalizedEvent carries only ErrorClass (no ErrorMessage field), +// so this enriches the SESSION terminal only. Decode is best-effort: an absent +// data, a non-object body, or a missing/non-string `message` yields "" — the +// session is still finalized failed with its ErrorClass (degrade, never abort). +// err must be non-nil. +func errorMessage(err *assistantError) string { + if len(bytes.TrimSpace(err.Data)) == 0 { + return "" + } + var d struct { + Message string `json:"message"` + } + if json.Unmarshal(err.Data, &d) != nil { + return "" + } + return d.Message +} + +// turnStatus derives a turn's terminal status (adapter-opencode.md §"Per-table +// emit rules"): an error OBJECT being PRESENT → failed (ErrorClass = error.name, +// or defaultErrorClass when the name is empty — SOW-0005 round-2 P2-A); else +// completed (data.finish="stop" or any non-error finish all map to completed — +// opencode does not record a per-turn aborted distinct from a session error). +// The predicate is error PRESENCE (data.Error != nil), not a non-empty name: an +// opencode error object with an empty name is still a failure. +func turnStatus(data *messageData) (status, errClass string) { + if data.Error != nil { + return "failed", errorClass(data.Error) + } + return "completed", "" +} + +// turnIsTerminal reports whether an assistant message represents a COMPLETED +// turn — the predicate that gates TurnFinalizedEvent emission (adapter-opencode +// .md §"Per-table emit rules": finalize ONLY when data.time.completed is set, or +// the message carries an error, or it has at least one step-finish part). +// opencode writes a turn's message row LIVE while the turn is still in progress +// (data.time.completed nil, no step-finish part yet), so finalizing every +// assistant message would wrongly mark an in-flight turn completed. A turn that +// is not terminal stays RUNNING (TurnStarted with no TurnFinalized); a later +// poll re-emits the whole tree and finalizes it once it actually completes (the +// re-emit is idempotent — adapter-opencode.md §"Edge Cases" #4). hasStepFinish +// is supplied by the part walk (mapMessage), which already decoded the parts. +// Error PRESENCE (data.Error != nil) is terminal regardless of the error name +// (SOW-0005 round-2 P2-A). +func turnIsTerminal(data *messageData, hasStepFinish bool) bool { + if data.Time.Completed != nil { + return true + } + if data.Error != nil { + return true + } + return hasStepFinish +} + +// turnEndUs returns a turn's end timestamp (µs): the assistant message's +// data.time.completed when set, else the message row's time_created (a turn with +// no completed ts is still ordered by its creation). It is a mapper METHOD so the +// ms→µs conversion goes through the warning-capable msToMicrosWarn (SOW-0005 +// round-4 P2-2): the result becomes the TurnFinalized Ts and the session's +// failed-terminal EndTs, so a crafted/corrupt timestamp clamps WITH a WARN. +func (m *sessionMapper) turnEndUs(data *messageData, msg messageRow) int64 { + if data.Time.Completed != nil { + return m.msToMicrosWarn(*data.Time.Completed, "message.data.time.completed") + } + return m.msToMicrosWarn(msg.TimeCreatedMs, "message.time_created (turn end)") +} + +// --- provider canonicalization (AC#7) ----------------------------------------- + +// knownProviderAliases maps a handful of well-known opencode provider aliases to +// their canonical vendor name. opencode provider ids are USER-DEFINED aliases +// (adapter-opencode.md §"Multi-provider awareness"), so this is intentionally +// SMALL and conservative: only aliases whose canonical vendor is unambiguous are +// listed. Everything else passes through unchanged (the alias IS the catalog +// provider name until a future SOW normalizes — SOW decision #6). The canonical +// model has no shared providers table yet (internal/canonical has no +// providers.go), so this best-effort map lives in the adapter package. +var knownProviderAliases = map[string]string{ + "openrouter": "openrouter", + "deepseek": "deepseek", + "openai": "openai", + "anthropic": "anthropic", + "google": "google", +} + +// canonicalProvider maps an opencode provider alias to a best-effort canonical +// vendor name, defaulting to the alias UNCHANGED when unknown (AC#7; SOW decision +// #6). An empty alias yields an empty provider (the LLM op then carries no +// provider and the catalog does not seed a provider row — catalog.go gates on +// Provider != ""). The mapper sets ProviderAlias to the verbatim alias regardless. +func canonicalProvider(alias string) string { + if alias == "" { + return "" + } + if c, ok := knownProviderAliases[alias]; ok { + return c + } + return alias +} + +// --- turnContext helpers ------------------------------------------------------ + +// parentSeq returns the ParentOpSeq for a reasoning/tool/session op: the open +// (or most-recently-closed) LLM op's seq, so the op nests under the LLM call that +// produced it (adapter-opencode.md "Op seq numbering within a turn"). Returns -1 +// when no LLM op has been emitted in the turn (a tool/reasoning part before any +// step-start), making the op top-level within its turn (canonical ParentOpSeq=-1). +func (tc *turnContext) parentSeq() int { + if tc.llmOpSeq == 0 { + return -1 + } + return tc.llmOpSeq +} + +// --- PayloadRef seam (chunk D fills the opencode-sqlite:// URI builder) -------- + +// payloadURIBuilder turns an owning part id + field path into a PayloadRef +// LocationURI. The mapper is pure and DB-agnostic (adapter-opencode.md +// §"Payload references", "Mapper/URI seam"): it knows the part id and field but +// NOT the resolved database basename or the escaping, which live with the +// connection/discovery layer (chunk D). Chunk D injects the production builder +// (prefixing the resolved opencode.db basename); mapper-only tests inherit the +// deterministic default (defaultPayloadURI) so the seam is testable now. This +// mirrors codex, whose mapper defers file:// construction to payloadURI. +type payloadURIBuilder func(partID, field string) string + +// defaultPayloadURI is the mapper's built-in PayloadRef URI builder, used when +// no production builder is injected (mapper-only unit tests). It delegates to +// buildPayloadURI in payloads.go — the single source of truth for the +// opencode-sqlite:// grammar (SOW-0005 chunk D) — so the relative form here and +// the form an injected production builder uses share one definition. The form is: +// +// opencode-sqlite://?part_id=&field= +// +// Behaviour is byte-identical to the pre-chunk-D literal: for the Sonyflake part +// ids and fixed field names opencode emits, every character is URL-unreserved. +func defaultPayloadURI(partID, field string) string { + return buildPayloadURI(partID, field) +} + +// payloadURI builds a PayloadRef LocationURI for the given part/field via the +// injected builder, falling back to defaultPayloadURI when none is set (the +// zero-value mapper case in unit tests). +func (m *sessionMapper) payloadURI(partID, field string) string { + if m.uriBuilder != nil { + return m.uriBuilder(partID, field) + } + return defaultPayloadURI(partID, field) +} + +// payloadRef builds a PayloadRefEvent scoped to the owning op (turnSeq/opSeq) so +// it references an op that EXISTS — payload_refs.op_id is NOT NULL REFERENCES +// ops(id), so an orphan ref would FK-roll-back the ingest batch (mirrors codex's +// discipline). The LocationURI is built from the part id + field via the +// chunk-D-injectable seam. originalBytes is the body byte length when known +// (-1 otherwise). +func (m *sessionMapper) payloadRef(base canonical.EventBase, turnSeq, opSeq int, kind, format, partID, field string, originalBytes int64) canonical.PayloadRefEvent { + return canonical.PayloadRefEvent{ + EventBase: base, + SessionNativeID: m.nativeID(), + TurnSeq: turnSeq, + OpSeq: opSeq, + PayloadKind: kind, + Format: format, + LocationURI: m.payloadURI(partID, field), + OriginalBytes: originalBytes, + } +} diff --git a/internal/adapters/opencode/migrations.go b/internal/adapters/opencode/migrations.go new file mode 100644 index 0000000..8a01c69 --- /dev/null +++ b/internal/adapters/opencode/migrations.go @@ -0,0 +1,203 @@ +package opencode + +import ( + "context" + "crypto/sha256" + "database/sql" + "encoding/hex" + "errors" + "fmt" + "strconv" + "strings" +) + +// This file reads opencode's `__drizzle_migrations` table (SOW-0005 chunk D). +// It serves two consumers: +// +// - the cursor schema hash (schemaHash over the ordered migration names), +// replacing chunk C's interim present-column-shape fingerprint; +// - the auto-discovery probe (ProbeStatus: session/message/part counts + +// the latest migration name) the ingester surfaces at startup (AC#8). +// +// All SQL here uses fixed table identifiers, never operator input. The DB is +// always opened via the chunk-A openReadOnly helper; this file never opens a +// write path. + +// migrationsTable is opencode's Drizzle migration journal. Its `name` column +// holds the migration directory name (e.g. "20260510033149_session_usage"), +// which embeds a YYYYMMDDHHMMSS timestamp prefix; Drizzle's `id` increments in +// application order (adapter-opencode.md §"__drizzle_migrations"). +const migrationsTable = "__drizzle_migrations" + +// errNoMigrationsTable is the soft sentinel returned by readMigrations when the +// __drizzle_migrations table does not exist (a very old or foreign SQLite file). +// Callers treat it as non-fatal: no schema hash, no migration reported, but the +// adapter keeps running rather than crashing on a database that is not +// opencode's. It is distinguished from a genuine query error (a corrupt table, +// an I/O fault) which IS propagated. +var errNoMigrationsTable = errors.New("opencode: __drizzle_migrations table not present") + +// readMigrations reads the applied-migration names from __drizzle_migrations in +// application order (ORDER BY id ASC) and returns them plus the latest (the name +// with the highest id, i.e. the last element). A missing table yields +// (nil, "", errNoMigrationsTable) so callers degrade gracefully; any other query +// error is wrapped and returned. Rows with a NULL/empty name are skipped (the +// column is nullable in opencode's schema) so a stray empty name never pollutes +// the hash or masquerades as the latest migration. +func readMigrations(ctx context.Context, db *sql.DB) (names []string, latest string, err error) { + present, presentErr := migrationsTablePresent(ctx, db) + if presentErr != nil { + // A genuine query error (corruption, ctx-cancel, closed DB) is NOT "table + // missing" — propagate it so the caller surfaces the failure rather than + // silently degrading to no-migrations (SOW-0005 round-2 P2-C). + return nil, "", presentErr + } + if !present { + return nil, "", errNoMigrationsTable + } + // id is the Drizzle auto-increment applied-order key; ORDER BY id ASC gives + // application order. Fixed identifiers only (migrationsTable), never input. + q := `SELECT name FROM ` + quoteIdent(migrationsTable) + ` ORDER BY id ASC` // #nosec G202 -- migrationsTable is a fixed package constant via quoteIdent, never user input + rows, err := db.QueryContext(ctx, q) + if err != nil { + return nil, "", fmt.Errorf("opencode: read %s: %w", migrationsTable, err) + } + defer func() { _ = rows.Close() }() + + for rows.Next() { + var name sql.NullString + if scanErr := rows.Scan(&name); scanErr != nil { + return nil, "", fmt.Errorf("opencode: scan %s.name: %w", migrationsTable, scanErr) + } + if name.Valid && name.String != "" { + names = append(names, name.String) + } + } + if rErr := rows.Err(); rErr != nil { + return nil, "", fmt.Errorf("opencode: iterate %s: %w", migrationsTable, rErr) + } + if len(names) > 0 { + latest = names[len(names)-1] + } + return names, latest, nil +} + +// migrationsTablePresent reports whether __drizzle_migrations exists, via +// sqlite_master, returning (present, err). It queries the catalog (which always +// exists), so a NON-nil err is a GENUINE fault (corruption, ctx-cancel, closed +// DB) — NOT a missing table — and is propagated so readMigrations does not +// silently treat it as "no migrations" (SOW-0005 round-2 P2-C: the prior version +// folded every error into present=false, hiding real failures). A clean count of +// 0 is the only soft-absent case (a foreign/old database); it returns +// (false, nil). The name is a fixed constant, bound as a parameter here +// (sqlite_master.name accepts a bind, unlike a PRAGMA argument). +func migrationsTablePresent(ctx context.Context, db *sql.DB) (bool, error) { + var count int + if err := db.QueryRowContext(ctx, + `SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name=?`, + migrationsTable).Scan(&count); err != nil { + return false, fmt.Errorf("opencode: probe %s presence: %w", migrationsTable, err) + } + return count > 0, nil +} + +// schemaHash returns the hex SHA-256 of the ordered migration-name list — the +// cursor's schema_hash (adapter-opencode.md §"Cursor"). Each name is framed with +// its byte length and a newline (":\n") before hashing, so the digest +// is UNAMBIGUOUS regardless of the names' content (a length-prefix is the same +// injection-safe framing the presenter cursor fingerprint uses): two different +// migration lists, the same names in a different order, or a single name that +// happens to contain a separator all yield distinct hashes. An empty list +// (missing/foreign table) yields "" so the cursor records no hash rather than +// the digest of the empty string. +func schemaHash(names []string) string { + if len(names) == 0 { + return "" + } + var b strings.Builder + for _, n := range names { + b.WriteString(strconv.Itoa(len(n))) + b.WriteByte(':') + b.WriteString(n) + b.WriteByte('\n') + } + sum := sha256.Sum256([]byte(b.String())) + return hex.EncodeToString(sum[:]) +} + +// readSchemaHash reads __drizzle_migrations and returns the schema hash for the +// applied migrations, or "" when the table is absent (a foreign/old database). +// A genuine query error is propagated so scanLoop/tailLoop can surface it. This +// is the single helper the poll loops call at scan/tail start to stamp the +// cursor's schema_hash with the REAL migration-name digest (replacing chunk C's +// present-column-shape placeholder). +func readSchemaHash(ctx context.Context, db *sql.DB) (string, error) { + names, _, err := readMigrations(ctx, db) + if err != nil { + if errors.Is(err, errNoMigrationsTable) { + return "", nil + } + return "", err + } + return schemaHash(names), nil +} + +// probeCountTables are the three tracked tables ProbeStatus counts, in the order +// it reports them. They are the canonical tree (session/message/part); the +// session_message sidecar is not surfaced in the startup probe (it carries only +// agent/model-switch markers, not session volume). +var probeCountTables = []string{"session", "message", "part"} + +// ProbeStatus opens the opencode database at dbPath strictly read-only and +// returns the row counts of the session/message/part tables plus the latest +// applied migration name (AC#8). The ingester's auto-discovery calls it once at +// startup to surface what the source will yield via /api/health and the +// discovery log. +// +// Cost note: each count is a full COUNT(*). On a multi-GB database that is a few +// hundred ms ONCE at startup, which is acceptable for a one-time probe; the +// steady-state tailer never runs these (it uses the PK-indexed MAX(id) gate). +// +// Graceful degradation: a table that does not exist makes its count 0 and is +// recorded as a soft error (joined into the returned err) rather than failing +// the whole probe, so a foreign SQLite file the probe stumbles on still +// registers and is observable. A missing __drizzle_migrations table likewise +// leaves latestMigration empty without erroring. A hard open/ping failure (the +// file is unreadable) IS returned so discovery can log it. +func ProbeStatus(ctx context.Context, dbPath string) (sessions, messages, parts int64, latestMigration string, err error) { + db, openErr := openReadOnly(ctx, dbPath) + if openErr != nil { + return 0, 0, 0, "", fmt.Errorf("opencode: probe open %s (ro): %w", dbPath, openErr) + } + defer func() { _ = db.Close() }() + + counts := make([]int64, len(probeCountTables)) + var softErrs []error + for i, table := range probeCountTables { + n, cErr := countRows(ctx, db, table) + if cErr != nil { + softErrs = append(softErrs, cErr) + continue + } + counts[i] = n + } + + _, latest, mErr := readMigrations(ctx, db) + if mErr != nil && !errors.Is(mErr, errNoMigrationsTable) { + softErrs = append(softErrs, mErr) + } + + return counts[0], counts[1], counts[2], latest, errors.Join(softErrs...) +} + +// countRows returns COUNT(*) for a tracked table. The table name is a fixed +// probeCountTables entry, never operator input, so it is safe to interpolate +// (quoted as an identifier defensively, mirroring maxID/maxTimeUpdated). +func countRows(ctx context.Context, db *sql.DB, table string) (int64, error) { + var n int64 + q := `SELECT COUNT(*) FROM ` + quoteIdent(table) // #nosec G202 -- table is a fixed probeCountTables identifier via quoteIdent, never user input + if err := db.QueryRowContext(ctx, q).Scan(&n); err != nil { + return 0, fmt.Errorf("opencode: count %s: %w", table, err) + } + return n, nil +} diff --git a/internal/adapters/opencode/migrations_test.go b/internal/adapters/opencode/migrations_test.go new file mode 100644 index 0000000..2c715f2 --- /dev/null +++ b/internal/adapters/opencode/migrations_test.go @@ -0,0 +1,334 @@ +package opencode + +import ( + "database/sql" + "errors" + "testing" +) + +// drizzleMigrationsDDL mirrors opencode's real __drizzle_migrations shape +// (adapter-opencode.md §"__drizzle_migrations"): an auto-increment id that +// increases in application order plus a nullable name column. Synthetic only — +// never the operator's database (SOW-0005 R5). +const drizzleMigrationsDDL = `CREATE TABLE __drizzle_migrations ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + hash TEXT NOT NULL, + created_at NUMERIC, + name TEXT, + applied_at TEXT)` + +// insertMigration inserts one applied-migration row. id is assigned by +// AUTOINCREMENT in call order so the natural application order matches insertion +// order. +func insertMigration(t *testing.T, rw *sql.DB, name string) { + t.Helper() + if _, err := rw.Exec( + `INSERT INTO __drizzle_migrations (hash, name, applied_at) VALUES (?,?,?)`, + "hash_"+name, name, "2026-05-30"); err != nil { + t.Fatalf("insert migration %q: %v", name, err) + } +} + +// newMigrationsDB builds a current-schema DB that ALSO carries a populated +// __drizzle_migrations table with the given names (in application order), and +// returns the read-only handle. The rw handle is closed before reopening RO so +// the WAL is flushed. +func newMigrationsDB(t *testing.T, names ...string) *sql.DB { + t.Helper() + path, rw := newEmptyDB(t, t.TempDir(), "opencode.db", drizzleMigrationsDDL) + for _, n := range names { + insertMigration(t, rw, n) + } + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + return openRO(t, path) +} + +// TestReadMigrations_OrderedAndLatest pins that readMigrations returns the names +// in application order (id ASC) and reports the highest-id name as latest, even +// when the names were inserted out of timestamp order (id, not name, is the +// applied-order key Drizzle maintains). +func TestReadMigrations_OrderedAndLatest(t *testing.T) { + t.Parallel() + db := newMigrationsDB(t, + "20260127222353_familiar_lady_ursula", + "20260510033149_session_usage", + "20260511000411_data_migration_state", + ) + names, latest, err := readMigrations(ctxBG(), db) + if err != nil { + t.Fatalf("readMigrations: %v", err) + } + want := []string{ + "20260127222353_familiar_lady_ursula", + "20260510033149_session_usage", + "20260511000411_data_migration_state", + } + if len(names) != len(want) { + t.Fatalf("readMigrations names = %v, want %v", names, want) + } + for i := range want { + if names[i] != want[i] { + t.Fatalf("readMigrations names[%d] = %q, want %q (application order)", i, names[i], want[i]) + } + } + if latest != want[len(want)-1] { + t.Fatalf("latest = %q, want %q", latest, want[len(want)-1]) + } +} + +// TestReadMigrations_SkipsNullNames verifies a NULL/empty name row never pollutes +// the list or becomes the latest (the name column is nullable). +func TestReadMigrations_SkipsNullNames(t *testing.T) { + t.Parallel() + path, rw := newEmptyDB(t, t.TempDir(), "opencode.db", drizzleMigrationsDDL) + insertMigration(t, rw, "20260127222353_first") + // A row with a NULL name (real schema allows it). + if _, err := rw.Exec(`INSERT INTO __drizzle_migrations (hash, name) VALUES (?, NULL)`, "h_null"); err != nil { + t.Fatalf("insert null-name migration: %v", err) + } + insertMigration(t, rw, "20260510033149_last") + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db := openRO(t, path) + + names, latest, err := readMigrations(ctxBG(), db) + if err != nil { + t.Fatalf("readMigrations: %v", err) + } + if len(names) != 2 { + t.Fatalf("readMigrations names = %v, want 2 (NULL skipped)", names) + } + if latest != "20260510033149_last" { + t.Fatalf("latest = %q, want the last non-null name", latest) + } +} + +// TestReadMigrations_MissingTableSentinel verifies a DB WITHOUT a +// __drizzle_migrations table (a very old or foreign SQLite file) returns the soft +// sentinel + empty results, not a hard error — so callers degrade gracefully. +func TestReadMigrations_MissingTableSentinel(t *testing.T) { + t.Parallel() + // newEmptyDB without the migrations DDL → no __drizzle_migrations table. + path, rw := newEmptyDB(t, t.TempDir(), "opencode.db") + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db := openRO(t, path) + + names, latest, err := readMigrations(ctxBG(), db) + if !errors.Is(err, errNoMigrationsTable) { + t.Fatalf("readMigrations(no table) err = %v, want errNoMigrationsTable", err) + } + if names != nil || latest != "" { + t.Fatalf("readMigrations(no table) = (%v,%q), want (nil,\"\")", names, latest) + } +} + +// TestSchemaHash_StableOrderSensitiveAndDistinct pins the schema-hash contract: +// stable for the same ordered list, different for a different order, different +// for different content, and empty for an empty list. +func TestSchemaHash_StableOrderSensitiveAndDistinct(t *testing.T) { + t.Parallel() + a := []string{"m1", "m2", "m3"} + if schemaHash(a) != schemaHash([]string{"m1", "m2", "m3"}) { + t.Error("schemaHash not stable for the same ordered list") + } + if schemaHash(a) == schemaHash([]string{"m1", "m3", "m2"}) { + t.Error("schemaHash not order-sensitive (reordered names hashed the same)") + } + if schemaHash(a) == schemaHash([]string{"m1", "m2"}) { + t.Error("schemaHash collided for different lists") + } + if schemaHash(nil) != "" || schemaHash([]string{}) != "" { + t.Error("schemaHash(empty) must be \"\"") + } + // A separator-injection guard: ["m1\nm2"] must not collide with ["m1","m2"] + // — newline join is unambiguous because a migration name has no newline. + if schemaHash([]string{"m1\nm2"}) == schemaHash([]string{"m1", "m2"}) { + t.Error("schemaHash join is ambiguous (newline-in-name collides with two names)") + } +} + +// TestReadSchemaHash_RealMigrations verifies readSchemaHash returns the digest of +// the live migration names and "" (no error) when the table is absent. +func TestReadSchemaHash_RealMigrations(t *testing.T) { + t.Parallel() + db := newMigrationsDB(t, "20260127222353_a", "20260510033149_b") + got, err := readSchemaHash(ctxBG(), db) + if err != nil { + t.Fatalf("readSchemaHash: %v", err) + } + want := schemaHash([]string{"20260127222353_a", "20260510033149_b"}) + if got != want { + t.Fatalf("readSchemaHash = %q, want %q", got, want) + } + + // Missing table → "" + nil (degrade, do not error). + pathNo, rwNo := newEmptyDB(t, t.TempDir(), "opencode.db") + if err := rwNo.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + dbNo := openRO(t, pathNo) + if h, err := readSchemaHash(ctxBG(), dbNo); err != nil || h != "" { + t.Fatalf("readSchemaHash(no table) = (%q,%v), want (\"\",nil)", h, err) + } +} + +// TestRecordSchemaHash_RecordsReal asserts recordSchemaHash stamps the cursor with +// the REAL migration-name digest (replacing chunk C's present-column placeholder). +func TestRecordSchemaHash_RecordsReal(t *testing.T) { + t.Parallel() + db := newMigrationsDB(t, "20260127222353_a", "20260510033149_b") + var ce collectErrs + got := recordSchemaHash(ctxBG(), db, newCursor(), ce.onError) + want := schemaHash([]string{"20260127222353_a", "20260510033149_b"}) + if got.SchemaHash != want { + t.Fatalf("recordSchemaHash SchemaHash = %q, want %q (real migration digest)", got.SchemaHash, want) + } + if ce.count() != 0 { + t.Errorf("recordSchemaHash on a fresh cursor logged %d errors, want 0", ce.count()) + } +} + +// TestRecordSchemaHash_MismatchContinuesPreservingWatermarks pins the spec +// behaviour (adapter-opencode.md §"Cursor"): a cursor carrying a STALE hash +// (opencode applied a migration between runs) is re-stamped with the new hash, +// the watermarks are PRESERVED (no reset), and a structured WARN is surfaced via +// onError. Column drift is handled per-column by the dynamic SELECT, so a benign +// migration never forces a re-ingest. +func TestRecordSchemaHash_MismatchContinuesPreservingWatermarks(t *testing.T) { + t.Parallel() + db := newMigrationsDB(t, "20260127222353_a", "20260510033149_b") + newHash := schemaHash([]string{"20260127222353_a", "20260510033149_b"}) + + // A persisted cursor from an EARLIER schema (stale hash) with live watermarks. + stale := newCursor(). + withSchemaHash("deadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef"). + withTable("session", TableWatermark{MaxIDSeen: "ses_9", MaxTimeUpdatedMs: 1779, MaxTimeUpdatedID: "ses_9"}). + withTable("message", TableWatermark{MaxIDSeen: "msg_9", MaxTimeUpdatedMs: 1780, MaxTimeUpdatedID: "msg_9"}) + + var ce collectErrs + got := recordSchemaHash(ctxBG(), db, stale, ce.onError) + + if got.SchemaHash != newHash { + t.Fatalf("hash not re-stamped: got %q, want %q", got.SchemaHash, newHash) + } + // Watermarks must be preserved (NOT reset). + if w := got.Tables["session"]; w.MaxIDSeen != "ses_9" || w.MaxTimeUpdatedMs != 1779 || w.MaxTimeUpdatedID != "ses_9" { + t.Errorf("session watermark reset on mismatch: %+v", w) + } + if w := got.Tables["message"]; w.MaxIDSeen != "msg_9" || w.MaxTimeUpdatedMs != 1780 || w.MaxTimeUpdatedID != "msg_9" { + t.Errorf("message watermark reset on mismatch: %+v", w) + } + // A structured WARN must have been surfaced. + if ce.count() == 0 { + t.Error("schema-hash mismatch did not surface a WARN via onError") + } +} + +// TestProbeStatus_CountsAndLatest verifies the auto-discovery probe reports the +// session/message/part counts and the latest migration from a synthetic DB. +func TestProbeStatus_CountsAndLatest(t *testing.T) { + t.Parallel() + path, rw := newEmptyDB(t, t.TempDir(), "opencode.db", drizzleMigrationsDDL) + // Two sessions, three messages, four parts. + insertSession(t, rw, "ses_1", "", 100, 100, 0) + insertSession(t, rw, "ses_2", "", 110, 110, 0) + insertAssistantMessage(t, rw, "msg_1", "ses_1", 101, 101, 10, 5) + insertAssistantMessage(t, rw, "msg_2", "ses_1", 102, 102, 20, 10) + insertAssistantMessage(t, rw, "msg_3", "ses_2", 111, 111, 30, 15) + insertPart(t, rw, "prt_1", "msg_1", "ses_1", 103, 103, textBody("a")) + insertPart(t, rw, "prt_2", "msg_1", "ses_1", 104, 104, textBody("b")) + insertPart(t, rw, "prt_3", "msg_2", "ses_1", 105, 105, textBody("c")) + insertPart(t, rw, "prt_4", "msg_3", "ses_2", 112, 112, textBody("d")) + insertMigration(t, rw, "20260127222353_a") + insertMigration(t, rw, "20260510033149_latest") + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + + sessions, messages, parts, latest, err := ProbeStatus(ctxBG(), path) + if err != nil { + t.Fatalf("ProbeStatus: %v", err) + } + if sessions != 2 || messages != 3 || parts != 4 { + t.Fatalf("ProbeStatus counts = (%d,%d,%d), want (2,3,4)", sessions, messages, parts) + } + if latest != "20260510033149_latest" { + t.Fatalf("ProbeStatus latest = %q, want 20260510033149_latest", latest) + } +} + +// TestProbeStatus_MissingMigrationsTableDegrades verifies a DB without +// __drizzle_migrations still returns counts (no migration), with no hard error — +// a foreign SQLite file degrades gracefully. +func TestProbeStatus_MissingMigrationsTableDegrades(t *testing.T) { + t.Parallel() + path, rw := newEmptyDB(t, t.TempDir(), "opencode.db") + insertSession(t, rw, "ses_1", "", 100, 100, 0) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + sessions, messages, parts, latest, err := ProbeStatus(ctxBG(), path) + if err != nil { + t.Fatalf("ProbeStatus(no migrations) err = %v, want nil (degrade)", err) + } + if sessions != 1 || messages != 0 || parts != 0 { + t.Fatalf("ProbeStatus counts = (%d,%d,%d), want (1,0,0)", sessions, messages, parts) + } + if latest != "" { + t.Fatalf("ProbeStatus latest = %q, want \"\" (no migrations table)", latest) + } +} + +// TestProbeStatus_OpenErrorIsHard verifies a non-existent database file is a hard +// probe error (mode=ro refuses to create it), so discovery can log it. +func TestProbeStatus_OpenErrorIsHard(t *testing.T) { + t.Parallel() + _, _, _, _, err := ProbeStatus(ctxBG(), t.TempDir()+"/does-not-exist.db") + if err == nil { + t.Fatal("ProbeStatus(missing file) = nil error, want hard open error") + } +} + +// TestRecordSchemaHash_ReadErrorKeepsPriorCursor drives recordSchemaHash's +// non-sentinel read-error branch: a __drizzle_migrations table that EXISTS but +// lacks the `id` column makes the `ORDER BY id` query fail (a genuine error, not +// the missing-table sentinel). recordSchemaHash must surface a WARN via onError +// and keep the prior cursor unchanged (the backfill/poll still proceeds). +func TestRecordSchemaHash_ReadErrorKeepsPriorCursor(t *testing.T) { + t.Parallel() + // A migrations table WITHOUT an id column: present (so not the sentinel) but + // `ORDER BY id` errors. + noIDDDL := `CREATE TABLE __drizzle_migrations (hash TEXT NOT NULL, name TEXT)` + path, rw := newEmptyDB(t, t.TempDir(), "opencode.db", noIDDDL) + if _, err := rw.Exec(`INSERT INTO __drizzle_migrations (hash, name) VALUES (?,?)`, "h0", "m1"); err != nil { + t.Fatalf("insert: %v", err) + } + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db := openRO(t, path) + + // readSchemaHash must surface the error (not the sentinel). + if _, err := readSchemaHash(ctxBG(), db); err == nil { + t.Fatal("readSchemaHash over a no-id migrations table = nil error, want query error") + } + + prior := newCursor().withSchemaHash("priorhash"). + withTable("session", TableWatermark{MaxIDSeen: "ses_5", MaxTimeUpdatedMs: 99, MaxTimeUpdatedID: "ses_5"}) + var ce collectErrs + got := recordSchemaHash(ctxBG(), db, prior, ce.onError) + if got.SchemaHash != "priorhash" { + t.Errorf("recordSchemaHash on read error changed the hash to %q, want prior 'priorhash'", got.SchemaHash) + } + if w := got.Tables["session"]; w.MaxIDSeen != "ses_5" || w.MaxTimeUpdatedMs != 99 || w.MaxTimeUpdatedID != "ses_5" { + t.Errorf("recordSchemaHash on read error mutated watermarks: %+v", w) + } + if ce.count() == 0 { + t.Error("recordSchemaHash read error did not surface a WARN via onError") + } +} diff --git a/internal/adapters/opencode/payloads.go b/internal/adapters/opencode/payloads.go new file mode 100644 index 0000000..9e3f16d --- /dev/null +++ b/internal/adapters/opencode/payloads.go @@ -0,0 +1,47 @@ +package opencode + +import "net/url" + +// This file is the SINGLE SOURCE OF TRUTH for the opencode PayloadRef +// LocationURI grammar (SOW-0005 chunk D). It mirrors codex/claude_code, which +// keep URI construction in their own payloads.go rather than scattering it +// across the mapper. The mapper (mapper_turn.go) is pure and DB-agnostic; its +// built-in default (defaultPayloadURI) delegates here so there is exactly one +// place that knows the grammar. +// +// There is intentionally NO resolver/parser here. Nothing in the project reads +// an opencode-sqlite:// URI yet: the future /api/payloads resolver is a separate +// Phase-2 SOW. Building a parser now would be dead code (AGENTS.md +// "Runtime artifact discipline" / no half-built features). + +// payloadURIScheme is the scheme for an opencode payload reference. The body is +// not copied into ai-viewer's database; the reference records WHERE to read it +// (which part row, which JSON field) so the future resolver can fetch it +// read-only on demand. Hostless + pathless: the owning database is resolved from +// the payload_ref's source_id, not embedded in the URI. +const payloadURIScheme = "opencode-sqlite" + +// buildPayloadURI renders the canonical PayloadRef LocationURI for a body that +// lives in an opencode `part` row's `data` JSON. The grammar is: +// +// - scheme `opencode-sqlite` (no host, no path); +// - query params `part_id=` and `field=`; +// - both values URL-encoded via net/url so a part id or field path containing +// a reserved character (`&`, `=`, `?`, `#`, space, …) cannot corrupt the +// query or be misread by the resolver. +// +// Produces exactly: +// +// opencode-sqlite://?part_id=&field= +// +// `field` is a dotted path into the part's decoded `data` (e.g. "text", +// "state.input", "state.output"); the resolver SELECTs the owning source's +// `part.data` for `part_id` read-only and projects `field`. For the values +// opencode actually emits (Sonyflake part ids like `prt_...` and the fixed field +// names above) every character is URL-unreserved, so the encoded form is +// byte-identical to the pre-chunk-D literal concatenation — existing mapper +// goldens are unchanged. +func buildPayloadURI(partID, field string) string { + return payloadURIScheme + "://?part_id=" + url.QueryEscape(partID) + + "&field=" + url.QueryEscape(field) +} diff --git a/internal/adapters/opencode/payloads_test.go b/internal/adapters/opencode/payloads_test.go new file mode 100644 index 0000000..3f592bc --- /dev/null +++ b/internal/adapters/opencode/payloads_test.go @@ -0,0 +1,83 @@ +package opencode + +import ( + "testing" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// TestBuildPayloadURI pins the canonical opencode-sqlite:// grammar: scheme, +// part_id + field query params, in that order, with the literal form for +// unreserved values. +func TestBuildPayloadURI(t *testing.T) { + t.Parallel() + got := buildPayloadURI("prt_123", "text") + want := "opencode-sqlite://?part_id=prt_123&field=text" + if got != want { + t.Fatalf("buildPayloadURI = %q, want %q", got, want) + } +} + +// TestBuildPayloadURI_EncodesReservedChars verifies a part id or field path +// containing reserved characters (&, =, space, /) is URL-encoded so it cannot +// corrupt the query string or be misread by the future resolver. +func TestBuildPayloadURI_EncodesReservedChars(t *testing.T) { + t.Parallel() + got := buildPayloadURI("prt &=?", "state.output/v2") + // & and = and space and ? in the id must be percent-encoded; the '.' in the + // field is unreserved and stays literal, the '/' is encoded by QueryEscape. + want := "opencode-sqlite://?part_id=prt+%26%3D%3F&field=state.output%2Fv2" + if got != want { + t.Fatalf("buildPayloadURI(reserved) = %q, want %q", got, want) + } +} + +// TestDefaultPayloadURI_DelegatesToBuilder confirms the mapper's built-in default +// is byte-identical to buildPayloadURI (single source of truth) — so the chunk-B +// mapper goldens are unchanged after the relocation. +func TestDefaultPayloadURI_DelegatesToBuilder(t *testing.T) { + t.Parallel() + for _, tc := range []struct{ id, field string }{ + {"prt_9", "state.output"}, + {"prt_2", "text"}, + {"prt_123", "state.input"}, + } { + if got, want := defaultPayloadURI(tc.id, tc.field), buildPayloadURI(tc.id, tc.field); got != want { + t.Errorf("defaultPayloadURI(%q,%q) = %q, want %q (must delegate)", tc.id, tc.field, got, want) + } + } +} + +// TestMapSession_PayloadRefUsesBuilder is the integration check that the mapper +// emits PayloadRef LocationURIs through the builder: a reasoning part yields the +// exact opencode-sqlite:// form, proving the relocated builder is wired and the +// op→payload linkage is intact. This is the byte-identical contract the chunk-B +// goldens depend on. +func TestMapSession_PayloadRefUsesBuilder(t *testing.T) { + t.Parallel() + s := rootSession("ses_p", 0) + end := int64(2200) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), + reasoningPart("prt_2", 2000, &end, false), + ), + } + evs, err := mapSession(testSourceID, s, msgs) + if err != nil { + t.Fatalf("mapSession: %v", err) + } + var found bool + for _, ev := range evs { + if p, ok := ev.(canonical.PayloadRefEvent); ok && p.PayloadKind == "llm_reasoning" { + found = true + want := buildPayloadURI("prt_2", "text") + if p.LocationURI != want { + t.Fatalf("reasoning PayloadRef URI = %q, want %q", p.LocationURI, want) + } + } + } + if !found { + t.Fatal("no llm_reasoning PayloadRef emitted") + } +} diff --git a/internal/adapters/opencode/review_fixes_test.go b/internal/adapters/opencode/review_fixes_test.go new file mode 100644 index 0000000..aee1dfb --- /dev/null +++ b/internal/adapters/opencode/review_fixes_test.go @@ -0,0 +1,384 @@ +package opencode + +import ( + "encoding/json" + "strings" + "testing" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file pins the SOW-0005 external-review fixes that are expressible at the +// PURE-MAPPER layer (no DB): the live-turn finalize predicate (P1.3), the +// step-start force-close (P2.5), and the load-bearing decode-failure warnings +// (P2.6 — malformed session.model JSON + malformed task metadata). The +// DB-level fixes (P1.1 checkpoint-after-emit, P2.4 nested root, P2.7 +// session_message warn) are pinned in the tailer/store test files. + +// --- P1.3: live in-progress turns are NOT finalized --------------------------- + +// TestMapper_RunningTurnNotFinalized pins P1.3: an assistant message with NO +// data.time.completed, NO data.error, and NO step-finish part is a RUNNING turn — +// it emits TurnStarted but NOT TurnFinalized. opencode writes the message row live +// while the turn is still in progress; finalizing it would wrongly mark it +// completed. (A later poll re-emits + finalizes once it completes; idempotent.) +func TestMapper_RunningTurnNotFinalized(t *testing.T) { + t.Parallel() + s := rootSession("ses_run", 0) + // No completed ts, finish empty, and only a step-START (no step-finish). + msg := asgMsg("msg_a", 2000, nil, "anthropic", "claude-x", tokenCounts{Input: 10, Output: 5}, 0.01, "", "") + ss := stepStartAt("prt_ss", 2100) + ev := run(t, s, []messageWithParts{mwp(msg, ss)}) + + if got := countKind(ev, canonical.EvTurnStarted); got != 1 { + t.Errorf("TurnStarted = %d, want 1", got) + } + if got := countKind(ev, canonical.EvTurnFinalized); got != 0 { + t.Errorf("TurnFinalized = %d, want 0 (running turn: no completed ts, no error, no step-finish)", got) + } + // The open LLM op also stays running (no OpFinalized) per Edge #4. + if got := countKind(ev, canonical.EvOpFinalized); got != 0 { + t.Errorf("OpFinalized = %d, want 0 (the single open step has no finish)", got) + } +} + +// TestMapper_TurnFinalizedWhenTerminal pins the three terminal signals that DO +// finalize a turn (P1.3): a completed ts, OR a step-finish part, OR an error. +func TestMapper_TurnFinalizedWhenTerminal(t *testing.T) { + t.Parallel() + completed := int64(3000) + + cases := []struct { + name string + msg messageRow + parts []partRow + }{ + { + name: "completed ts set", + msg: asgMsg("msg_a", 2000, &completed, "anthropic", "claude-x", tokenCounts{Input: 10, Output: 5}, 0.01, "stop", ""), + parts: []partRow{stepStartAt("prt_ss", 2100)}, + }, + { + name: "has step-finish part", + msg: asgMsg("msg_a", 2000, nil, "anthropic", "claude-x", tokenCounts{Input: 10, Output: 5}, 0.01, "stop", ""), + parts: []partRow{stepStartAt("prt_ss", 2100), stepFinish("prt_sf", 10, 5, 0, 0, 0, 0.01)}, + }, + { + name: "has error", + msg: asgMsg("msg_a", 2000, nil, "anthropic", "claude-x", tokenCounts{Input: 10, Output: 5}, 0.01, "", "Overloaded"), + parts: []partRow{stepStartAt("prt_ss", 2100)}, + }, + } + for _, tc := range cases { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + ev := run(t, rootSession("ses_done", 0), []messageWithParts{mwp(tc.msg, tc.parts...)}) + if got := countKind(ev, canonical.EvTurnFinalized); got != 1 { + t.Errorf("TurnFinalized = %d, want 1 (terminal turn)", got) + } + }) + } +} + +// --- P2.5: step-start force-closes the previous open LLM op ------------------- + +// TestMapper_TwoStepStartsForceCloseFirst pins P2.5 (spec Edge #5): two step-start +// parts with NO step-finish between them → the FIRST LLM op is force-closed with +// Status="cancelled" and EndTs = the second step-start's start ts; the SECOND op +// stays running (no finalize) because the turn ends with it still open. +func TestMapper_TwoStepStartsForceCloseFirst(t *testing.T) { + t.Parallel() + s := rootSession("ses_orphan", 0) + // Two step-starts at distinct times, no step-finish anywhere. completed set so + // the TURN finalizes (isolating the op-level force-close from the turn gate). + completed := int64(5000) + msg := asgMsg("msg_a", 2000, &completed, "anthropic", "claude-x", tokenCounts{}, 0, "stop", "") + ss1 := stepStartAt("prt_ss1", 2100) + ss2 := stepStartAt("prt_ss2", 3300) + ev := run(t, s, []messageWithParts{mwp(msg, ss1, ss2)}) + + starts := llmOps(ev) + if len(starts) != 2 { + t.Fatalf("llm OpStarted count = %d, want 2", len(starts)) + } + fins := opFinals(ev) + if len(fins) != 1 { + t.Fatalf("OpFinalized count = %d, want 1 (only the force-closed first op)", len(fins)) + } + // The single finalize is the FIRST op (seq = first start's seq), cancelled, + // EndTs = second step-start's start (3300 ms → µs). + if fins[0].Seq != starts[0].Seq { + t.Errorf("force-closed op Seq = %d, want first op Seq %d", fins[0].Seq, starts[0].Seq) + } + if fins[0].Status != "cancelled" { + t.Errorf("force-closed op Status = %q, want cancelled", fins[0].Status) + } + if fins[0].EndTs != msToMicros(3300) { + t.Errorf("force-closed op EndTs = %d, want %d (second step-start start)", fins[0].EndTs, msToMicros(3300)) + } + // The SECOND op (the one still open at turn end) must have NO finalize. + for _, f := range fins { + if f.Seq == starts[1].Seq { + t.Errorf("second op Seq %d was finalized; it must stay running", starts[1].Seq) + } + } +} + +// TestMapper_NormalStepPairNotCancelled guards against over-firing P2.5: a normal +// step-start → step-finish → step-start → step-finish sequence finalizes BOTH ops +// "completed" with NO cancelled status (the first op closed normally before the +// second start, so Edge #5 must not trigger). +func TestMapper_NormalStepPairNotCancelled(t *testing.T) { + t.Parallel() + completed := int64(9000) + msg := asgMsg("msg_a", 2000, &completed, "anthropic", "claude-x", tokenCounts{}, 0, "stop", "") + parts := []partRow{ + stepStartAt("prt_ss1", 2100), + stepFinish("prt_sf1", 100, 20, 0, 0, 0, 0.01), + stepStartAt("prt_ss2", 4000), + stepFinish("prt_sf2", 250, 50, 0, 0, 0, 0.02), + } + ev := run(t, rootSession("ses_pairs", 0), []messageWithParts{mwp(msg, parts...)}) + for _, f := range opFinals(ev) { + if f.Status == "cancelled" { + t.Errorf("op seq %d finalized cancelled; a normal step pair must close completed", f.Seq) + } + } + if got := len(opFinals(ev)); got != 2 { + t.Errorf("OpFinalized = %d, want 2 (both steps closed normally)", got) + } +} + +// --- P2.6: load-bearing decode failures surface a WARN ------------------------ + +// TestMapper_MalformedSessionModelWarns pins P2.6: a PRESENT-but-malformed +// session.model JSON degrades Model/provider to empty AND surfaces one WARN via +// the injected onWarn (rather than silently swallowing). The session is NOT +// aborted — the SessionStarted still emits. +func TestMapper_MalformedSessionModelWarns(t *testing.T) { + t.Parallel() + s := rootSession("ses_badmodel", 0) + s.Model = []byte(`{"id":`) // truncated JSON: present but unparseable + + var warns []error + completed := int64(3000) + msg := asgMsg("msg_a", 2000, &completed, "anthropic", "claude-x", tokenCounts{}, 0, "stop", "") + evs, err := mapSession(testSourceID, s, []messageWithParts{mwp(msg, stepStartAt("prt_ss", 2100), stepFinish("prt_sf", 1, 1, 0, 0, 0, 0))}, + WithOnWarn(func(e error) { warns = append(warns, e) })) + if err != nil { + t.Fatalf("mapSession: %v", err) + } + if len(warns) == 0 { + t.Fatal("malformed session.model produced no WARN (silent failure)") + } + ss := firstStarted(t, evs) + if ss.Model != "" { + t.Errorf("Model = %q, want empty (malformed model degraded)", ss.Model) + } + if got := countKind(evs, canonical.EvSessionStarted); got != 1 { + t.Errorf("SessionStarted = %d, want 1 (session not aborted on malformed model)", got) + } +} + +// TestMapper_MalformedTaskMetadataWarns pins P2.6: a tool='task' part whose +// state.metadata is PRESENT but malformed surfaces one WARN (a possible sub-agent +// linkage was dropped) yet still emits the tool op for the task invocation. +func TestMapper_MalformedTaskMetadataWarns(t *testing.T) { + t.Parallel() + s := rootSession("ses_badmeta", 0) + completed := int64(5000) + msg := asgMsg("msg_a", 2000, &completed, "anthropic", "claude-x", tokenCounts{}, 0, "stop", "") + end := int64(4000) + task := taskPartBadMetadata("prt_task", 3000, &end) + + var warns []error + evs, err := mapSession(testSourceID, s, + []messageWithParts{mwp(msg, stepStartAt("prt_ss", 2100), task, stepFinish("prt_sf", 1, 1, 0, 0, 0, 0))}, + WithOnWarn(func(e error) { warns = append(warns, e) })) + if err != nil { + t.Fatalf("mapSession: %v", err) + } + if len(warns) == 0 { + t.Fatal("malformed task metadata produced no WARN (silent linkage drop)") + } + // No session op (the child id could not be resolved) but the tool op survives. + if got := countKindOpKind(evs, canonical.OpSession); got != 0 { + t.Errorf("session ops = %d, want 0 (metadata unparseable → no child id)", got) + } + sawTask := false + for _, op := range toolOps(evs) { + if op.Name == "task" { + sawTask = true + } + } + if !sawTask { + t.Error("tool op name=task missing; the invocation must still be recorded") + } +} + +// --- P2.4: resolveRootID chain walk edge cases (cycle / missing ancestor) ----- + +// TestResolveRootID_Edges pins resolveRootID's degrade paths (SOW-0005 P2.4): a +// root session resolves to itself; a clean 2-level chain resolves to the root; a +// MISSING ancestor row falls back to the furthest resolvable id + one WARN; a +// CYCLE is broken + one WARN. These are the branches the golden fixture (a clean +// 3-level tree) does not exercise. +func TestResolveRootID_Edges(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + // root <- child <- grand (clean chain); orphan -> missing parent; a 2-cycle. + insertSession(t, rw, "ses_root", "", 1, 1, 0) + insertSession(t, rw, "ses_child", "ses_root", 2, 2, 0) + insertSession(t, rw, "ses_grand", "ses_child", 3, 3, 0) + insertSession(t, rw, "ses_orphan", "ses_ghost", 4, 4, 0) // parent ses_ghost does not exist + insertSession(t, rw, "ses_cycA", "ses_cycB", 5, 5, 0) + insertSession(t, rw, "ses_cycB", "ses_cycA", 6, 6, 0) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db := openRO(t, path) + + t.Run("root resolves to self with no query", func(t *testing.T) { + var ce collectErrs + if got := resolveRootID(ctxBG(), db, "ses_root", "", ce.onError); got != "ses_root" { + t.Errorf("root = %q, want ses_root", got) + } + if ce.count() != 0 { + t.Errorf("root resolution warned %d times, want 0", ce.count()) + } + }) + t.Run("clean chain resolves to top", func(t *testing.T) { + var ce collectErrs + if got := resolveRootID(ctxBG(), db, "ses_grand", "ses_child", ce.onError); got != "ses_root" { + t.Errorf("grand root = %q, want ses_root", got) + } + if ce.count() != 0 { + t.Errorf("clean chain warned %d times, want 0", ce.count()) + } + }) + t.Run("missing ancestor falls back + warns", func(t *testing.T) { + var ce collectErrs + // ses_orphan's parent ses_ghost is absent → fall back to ses_ghost (the + // furthest resolvable ancestor) and warn. + if got := resolveRootID(ctxBG(), db, "ses_orphan", "ses_ghost", ce.onError); got != "ses_ghost" { + t.Errorf("orphan root = %q, want ses_ghost (furthest resolvable)", got) + } + if ce.count() != 1 { + t.Errorf("missing-ancestor warned %d times, want 1", ce.count()) + } + }) + t.Run("cycle is broken + warns", func(t *testing.T) { + var ce collectErrs + got := resolveRootID(ctxBG(), db, "ses_cycA", "ses_cycB", ce.onError) + if got != "ses_cycA" && got != "ses_cycB" { + t.Errorf("cycle root = %q, want one of the cycle ids (broken, not looping)", got) + } + if ce.count() != 1 { + t.Errorf("cycle warned %d times, want 1", ce.count()) + } + }) +} + +// --- P2.6: corrupt numeric cell surfaces a WARN, degrades to 0 ---------------- + +// TestLoadSession_CorruptNumericWarns pins the store-load half of P2.6: a session +// row whose numeric column holds non-numeric text (corrupt cell) degrades that +// field to 0 AND surfaces one WARN via onWarn — not silently swallowed. SQLite's +// flexible typing lets the fixture store a string in an INTEGER column. +func TestLoadSession_CorruptNumericWarns(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + // Insert a session with a NON-NUMERIC tokens_input (corrupt). All other + // required columns are valid so the row loads; only the bad cell degrades. + if _, err := rw.Exec( + `INSERT INTO session (id, project_id, slug, directory, title, version, tokens_input, time_created, time_updated) + VALUES ('ses_corrupt','prj','slug','/w','T','9.9.9','not-a-number',100,100)`); err != nil { + t.Fatalf("insert corrupt session: %v", err) + } + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + var warns []error + s, ok, err := loadSession(ctxBG(), db, schema, "ses_corrupt", func(e error) { warns = append(warns, e) }) + if err != nil { + t.Fatalf("loadSession: %v", err) + } + if !ok { + t.Fatal("loadSession(ses_corrupt) ok=false, want true") + } + if s.TokensInput != 0 { + t.Errorf("TokensInput = %d, want 0 (corrupt cell degraded)", s.TokensInput) + } + if len(warns) != 1 { + t.Fatalf("corrupt cell produced %d warnings, want 1", len(warns)) + } + if !strings.Contains(warns[0].Error(), "tokens_input") || !strings.Contains(warns[0].Error(), "corrupt numeric") { + t.Errorf("warn = %q, want one naming the corrupt column", warns[0].Error()) + } +} + +// TestParseCheckedAndPeek covers the small pure helpers' corrupt/malformed +// branches that the higher-level tests don't hit directly: parseInt64Checked / +// parseFloat64Checked on empty (ok), valid (ok), and corrupt (not ok); and +// peekPartType on empty/malformed/known bodies. +func TestParseCheckedAndPeek(t *testing.T) { + t.Parallel() + if v, ok := parseInt64Checked(""); v != 0 || !ok { + t.Errorf("parseInt64Checked(empty) = (%d,%v), want (0,true)", v, ok) + } + if v, ok := parseInt64Checked("42"); v != 42 || !ok { + t.Errorf("parseInt64Checked(42) = (%d,%v), want (42,true)", v, ok) + } + if v, ok := parseInt64Checked("nope"); v != 0 || ok { + t.Errorf("parseInt64Checked(nope) = (%d,%v), want (0,false)", v, ok) + } + if v, ok := parseFloat64Checked(""); v != 0 || !ok { + t.Errorf("parseFloat64Checked(empty) = (%v,%v), want (0,true)", v, ok) + } + if v, ok := parseFloat64Checked("1.5"); v != 1.5 || !ok { + t.Errorf("parseFloat64Checked(1.5) = (%v,%v), want (1.5,true)", v, ok) + } + if v, ok := parseFloat64Checked("nope"); v != 0 || ok { + t.Errorf("parseFloat64Checked(nope) = (%v,%v), want (0,false)", v, ok) + } + if got := peekPartType(nil); got != partUnknown { + t.Errorf("peekPartType(nil) = %q, want unknown", got) + } + if got := peekPartType([]byte(`{bad json`)); got != partUnknown { + t.Errorf("peekPartType(malformed) = %q, want unknown", got) + } + if got := peekPartType([]byte(`{"type":"step-finish"}`)); got != partStepFinish { + t.Errorf("peekPartType(step-finish) = %q, want step-finish", got) + } +} + +// --- local synthetic-row builders for the review-fix scenarios ---------------- + +// stepStartAt builds a step-start partRow with an explicit time_created (ms), so a +// force-close EndTs (the NEXT step-start's start) is assertable. The shared +// stepStart helper carries no time. +func stepStartAt(id string, createdMs int64) partRow { + raw, _ := json.Marshal(map[string]any{"type": "step-start"}) + return partRow{ID: id, MessageID: "msg_a", SessionID: "ses_x", TimeCreatedMs: createdMs, TimeUpdatedMs: createdMs, Data: raw} +} + +// taskPartBadMetadata builds a tool='task' part whose state.metadata is a JSON +// STRING (not the {sessionId:...} object the decoder expects), so +// subAgentSessionIDChecked reports malformed. completed state so the tool op +// finalizes. +func taskPartBadMetadata(id string, startMs int64, endMs *int64) partRow { + state := map[string]any{ + "status": "completed", + "input": map[string]any{"prompt": "x"}, + "output": "done", + "time": map[string]any{"start": startMs, "end": *endMs}, + "metadata": "not-an-object", // present but the wrong shape → decode error + } + raw, _ := json.Marshal(map[string]any{"type": "tool", "callID": "call_" + id, "tool": "task", "state": state}) + return partRow{ID: id, MessageID: "msg_a", SessionID: "ses_x", TimeCreatedMs: startMs, TimeUpdatedMs: startMs, Data: raw} +} diff --git a/internal/adapters/opencode/review_round2_store_test.go b/internal/adapters/opencode/review_round2_store_test.go new file mode 100644 index 0000000..7ba329e --- /dev/null +++ b/internal/adapters/opencode/review_round2_store_test.go @@ -0,0 +1,290 @@ +package opencode + +import ( + "context" + "testing" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file pins the SOW-0005 ROUND-2 DB-level fixes: P2-B (one part query per +// session, not N+1), P2-C (migrationsTablePresent propagates a real fault rather +// than reporting "absent"), and P2-E (a session with non-NULL time_compacting is +// skipped this cycle and re-emits once the column clears). The mapper-layer +// round-2 fixes live in review_round2_test.go; P1-A in cursor_regression_test.go. + +// --- P2-B: parts loaded in ONE query per session ------------------------------ + +// TestP2B_PartsLoadedInOneQuery pins P2-B: loading a multi-message session's tree +// issues exactly ONE part SELECT (WHERE session_id = ?), NOT one per message. It +// drives loadSessionTree through the counting driver over a 3-message session and +// asserts the parts are correctly grouped per message AND the part query ran once. +// +// NOT t.Parallel(): the counting driver shares one global queryLog. +func TestP2B_PartsLoadedInOneQuery(t *testing.T) { + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_a", "", 1, 1, 0) + // Three messages, each with two parts, so an N+1 loader would run 3 part queries. + for m := 1; m <= 3; m++ { + mid := fmtID("msg", m) + insertAssistantMessage(t, rw, mid, "ses_a", int64(m*10), int64(m*10), 1, 1) + insertPart(t, rw, fmtID("prt", m*10+1), mid, "ses_a", int64(m*10+1), int64(m*10+1), stepStartBody()) + insertPart(t, rw, fmtID("prt", m*10+2), mid, "ses_a", int64(m*10+2), int64(m*10+2), textBody("t")) + } + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + + cdb, log := openCounting(t, path) + cschema, err := introspectAll(ctxBG(), cdb) + if err != nil { + t.Fatalf("introspectAll: %v", err) + } + log.reset() + + tree, err := loadSessionTree(ctxBG(), cdb, cschema, "ses_a", func(error) {}) + if err != nil { + t.Fatalf("loadSessionTree: %v", err) + } + if len(tree) != 3 { + t.Fatalf("tree messages = %d, want 3", len(tree)) + } + // Parts grouped correctly per message, in id order. + for i := range tree { + if len(tree[i].Parts) != 2 { + t.Errorf("message %s has %d parts, want 2", tree[i].Message.ID, len(tree[i].Parts)) + } + for _, p := range tree[i].Parts { + if p.MessageID != tree[i].Message.ID { + t.Errorf("part %s grouped under wrong message %s (want %s)", p.ID, tree[i].Message.ID, p.MessageID) + } + } + } + // The part SELECT (FROM "part") must have run EXACTLY ONCE, not once per message. + if n := log.countContaining(`FROM "part"`); n != 1 { + t.Errorf("part query ran %d times for a 3-message session, want exactly 1 (P2-B: no N+1)", n) + } +} + +// --- P2-C: migrationsTablePresent propagates a real fault --------------------- + +// TestP2C_MigrationsTablePresentPropagatesError pins P2-C: a genuine query fault +// (here: a closed DB) makes migrationsTablePresent return an ERROR, not +// (false, nil) — the prior version folded every error into "absent", hiding +// corruption/ctx-cancel/closed-DB. A genuinely absent table still returns +// (false, nil). +func TestP2C_MigrationsTablePresentPropagatesError(t *testing.T) { + t.Parallel() + // Genuinely-absent table (a current schema synthetic DB has no __drizzle_migrations). + path, rw := newEmptyDB(t, t.TempDir(), "opencode.db") + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db := openRO(t, path) + present, err := migrationsTablePresent(ctxBG(), db) + if err != nil { + t.Fatalf("absent table: err = %v, want nil (soft-absent)", err) + } + if present { + t.Fatal("absent table reported present") + } + + // A genuine fault: close the DB, then query → error must propagate. + db2 := openRO(t, path) + if cerr := db2.Close(); cerr != nil { + t.Fatalf("close db2: %v", cerr) + } + _, err = migrationsTablePresent(ctxBG(), db2) + if err == nil { + t.Fatal("query over a CLOSED DB returned nil error; want the fault propagated (P2-C, NOT treated as absent)") + } +} + +// TestP2C_ReadMigrationsPropagatesError pins that readMigrations surfaces the +// closed-DB fault (not the soft errNoMigrationsTable sentinel) so callers see the +// real failure. +func TestP2C_ReadMigrationsPropagatesError(t *testing.T) { + t.Parallel() + path, rw := newEmptyDB(t, t.TempDir(), "opencode.db") + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db := openRO(t, path) + if cerr := db.Close(); cerr != nil { + t.Fatalf("close db: %v", cerr) + } + _, _, err := readMigrations(ctxBG(), db) + if err == nil { + t.Fatal("readMigrations over a closed DB = nil error; want the fault propagated (P2-C)") + } + if err == errNoMigrationsTable { + t.Fatal("readMigrations over a closed DB returned the soft sentinel; want the real fault (P2-C)") + } +} + +// --- P2-E: time_compacting pauses, then re-emits when it clears ---------------- + +// TestP2E_CompactingSessionSkippedThenEmits pins P2-E (adapter-opencode.md Edge +// #8): a session whose time_compacting is non-NULL is SKIPPED (no events) this +// cycle; once the column clears (and its time_updated bumps), a later cycle emits +// it. It drives reloadAndEmit directly across the two states. +func TestP2E_CompactingSessionSkippedThenEmits(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + // A session mid-compaction: time_compacting set. + insertSession(t, rw, "ses_c", "", 100, 100, 0) + if _, err := rw.Exec(`UPDATE session SET time_compacting = 150 WHERE id = ?`, "ses_c"); err != nil { + t.Fatalf("set time_compacting: %v", err) + } + insertAssistantMessage(t, rw, "msg_a", "ses_c", 110, 110, 5, 2) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + + db, schema := introspect(t, path) + + // Cycle 1: compaction in progress → NO events emitted for the session. + out := make(chan canonical.Event, 64) + if err := reloadAndEmit(ctxBG(), db, schema, "opencode:test", []string{"ses_c"}, out, silentLogger(), func(error) {}); err != nil { + t.Fatalf("reloadAndEmit (compacting): %v", err) + } + got := drainAll(out) + if len(got) != 0 { + t.Fatalf("compacting session emitted %d events, want 0 (P2-E pause)", len(got)) + } + + // time_compacting clears (compaction finished); opencode bumps time_updated. + rw2, err := openRWAgain(t, path) + if err != nil { + t.Fatalf("reopen rw: %v", err) + } + if _, err := rw2.Exec(`UPDATE session SET time_compacting = NULL, time_updated = 200 WHERE id = ?`, "ses_c"); err != nil { + _ = rw2.Close() + t.Fatalf("clear time_compacting: %v", err) + } + if err := rw2.Close(); err != nil { + t.Fatalf("close rw2: %v", err) + } + + // Cycle 2: compaction cleared → the session's tree now emits. + out2 := make(chan canonical.Event, 64) + if err := reloadAndEmit(ctxBG(), db, schema, "opencode:test", []string{"ses_c"}, out2, silentLogger(), func(error) {}); err != nil { + t.Fatalf("reloadAndEmit (cleared): %v", err) + } + got2 := drainAll(out2) + if n := countKind(got2, canonical.EvSessionStarted); n != 1 { + t.Fatalf("cleared session emitted %d SessionStarted, want 1 (P2-E re-emit)", n) + } +} + +// TestP2E_LoadSessionReadsCompacting pins that loadSession populates +// TimeCompactingMs from the column so the skip predicate sees it. +func TestP2E_LoadSessionReadsCompacting(t *testing.T) { + t.Parallel() + path, rw := newEmptyDB(t, t.TempDir(), "opencode.db") + insertSession(t, rw, "ses_c", "", 100, 100, 0) + if _, err := rw.Exec(`UPDATE session SET time_compacting = 777 WHERE id = ?`, "ses_c"); err != nil { + t.Fatalf("set time_compacting: %v", err) + } + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + s, ok, err := loadSession(context.Background(), db, schema, "ses_c", nil) + if err != nil || !ok { + t.Fatalf("loadSession: ok=%v err=%v", ok, err) + } + if s.TimeCompactingMs != 777 { + t.Errorf("TimeCompactingMs = %d, want 777", s.TimeCompactingMs) + } +} + +// --- P1-C golden invariant (keyed on canonical FIELDS, not golden text) ------- + +// TestGoldenInvariant_HFailedTool re-scans the h_failed_tool fixture through the +// public adapter and asserts the P1-C invariant on canonical-event FIELDS, so a +// future -update-golden cannot silently launder the "error"→"failed" regression +// past review: the bash tool's OpFinalized carries the canonical status "failed" +// (never "error"), the opencode detail rides in ErrorClass+ErrorMessage, and the +// TURN stays "completed" (a failed tool op does not fail the turn). +func TestGoldenInvariant_HFailedTool(t *testing.T) { + t.Parallel() + ev := scenarioEvents(t, "h_failed_tool") + + var toolFin *canonical.OpFinalizedEvent + for i := range ev { + f, ok := ev[i].(canonical.OpFinalizedEvent) + if !ok { + continue + } + if f.Status == "error" { + t.Fatalf("op_finalized carries non-canonical status %q (P1-C: must be 'failed')", f.Status) + } + // The tool op is Seq 2 (Seq 1 is the LLM op); the failed one is the tool. + if f.Seq == 2 { + fc := f + toolFin = &fc + } + } + if toolFin == nil { + t.Fatal("no tool op_finalized (seq 2) in h_failed_tool") + } + if toolFin.Status != "failed" { + t.Errorf("tool op status = %q, want failed (P1-C)", toolFin.Status) + } + if toolFin.ErrorClass != defaultErrorClass { + t.Errorf("tool op ErrorClass = %q, want %q (P1-C class label)", toolFin.ErrorClass, defaultErrorClass) + } + if toolFin.ErrorMessage == "" { + t.Error("tool op ErrorMessage empty, want the opencode state.error detail (P1-C)") + } + // The turn itself is completed (the tool failure is op-scoped, not turn-scoped). + tf := turnFinalForSeq(t, ev, 1) + if tf.Status != "completed" { + t.Errorf("turn status = %q, want completed (a failed tool op does not fail the turn)", tf.Status) + } +} + +// --- P3-C: SourceProgress emitted from ONE layer only ------------------------- + +// TestP3C_SingleBatchEmitsOneSourceProgress pins P3-C: a productive single-batch +// pollOnce emits EXACTLY ONE SourceProgress checkpoint — the batch processor's, +// after its sessions are emitted — not two (the trailing post-processChanges emit +// that used to double it was removed). A single small session fits in one batch, +// so exactly one checkpoint fires. +func TestP3C_SingleBatchEmitsOneSourceProgress(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_a", "", 1, 1, 0) + insertAssistantMessage(t, rw, "msg_a", "ses_a", 2, 2, 5, 2) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + + db, schema := introspect(t, path) + out := make(chan canonical.Event, 256) + cur := newCursor() + st := newPollState(false) + advanced, err := pollOnce(ctxBG(), db, schema, &cur, "opencode:test", &st, out, silentLogger(), func(error) {}) + if err != nil { + t.Fatalf("pollOnce: %v", err) + } + if !advanced { + t.Fatal("pollOnce over new rows reported advanced=false") + } + got := drainAll(out) + if n := countKind(got, canonical.EvSourceProgress); n != 1 { + t.Errorf("single-batch pollOnce emitted %d SourceProgress, want EXACTLY 1 (P3-C: one checkpoint layer)", n) + } +} + +// NOTE (SOW-0005 round-3 P3-1): the former TestP2B_OldSchemaPartFallbackOneQuery +// was removed along with the loadPartsByMessageIDs / selectPartsByMessageIDs +// fallback it exercised. session_id is a REQUIRED part column (requiredColumns), +// so introspectAll makes a part table lacking it FATAL upstream — the fallback was +// unreachable in production, and the test had to bypass introspection to reach it. +// The current P2-B test above (TestP2B_PartsLoadedInOneQuery) covers the live +// single-query path on the real (session_id-present) schema. diff --git a/internal/adapters/opencode/review_round2_test.go b/internal/adapters/opencode/review_round2_test.go new file mode 100644 index 0000000..573a4a8 --- /dev/null +++ b/internal/adapters/opencode/review_round2_test.go @@ -0,0 +1,339 @@ +package opencode + +import ( + "encoding/json" + "math" + "strings" + "testing" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file pins the SOW-0005 ROUND-2 external-review fixes that are expressible +// at the PURE-MAPPER layer (no DB): P1-B (last-turn-error session finalize, not a +// sticky OR), P1-C (tool error → canonical "failed" + ErrorClass/Message), P2-A +// (error PRESENCE is terminal, not a non-empty name), P2-D (patch re-emit carries +// the full op identity), and P2-F (overflow clamps + WARN). The DB-level fixes +// (P1-A cursor split, P2-B N+1 parts, P2-C migrations err, P2-E time_compacting, +// P3-C SourceProgress single-emit) are pinned in the tailer/store test files. + +// asgMsgErr builds an assistant message with a custom session id and an OPTIONAL +// error object; errPtr nil means no error, a present (possibly empty-Name) error +// exercises the error-PRESENCE path. It mirrors asgMsg but lets the test control +// the session id (for multi-turn P1-B) and inject a raw error object. +func asgMsgErr(id, sessionID string, createdMs int64, completedMs *int64, errObj map[string]any) messageRow { + d := map[string]any{ + "role": "assistant", + "providerID": "the-alias", + "modelID": "the-model", + "agent": "test-agent", + "tokens": tokenCounts{}, + "time": map[string]any{"created": createdMs}, + "finish": "stop", + } + if completedMs != nil { + d["time"] = map[string]any{"created": createdMs, "completed": *completedMs} + } + if errObj != nil { + d["error"] = errObj + } + raw, _ := json.Marshal(d) + return messageRow{ID: id, SessionID: sessionID, TimeCreatedMs: createdMs, TimeUpdatedMs: createdMs, Data: raw} +} + +// sessionFinal returns the single SessionFinalizedEvent in the stream, or nil +// when the session stayed running (no finalize). +func sessionFinal(evs []canonical.Event) *canonical.SessionFinalizedEvent { + for i := range evs { + if f, ok := evs[i].(canonical.SessionFinalizedEvent); ok { + return &f + } + } + return nil +} + +// --- P1-B: failure is the LAST assistant turn's state, not a sticky OR -------- + +// TestP1B_SessionRecoversWhenLastTurnSucceeds pins P1-B: a session whose turn 1 +// errored but whose turn 2 succeeded is NOT finalized failed — the sticky +// failError is CLEARED by the recovering turn. (Both turns are terminal: turn 1 +// via its error, turn 2 via a completed ts.) +func TestP1B_SessionRecoversWhenLastTurnSucceeds(t *testing.T) { + t.Parallel() + s := rootSession("ses_x", 0) // not archived → terminal decided by last turn + comp := int64(3000) + turn1 := asgMsgErr("msg_1", "ses_x", 2000, &comp, map[string]any{"name": "ProviderError"}) + turn2 := asgMsgErr("msg_2", "ses_x", 4000, ptr(5000), nil) // recovered + evs := run(t, s, []messageWithParts{mwp(turn1), mwp(turn2)}) + + if fin := sessionFinal(evs); fin != nil { + t.Fatalf("session finalized %s after a recovering last turn; want NO finalize (P1-B)", fin.Status) + } + // Both turns still finalize at the turn level (turn 1 failed, turn 2 completed). + tf := turnFinals(evs) + if len(tf) != 2 { + t.Fatalf("turn finals = %d, want 2", len(tf)) + } +} + +// TestP1B_SessionFailsWhenLastTurnErrors pins the converse: a session whose LAST +// turn errored IS finalized failed, carrying that turn's error class. +func TestP1B_SessionFailsWhenLastTurnErrors(t *testing.T) { + t.Parallel() + s := rootSession("ses_x", 0) + turn1 := asgMsgErr("msg_1", "ses_x", 2000, ptr(3000), nil) // clean + turn2 := asgMsgErr("msg_2", "ses_x", 4000, ptr(5000), map[string]any{"name": "FatalError"}) + evs := run(t, s, []messageWithParts{mwp(turn1), mwp(turn2)}) + + fin := sessionFinal(evs) + if fin == nil { + t.Fatal("session whose last turn errored was not finalized failed (P1-B)") + } + if fin.Status != canonical.StatusFailed { + t.Fatalf("session status = %q, want failed", fin.Status) + } + if fin.ErrorClass != "FatalError" { + t.Fatalf("session ErrorClass = %q, want FatalError (the LAST turn's error)", fin.ErrorClass) + } +} + +// TestP1B_ArchivedWinsOverError pins that archival still wins over a last-turn +// error: an archived session is completed regardless of its last turn. +func TestP1B_ArchivedWinsOverError(t *testing.T) { + t.Parallel() + s := rootSession("ses_x", 9000) // archived + turn := asgMsgErr("msg_1", "ses_x", 2000, ptr(3000), map[string]any{"name": "Whatever"}) + evs := run(t, s, []messageWithParts{mwp(turn)}) + + fin := sessionFinal(evs) + if fin == nil || fin.Status != canonical.StatusCompleted { + t.Fatalf("archived session must finalize completed regardless of a last-turn error, got %+v", fin) + } +} + +// --- P2-A: error PRESENCE is terminal/failed, even with an empty name --------- + +// TestP2A_EmptyNameErrorIsTerminalFailed pins P2-A: an error OBJECT with an EMPTY +// name still makes the turn terminal+failed (error presence, not name), and the +// ErrorClass defaults to the safe class label rather than blanking. +func TestP2A_EmptyNameErrorIsTerminalFailed(t *testing.T) { + t.Parallel() + s := rootSession("ses_x", 0) + // Error object present but name is "" — and NO completed ts, NO step-finish, so + // ONLY error-presence can make this turn terminal. + turn := asgMsgErr("msg_1", "ses_x", 2000, nil, map[string]any{"data": map[string]any{"detail": "x"}}) + evs := run(t, s, []messageWithParts{mwp(turn)}) + + tf := turnFinals(evs) + if len(tf) != 1 { + t.Fatalf("turn finals = %d, want 1 (empty-name error is terminal — P2-A)", len(tf)) + } + if tf[0].Status != "failed" { + t.Fatalf("turn status = %q, want failed (error presence — P2-A)", tf[0].Status) + } + if tf[0].ErrorClass != defaultErrorClass { + t.Fatalf("turn ErrorClass = %q, want %q default (P2-A)", tf[0].ErrorClass, defaultErrorClass) + } + // The session also finalizes failed with the default class. + fin := sessionFinal(evs) + if fin == nil || fin.Status != canonical.StatusFailed || fin.ErrorClass != defaultErrorClass { + t.Fatalf("session finalize = %+v, want failed + %q (P2-A)", fin, defaultErrorClass) + } +} + +// --- P2-D: patch-enrichment re-emit carries the full op identity -------------- + +// TestP2D_PatchReEmitCarriesIdentity pins P2-D: when a patch part lands inside a +// step, the LLM op's re-emit (OpStarted carrying the patch extras before the +// finalize) must carry the SAME Name/Model/Provider/ProviderAlias as the original +// OpStarted — so the ingest writer's UNCONDITIONAL ops.name update does not blank +// the name. There are then TWO OpStarted for the LLM op's seq, both fully +// identified, and the second carries the patch extras. +func TestP2D_PatchReEmitCarriesIdentity(t *testing.T) { + t.Parallel() + s := rootSession("ses_x", 0) + patch, _ := json.Marshal(map[string]any{"type": "patch", "hash": "abc123", "files": []string{"/a/b.go"}}) + patchPart := partRow{ID: "prt_patch", MessageID: "msg_a", SessionID: "ses_x", Data: patch} + msg := asgMsg("msg_a", 1500, ptr(3000), "the-alias", "the-model", tokenCounts{Input: 10}, 0.01, "stop", "") + evs := run(t, s, []messageWithParts{mwp(msg, + stepStart("prt_ss"), + patchPart, + stepFinish("prt_sf", 10, 5, 0, 0, 0, 0.01), + )}) + + // Collect the LLM OpStarted events (kind=llm). There must be TWO at the same + // seq: the original and the patch-enrichment re-emit. + var llmStarts []canonical.OpStartedEvent + for _, st := range opStarts(evs) { + if st.Kind == canonical.OpLLM { + llmStarts = append(llmStarts, st) + } + } + if len(llmStarts) != 2 { + t.Fatalf("LLM OpStarted count = %d, want 2 (original + patch re-emit — P2-D)", len(llmStarts)) + } + for i, st := range llmStarts { + if st.Name != "the-model" || st.Model != "the-model" { + t.Errorf("LLM OpStarted[%d] Name/Model = %q/%q, want the-model (P2-D: re-emit must keep identity)", i, st.Name, st.Model) + } + if st.Provider != "the-alias" || st.ProviderAlias != "the-alias" { + t.Errorf("LLM OpStarted[%d] Provider/Alias = %q/%q, want the-alias (P2-D)", i, st.Provider, st.ProviderAlias) + } + if st.Seq != llmStarts[0].Seq { + t.Errorf("LLM OpStarted[%d] Seq = %d, want %d (same op re-emit)", i, st.Seq, llmStarts[0].Seq) + } + } + // The re-emit (second) must carry the patch extras. + if llmStarts[1].Extras["patch_hash"] != "abc123" { + t.Errorf("patch re-emit Extras[patch_hash] = %v, want abc123", llmStarts[1].Extras["patch_hash"]) + } +} + +// --- P2-F: overflow on crafted token values clamps + WARNs -------------------- + +// TestP2F_HugeTokenDeltaClampsAndWarns pins P2-F at the step-finish token level: +// two step-finish snapshots whose CUMULATIVE inputs, subtracted, would overflow +// int64 are clamped (not wrapped) and surface a WARN. The crafted sequence is a +// huge positive cumulative after a negative one (a corrupt DB value), so the +// delta a-b overflows positive → clamps to MaxInt64. +func TestP2F_HugeTokenDeltaClampsAndWarns(t *testing.T) { + t.Parallel() + var warns []error + cums := []tokenCounts{ + {Input: math.MinInt64 + 1}, // a corrupt negative cumulative + {Input: math.MaxInt64}, // next cumulative: MaxInt64 - (MinInt64+1) overflows positive + } + got := computeStepDeltas(cums, func(e error) { warns = append(warns, e) }) + if len(got) != 2 { + t.Fatalf("deltas len = %d, want 2", len(got)) + } + // The second delta must clamp to MaxInt64, NOT wrap to a negative/small value. + if got[1].Input != math.MaxInt64 { + t.Errorf("overflowing delta = %d, want MaxInt64 (clamp, no wrap — P2-F)", got[1].Input) + } + if len(warns) == 0 { + t.Error("overflowing token delta did not surface a WARN (P2-F)") + } + foundInput := false + for _, w := range warns { + if strings.Contains(w.Error(), "tokens.input") && strings.Contains(w.Error(), "overflow") { + foundInput = true + } + } + if !foundInput { + t.Errorf("WARN set %v missing a tokens.input overflow message", warns) + } + + // Negative overflow: a very negative cumulative after a large positive one. The + // subtraction underflows; the result must clamp to 0 (negative token counts are + // meaningless), still with a WARN. + var negWarns []error + neg := computeStepDeltas([]tokenCounts{ + {Input: math.MaxInt64}, // first delta = MaxInt64 + {Input: math.MinInt64 + 1}, // MinInt64+1 - MaxInt64 underflows negative + }, func(e error) { negWarns = append(negWarns, e) }) + if neg[1].Input != 0 { + t.Errorf("underflowing delta = %d, want 0 (clamp, no wrap — P2-F)", neg[1].Input) + } + if len(negWarns) == 0 { + t.Error("underflowing token delta did not surface a WARN (P2-F)") + } +} + +// TestP2F_MsToMicrosSaturates pins P2-F at the timestamp level: a crafted huge ms +// value saturates at math.MaxInt64 rather than WRAPPING (a wrapped timestamp goes +// negative and reorders events nonsensically). +func TestP2F_MsToMicrosSaturates(t *testing.T) { + t.Parallel() + if got := msToMicros(math.MaxInt64); got != math.MaxInt64 { + t.Errorf("msToMicros(MaxInt64) = %d, want MaxInt64 (saturate, no wrap — P2-F)", got) + } + // A normal value still converts ×1000. + if got := msToMicros(1500); got != 1_500_000 { + t.Errorf("msToMicros(1500) = %d, want 1500000", got) + } + // A huge-but-in-range value just below the saturation threshold still ×1000. + safe := int64(math.MaxInt64/1000) - 1 + if got := msToMicros(safe); got != safe*1000 { + t.Errorf("msToMicros(%d) = %d, want %d", safe, got, safe*1000) + } +} + +// ptr returns a pointer to an int64 literal for the optional completed-ts args. +func ptr(v int64) *int64 { return &v } + +// --- round-3 P2-1: ms→µs clamp now WARNs; ctx_used add saturates + WARNs ------- + +// TestP2_1_MsToMicrosWarnsOnClamp pins round-3 P2-1: a session whose time_created +// is a crafted huge ms value clamps to MaxInt64 (as before, P2-F) AND now surfaces +// a WARN via the wired onWarn (it was a SILENT saturation before P2-1). The pure +// msToMicros (no warn channel) still saturates silently — covered by +// TestP2F_MsToMicrosSaturates. +func TestP2_1_MsToMicrosWarnsOnClamp(t *testing.T) { + t.Parallel() + s := rootSession("ses_clamp", 0) + s.TimeCreatedMs = math.MaxInt64 // *1000 would overflow → clamp + WARN + + var warns []error + evs, err := mapSession(testSourceID, s, nil, WithOnWarn(func(e error) { warns = append(warns, e) })) + if err != nil { + t.Fatalf("mapSession: %v", err) + } + // SessionStarted's Ts must be the saturated MaxInt64, never a wrapped negative. + ss, ok := evs[0].(canonical.SessionStartedEvent) + if !ok { + t.Fatalf("first event = %T, want SessionStartedEvent", evs[0]) + } + if ss.Ts != math.MaxInt64 { + t.Errorf("clamped SessionStarted.Ts = %d, want MaxInt64 (saturate, no wrap)", ss.Ts) + } + foundTs := false + for _, w := range warns { + if strings.Contains(w.Error(), "session.time_created") && strings.Contains(w.Error(), "overflow") { + foundTs = true + } + } + if !foundTs { + t.Errorf("clamped timestamp did not surface a WARN naming the field; warns=%v", warns) + } + + // A NON-overflowing timestamp emits NO clamp WARN (no false positives). + var clean []error + _, _ = mapSession(testSourceID, rootSession("ses_ok", 0), nil, WithOnWarn(func(e error) { clean = append(clean, e) })) + for _, w := range clean { + if strings.Contains(w.Error(), "overflow") { + t.Errorf("a normal session timestamp wrongly warned of overflow: %v", w) + } + } +} + +// TestP2_1_CtxUsedAddSaturatesAndWarns pins round-3 P2-1's second half: the +// ctx_used = tokens.input + tokens.cache.read ADDITION saturates at MaxInt64 with +// a WARN on a crafted overflowing pair, instead of wrapping to a negative +// ctx_used. addClampWarn is the arithmetic; this asserts both the clamp and the +// WARN, plus that a normal pair adds cleanly with no WARN. +func TestP2_1_CtxUsedAddSaturatesAndWarns(t *testing.T) { + t.Parallel() + var warns []error + onWarn := func(e error) { warns = append(warns, e) } + + // MaxInt64 + 1 overflows positive → clamp to MaxInt64 + WARN. + if got := addClampWarn(math.MaxInt64, 1, "ctx_used", onWarn); got != math.MaxInt64 { + t.Errorf("addClampWarn(MaxInt64,1) = %d, want MaxInt64 (saturate, no wrap)", got) + } + if len(warns) == 0 { + t.Fatal("overflowing ctx_used add did not surface a WARN (P2-1)") + } + if !strings.Contains(warns[0].Error(), "ctx_used") || !strings.Contains(warns[0].Error(), "overflow") { + t.Errorf("WARN = %q, want one naming ctx_used overflow", warns[0].Error()) + } + + // A normal pair adds cleanly with no WARN. + var clean []error + if got := addClampWarn(100, 250, "ctx_used", func(e error) { clean = append(clean, e) }); got != 350 { + t.Errorf("addClampWarn(100,250) = %d, want 350", got) + } + if len(clean) != 0 { + t.Errorf("normal ctx_used add wrongly warned: %v", clean) + } +} diff --git a/internal/adapters/opencode/review_round3_test.go b/internal/adapters/opencode/review_round3_test.go new file mode 100644 index 0000000..cd7b39f --- /dev/null +++ b/internal/adapters/opencode/review_round3_test.go @@ -0,0 +1,453 @@ +package opencode + +import ( + "strings" + "testing" + "time" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file pins the SOW-0005 ROUND-3 external-review fixes that live at the +// DB/poll-loop layer: P1-1 (boundary-millisecond re-scan catches an in-place +// update of an already-seen low-id row at the cursor boundary), P1-2 (the session +// row read + time_compacting check + tree load share ONE read-only transaction), +// P2-2 (malformed message/part data routes through onError → /api/health), and +// P2-3 (the defensive full-tree-size WARN). The mapper-layer P2-1 fix lives in +// review_round2_test.go; the CLI P3-2 fix in cmd/ai-viewer-ingest/sources_test.go. + +// --- P1-1: same-ms in-place update at the boundary is re-emitted -------------- + +// TestP1_1_BoundaryUpdateReEmitted pins the P1-1 fix (completed by round-4 P1): a +// cursor sits at (T, highID) for a table; an already-seen LOW-id row lives at the +// SAME ms T (the canonical "in-place update re-stamped to the same millisecond" +// case). The cheap MAX(id) path is silent (no id past the high-water), the gated +// MAX(time_updated) > gate is silent (boundary ms unchanged), and the forward +// delta's strict tie-break (time_updated = T AND id > highID) excludes the low-id +// row — so without the fix the row's session is lost forever. The boundary re-scan +// re-emits that session on a WARM/resumed boundary (boundaryReal==true) when the +// gate is open under EITHER trigger: +// - a WAL-driven probe (a WAL event since the last probe), OR +// - a SAFETY-NET probe (the 60 s net is due, NO WAL event) — covering a +// DROPPED/absent WAL hint (round-4 P1). +// +// The cold-Tail snapshot (boundaryReal==false, round-7 P2-1's single cold guard) +// must NOT re-emit on ANY path — the HEAD-snapshot replay guard. +func TestP1_1_BoundaryUpdateReEmitted(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + // One session whose whole tree sits at ms=100. Its part id is LOW ("prt_low"). + insertSession(t, rw, "ses_b", "", 100, 100, 0) + insertAssistantMessage(t, rw, "msg_b", "ses_b", 100, 100, 5, 2) + insertPart(t, rw, "prt_low", "msg_b", "ses_b", 100, 100, stepFinishBody(5, 2, 0.01)) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + // A cursor that has ALREADY paged past ms=100 at a HIGHER tie-break id than the + // low-id row, with the monotonic high-water also above the low row — exactly the + // state in which the low-id boundary row is invisible to both detectors and the + // forward delta. MaxTimeUpdatedMs == 100 (the boundary T) on every tracked table. + freshCursor := func() Cursor { + c := newCursor() + for _, table := range trackedTables { + c = c.withTable(table, TableWatermark{ + MaxIDSeen: "zzz_high", // > any planted id → cheap MAX(id) silent + MaxTimeUpdatedMs: 100, // boundary T + MaxTimeUpdatedID: "zzz_high", // > prt_low → forward tie-break excludes it + }) + } + return c + } + + // (a) Cold-Tail snapshot: a brand-new pollState (boundaryReal==false) with NO WAL + // event. The gate opens via the immediately-due safety net, but the boundary re-scan + // must NOT fire — this is the HEAD-snapshot reconciliation, and replaying the + // snapshot's boundary session there would be spurious (round-7 P2-1: boundaryReal is + // the single cold guard, gating BOTH the changed and gate-open paths). + cur := freshCursor() + stCold := newPollState(false) // boundaryReal=false; lastProbe zero ⇒ net immediately due + out0 := make(chan canonical.Event, 64) + if _, err := pollOnce(ctxBG(), db, schema, &cur, "opencode:test", &stCold, out0, silentLogger(), func(error) {}); err != nil { + t.Fatalf("pollOnce (cold first probe): %v", err) + } + if got := drainAll(out0); hasSession(got, "ses_b") { + t.Fatalf("boundary session re-emitted on the COLD snapshot (boundaryReal must guard it on all paths); got %d events", len(got)) + } + + // (b) SAFETY-NET probe after a prior cycle: the net is due with NO WAL event. The + // cursor at (T, highID) is a WARM/resumed boundary (a real prior paged position → + // boundaryReal=true), so round-4 P1's safety-net boundary re-scan fires and a + // same-ms in-place update that arrived with a missed WAL hint is still surfaced. + // (Round-7 P2-1: boundaryReal is now the single cold guard; a warm boundary like + // this one sets it true, where the old code relied on the removed priorProbe flag.) + cur = freshCursor() + stNet := newPollState(true) + stNet.markProbe(time.Now().Add(-2 * timeUpdatedSafetyNet)) // net due; no WAL + outNet := make(chan canonical.Event, 256) + if _, err := pollOnce(ctxBG(), db, schema, &cur, "opencode:test", &stNet, outNet, silentLogger(), func(error) {}); err != nil { + t.Fatalf("pollOnce (safety-net probe): %v", err) + } + if got := drainAll(outNet); !hasSession(got, "ses_b") { + t.Fatalf("boundary re-scan did not re-emit ses_b on the safety-net probe (round-4 P1); got %d events", len(got)) + } + + // (c) WAL-driven probe with no detector advance: the boundary re-scan fires and + // re-emits ses_b's tree (the round-3 immediate path). The cursor at (T, highID) is + // a WARM/resumed boundary (boundaryReal=true); round-7 P2-1 gates this gate-open + // path on boundaryReal too, so a warm boundary is required for the WAL-driven + // re-emit (a cold Tail's first WAL-driven probe must NOT replay — see the cold case). + cur = freshCursor() + st2 := newPollState(true) + now := time.Now() + st2.markProbe(now.Add(-2 * timeUpdatedSafetyNet)) + st2.markWALEvent(now) // lastWALEvent.After(lastProbe) → gate open via WAL + out := make(chan canonical.Event, 256) + if _, err := pollOnce(ctxBG(), db, schema, &cur, "opencode:test", &st2, out, silentLogger(), func(error) {}); err != nil { + t.Fatalf("pollOnce (WAL-driven): %v", err) + } + got := drainAll(out) + if !hasSession(got, "ses_b") { + t.Fatalf("boundary re-scan did not re-emit the same-ms in-place-updated session ses_b; got %d events", len(got)) + } + if n := countKind(got, canonical.EvSessionStarted); n != 1 { + t.Errorf("boundary re-emit SessionStarted count = %d, want 1", n) + } + // The cursor must NOT have advanced (the boundary rows are already at the + // watermark; the re-scan only re-emits). + if cur.Tables["part"].MaxTimeUpdatedID != "zzz_high" { + t.Errorf("boundary re-scan advanced the cursor (MaxTimeUpdatedID=%q); it must not", cur.Tables["part"].MaxTimeUpdatedID) + } +} + +// TestP1_1_CompactingClearsAtBoundaryReSurfacesOnSafetyNet pins the round-4 P1 +// completeness case the brief calls out: a session that was paused mid-compaction +// (skipped, no events) has its time_compacting CLEARED by an in-place UPDATE that +// lands at exactly the cursor's boundary ms T. That update moves neither MAX(id) +// (no insert) nor MAX(time_updated) (boundary value unchanged), and the forward +// delta's strict tie-break excludes the session row (id <= boundary highID) — so +// without the safety-net boundary re-scan the now-clean session is stranded. With +// priorProbe set and the net due (NO WAL event), the boundary re-scan re-surfaces +// it and the session emits its tree (it is no longer skipped). +func TestP1_1_CompactingClearsAtBoundaryReSurfacesOnSafetyNet(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + // A session whose whole tree sits at ms=100, time_compacting already CLEARED + // (NULL) — i.e. compaction finished, re-stamped to the same boundary ms. + insertSession(t, rw, "ses_comp", "", 100, 100, 0) + insertAssistantMessage(t, rw, "msg_comp", "ses_comp", 100, 100, 5, 2) + insertPart(t, rw, "prt_comp", "msg_comp", "ses_comp", 100, 100, stepFinishBody(5, 2, 0.01)) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + // Cursor already at the boundary (100, highID) on every table, with the monotonic + // high-water above the session/message/part ids → both detectors silent, forward + // delta tie-break excludes the rows. + cur := newCursor() + for _, table := range trackedTables { + cur = cur.withTable(table, TableWatermark{MaxIDSeen: "zzz_high", MaxTimeUpdatedMs: 100, MaxTimeUpdatedID: "zzz_high"}) + } + + // WARM/resumed boundary (cursor at a real paged position → boundaryReal=true); the + // net is due with NO WAL event, so the safety-net boundary re-scan fires (round-4 P1). + // Round-7 P2-1: boundaryReal is the single cold guard, so a warm boundary is what + // arms this safety-net re-emit (the old code keyed it off the now-removed priorProbe). + st := newPollState(true) + st.markProbe(time.Now().Add(-2 * timeUpdatedSafetyNet)) // net due; no WAL + out := make(chan canonical.Event, 256) + if _, err := pollOnce(ctxBG(), db, schema, &cur, "opencode:test", &st, out, silentLogger(), func(error) {}); err != nil { + t.Fatalf("pollOnce (safety-net, compaction cleared at boundary): %v", err) + } + got := drainAll(out) + if !hasSession(got, "ses_comp") { + t.Fatalf("a session whose compaction cleared at the boundary ms was not re-surfaced by the safety-net boundary re-scan; got %d events", len(got)) + } + if n := countKind(got, canonical.EvSessionStarted); n != 1 { + t.Errorf("re-surfaced compaction session SessionStarted count = %d, want 1", n) + } +} + +// TestP1_1_BoundaryReScanSkipsColdAndEmptyTables pins two guards: a table with a +// zero boundary watermark (cold start) and a probe with no WAL event both yield no +// boundary re-emit, so an empty/idle DB never spuriously fires. +func TestP1_1_BoundaryReScanSkipsColdAndEmptyTables(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_c", "", 100, 100, 0) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + // A zero cursor: every table's MaxTimeUpdatedMs == 0 → boundary re-scan skips + // every table even on a WAL-driven probe. + affected, err := boundaryAffectedSessions(ctxBG(), db, schema, newCursor(), func(error) {}) + if err != nil { + t.Fatalf("boundaryAffectedSessions: %v", err) + } + if len(affected) != 0 { + t.Errorf("cold-cursor boundary re-scan returned %v, want none", affected) + } +} + +// TestP1_1_BoundaryAffectedSessionsAcrossTables exercises the boundary re-scan's +// per-table derivation: a message row and a part row both at the boundary ms +// contribute their session via the message and part handlers (the part path also +// runs resolvePartSession). It also covers the error path on a closed DB. +func TestP1_1_BoundaryAffectedSessionsAcrossTables(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_x", "", 100, 100, 0) + insertAssistantMessage(t, rw, "msg_x", "ses_x", 100, 100, 5, 2) + insertPart(t, rw, "prt_x", "msg_x", "ses_x", 100, 100, textBody("t")) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + // Cursor boundary at ms=100 on every table → all three (session/message/part) + // boundary buckets contain ses_x's rows. The derived set is exactly {ses_x}. + cur := newCursor() + for _, table := range trackedTables { + cur = cur.withTable(table, TableWatermark{MaxIDSeen: "zzz", MaxTimeUpdatedMs: 100, MaxTimeUpdatedID: "zzz"}) + } + affected, err := boundaryAffectedSessions(ctxBG(), db, schema, cur, func(error) {}) + if err != nil { + t.Fatalf("boundaryAffectedSessions: %v", err) + } + if len(affected) != 1 || affected[0] != "ses_x" { + t.Fatalf("boundary affected = %v, want [ses_x]", affected) + } + + // Error path: a closed DB makes the per-table bucket query fail; the error is + // surfaced (not swallowed). + closed := openRO(t, path) + if err := closed.Close(); err != nil { + t.Fatalf("close: %v", err) + } + if _, err := boundaryAffectedSessions(ctxBG(), closed, schema, cur, func(error) {}); err == nil { + t.Error("boundaryAffectedSessions on a closed DB returned nil error, want a surfaced failure") + } + + // emitBoundarySessions over a cold (zero) cursor returns emitted=false with no + // events and no error (the no-affected early return). + out := make(chan canonical.Event, 8) + emitted, eerr := emitBoundarySessions(ctxBG(), db, schema, newCursor(), "opencode:test", out, silentLogger(), func(error) {}) + if eerr != nil { + t.Fatalf("emitBoundarySessions(cold): %v", eerr) + } + if emitted { + t.Error("emitBoundarySessions(cold) reported emitted=true, want false") + } +} + +// --- P1-2: single read-only transaction for the whole per-session read -------- + +// TestP1_2_CompactingSkippedAtomically pins P1-2's observable contract: a session +// whose time_compacting is non-NULL is skipped (no tree emit) — and the check + +// the (skipped) tree read happen on one snapshot. A direct mid-read race is not +// portably forceable; this asserts the atomic-skip behaviour and that a CLEAN +// session loads its tree on the same path. +func TestP1_2_CompactingSkippedAtomically(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + // A compacting session (time_compacting set) and a clean one. + insertSession(t, rw, "ses_busy", "", 100, 100, 0) + insertAssistantMessage(t, rw, "msg_busy", "ses_busy", 110, 110, 5, 2) + if _, err := rw.Exec(`UPDATE session SET time_compacting = 555 WHERE id = ?`, "ses_busy"); err != nil { + t.Fatalf("set time_compacting: %v", err) + } + insertSession(t, rw, "ses_ok", "", 200, 200, 0) + insertAssistantMessage(t, rw, "msg_ok", "ses_ok", 210, 210, 7, 3) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + // Compacting session: skipped, NO events. + evs, skipped, err := loadAndMapSession(ctxBG(), db, schema, "opencode:test", "ses_busy", silentLogger(), func(error) {}) + if err != nil { + t.Fatalf("loadAndMapSession(ses_busy): %v", err) + } + if !skipped { + t.Error("a compacting session was not skipped (P1-2/P2-E)") + } + if len(evs) != 0 { + t.Errorf("a compacting session emitted %d events, want 0", len(evs)) + } + + // Clean session: loaded + emitted on the same single-tx path. + evs2, skipped2, err := loadAndMapSession(ctxBG(), db, schema, "opencode:test", "ses_ok", silentLogger(), func(error) {}) + if err != nil { + t.Fatalf("loadAndMapSession(ses_ok): %v", err) + } + if skipped2 { + t.Error("a clean session was wrongly skipped") + } + if n := countKind(evs2, canonical.EvSessionStarted); n != 1 { + t.Errorf("clean session SessionStarted count = %d, want 1", n) + } +} + +// TestP1_2_TreeLoadRunsInCallerTx proves loadSessionTree no longer opens its own +// transaction: it accepts a roQuerier and runs entirely within the transaction the +// caller (loadAndMapSession) owns, so the session row, the compaction check, the +// root resolution and the tree all share one consistent snapshot. The test passes +// an explicit tx and asserts the tree loads; a tx ROLLBACK afterwards must succeed +// (loadSessionTree did not commit it — the caller owns the lifecycle). +func TestP1_2_TreeLoadRunsInCallerTx(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_t", "", 100, 100, 0) + insertAssistantMessage(t, rw, "msg_t", "ses_t", 110, 110, 5, 2) + insertPart(t, rw, "prt_t", "msg_t", "ses_t", 110, 110, textBody("hi")) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + tx, err := beginRO(ctxBG(), db) + if err != nil { + t.Fatalf("beginRO: %v", err) + } + tree, err := loadSessionTree(ctxBG(), tx, schema, "ses_t", func(error) {}) + if err != nil { + t.Fatalf("loadSessionTree(tx): %v", err) + } + if len(tree) != 1 || len(tree[0].Parts) != 1 { + t.Fatalf("tree shape = %d msgs / parts, want 1/1", len(tree)) + } + // The caller still owns the tx: loadSessionTree did not commit, so an explicit + // Rollback here succeeds (it would error "already committed" otherwise). + if err := tx.Rollback(); err != nil { + t.Errorf("tx.Rollback after loadSessionTree: %v (loadSessionTree must NOT own the tx lifecycle)", err) + } +} + +// --- P2-2: malformed message/part data routes through onError (health) -------- + +// TestP2_2_MalformedDataRoutesToOnError pins P2-2: a session with a malformed +// message.data blob (NOT-NULL but undecodable — a corruption signal) routes the +// failure through the adapter's onError callback (which the ingester turns into a +// SourceErrorEvent → /api/health) IN ADDITION to the session-scoped WRN LogEntry. +func TestP2_2_MalformedDataRoutesToOnError(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_m", "", 100, 100, 0) + // A message whose data is not valid JSON. + insertMessageRaw(t, rw, "msg_m", "ses_m", 110, 110, "{not json") + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + var onErr []error + evs, skipped, err := loadAndMapSession(ctxBG(), db, schema, "opencode:test", "ses_m", silentLogger(), + func(e error) { onErr = append(onErr, e) }) + if err != nil { + t.Fatalf("loadAndMapSession: %v", err) + } + if skipped { + t.Fatal("session wrongly skipped") + } + // onError fired (so /api/health degrades) and names the malformed message. + foundMsg := false + for _, e := range onErr { + if strings.Contains(e.Error(), "undecodable message.data") && strings.Contains(e.Error(), "msg_m") { + foundMsg = true + } + } + if !foundMsg { + t.Errorf("malformed message did not route through onError; got %v", onErr) + } + // The session WRN LogEntry is still emitted (detail view). + wrn := 0 + for _, ev := range evs { + if l, ok := ev.(canonical.LogEntryEvent); ok && l.Severity == "WRN" { + wrn++ + } + } + if wrn != 1 { + t.Errorf("session WRN LogEntry count = %d, want 1 (kept alongside onError)", wrn) + } +} + +// TestP2_2_MalformedPartRoutesToOnError is the part-level companion: an assistant +// turn with one malformed part blob routes through onError AND keeps the WRN. +func TestP2_2_MalformedPartRoutesToOnError(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_p", "", 100, 100, 0) + insertAssistantMessage(t, rw, "msg_p", "ses_p", 110, 110, 5, 2) + insertPart(t, rw, "prt_bad", "msg_p", "ses_p", 110, 110, "{not json") + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + var onErr []error + if _, _, err := loadAndMapSession(ctxBG(), db, schema, "opencode:test", "ses_p", silentLogger(), + func(e error) { onErr = append(onErr, e) }); err != nil { + t.Fatalf("loadAndMapSession: %v", err) + } + found := false + for _, e := range onErr { + if strings.Contains(e.Error(), "undecodable part.data") && strings.Contains(e.Error(), "prt_bad") { + found = true + } + } + if !found { + t.Errorf("malformed part did not route through onError; got %v", onErr) + } +} + +// --- P2-3: defensive full-tree-size WARN -------------------------------------- + +// TestP2_3_OversizedSessionWarns pins P2-3: warnIfSessionTooLarge emits ONE +// structured WARN via onWarn when a session's message or part count exceeds its +// bound, and stays silent for a normal-sized session. The threshold consts are +// 100k (too large to materialize in a test), so this drives the bound predicate +// directly with synthetic counts — the same predicate loadSessionTree calls. +func TestP2_3_OversizedSessionWarns(t *testing.T) { + t.Parallel() + + // Over the MESSAGE bound → exactly one WARN naming messages. + tooManyMsgs := make([]messageRow, maxSessionMessagesWarn+1) + var warns []error + warnIfSessionTooLarge("ses_big", tooManyMsgs, nil, func(e error) { warns = append(warns, e) }) + if len(warns) != 1 || !strings.Contains(warns[0].Error(), "messages") { + t.Fatalf("oversized-messages WARN = %v, want exactly one naming messages", warns) + } + + // Over the PART bound → exactly one WARN naming parts. + bigParts := map[string][]partRow{"msg_x": make([]partRow, maxSessionPartsWarn+1)} + var pwarns []error + warnIfSessionTooLarge("ses_big", nil, bigParts, func(e error) { pwarns = append(pwarns, e) }) + if len(pwarns) != 1 || !strings.Contains(pwarns[0].Error(), "parts") { + t.Fatalf("oversized-parts WARN = %v, want exactly one naming parts", pwarns) + } + + // A normal session → NO WARN. + var none []error + warnIfSessionTooLarge("ses_small", + make([]messageRow, 3), + map[string][]partRow{"m": make([]partRow, 5)}, + func(e error) { none = append(none, e) }) + if len(none) != 0 { + t.Errorf("normal-sized session warned: %v", none) + } + + // A nil onWarn is a no-op (the pure no-DB path), not a panic. + warnIfSessionTooLarge("ses_big", tooManyMsgs, nil, nil) +} diff --git a/internal/adapters/opencode/review_round4_test.go b/internal/adapters/opencode/review_round4_test.go new file mode 100644 index 0000000..3b67011 --- /dev/null +++ b/internal/adapters/opencode/review_round4_test.go @@ -0,0 +1,249 @@ +package opencode + +import ( + "database/sql" + "math" + "strings" + "testing" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file pins the SOW-0005 ROUND-4 external-review fixes: +// - P2-1: the DELTA-ROW scanners surface a corrupt OPTIONAL numeric cell via a +// WARN and degrade to 0 (parity with the non-delta loadSession path), while a +// corrupt REQUIRED watermark cell (id/time_updated) ERRORS the page so the +// cursor never advances to a poisoned watermark. +// - P2-2: a part/op/tool emitter fed a crafted huge timestamp clamps to +// math.MaxInt64 AND surfaces a WARN (no silent saturation on an emitted Ts). +// +// (P1 lives in review_round3_test.go; the file-part LogEntry P2-3 in mapper_test.go +// / mapper_branch_test.go; the DSN P3-2 in conn_dsn_test.go; the ProbeStatus P3-1 +// in cmd/ai-viewer-ingest/sources_test.go.) + +// --- P2-1: corrupt delta-row cells warn (optional) / error (required) --------- + +// insertSessionCorruptCol inserts a session row whose ONE named numeric column +// carries a non-numeric TEXT literal (a corrupt cell), exercising the delta +// scanner's parse-failure path. SQLite's flexible typing stores an unconvertible +// string verbatim even in an INTEGER/REAL-affinity column, so parseInt64Checked / +// parseFloat64Checked fail on it — the same shape a corrupt opencode DB would have. +// The base required columns are always set to valid values; `col` is set to the +// corrupt text. col must NOT be one of the base columns below (so it is never +// duplicated). For corrupting a base required column (time_updated), the base value +// for that column is overridden via the corrupt argument list instead. +func insertSessionCorruptCol(t *testing.T, path, id, col, badText string) { + t.Helper() + rw, err := openRWAgain(t, path) + if err != nil { + t.Fatalf("open rw: %v", err) + } + defer func() { _ = rw.Close() }() + // Base columns whose VALUES we control per-column so `col` is never duplicated. + cols := []string{"id", "project_id", "slug", "directory", "title", "version", "time_created", "time_updated"} + vals := []any{id, "prj", "slug", "/d", "T", "9", int64(100), int64(100)} + // If the corrupt column is one of the base columns, overwrite its value in place; + // otherwise append it. + replaced := false + for i, c := range cols { + if c == col { + vals[i] = badText + replaced = true + break + } + } + if !replaced { + cols = append(cols, col) + vals = append(vals, badText) + } + placeholders := strings.TrimSuffix(strings.Repeat("?,", len(cols)), ",") + stmt := `INSERT INTO session (` + strings.Join(cols, ", ") + `) VALUES (` + placeholders + `)` + if _, err := rw.Exec(stmt, vals...); err != nil { + t.Fatalf("insert corrupt %s: %v", col, err) + } +} + +// TestP2_1_CorruptOptionalCellWarnsDegradesToZero pins P2-1's optional-column path: +// a delta row whose OPTIONAL numeric cell (cost) is corrupt surfaces a WARN naming +// the table/column AND still resolves the row's session (degrades to 0, does not +// abort the page) — parity with the non-delta loadSession path. +func TestP2_1_CorruptOptionalCellWarnsDegradesToZero(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + insertSessionCorruptCol(t, path, "ses_cost", "cost", "not-a-number") + db, schema := introspect(t, path) + + var warns []error + affected := newAffectedSet() + // The WARN is buffered in sink during the page tx and flushed via the + // scanTableDelta onError AFTER the tx closes (round-5 P2-1). + sink := &warnSink{} + onRow := deltaRowHandler("session", schema["session"], affected, sink.collect) + if _, err := scanTableDelta(ctxBG(), db, schema["session"], TableWatermark{}, onRow, sink, func(e error) { warns = append(warns, e) }); err != nil { + t.Fatalf("scanTableDelta: corrupt OPTIONAL cell must NOT abort the page, got %v", err) + } + // Session still derived (row processed, cost degraded to 0). + if ids := affected.ids(); len(ids) != 1 || ids[0] != "ses_cost" { + t.Fatalf("affected = %v, want [ses_cost] (corrupt optional cell degrades, not skips)", affected.ids()) + } + // Exactly the corrupt-cost WARN surfaced. + foundCost := false + for _, w := range warns { + if strings.Contains(w.Error(), "corrupt numeric cell") && strings.Contains(w.Error(), "column=cost") { + foundCost = true + } + } + if !foundCost { + t.Errorf("corrupt cost cell did not WARN; got %v", warns) + } +} + +// TestP2_1_CorruptRequiredCellErrorsNoCursorAdvance pins P2-1's required-column +// path: a delta row whose REQUIRED watermark cell (time_updated) is corrupt ERRORS +// the page rather than coercing to 0 — so the cursor cannot advance to a poisoned +// (0) watermark. The error is surfaced; the watermark stays at the input. +func TestP2_1_CorruptRequiredCellErrorsNoCursorAdvance(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + // A session row whose time_updated is a non-numeric text literal (corrupt). + insertSessionCorruptCol(t, path, "ses_bad_tuid", "time_updated", "garbage") + db, schema := introspect(t, path) + + affected := newAffectedSet() + sink := &warnSink{} + onRow := deltaRowHandler("session", schema["session"], affected, sink.collect) + from := TableWatermark{MaxTimeUpdatedMs: 50, MaxTimeUpdatedID: "aaa", MaxIDSeen: "aaa"} + delta, err := scanTableDelta(ctxBG(), db, schema["session"], from, onRow, sink, func(error) {}) + if err == nil { + t.Fatal("corrupt REQUIRED time_updated cell must ERROR the page (no poisoned-0 watermark)") + } + if !strings.Contains(err.Error(), "poisoned watermark") { + t.Errorf("error = %v, want a poisoned-watermark refusal", err) + } + // The watermark must NOT have advanced to a 0/garbage value — it stays at the + // input position (the page aborted before recording any row). + if delta.watermark.MaxTimeUpdatedMs != from.MaxTimeUpdatedMs || delta.watermark.MaxTimeUpdatedID != from.MaxTimeUpdatedID { + t.Errorf("watermark advanced past the corrupt row: got (%d,%q), want input (%d,%q)", + delta.watermark.MaxTimeUpdatedMs, delta.watermark.MaxTimeUpdatedID, from.MaxTimeUpdatedMs, from.MaxTimeUpdatedID) + } +} + +// TestP2_1_RequiredAccessorsGuard pins the required-watermark accessors directly: +// i64Required/strRequired (the guard the delta AND boundary scanners share via +// requiredWatermark) return an error on a NULL/absent/corrupt required cell rather +// than a coerced 0/"" — the single chokepoint that prevents a poisoned watermark +// on EITHER scan path. +func TestP2_1_RequiredAccessorsGuard(t *testing.T) { + t.Parallel() + idx := columnIndex{"id": 0, "time_updated": 1} + + // Corrupt (non-numeric) required time_updated → error. + dBad := &scanDest{holders: []sql.NullString{{String: "x1", Valid: true}, {String: "not-num", Valid: true}}, table: "session"} + if _, err := dBad.i64Required(idx, "time_updated"); err == nil { + t.Error("i64Required on a corrupt cell returned nil error (must refuse the poisoned watermark)") + } + // NULL required time_updated → error (never silently 0). + dNull := &scanDest{holders: []sql.NullString{{String: "x1", Valid: true}, {Valid: false}}, table: "session"} + if _, err := dNull.i64Required(idx, "time_updated"); err == nil { + t.Error("i64Required on a NULL required cell returned nil error") + } + // Empty required id → error. + dEmptyID := &scanDest{holders: []sql.NullString{{String: "", Valid: true}, {String: "100", Valid: true}}, table: "session"} + if _, err := dEmptyID.strRequired(idx, "id"); err == nil { + t.Error("strRequired on an empty id returned nil error") + } + // Valid cells → no error. + dOK := &scanDest{holders: []sql.NullString{{String: "ses_1", Valid: true}, {String: "12345", Valid: true}}, table: "session"} + id, err := dOK.strRequired(idx, "id") + if err != nil || id != "ses_1" { + t.Errorf("strRequired(valid) = (%q,%v), want (ses_1,nil)", id, err) + } + v, err := dOK.i64Required(idx, "time_updated") + if err != nil || v != 12345 { + t.Errorf("i64Required(valid) = (%d,%v), want (12345,nil)", v, err) + } +} + +// --- P2-2: emitter timestamps clamp AND warn ---------------------------------- + +// hugeMs is an opencode-ms value whose *1000 overflows int64, forcing the ms→µs +// conversion to saturate. msToMicrosWarn must surface a WARN at every EMITTED +// event's Ts (round-4 P2-2), not just session start/finalize. +const hugeMs = math.MaxInt64/1000 + 1 + +// TestP2_2_StepStartTimestampClampsAndWarns pins that a step-start (LLM op) part +// whose time_created is the crafted hugeMs clamps the emitted OpStarted Ts to +// MaxInt64 AND surfaces a WARN through the mapper's warn channel. +func TestP2_2_StepStartTimestampClampsAndWarns(t *testing.T) { + s := rootSession("ses_x", 0) + var warns []error + step := stepStart("prt_1") + step.TimeCreatedMs = hugeMs + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), step), + } + evs, err := mapSession(testSourceID, s, msgs, WithOnWarn(func(e error) { warns = append(warns, e) })) + if err != nil { + t.Fatalf("mapSession: %v", err) + } + var llmStartTs int64 = -1 + for _, ev := range evs { + if op, ok := ev.(canonical.OpStartedEvent); ok && op.Kind == canonical.OpLLM { + llmStartTs = op.Ts + } + } + if llmStartTs != math.MaxInt64 { + t.Fatalf("LLM OpStarted Ts = %d, want clamp to MaxInt64", llmStartTs) + } + if !anyWarnContains(warns, "overflow") || !anyWarnContains(warns, "step-start") { + t.Errorf("step-start huge timestamp did not WARN with overflow+field context; got %v", warns) + } +} + +// TestP2_2_ToolEndTimestampClampsAndWarns pins the same no-silent-clamp contract on +// a TOOL op's end timestamp (state.time.end), which closeLLMOp/emitToolOp convert +// via the warning-capable helper. The emitted OpFinalized EndTs clamps AND warns. +func TestP2_2_ToolEndTimestampClampsAndWarns(t *testing.T) { + s := rootSession("ses_x", 0) + var warns []error + endMs := int64(hugeMs) + tool := toolPart("prt_2", "read", "completed", 200, &endMs, nil) + msgs := []messageWithParts{ + mwp(asgMsg("msg_a", 1500, nil, "the-alias", "the-model", tokenCounts{}, 0, "", ""), + stepStart("prt_1"), tool), + } + evs, err := mapSession(testSourceID, s, msgs, WithOnWarn(func(e error) { warns = append(warns, e) })) + if err != nil { + t.Fatalf("mapSession: %v", err) + } + var toolEnd int64 = -1 + for _, ev := range evs { + if op, ok := ev.(canonical.OpFinalizedEvent); ok { + toolEnd = op.EndTs + } + } + if toolEnd != math.MaxInt64 { + t.Fatalf("tool OpFinalized EndTs = %d, want clamp to MaxInt64", toolEnd) + } + if !anyWarnContains(warns, "overflow") || !anyWarnContains(warns, "tool") { + t.Errorf("tool huge end timestamp did not WARN with overflow+field context; got %v", warns) + } +} + +// anyWarnContains reports whether any error in warns contains substr. +func anyWarnContains(warns []error, substr string) bool { + for _, w := range warns { + if strings.Contains(w.Error(), substr) { + return true + } + } + return false +} diff --git a/internal/adapters/opencode/review_round5_store_test.go b/internal/adapters/opencode/review_round5_store_test.go new file mode 100644 index 0000000..6d2849d --- /dev/null +++ b/internal/adapters/opencode/review_round5_store_test.go @@ -0,0 +1,176 @@ +package opencode + +import ( + "database/sql" + "strings" + "testing" +) + +// This file pins the SOW-0005 ROUND-5 STORE-layer fixes: +// - P2-2: the REQUIRED owning-id columns (message.session_id, part.message_id, +// part.session_id, session_message.session_id) ERROR the delta page on an +// empty/corrupt value rather than deriving an empty affected session id (which +// affectedSet.add("") silently drops while the row "succeeds", advancing the +// cursor past an un-emitted change — a permanent, health-invisible gap). A +// valid row is unaffected, and affectedSet NEVER receives "". +// +// (P2-1's no-emission-while-a-source-DB-read-tx-is-open fix is pinned in +// review_round5_txclose_test.go.) +// +// SQLite TEXT NOT NULL rejects a NULL but ACCEPTS an empty string ''. opencode's +// owning-id columns are TEXT NOT NULL, so the realistic corruption shape is an +// empty cell, which these tests insert directly. + +// insertRowEmptyOwner inserts ONE row into the named table whose owning-id column +// `ownerCol` is the empty string (the corruption shape), with all other required +// columns set to valid values. It uses a fresh read-write handle (the test-fixture +// writer; production never opens opencode.db read-write). +func insertRowEmptyOwner(t *testing.T, path, table, ownerCol string) { + t.Helper() + rw, err := openRWAgain(t, path) + if err != nil { + t.Fatalf("open rw: %v", err) + } + defer func() { _ = rw.Close() }() + + var stmt string + var args []any + switch table { + case "message": + // session_id is the only owning id; empty it. + sid := "ses_ok" + if ownerCol == "session_id" { + sid = "" + } + stmt = `INSERT INTO message (id, session_id, time_created, time_updated, data) VALUES (?,?,?,?,?)` + args = []any{"msg_bad", sid, int64(100), int64(100), `{"role":"assistant"}`} + case "part": + mid, sid := "msg_ok", "ses_ok" + if ownerCol == "message_id" { + mid = "" + } + if ownerCol == "session_id" { + sid = "" + } + stmt = `INSERT INTO part (id, message_id, session_id, time_created, time_updated, data) VALUES (?,?,?,?,?,?)` + args = []any{"prt_bad", mid, sid, int64(100), int64(100), `{"type":"step-start"}`} + case "session_message": + sid := "ses_ok" + if ownerCol == "session_id" { + sid = "" + } + stmt = `INSERT INTO session_message (id, session_id, type, time_created, time_updated, data) VALUES (?,?,?,?,?,?)` + args = []any{"evt_bad", sid, "agent-switched", int64(100), int64(100), `{}`} + default: + t.Fatalf("unsupported table %q", table) + } + if _, err := rw.Exec(stmt, args...); err != nil { + t.Fatalf("insert %s empty %s: %v", table, ownerCol, err) + } +} + +// TestP2_2_EmptyOwningIDErrorsNoCursorAdvance pins, per (table, owning column), +// that a delta row with an EMPTY owning id ERRORS the page (no cursor advance, no +// affected session, error surfaced) rather than silently dropping the change. +func TestP2_2_EmptyOwningIDErrorsNoCursorAdvance(t *testing.T) { + cases := []struct { + table string + ownerCol string + }{ + {"message", "session_id"}, + {"part", "message_id"}, + {"part", "session_id"}, + {"session_message", "session_id"}, + } + for _, tc := range cases { + t.Run(tc.table+"/"+tc.ownerCol, func(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + insertRowEmptyOwner(t, path, tc.table, tc.ownerCol) + db, schema := introspect(t, path) + + affected := newAffectedSet() + sink := &warnSink{} + onRow := deltaRowHandler(tc.table, schema[tc.table], affected, sink.collect) + from := TableWatermark{MaxTimeUpdatedMs: 50, MaxTimeUpdatedID: "aaa", MaxIDSeen: "aaa"} + delta, err := scanTableDelta(ctxBG(), db, schema[tc.table], from, onRow, sink, func(error) {}) + if err == nil { + t.Fatalf("empty %s.%s must ERROR the page (no silent cursor gap)", tc.table, tc.ownerCol) + } + // The refusal must name the missing required column. + if !strings.Contains(err.Error(), "required column") || !strings.Contains(err.Error(), tc.ownerCol) { + t.Errorf("error = %v, want a required-column refusal naming %q", err, tc.ownerCol) + } + // The cursor must NOT have advanced past the corrupt row. + if delta.watermark.MaxTimeUpdatedMs != from.MaxTimeUpdatedMs || delta.watermark.MaxTimeUpdatedID != from.MaxTimeUpdatedID { + t.Errorf("watermark advanced past the corrupt row: got (%d,%q), want input (%d,%q)", + delta.watermark.MaxTimeUpdatedMs, delta.watermark.MaxTimeUpdatedID, from.MaxTimeUpdatedMs, from.MaxTimeUpdatedID) + } + // affectedSet NEVER received "" (nor any id): the row aborted before add. + if ids := affected.ids(); len(ids) != 0 { + t.Errorf("affected = %v, want empty (corrupt owning id must not derive a session)", ids) + } + }) + } +} + +// TestP2_2_ValidOwningIDUnaffected is the control: a row with valid owning ids +// scans cleanly, derives its affected session, and advances the watermark — the +// P2-2 guard does not regress the happy path. +func TestP2_2_ValidOwningIDUnaffected(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertAssistantMessage(t, rw, "msg_ok", "ses_ok", 100, 100, 10, 5) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + affected := newAffectedSet() + sink := &warnSink{} + onRow := deltaRowHandler("message", schema["message"], affected, sink.collect) + delta, err := scanTableDelta(ctxBG(), db, schema["message"], TableWatermark{}, onRow, sink, func(error) {}) + if err != nil { + t.Fatalf("valid row must scan cleanly, got %v", err) + } + if ids := affected.ids(); len(ids) != 1 || ids[0] != "ses_ok" { + t.Fatalf("affected = %v, want [ses_ok]", affected.ids()) + } + if delta.watermark.MaxTimeUpdatedID != "msg_ok" { + t.Errorf("watermark id = %q, want msg_ok (valid row advances)", delta.watermark.MaxTimeUpdatedID) + } +} + +// TestP2_2_RequiredOwnerAccessorGuard pins the requiredOwner accessor directly: an +// empty/NULL/absent owning cell returns an error; a valid one returns the value. +// This is the chokepoint the message/part/session_message delta scanners share. +func TestP2_2_RequiredOwnerAccessorGuard(t *testing.T) { + t.Parallel() + idx := columnIndex{"id": 0, "session_id": 1} + + // Empty owning id → error. + dEmpty := &scanDest{holders: []sql.NullString{{String: "x1", Valid: true}, {String: "", Valid: true}}, table: "message"} + if _, err := requiredOwner(dEmpty, idx, "session_id"); err == nil { + t.Error("requiredOwner on an empty session_id returned nil error (must refuse the silent gap)") + } + // NULL owning id → error. + dNull := &scanDest{holders: []sql.NullString{{String: "x1", Valid: true}, {Valid: false}}, table: "message"} + if _, err := requiredOwner(dNull, idx, "session_id"); err == nil { + t.Error("requiredOwner on a NULL session_id returned nil error") + } + // Absent column (not in index) → error. + if _, err := requiredOwner(dEmpty, columnIndex{"id": 0}, "session_id"); err == nil { + t.Error("requiredOwner on an absent session_id returned nil error") + } + // Valid owning id → value, no error. + dOK := &scanDest{holders: []sql.NullString{{String: "x1", Valid: true}, {String: "ses_1", Valid: true}}, table: "message"} + sid, err := requiredOwner(dOK, idx, "session_id") + if err != nil || sid != "ses_1" { + t.Errorf("requiredOwner(valid) = (%q,%v), want (ses_1,nil)", sid, err) + } +} diff --git a/internal/adapters/opencode/review_round5_test.go b/internal/adapters/opencode/review_round5_test.go new file mode 100644 index 0000000..c524507 --- /dev/null +++ b/internal/adapters/opencode/review_round5_test.go @@ -0,0 +1,146 @@ +package opencode + +import ( + "encoding/json" + "testing" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file pins the SOW-0005 ROUND-5 external-review fixes that live in the PURE +// mapper layer: +// - P3-1: a failed SessionFinalizedEvent carries ErrorMessage from +// data.error.data.message (opencode's AssistantError serializes as +// {name, data:{message,...}}); a message-less / malformed / absent error data +// degrades to an empty ErrorMessage WITHOUT aborting the session. +// +// (P2-1's no-emission-under-open-tx and P2-2's required-ownership-id-columns fixes +// are store-layer concerns pinned in review_round5_store_test.go; P3-1's golden is +// the i_failed_assistant scenario + TestGoldenInvariant_IFailedAssistant.) + +// asgMsgErrData builds an assistant messageRow whose data.error is the full +// opencode tagged shape {"name":..,"data":..}. errData is marshalled verbatim as +// the error's `data` body so a test can exercise the {message}, message-less, and +// malformed branches. A nil errData omits the data key entirely (error with only +// a name). completedMs is set so the turn is terminal (and the session finalizes). +func asgMsgErrData(t *testing.T, id, name string, errData any) messageRow { + t.Helper() + completed := int64(2000) + errObj := map[string]any{"name": name} + if errData != nil { + errObj["data"] = errData + } + d := map[string]any{ + "role": "assistant", + "providerID": "the-alias", + "modelID": "the-model", + "agent": "test-agent", + "cost": 0.1, + "tokens": tokenCounts{Input: 10}, + "time": map[string]any{"created": 1500, "completed": completed}, + "finish": "error", + "error": errObj, + } + raw, err := json.Marshal(d) + if err != nil { + t.Fatalf("marshal assistant error msg: %v", err) + } + return messageRow{ID: id, SessionID: "ses_x", TimeCreatedMs: 1500, TimeUpdatedMs: 1500, Data: raw} +} + +// TestP3_1_SessionErrorMessageFromData pins that a failed session terminal carries +// BOTH ErrorClass (data.error.name) AND ErrorMessage (data.error.data.message). +func TestP3_1_SessionErrorMessageFromData(t *testing.T) { + s := rootSession("ses_err", 0) + msgs := []messageWithParts{ + mwp(asgMsgErrData(t, "msg_a", "MessageAbortedError", + map[string]any{"message": "request was aborted by the user"})), + } + evs := run(t, s, msgs) + fins := finalizes(evs) + if len(fins) != 1 { + t.Fatalf("SessionFinalized count = %d, want 1", len(fins)) + } + if fins[0].Status != canonical.StatusFailed { + t.Fatalf("Status = %q, want failed", fins[0].Status) + } + if fins[0].ErrorClass != "MessageAbortedError" { + t.Errorf("ErrorClass = %q, want MessageAbortedError", fins[0].ErrorClass) + } + if fins[0].ErrorMessage != "request was aborted by the user" { + t.Errorf("ErrorMessage = %q, want the data.message string (P3-1)", fins[0].ErrorMessage) + } +} + +// TestP3_1_SessionErrorMessageDegrades pins that the message-less / malformed / +// absent error-data shapes leave ErrorMessage EMPTY while still finalizing failed +// with the correct ErrorClass — the decode is best-effort and never aborts. +func TestP3_1_SessionErrorMessageDegrades(t *testing.T) { + cases := []struct { + name string + errName string + errData any + wantMsg string + wantClas string + }{ + // MessageOutputLengthError: the ONE shipping variant whose data carries no + // message (data: {}). ErrorMessage stays empty; ErrorClass is the name. + {"message_less_variant", "MessageOutputLengthError", map[string]any{}, "", "MessageOutputLengthError"}, + // data present but message is not a string (a non-object/garbage body that + // still unmarshals into the {message string} probe as zero) → empty. + {"data_without_message", "UnknownError", map[string]any{"ref": "r1"}, "", "UnknownError"}, + // error with only a name, no data key at all → empty message. + {"no_data_key", "ProviderError", nil, "", "ProviderError"}, + // empty-name error object: error PRESENCE still makes it failed (P2-A), the + // class defaults; the data.message still flows through. + {"empty_name_keeps_message", "", map[string]any{"message": "boom"}, "boom", defaultErrorClass}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + s := rootSession("ses_err", 0) + msgs := []messageWithParts{mwp(asgMsgErrData(t, "msg_a", tc.errName, tc.errData))} + evs := run(t, s, msgs) + fins := finalizes(evs) + if len(fins) != 1 { + t.Fatalf("SessionFinalized count = %d, want 1", len(fins)) + } + if fins[0].Status != canonical.StatusFailed { + t.Fatalf("Status = %q, want failed", fins[0].Status) + } + if fins[0].ErrorClass != tc.wantClas { + t.Errorf("ErrorClass = %q, want %q", fins[0].ErrorClass, tc.wantClas) + } + if fins[0].ErrorMessage != tc.wantMsg { + t.Errorf("ErrorMessage = %q, want %q", fins[0].ErrorMessage, tc.wantMsg) + } + }) + } +} + +// TestP3_1_ErrorMessageHelper pins the pure helper's branches directly, including +// the malformed-JSON path that the mapSession-level tests cannot reach (the +// assistant decode would reject a malformed whole-message body before errorMessage +// is consulted, but a corrupt error.data sub-blob is possible). +func TestP3_1_ErrorMessageHelper(t *testing.T) { + cases := []struct { + name string + err *assistantError + want string + }{ + {"message_present", &assistantError{Name: "X", Data: json.RawMessage(`{"message":"hello"}`)}, "hello"}, + {"nil_data", &assistantError{Name: "X", Data: nil}, ""}, + {"empty_data", &assistantError{Name: "X", Data: json.RawMessage(``)}, ""}, + {"whitespace_data", &assistantError{Name: "X", Data: json.RawMessage(" \n ")}, ""}, + {"null_data", &assistantError{Name: "X", Data: json.RawMessage(`null`)}, ""}, + {"no_message_key", &assistantError{Name: "X", Data: json.RawMessage(`{"ref":"r"}`)}, ""}, + {"malformed_data", &assistantError{Name: "X", Data: json.RawMessage(`{not json`)}, ""}, + {"non_object_data", &assistantError{Name: "X", Data: json.RawMessage(`"a string"`)}, ""}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + if got := errorMessage(tc.err); got != tc.want { + t.Errorf("errorMessage = %q, want %q", got, tc.want) + } + }) + } +} diff --git a/internal/adapters/opencode/review_round5_txclose_test.go b/internal/adapters/opencode/review_round5_txclose_test.go new file mode 100644 index 0000000..19399a6 --- /dev/null +++ b/internal/adapters/opencode/review_round5_txclose_test.go @@ -0,0 +1,186 @@ +package opencode + +import ( + "context" + "database/sql" + "errors" + "sync/atomic" + "testing" + "time" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file pins SOW-0005 ROUND-5 P2-1: NO warning/error/content emission happens +// while a source-DB read transaction is open, so a backpressured onError can never +// block with the WAL snapshot pinned on the live opencode database. +// +// The deterministic discriminator: open the DB read-only with a pool of exactly +// ONE connection (withMaxOpenConns(1)). While a read tx is OPEN it holds that one +// connection, so a query issued from inside the warn callback would have to WAIT +// for a free connection — and times out against a short context. After the tx is +// committed/rolled back the connection is free, so the same probe query succeeds +// immediately. The tests assert the warn callback fires AND its probe query +// SUCCEEDS, proving the tx was already closed when the callback ran. (If the fix +// regressed and the callback fired under the open tx, the probe would ctx-timeout +// and the test would FAIL rather than hang, thanks to the bounded probe context.) + +// --- warnSink unit contract --------------------------------------------------- + +// TestP2_1_WarnSinkBuffersFlushesResets pins the buffer's core contract: collect +// appends (non-blocking), flush emits in collection order through onError and +// resets so the sink is reusable for the next tx scope, and a nil onError drops. +func TestP2_1_WarnSinkBuffersFlushesResets(t *testing.T) { + t.Parallel() + s := &warnSink{} + s.collect(nil) // nil is ignored + if s.len() != 0 { + t.Fatalf("len after nil collect = %d, want 0", s.len()) + } + e1, e2 := errors.New("w1"), errors.New("w2") + s.collect(e1) + s.collect(e2) + if s.len() != 2 { + t.Fatalf("len = %d, want 2 (buffered, not emitted)", s.len()) + } + var got []error + if n := s.flush(func(e error) { got = append(got, e) }); n != 2 { + t.Fatalf("flush returned %d, want 2", n) + } + if len(got) != 2 || got[0] != e1 || got[1] != e2 { + t.Fatalf("flush order = %v, want [w1 w2]", got) + } + if s.len() != 0 { + t.Fatalf("len after flush = %d, want 0 (reset for reuse)", s.len()) + } + // Reuse: collect again, flush with nil onError drops without panicking. + s.collect(errors.New("w3")) + if n := s.flush(nil); n != 1 || s.len() != 0 { + t.Fatalf("flush(nil) = %d, len = %d, want 1 dropped + reset", n, s.len()) + } +} + +// txOpenProbe returns an onError callback that, on its FIRST invocation, probes +// whether the single-connection pool has a FREE connection (i.e. the read tx is +// already closed): it runs `SELECT 1` under a bounded context. txClosed is set +// true iff the probe succeeded (connection free → tx committed/rolled back before +// the callback fired); fired records that the callback ran at all. A second pool +// connection is impossible (MaxOpenConns(1)), so a still-open tx makes the probe +// ctx-timeout → txClosed stays false. +func txOpenProbe(db *sql.DB) (onError func(error), fired *atomic.Bool, txClosed *atomic.Bool) { + fired = &atomic.Bool{} + txClosed = &atomic.Bool{} + onError = func(error) { + if fired.Swap(true) { + return // probe once + } + ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second) + defer cancel() + var one int + if err := db.QueryRowContext(ctx, "SELECT 1").Scan(&one); err == nil && one == 1 { + txClosed.Store(true) + } + } + return onError, fired, txClosed +} + +// openROConns opens the built DB path strictly read-only (the adapter's helper) +// with a pool of exactly n connections, registering cleanup. n=1 is the P2-1 +// discriminator: a query inside the warn callback can only get a connection once +// the read tx has released it. +func openROConns(t *testing.T, path string, n int) *sql.DB { + t.Helper() + db, err := openReadOnly(context.Background(), path, withMaxOpenConns(n)) + if err != nil { + t.Fatalf("openReadOnly %s: %v", path, err) + } + t.Cleanup(func() { _ = db.Close() }) + return db +} + +// TestP2_1_DeltaWarnEmittedAfterTxClosed pins the DELTA-scan path (scanOnePage): +// a corrupt OPTIONAL cell raises a WARN, and that WARN is delivered through +// onError only AFTER the page tx is committed (the connection is free, so the +// probe query inside the callback succeeds). +func TestP2_1_DeltaWarnEmittedAfterTxClosed(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + // A corrupt OPTIONAL cell (cost) → a degrade-to-0 WARN inside the page tx. + insertSessionCorruptCol(t, path, "ses_warn", "cost", "not-a-number") + + db := openROConns(t, path, 1) // single-connection pool: the discriminator + schema, err := introspectAll(ctxBG(), db) + if err != nil { + t.Fatalf("introspectAll: %v", err) + } + + onError, fired, txClosed := txOpenProbe(db) + sink := &warnSink{} + affected := newAffectedSet() + onRow := deltaRowHandler("session", schema["session"], affected, sink.collect) + if _, err := scanTableDelta(ctxBG(), db, schema["session"], TableWatermark{}, onRow, sink, onError); err != nil { + t.Fatalf("scanTableDelta (corrupt optional must not abort): %v", err) + } + if !fired.Load() { + t.Fatal("expected a corrupt-cell WARN to fire through onError") + } + if !txClosed.Load() { + t.Fatal("WARN fired while the page read tx was STILL OPEN (P2-1 violated): the probe query could not get the single pool connection") + } +} + +// TestP2_1_TreeLoadWarnEmittedAfterTxClosed pins the TREE-LOAD path +// (loadAndMapSession): a corrupt OPTIONAL session cell raises a WARN inside the +// single per-session read tx; the WARN reaches onError only after that tx is +// committed, and the mapped content events are returned (emitted by the caller) +// strictly after. The single-connection probe proves the tx was closed at flush. +func TestP2_1_TreeLoadWarnEmittedAfterTxClosed(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + // A session whose OPTIONAL cost cell is corrupt (warns inside loadSession) plus a + // minimal assistant turn so the tree maps to content events. + insertSessionCorruptCol(t, path, "ses_tree", "cost", "garbage") + rw2, err := openRWAgain(t, path) + if err != nil { + t.Fatalf("open rw2: %v", err) + } + insertAssistantMessage(t, rw2, "msg_t1", "ses_tree", 200, 900, 10, 5) + insertPart(t, rw2, "prt_t1", "msg_t1", "ses_tree", 210, 210, stepStartBody()) + insertPart(t, rw2, "prt_t2", "msg_t1", "ses_tree", 900, 900, stepFinishBody(10, 5, 0.01)) + if err := rw2.Close(); err != nil { + t.Fatalf("close rw2: %v", err) + } + + db := openROConns(t, path, 1) + schema, err := introspectAll(ctxBG(), db) + if err != nil { + t.Fatalf("introspectAll: %v", err) + } + + onError, fired, txClosed := txOpenProbe(db) + evs, skipped, err := loadAndMapSession(ctxBG(), db, schema, "opencode:test", "ses_tree", silentLogger(), onError) + if err != nil { + t.Fatalf("loadAndMapSession: %v", err) + } + if skipped { + t.Fatal("session unexpectedly skipped") + } + if !fired.Load() { + t.Fatal("expected a corrupt-cell WARN to fire through onError during tree load") + } + if !txClosed.Load() { + t.Fatal("WARN fired while the per-session read tx was STILL OPEN (P2-1 violated)") + } + // Content events were produced (mapped AFTER the tx closed, emitted by caller). + if countKind(evs, canonical.EvSessionStarted) != 1 { + t.Fatalf("want 1 SessionStarted in mapped events, got %d events", len(evs)) + } +} diff --git a/internal/adapters/opencode/review_round6_test.go b/internal/adapters/opencode/review_round6_test.go new file mode 100644 index 0000000..d536384 --- /dev/null +++ b/internal/adapters/opencode/review_round6_test.go @@ -0,0 +1,160 @@ +package opencode + +import ( + "testing" + "time" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file pins the SOW-0005 ROUND-6 external-review P1 fix: the boundary-ms +// re-scan must run BEFORE the forward delta (processChanges) on EVERY gated probe, +// against the PRE-ADVANCE cursor, so a co-occurring forward change can never strand +// a same-ms in-place update of a low-id row. (The round-3/4 P1 tests live in +// review_round3_test.go; this is the deeper "co-occurring forward change" case the +// round-3/4 code missed because it ran the re-scan only on the changed==false path.) +// +// The other round-6 fixes are pinned in their natural homes: P2-1 (tool_response +// PayloadRef only when state.output non-empty) in mapper_test.go, P3-1 (retry log +// error.name) in mapper_test.go, P3-2 (resolvePartSession simplification) in +// tailer_branch_test.go, P3-3 (j_file_attachment golden) in testdata + the golden +// suite + golden_invariants_test.go. + +// TestP1_R6_CoOccurringForwardChangeDoesNotStrandBoundaryUpdate is the EXACT codex +// round-6 case. Two sessions sit at the SAME table's boundary ms T: +// - ses_a: an in-place UPDATE re-stamped to ms T with a LOW part id (id ≤ the +// cursor's MaxTimeUpdatedID). The forward delta's strict `> :tuid` tie-break +// (time_updated = T AND id > highID) EXCLUDES it; only the boundary re-scan sees +// it. +// - ses_b: an in-place UPDATE re-stamped to ms T2 > T (a NORMAL forward change +// that advances MAX(time_updated)). The gated MAX(time_updated) probe catches it +// → detectChange returns changed=true, probed=true. +// +// Old (round-3/4) behaviour: changed==true → the boundary re-scan was SKIPPED, and +// processChanges advanced the cursor to (T2, …) — leaving ses_a's row permanently +// below the new watermark, never seen (a zero-gaps violation). Round-6: the boundary +// re-scan runs FIRST against the pre-advance cursor (boundary T), re-emits ses_a, +// THEN processChanges emits ses_b and advances the cursor to T2. BOTH are emitted and +// ses_a is not stranded. +func TestP1_R6_CoOccurringForwardChangeDoesNotStrandBoundaryUpdate(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + + const ( + boundaryMs = int64(100) // T — the cursor's boundary ms + forwardMs = int64(200) // T2 > T — the forward change's new ms + ) + + // ses_a: whole tree at ms T=100, part id is LOW ("prt_aaa_low") so the forward + // delta tie-break (id > highID) excludes it — only the boundary re-scan catches + // this same-ms in-place update. + insertSession(t, rw, "ses_a", "", boundaryMs, boundaryMs, 0) + insertAssistantMessage(t, rw, "msg_a", "ses_a", boundaryMs, boundaryMs, 5, 2) + insertPart(t, rw, "prt_aaa_low", "msg_a", "ses_a", boundaryMs, boundaryMs, stepFinishBody(5, 2, 0.01)) + + // ses_b: a NORMAL forward change — an in-place UPDATE re-stamped to ms T2=200, + // which advances MAX(time_updated) past the cursor boundary T. Its part id is + // existing (NOT greater than MaxIDSeen), so the cheap MAX(id) path stays silent + // and the GATED MAX(time_updated) probe is what fires (probed=true) — exactly the + // state in which the boundary re-scan gate is open. + insertSession(t, rw, "ses_b", "", forwardMs, forwardMs, 0) + insertAssistantMessage(t, rw, "msg_b", "ses_b", forwardMs, forwardMs, 7, 3) + insertPart(t, rw, "prt_bbb_fwd", "msg_b", "ses_b", forwardMs, forwardMs, stepFinishBody(7, 3, 0.02)) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + // Cursor at (T=100, "zzz_high") on every table: MaxIDSeen "zzz_high" is greater + // than BOTH planted part ids, so the cheap MAX(id) path is silent for both (no + // INSERT); the boundary ms is T=100; the tie-break id "zzz_high" is greater than + // prt_aaa_low, so the forward delta excludes ses_a but includes ses_b (T2 > T). + cur := newCursor() + for _, table := range trackedTables { + cur = cur.withTable(table, TableWatermark{ + MaxIDSeen: "zzz_high", + MaxTimeUpdatedMs: boundaryMs, + MaxTimeUpdatedID: "zzz_high", + }) + } + + // WARM start: this cursor at (T, highID) came from REAL prior paging (a Scan + // cursor / a Tail that has paged), so the boundary bucket was already emitted and + // boundaryReal starts true — the codex stranding case is a warm cursor. (A cold + // HEAD-snapshot Tail is the separate TestP1_R6_ColdFirstProbe… case below.) + // Gate open via the SAFETY NET (no WAL event needed): the 60 s net is due. This is + // the harder trigger (no WAL hint); the unified trigger must fire the boundary + // re-scan here too (boundaryReal=true is the single cold guard — round-7 P2-1). + st := newPollState(true) + st.markProbe(time.Now().Add(-2 * timeUpdatedSafetyNet)) // net due; no WAL event + + out := make(chan canonical.Event, 512) + active, err := pollOnce(ctxBG(), db, schema, &cur, "opencode:test", &st, out, silentLogger(), func(error) {}) + if err != nil { + t.Fatalf("pollOnce: %v", err) + } + got := drainAll(out) + + // BOTH sessions must be emitted: ses_b via the forward delta, ses_a via the + // boundary re-scan. The round-3/4 code emitted ONLY ses_b (ses_a stranded). + if !hasSession(got, "ses_b") { + t.Errorf("forward-change session ses_b was not emitted (forward delta must emit it)") + } + if !hasSession(got, "ses_a") { + t.Fatalf("STRANDED: same-ms in-place-updated session ses_a was not emitted — a co-occurring forward change advanced the cursor past boundary ms %d before the boundary re-scan ran (round-6 P1 regression)", boundaryMs) + } + + // The cursor advanced to the forward change's ms T2 (the forward delta's job); the + // boundary re-scan itself never advances the cursor. + if cur.Tables["part"].MaxTimeUpdatedMs != forwardMs { + t.Errorf("cursor part MaxTimeUpdatedMs = %d, want %d (forward delta advances to T2)", cur.Tables["part"].MaxTimeUpdatedMs, forwardMs) + } + if !active { + t.Error("pollOnce reported active=false; a forward change + boundary re-emit both ran, want active=true") + } +} + +// TestP1_R6_ColdFirstProbeStillGuardsBoundaryReplay re-pins the cold-Tail replay +// guard under the unified trigger: the boundary re-scan runs before the forward +// delta, but a fresh COLD Tail (boundaryReal==false, no preceding Scan) must STILL +// NOT replay the boundary bucket (it is a HEAD-snapshot reconciliation, not a real +// in-place update). boundaryReal is the single cold guard (round-7 P2-1): it gates +// the re-scan on EVERY path, so even with the gate open the cold snapshot boundary +// is not replayed until the cursor first advances. +func TestP1_R6_ColdFirstProbeStillGuardsBoundaryReplay(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + // A session whose tree sits at the boundary ms with a LOW part id (the boundary + // bucket the guard must NOT replay on the cold first probe). + insertSession(t, rw, "ses_boundary", "", 100, 100, 0) + insertAssistantMessage(t, rw, "msg_b", "ses_boundary", 100, 100, 5, 2) + insertPart(t, rw, "prt_aaa_low", "msg_b", "ses_boundary", 100, 100, stepFinishBody(5, 2, 0.01)) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + cur := newCursor() + for _, table := range trackedTables { + cur = cur.withTable(table, TableWatermark{ + MaxIDSeen: "zzz_high", + MaxTimeUpdatedMs: 100, + MaxTimeUpdatedID: "zzz_high", + }) + } + + // Cold Tail: fresh pollState (boundaryReal==false), no WAL event. The net is + // immediately due so the gate is open (and the cheap MAX(id) path is silent here), + // but the boundary re-scan must not run — boundaryReal==false suppresses it on the + // gate-open path (the HEAD-snapshot replay guard; round-7 P2-1 single cold guard). + st := newPollState(false) + out := make(chan canonical.Event, 256) + if _, err := pollOnce(ctxBG(), db, schema, &cur, "opencode:test", &st, out, silentLogger(), func(error) {}); err != nil { + t.Fatalf("pollOnce (cold first probe): %v", err) + } + if got := drainAll(out); hasSession(got, "ses_boundary") { + t.Fatalf("cold first probe replayed the boundary bucket (round-6 must keep the round-4 cold guard); ses_boundary must NOT be emitted") + } +} diff --git a/internal/adapters/opencode/review_round7_test.go b/internal/adapters/opencode/review_round7_test.go new file mode 100644 index 0000000..f5bff6c --- /dev/null +++ b/internal/adapters/opencode/review_round7_test.go @@ -0,0 +1,616 @@ +package opencode + +import ( + "database/sql" + "fmt" + "math/rand" + "os" + "strings" + "testing" + "time" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// nullStr builds a valid (non-NULL) sql.NullString for the scanDest unit tests. +func nullStr(s string) sql.NullString { return sql.NullString{String: s, Valid: true} } + +// This file pins the SOW-0005 ROUND-7 external-review fixes: +// - P1-1: the UNIFIED boundary re-scan trigger closes the cheap-MAX(id) +// co-occurrence class — a true INSERT (cheap path, probed==false) co-occurring +// with a same-ms in-place update of a low-id row must re-emit BOTH. Pinned by the +// exact codex case AND a same-ms property/stress test (the guard against a 5th case). +// - P1-2: reloadAndEmit propagates a transient (non-session-gone, non-compaction) +// error so the cursor is NOT advanced past un-emitted content; a genuine +// session-gone skips and the cursor advances. (The reloadAndEmit-level +// propagation is also pinned in tailer_pollcycle_test.go; here it is pinned at +// the commitBatch/processChanges cursor-advance boundary.) +// - P2-1: the boundaryReal cold guard is applied to EVERY re-scan trigger — a cold +// Tail's first WAL-driven OR safety-net probe does NOT replay the HEAD-snapshot +// boundary bucket. +// - P2-2: watchWAL's goroutine is awaited by closeWatch before it returns (no +// send-on-closed-channel race; the goroutine is provably dead). +// - P2-3: the FULL-TREE scanners surface a WARN (not a silent out[""] drop) on a +// corrupt/empty required part.message_id / part.session_id. + +// --- P1-1: cheap-MAX(id) INSERT co-occurring with a same-ms boundary update ----- + +// TestP1_R7_CheapPathInsertCoOccurringBoundaryUpdate is the EXACT codex round-7 +// case — the 4th same-ms variant. The cursor sits at (T, highID). Two changes +// co-occur in ONE poll: +// - ses_a: an in-place UPDATE of a LOW-id part re-stamped to ms T (the boundary). +// The forward delta's strict tie-break (time_updated = T AND id > highID) +// EXCLUDES it; only the boundary re-scan can catch it. +// - ses_b: a TRUE INSERT whose part id sorts ABOVE the cursor's MaxIDSeen, so the +// CHEAP MAX(id) path fires (changed==true, probed==false) and SHORT-CIRCUITS +// before the gated MAX(time_updated) probe. +// +// Pre-round-7 (probed-gated trigger): the cheap path returned probed==false → +// gateOpen==false → the boundary re-scan was SKIPPED, processChanges advanced the +// cursor PAST T for the INSERT, and ses_a fell permanently below the new watermark, +// never seen (zero-gaps violation). Round-7 P1-1: the trigger arms on changed==true +// regardless of path, so ses_a's boundary bucket is re-scanned FIRST (pre-advance), +// BOTH sessions are emitted, and the cursor advances to the INSERT's position. +func TestP1_R7_CheapPathInsertCoOccurringBoundaryUpdate(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + + const boundaryMs = int64(100) // T — the cursor boundary ms + + // ses_a: its tree sits AT the boundary ms T=100 with a LOW part id ("prt_aaa_low"), + // below the cursor's MaxIDSeen/MaxTimeUpdatedID, so the cheap MAX(id) path is silent + // for it and the forward delta tie-break excludes it — only the boundary re-scan sees + // it. (Models a same-ms in-place UPDATE of an already-emitted row.) + insertSession(t, rw, "ses_a", "", boundaryMs, boundaryMs, 0) + insertAssistantMessage(t, rw, "msg_a", "ses_a", boundaryMs, boundaryMs, 5, 2) + insertPart(t, rw, "prt_aaa_low", "msg_a", "ses_a", boundaryMs, boundaryMs, stepFinishBody(5, 2, 0.01)) + + // ses_b: a TRUE INSERT. Its part id "zzz_insert_high" sorts ABOVE the cursor's + // MaxIDSeen ("zzz_high"? no — strictly greater), so the CHEAP MAX(id) path fires + // (changed==true, probed==false). Its time_updated is ALSO T (same ms), so this is + // the same-ms co-occurrence the fix targets; the forward delta INCLUDES it + // (id "zzz_insert_high" > tie-break "zzz_high"). + insertSession(t, rw, "ses_b", "", boundaryMs, boundaryMs, 0) + insertAssistantMessage(t, rw, "msg_b", "ses_b", boundaryMs, boundaryMs, 7, 3) + insertPart(t, rw, "zzz_insert_high", "msg_b", "ses_b", boundaryMs, boundaryMs, stepFinishBody(7, 3, 0.02)) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + // Cursor at (T=100, "zzz_high") on every table. MaxIDSeen "zzz_high" is BELOW the + // inserted part id "zzz_insert_high" (so the cheap MAX(id) path fires on the INSERT) + // but ABOVE prt_aaa_low (so the boundary update is invisible to the cheap path and the + // forward delta tie-break excludes it). Boundary ms is T=100. + cur := newCursor() + for _, table := range trackedTables { + cur = cur.withTable(table, TableWatermark{ + MaxIDSeen: "zzz_high", + MaxTimeUpdatedMs: boundaryMs, + MaxTimeUpdatedID: "zzz_high", + }) + } + + // WARM boundary (boundaryReal=true): the cursor at (T, highID) is a real prior + // paged position. CLOSE the probe gate (a recent probe, NO WAL event) so the ONLY + // thing that can arm the boundary re-scan is changed==true via the CHEAP path — + // proving round-7 P1-1 (the re-scan must fire on the cheap path, NOT only when the + // gated probe ran). With the old probed-gated trigger this poll would NOT re-scan. + st := newPollState(true) + st.markProbe(time.Now()) // gate CLOSED: no WAL event, net not due + + out := make(chan canonical.Event, 512) + active, err := pollOnce(ctxBG(), db, schema, &cur, "opencode:test", &st, out, silentLogger(), func(error) {}) + if err != nil { + t.Fatalf("pollOnce: %v", err) + } + got := drainAll(out) + + // The INSERT (ses_b) is emitted by the forward delta. + if !hasSession(got, "ses_b") { + t.Errorf("the co-occurring INSERT session ses_b was not emitted (forward delta must emit it)") + } + // The same-ms boundary update (ses_a) is emitted by the boundary re-scan — the + // round-7 P1-1 fix. With the old probed-gated trigger ses_a was STRANDED here + // (the cheap MAX(id) path short-circuited probed=false → no re-scan). + if !hasSession(got, "ses_a") { + t.Fatalf("STRANDED (round-7 P1-1 regression): the same-ms in-place boundary update ses_a was NOT emitted — the cheap MAX(id) INSERT path skipped the boundary re-scan and the cursor advanced past ms %d", boundaryMs) + } + if !active { + t.Error("pollOnce reported active=false; both an INSERT and a boundary re-emit ran, want active=true") + } +} + +// --- P1-1: same-ms property/stress test (the guard against a 5th case) ---------- + +// ssState tracks, per session, the latest (time_updated, the cumulative tokens we +// last stamped) so the property check can assert the LAST mutation's content was +// the one finally emitted (zero gaps). +type ssState struct { + latestUpdatedMs int64 + mutated bool +} + +// TestP1_R7_SameMsStress is the property/stress guard against a 5th same-ms case. +// It seeds a synthetic DB, then across multiple poll cycles applies RANDOM (but +// DETERMINISTICALLY seeded — math/rand, varied by the iteration index) interleavings +// of: +// - in-place UPDATEs of an arbitrary EXISTING low-id row re-stamped to the CURRENT +// boundary ms T (the same-ms boundary case — invisible to the cheap MAX(id) path +// and excluded by the forward delta tie-break, so only the boundary re-scan can +// catch it); +// - a CO-OCCURRING true INSERT at ms T+1 (STRICTLY higher), which fires the cheap +// MAX(id) path (changed==true, probed==false) AND advances the cursor PAST T — +// so the in-place update at T falls below the new watermark unless the boundary +// re-scan runs against the pre-advance T (the exact round-7 P1-1 strand); +// - a "missed-WAL" cycle that relies on the 60 s safety net (no WAL event marked, +// the net forced due) instead of the WAL hint. +// +// After draining, it asserts EVERY mutated session's LATEST state was emitted (zero +// gaps) and the cursor never regressed. The structure GUARANTEES the cheap-path +// co-occurrence strand on cycles that do both an in-place update and an INSERT, so a +// 5th same-ms variant (or a regression of this fix) is caught. Verified to FAIL +// against the pre-round-7 probed-gated trigger. +// +// Run with -count=5 (per the SOW gate) to shake out nondeterminism; the seed is +// derived from a fixed constant so each run is reproducible. +func TestP1_R7_SameMsStress(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + + // Seed N sessions, each a single assistant turn with a step-finish part, all at a + // shared starting ms so the boundary bucket is non-trivial from the first poll. + const ( + seedN = 6 + startMs = int64(1000) + numCycle = 16 + ) + for i := 0; i < seedN; i++ { + sid := fmt.Sprintf("ses_%03d", i) + mid := fmt.Sprintf("msg_%03d", i) + insertSession(t, rw, sid, "", startMs, startMs, 0) + insertAssistantMessage(t, rw, mid, sid, startMs, startMs, 5, 2) + insertPart(t, rw, fmt.Sprintf("prt_%03d", i), mid, sid, startMs, startMs, stepFinishBody(5, 2, 0.01)) + } + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + + db, schema := introspect(t, path) + + // A SEPARATE writable handle simulates opencode's live writer across cycles. + rw2, err := openRWAgain(t, path) + if err != nil { + t.Fatalf("reopen rw: %v", err) + } + defer func() { _ = rw2.Close() }() + + // Start WARM (boundaryReal=true) from the seed HEAD: a real Scan would have emitted + // the seed, so this models the resumed Tail. Cursor = the seed maxima. + cur := newCursor() + for _, table := range trackedTables { + mid, _ := maxID(ctxBG(), db, table) + mtu, _ := maxTimeUpdated(ctxBG(), db, table) + cur = cur.withTable(table, TableWatermark{MaxIDSeen: mid, MaxTimeUpdatedMs: mtu, MaxTimeUpdatedID: mid}) + } + st := newPollState(true) + + // Track which sessions were mutated and the ms their latest state lands at, so the + // final assertion can verify the LAST state was emitted (zero gaps). + expect := map[string]*ssState{} + + rng := rand.New(rand.NewSource(0xC0DE57)) //nolint:gosec // deterministic test PRNG, not security-sensitive + insertSeq := 0 // strictly-increasing id/ms counter for new INSERTs + curBoundaryMs := startMs // the ms in-place updates target (the cursor boundary) + var lastCursor = cur + + out := make(chan canonical.Event, 8192) + + for c := 0; c < numCycle; c++ { + // Every cycle does an in-place update at the CURRENT boundary T (the same-ms + // case). Most cycles ALSO do a co-occurring INSERT at T+1 — the strand setup: + // the INSERT advances the cursor past T, so the in-place update at T is below + // the new watermark unless the boundary re-scan runs against pre-advance T. + doInsert := rng.Intn(4) != 0 // ~3/4 cycles co-occur an INSERT (the strand case) + missedWAL := rng.Intn(3) == 0 + + // In-place UPDATE of an arbitrary existing seed session's part, re-stamped to T. + victim := fmt.Sprintf("ses_%03d", rng.Intn(seedN)) + if _, uerr := rw2.Exec(`UPDATE part SET time_updated = ? WHERE session_id = ?`, curBoundaryMs, victim); uerr != nil { + t.Fatalf("cycle %d in-place update of %s: %v", c, victim, uerr) + } + if e, ok := expect[victim]; ok { + e.latestUpdatedMs = curBoundaryMs + } else { + expect[victim] = &ssState{latestUpdatedMs: curBoundaryMs, mutated: true} + } + + if doInsert { + // A NEW session at ms T+1 with a strictly-higher id (the cheap MAX(id) path), + // advancing the cursor PAST the in-place update's ms T. + insMs := curBoundaryMs + 1 + sid := fmt.Sprintf("ses_n%03d", insertSeq) + mid := fmt.Sprintf("msg_n%03d", insertSeq) + pid := fmt.Sprintf("zzz_ins_%06d", insertSeq) // sorts above every existing id + insertSession(t, rw2, sid, "", insMs, insMs, 0) + insertAssistantMessage(t, rw2, mid, sid, insMs, insMs, 9, 4) + insertPart(t, rw2, pid, mid, sid, insMs, insMs, stepFinishBody(9, 4, 0.03)) + expect[sid] = &ssState{latestUpdatedMs: insMs, mutated: true} + insertSeq++ + } + + // Open the gate: a WAL event (normal) or, ~1/3 of the time, ONLY the 60 s net + // (a missed/dropped WAL hint — the safety-net path). + if missedWAL { + st.lastWALEvent = time.Time{} + st.markProbe(time.Now().Add(-2 * timeUpdatedSafetyNet)) // net due, no WAL + } else { + st.markWALEvent(time.Now()) + } + + if _, perr := pollOnce(ctxBG(), db, schema, &cur, "opencode:test", &st, out, silentLogger(), func(error) {}); perr != nil { + t.Fatalf("cycle %d pollOnce: %v", c, perr) + } + + // The cursor must never regress on any table. + for _, table := range trackedTables { + if cmpWatermark(cur.Tables[table], lastCursor.Tables[table]) < 0 { + t.Fatalf("cycle %d: cursor REGRESSED on table %q: %+v < %+v", c, table, cur.Tables[table], lastCursor.Tables[table]) + } + } + lastCursor = cur + + // The boundary follows the forward INSERTs: after a co-occurring INSERT at T+1 + // the cursor's MaxTimeUpdatedMs is now T+1, so the NEXT cycle's in-place update + // targets the new boundary. (On a no-INSERT cycle the boundary stays at T.) + if doInsert { + curBoundaryMs++ + } + } + + got := drainAll(out) + + // ZERO GAPS: every mutated session must have been emitted at least once (its latest + // tree). A SessionStarted is emitted on every full-tree (re)emit, so its presence + // proves the session's latest state reached the output. A stranded same-ms update + // (the bug class) leaves a mutated session ABSENT from the output. + emitted := map[string]bool{} + for _, ev := range got { + if s, ok := ev.(canonical.SessionStartedEvent); ok { + emitted[s.NativeID] = true + } + } + for sid, stt := range expect { + if stt.mutated && !emitted[sid] { + t.Errorf("ZERO-GAPS VIOLATION: mutated session %s (latest ms %d) was never emitted across %d cycles — a same-ms update was stranded by a co-occurring INSERT advancing the cursor", sid, stt.latestUpdatedMs, numCycle) + } + } +} + +// --- P1-2: cursor not advanced on a transient error at the batch boundary ------- + +// TestP1_2_R7_TransientErrorDoesNotAdvanceCursor pins the P1-2 fix at the cursor +// boundary: a transient (non-session-gone) reload error during processChanges must +// leave the committed cursor UNADVANCED, so the same rows are retried next cycle. +// We force the transient error by closing the DB after introspection: the delta +// page scan itself errors, processChanges returns the pre-run cursor and an error, +// and the cursor is NOT advanced. +func TestP1_2_R7_TransientErrorDoesNotAdvanceCursor(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_a", "", 100, 100, 0) + insertAssistantMessage(t, rw, "msg_a", "ses_a", 110, 110, 5, 2) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + if err := db.Close(); err != nil { // closed → every query errors (transient) + t.Fatalf("close ro db: %v", err) + } + + before := newCursor() + out := make(chan canonical.Event, 16) + next, advanced, err := processChanges(ctxBG(), db, schema, before, "opencode:test", out, silentLogger(), func(error) {}) + if err == nil { + t.Fatal("processChanges over a closed DB must return an error (transient), not swallow it") + } + if advanced { + t.Error("processChanges reported advanced=true despite a transient error — the cursor must not advance past un-emitted content (round-7 P1-2)") + } + // The returned cursor is the pre-run cursor (no table watermark advanced). + for _, table := range trackedTables { + if cmpWatermark(next.Tables[table], before.Tables[table]) != 0 { + t.Errorf("table %q cursor advanced on a transient error: %+v != %+v (committed cursor must stay put)", table, next.Tables[table], before.Tables[table]) + } + } +} + +// TestP1_2_R7_SessionGoneAdvances pins the OTHER side of the P1-2 policy: a +// genuinely GONE session (its row absent — deleted between the delta and the load) +// is skip-and-continue, so reloadAndEmit returns nil (the cursor MAY advance). This +// is the one load failure that is legitimately non-fatal. We delete the session row +// but leave the message row (an orphan) so the delta derives the session id but the +// tree load finds no session row → errSessionGone path → skipped, no error returned. +func TestP1_2_R7_SessionGoneAdvances(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_gone", "", 100, 100, 0) + insertAssistantMessage(t, rw, "msg_orphan", "ses_gone", 110, 110, 5, 2) + // Delete the session row, leaving the message orphaned: the affected-session + // derivation still yields "ses_gone", but loadSession finds no row → errSessionGone. + if _, err := rw.Exec(`DELETE FROM session WHERE id = ?`, "ses_gone"); err != nil { + t.Fatalf("delete session row: %v", err) + } + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + var onErr []error + out := make(chan canonical.Event, 16) + err := reloadAndEmit(ctxBG(), db, schema, "opencode:test", []string{"ses_gone"}, out, silentLogger(), + func(e error) { onErr = append(onErr, e) }) + if err != nil { + t.Fatalf("reloadAndEmit must SKIP a gone session (not propagate), got error: %v", err) + } + // The gone session is surfaced once as errSessionGone via onError. + found := false + for _, e := range onErr { + if strings.Contains(e.Error(), "ses_gone") && strings.Contains(e.Error(), "not found") { + found = true + } + } + if !found { + t.Errorf("a gone session must surface one errSessionGone via onError; got %v", onErr) + } +} + +// --- P2-1: cold-Tail boundaryReal guard on the WAL-driven AND safety-net paths -- + +// TestP2_1_R7_ColdTailGateOpenDoesNotReplayBoundary pins P2-1: a COLD Tail +// (boundaryReal==false, HEAD snapshot) must NOT replay its snapshot boundary bucket +// on ANY gate-open path — neither a WAL-driven first probe NOR a safety-net first +// probe (changed==false, gate open). Pre-round-7 the changed==false path was guarded +// only by the now-removed priorProbe flag, so a cold Tail whose first poll was +// WAL-driven (or whose first safety-net probe had priorProbe already set) replayed the +// HEAD-snapshot boundary. boundaryReal (the single cold guard) now suppresses it on +// every path. +func TestP2_1_R7_ColdTailGateOpenDoesNotReplayBoundary(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + // A pre-existing session whose tree sits at the snapshot boundary ms — the bucket + // a cold Tail must NOT replay. + insertSession(t, rw, "ses_snapshot", "", 100, 100, 0) + insertAssistantMessage(t, rw, "msg_s", "ses_snapshot", 100, 100, 5, 2) + insertPart(t, rw, "prt_low", "msg_s", "ses_snapshot", 100, 100, stepFinishBody(5, 2, 0.01)) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + // A cold HEAD-snapshot cursor at the boundary (100, highID) on every table. + freshCursor := func() Cursor { + c := newCursor() + for _, table := range trackedTables { + c = c.withTable(table, TableWatermark{MaxIDSeen: "zzz_high", MaxTimeUpdatedMs: 100, MaxTimeUpdatedID: "zzz_high"}) + } + return c + } + + // (a) Cold Tail, first poll is WAL-DRIVEN (a WAL event fired before any probe). + // boundaryReal==false must suppress the re-scan: the snapshot boundary must NOT + // be replayed even though the gate is open via the WAL path. + curWAL := freshCursor() + stWAL := newPollState(false) // COLD: boundaryReal=false + now := time.Now() + stWAL.markProbe(now.Add(-2 * timeUpdatedSafetyNet)) + stWAL.markWALEvent(now) // WAL event after the last probe → gate open via WAL + outWAL := make(chan canonical.Event, 64) + if _, err := pollOnce(ctxBG(), db, schema, &curWAL, "opencode:test", &stWAL, outWAL, silentLogger(), func(error) {}); err != nil { + t.Fatalf("pollOnce (cold WAL-driven): %v", err) + } + if got := drainAll(outWAL); hasSession(got, "ses_snapshot") { + t.Fatalf("COLD Tail replayed the snapshot boundary on a WAL-driven first probe (round-7 P2-1); ses_snapshot must NOT be emitted") + } + + // (b) Cold Tail, first poll is a SAFETY-NET probe with a prior probe already marked + // (the round-7 P2-1 hole: the old priorProbe guard would have ALLOWED the re-scan + // here). boundaryReal==false must still suppress it. + curNet := freshCursor() + stNet := newPollState(false) // COLD: boundaryReal=false + stNet.markProbe(time.Now().Add(-2 * timeUpdatedSafetyNet)) + stNet.markProbe(time.Now().Add(-2 * timeUpdatedSafetyNet)) // a SECOND prior probe; net still due, no WAL + outNet := make(chan canonical.Event, 64) + if _, err := pollOnce(ctxBG(), db, schema, &curNet, "opencode:test", &stNet, outNet, silentLogger(), func(error) {}); err != nil { + t.Fatalf("pollOnce (cold safety-net): %v", err) + } + if got := drainAll(outNet); hasSession(got, "ses_snapshot") { + t.Fatalf("COLD Tail replayed the snapshot boundary on a safety-net first probe (round-7 P2-1 hole); ses_snapshot must NOT be emitted") + } + + // (c) After the cursor first ADVANCES (a forward INSERT), boundaryReal flips true and + // the boundary re-scan activates — proving the guard only suppresses the COLD window, + // not forever. Insert a forward row and re-poll on the same (now-warm) state. + rw2, err := openRWAgain(t, path) + if err != nil { + t.Fatalf("reopen rw: %v", err) + } + defer func() { _ = rw2.Close() }() + insertSession(t, rw2, "ses_fwd", "", 300, 300, 0) + insertAssistantMessage(t, rw2, "msg_fwd", "ses_fwd", 300, 300, 6, 2) + insertPart(t, rw2, "zzz_fwd_high", "msg_fwd", "ses_fwd", 300, 300, stepFinishBody(6, 2, 0.02)) + + // Reuse stWAL: its cursor must advance on the forward INSERT, flipping boundaryReal. + stWAL.markWALEvent(time.Now()) + outFwd := make(chan canonical.Event, 128) + if _, err := pollOnce(ctxBG(), db, schema, &curWAL, "opencode:test", &stWAL, outFwd, silentLogger(), func(error) {}); err != nil { + t.Fatalf("pollOnce (forward advance): %v", err) + } + if !stWAL.boundaryReal { + t.Error("boundaryReal did not flip true after the cursor advanced on a forward INSERT") + } + if got := drainAll(outFwd); !hasSession(got, "ses_fwd") { + t.Error("the forward INSERT ses_fwd was not emitted") + } +} + +// --- P2-2: watchWAL goroutine awaited before closeWatch returns ----------------- + +// TestP2_2_R7_CloseWatchAwaitsGoroutine pins the P2-2 fix: closeWatch returns ONLY +// after the watcher goroutine has exited, so no send to out/onError can happen after +// closeWatch returns (the send-on-closed-channel race the adapter contract forbids). +// The goroutine's `defer close(hint)` runs (LIFO) BEFORE its `defer wg.Done()`, and +// closeWatch's wg.Wait() blocks until wg.Done() — so once closeWatch returns, the +// hint channel is provably closed (the goroutine is dead). Run under -race. +func TestP2_2_R7_CloseWatchAwaitsGoroutine(t *testing.T) { + t.Parallel() + dir := t.TempDir() + dbPath := dir + "/opencode.db" + // Create the DB file + an empty WAL companion so the watch establishes successfully. + if err := writeFileBytes(dbPath, []byte("x")); err != nil { + t.Fatalf("write db: %v", err) + } + if err := writeFileBytes(dbPath+"-wal", []byte{}); err != nil { + t.Fatalf("write wal: %v", err) + } + + var ce collectErrs + hint, closeWatch := watchWAL(dbPath, ce.onError) + + // Trigger a few WAL writes so the goroutine is actively processing events. + for i := 0; i < 3; i++ { + if err := appendFileBytes(dbPath+"-wal", []byte("frame")); err != nil { + t.Fatalf("append wal: %v", err) + } + } + // Drain any pending hint (non-blocking) so the goroutine is back in its select. + select { + case <-hint: + case <-time.After(time.Second): + } + + // closeWatch must block until the goroutine exits. After it returns, the hint + // channel is closed (the goroutine's deferred close(hint) ran before wg.Done()). + closeWatch() + + select { + case _, ok := <-hint: + if ok { + // A buffered pending hint may drain as ok=true ONCE; the next recv must be + // the closed-channel zero. + if _, ok2 := <-hint; ok2 { + t.Fatal("hint channel still open after closeWatch returned — the watcher goroutine was not awaited (round-7 P2-2)") + } + } + default: + t.Fatal("hint channel not closed after closeWatch returned — the goroutine did not exit before closeWatch (round-7 P2-2)") + } + + // closeWatch is idempotent (sync.Once): a second call must not panic and returns. + closeWatch() +} + +// --- P2-3: full-tree scanners surface a WARN on a corrupt required owner id ------ + +// TestP2_3_R7_FullTreeCorruptPartOwnerWarns pins P2-3: a full-tree reload of a +// session whose part has an EMPTY required message_id (or session_id) surfaces a +// structured WARN via onWarn (the post-tx warnSink path) and SKIPS the row — it is +// NOT silently attached to out[""] and dropped. The DELTA scanners already abort on +// this (round-5 P2-2); round-7 P2-3 extends the same discipline to the FULL-TREE +// load path (scanPartRows). +func TestP2_3_R7_FullTreeCorruptPartOwnerWarns(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_p", "", 100, 100, 0) + insertAssistantMessage(t, rw, "msg_p", "ses_p", 110, 110, 5, 2) + // A part with a VALID body but an EMPTY message_id (corrupt required ownership id). + // It must NOT land under out[""] silently; it must surface a WARN and be skipped. + if _, err := rw.Exec(`INSERT INTO part (id, message_id, session_id, time_created, time_updated, data) VALUES (?,?,?,?,?,?)`, + "prt_bad_owner", "", "ses_p", 110, 110, stepFinishBody(5, 2, 0.01)); err != nil { + t.Fatalf("insert corrupt-owner part: %v", err) + } + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + var warns []error + evs, skipped, err := loadAndMapSession(ctxBG(), db, schema, "opencode:test", "ses_p", silentLogger(), + func(e error) { warns = append(warns, e) }) + if err != nil { + t.Fatalf("loadAndMapSession: %v", err) + } + if skipped { + t.Fatal("session wrongly skipped") + } + // The corrupt-owner part surfaced a WARN naming the table/column (not a silent drop). + found := false + for _, w := range warns { + if strings.Contains(w.Error(), "required ownership column") && + strings.Contains(w.Error(), "message_id") && strings.Contains(w.Error(), "table=part") { + found = true + } + } + if !found { + t.Errorf("a corrupt part.message_id on the full-tree path did not surface a WARN (round-7 P2-3); got %v", warns) + } + // The session still loaded (the one good message is emitted); the corrupt part was + // skipped, not attached under out[""]. + if !hasSession(evs, "ses_p") { + t.Error("session ses_p was not emitted after skipping its corrupt part") + } +} + +// TestP2_3_R7_OwnerOrWarnUnit pins the ownerOrWarn accessor directly: a present +// non-empty value returns (v, true) with no warn; an empty/absent value returns +// ("", false) with exactly one WARN; a nil onWarn is a no-op (no panic). +func TestP2_3_R7_OwnerOrWarnUnit(t *testing.T) { + t.Parallel() + idx := columnIndex{"message_id": 0, "session_id": 1} + + // Present + non-empty → (v, true), no warn. + var warns []error + dOK := (&scanDest{holders: []sql.NullString{nullStr("msg_1"), nullStr("ses_1")}}).withWarn("part", func(e error) { warns = append(warns, e) }) + if v, ok := dOK.ownerOrWarn(idx, "message_id"); !ok || v != "msg_1" { + t.Errorf("ownerOrWarn(present) = (%q,%v), want (msg_1,true)", v, ok) + } + if len(warns) != 0 { + t.Errorf("ownerOrWarn(present) warned: %v", warns) + } + + // Empty → ("", false), exactly one WARN. + warns = nil + dEmpty := (&scanDest{holders: []sql.NullString{nullStr(""), nullStr("ses_1")}}).withWarn("part", func(e error) { warns = append(warns, e) }) + if v, ok := dEmpty.ownerOrWarn(idx, "message_id"); ok || v != "" { + t.Errorf("ownerOrWarn(empty) = (%q,%v), want (\"\",false)", v, ok) + } + if len(warns) != 1 || !strings.Contains(warns[0].Error(), "message_id") { + t.Errorf("ownerOrWarn(empty) WARN = %v, want exactly one naming message_id", warns) + } + + // nil onWarn → no panic, still returns false. + dNil := &scanDest{holders: []sql.NullString{nullStr(""), nullStr("ses_1")}} + if _, ok := dNil.ownerOrWarn(idx, "message_id"); ok { + t.Error("ownerOrWarn(empty, nil onWarn) returned ok=true, want false") + } +} + +// --- tiny file helpers for the WAL watch test ----------------------------------- + +func writeFileBytes(path string, b []byte) error { return os.WriteFile(path, b, 0o600) } + +func appendFileBytes(path string, b []byte) error { + f, err := os.OpenFile(path, os.O_APPEND|os.O_WRONLY, 0o600) + if err != nil { + return err + } + defer func() { _ = f.Close() }() + _, werr := f.Write(b) + return werr +} diff --git a/internal/adapters/opencode/schema_test.go b/internal/adapters/opencode/schema_test.go new file mode 100644 index 0000000..b2ae453 --- /dev/null +++ b/internal/adapters/opencode/schema_test.go @@ -0,0 +1,318 @@ +package opencode + +import ( + "context" + "database/sql" + "path/filepath" + "strings" + "testing" +) + +// seedOldSchemaDB builds a synthetic opencode database mimicking a +// pre-20260510033149_session_usage schema: the session table LACKS the +// cost/tokens_* columns (and the later path/agent/model/time_archived +// columns). It keeps the required id/time_created/time_updated columns so the +// table is still readable. This is the AC#5 schema-drift fixture, built +// throwaway in dir, never copied from the operator's database. +func seedOldSchemaDB(t *testing.T, dir string) string { + t.Helper() + path := filepath.Join(dir, "opencode-old.db") + rwDSN := "file:" + escapeURIPath(filepath.ToSlash(path)) + "?_pragma=busy_timeout(5000)" + rw, err := sql.Open(driverName, rwDSN) + if err != nil { + t.Fatalf("open rw: %v", err) + } + defer func() { _ = rw.Close() }() + stmts := []string{ + // Old session: no cost/tokens_*, no path/agent/model/time_archived. + `CREATE TABLE session ( + id TEXT PRIMARY KEY, project_id TEXT NOT NULL, parent_id TEXT, + slug TEXT NOT NULL, directory TEXT NOT NULL, title TEXT NOT NULL, + version TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL)`, + `CREATE TABLE message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL)`, + `CREATE TABLE part ( + id TEXT PRIMARY KEY, message_id TEXT NOT NULL, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL)`, + `CREATE TABLE session_message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, type TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL)`, + } + for _, s := range stmts { + if _, err := rw.Exec(s); err != nil { + t.Fatalf("seed old schema: %v\nstmt: %s", err, s) + } + } + return path +} + +// TestIntrospectAll_CurrentSchema asserts a current-schema database yields no +// missing columns on any table and a SELECT list covering every wanted column. +func TestIntrospectAll_CurrentSchema(t *testing.T) { + t.Parallel() + path := seedSyntheticDB(t, t.TempDir()) + db, err := openReadOnly(context.Background(), path) + if err != nil { + t.Fatalf("openReadOnly: %v", err) + } + t.Cleanup(func() { _ = db.Close() }) + + set, err := introspectAll(context.Background(), db) + if err != nil { + t.Fatalf("introspectAll: %v", err) + } + for _, table := range trackedTables { + s, ok := set[table] + if !ok { + t.Fatalf("table %q absent from schemaSet", table) + } + if len(s.Missing) != 0 { + t.Errorf("table %q reports missing columns on current schema: %v", table, s.Missing) + } + if len(s.Present) != len(wantedColumns[table]) { + t.Errorf("table %q present=%v, want all %v", table, s.Present, wantedColumns[table]) + } + } +} + +// TestIntrospectAll_OldSchema is the AC#5 dynamic-SELECT proof. Against the +// pre-session_usage schema the session table is missing cost/tokens_* (and +// later optional columns); introspectAll must succeed (required columns +// present), the session tableSchema must list those as Missing, and the built +// SELECT must NOT name any missing column. +func TestIntrospectAll_OldSchema(t *testing.T) { + t.Parallel() + path := seedOldSchemaDB(t, t.TempDir()) + db, err := openReadOnly(context.Background(), path) + if err != nil { + t.Fatalf("openReadOnly: %v", err) + } + t.Cleanup(func() { _ = db.Close() }) + + set, err := introspectAll(context.Background(), db) + if err != nil { + t.Fatalf("introspectAll on old schema: %v", err) + } + sess := set["session"] + + // The session_usage columns must be detected as missing. + wantMissing := []string{ + "agent", "cost", "model", "time_archived", "tokens_cache_read", + "tokens_cache_write", "tokens_input", "tokens_output", "tokens_reasoning", + } + missingSet := map[string]bool{} + for _, m := range sess.Missing { + missingSet[m] = true + } + for _, w := range wantMissing { + if !missingSet[w] { + t.Errorf("expected missing column %q not reported (missing=%v)", w, sess.Missing) + } + } + + // The dynamic SELECT must omit every missing column and must never use *. + sel := sess.buildSelect() + if strings.Contains(sel, "*") { + t.Errorf("SELECT must name columns explicitly, never *: %q", sel) + } + for _, m := range sess.Missing { + if strings.Contains(sel, quoteIdent(m)) { + t.Errorf("SELECT references missing column %q: %q", m, sel) + } + } + // Required columns must be present in the SELECT. + for _, r := range requiredColumns["session"] { + if !strings.Contains(sel, quoteIdent(r)) { + t.Errorf("SELECT omits required column %q: %q", r, sel) + } + } + // Sanity: the SELECT pages and orders along the watermark key. + if !strings.Contains(sel, "ORDER BY time_updated, id LIMIT 1000") { + t.Errorf("SELECT missing watermark ordering/paging: %q", sel) + } +} + +// TestIntrospectAll_MissingRequiredFails asserts that a table missing a +// required column (here: message without its data body) is rejected, because +// such a schema cannot be read safely and must surface a fatal error rather +// than emit empty rows. +func TestIntrospectAll_MissingRequiredFails(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path := filepath.Join(dir, "broken.db") + rwDSN := "file:" + escapeURIPath(filepath.ToSlash(path)) + "?_pragma=busy_timeout(5000)" + rw, err := sql.Open(driverName, rwDSN) + if err != nil { + t.Fatalf("open rw: %v", err) + } + defer func() { _ = rw.Close() }() + // session/part/session_message fine; message lacks the required data column. + stmts := []string{ + `CREATE TABLE session (id TEXT PRIMARY KEY, project_id TEXT NOT NULL, slug TEXT NOT NULL, + directory TEXT NOT NULL, title TEXT NOT NULL, version TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL)`, + `CREATE TABLE message (id TEXT PRIMARY KEY, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL)`, + `CREATE TABLE part (id TEXT PRIMARY KEY, message_id TEXT NOT NULL, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, data TEXT NOT NULL)`, + `CREATE TABLE session_message (id TEXT PRIMARY KEY, session_id TEXT NOT NULL, type TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, data TEXT NOT NULL)`, + } + for _, s := range stmts { + if _, err := rw.Exec(s); err != nil { + t.Fatalf("seed broken: %v", err) + } + } + _ = rw.Close() + + db, err := openReadOnly(context.Background(), path) + if err != nil { + t.Fatalf("openReadOnly: %v", err) + } + t.Cleanup(func() { _ = db.Close() }) + + if _, err := introspectAll(context.Background(), db); err == nil { + t.Fatal("introspectAll on table missing a required column: want error") + } +} + +// TestIntrospectTable_UnknownTableRejected asserts the helper refuses a table +// name outside the wantedColumns set (programmer error guard). +func TestIntrospectTable_UnknownTableRejected(t *testing.T) { + t.Parallel() + path := seedSyntheticDB(t, t.TempDir()) + db, err := openReadOnly(context.Background(), path) + if err != nil { + t.Fatalf("openReadOnly: %v", err) + } + t.Cleanup(func() { _ = db.Close() }) + if _, err := introspectTable(context.Background(), db, "not_a_table"); err == nil { + t.Fatal("introspectTable(unknown): want error") + } +} + +// TestTableSchema_BuildSelectEmpty covers the defensive empty-columns path. +func TestTableSchema_BuildSelectEmpty(t *testing.T) { + t.Parallel() + s := tableSchema{Table: "session"} + if got := s.buildSelect(); !strings.Contains(got, "WHERE 0") { + t.Errorf("empty-column SELECT = %q, want a no-row query", got) + } +} + +// TestQuoteIdent covers the identifier quoting and embedded-quote escaping. +func TestQuoteIdent(t *testing.T) { + t.Parallel() + if got := quoteIdent("time_updated"); got != `"time_updated"` { + t.Errorf("quoteIdent = %q", got) + } + if got := quoteIdent(`we"ird`); got != `"we""ird"` { + t.Errorf("quoteIdent escape = %q", got) + } +} + +// TestBuildSelect_ScansIntoRowStructs proves the dynamic SELECT and the typed +// row structs fit together end-to-end: the message SELECT built from the live +// schema is executed against the seeded synthetic DB with a zero watermark +// (time_updated > -1 selects everything), scanned into a messageRow, and its +// data column decoded via decodeMessageData. This pins the column order the +// SELECT emits against the struct the later delta-query layer scans into, so a +// future column reordering cannot silently misalign the scan. +func TestBuildSelect_ScansIntoRowStructs(t *testing.T) { + t.Parallel() + path := seedSyntheticDB(t, t.TempDir()) + db, err := openReadOnly(context.Background(), path) + if err != nil { + t.Fatalf("openReadOnly: %v", err) + } + t.Cleanup(func() { _ = db.Close() }) + + set, err := introspectAll(context.Background(), db) + if err != nil { + t.Fatalf("introspectAll: %v", err) + } + + // message has a fixed 5-column shape on every schema; scan it whole. + msgSel := set["message"].buildSelect() + var m messageRow + row := db.QueryRowContext(context.Background(), msgSel, int64(-1), int64(-1), "") + if err := row.Scan(&m.ID, &m.SessionID, &m.TimeCreatedMs, &m.TimeUpdatedMs, &m.Data); err != nil { + t.Fatalf("scan messageRow via buildSelect: %v", err) + } + if m.ID != "msg_aaa" || m.SessionID != "ses_aaa" { + t.Errorf("scanned message wrong: %+v", m) + } + md, err := decodeMessageData(m.Data) + if err != nil { + t.Fatalf("decode scanned message data: %v", err) + } + if md.role() != roleAssistant { + t.Errorf("scanned message role = %v, want assistant", md.role()) + } + + // session present-column SELECT scans into a sessionRow's matching prefix + // (id, project_id, slug, directory, title, version, time_created, + // time_updated on the current synthetic schema, which omits parent_id/agent/ + // model/... only when NULL — here all wanted columns exist). Scan just the + // always-present identity columns to prove the row struct + SELECT align. + sessSel := set["session"].buildSelect() + if sessSel == "" { + t.Fatal("empty session SELECT") + } + // Build a column->index map from the SELECT's Present order to read id only. + present := set["session"].Present + dest := make([]any, len(present)) + holders := make([]sql.NullString, len(present)) + for i := range present { + dest[i] = &holders[i] + } + r2 := db.QueryRowContext(context.Background(), sessSel, int64(-1), int64(-1), "") + if err := r2.Scan(dest...); err != nil { + t.Fatalf("scan session via buildSelect: %v", err) + } + var s sessionRow + for i, c := range present { + if c == "id" { + s.ID = holders[i].String + } + } + if s.ID != "ses_aaa" { + t.Errorf("scanned session id = %q, want ses_aaa", s.ID) + } + + // part and session_message share a fixed-shape SELECT; scan each whole row + // into its typed struct to prove the SELECT column order matches the + // container the delta-query layer (Chunk C) will scan into, and decode the + // part body to confirm the data column round-trips. + partSel := set["part"].buildSelect() + var p partRow + pr := db.QueryRowContext(context.Background(), partSel, int64(-1), int64(-1), "") + if err := pr.Scan(&p.ID, &p.MessageID, &p.SessionID, &p.TimeCreatedMs, &p.TimeUpdatedMs, &p.Data); err != nil { + t.Fatalf("scan partRow via buildSelect: %v", err) + } + if p.ID != "prt_aaa" || p.MessageID != "msg_aaa" { + t.Errorf("scanned part wrong: %+v", p) + } + pd, err := decodePartData(p.Data) + if err != nil { + t.Fatalf("decode scanned part data: %v", err) + } + if pd.kind() != partText { + t.Errorf("scanned part kind = %v, want text", pd.kind()) + } + + smSel := set["session_message"].buildSelect() + var sm sessionMessageRow + smr := db.QueryRowContext(context.Background(), smSel, int64(-1), int64(-1), "") + if err := smr.Scan(&sm.ID, &sm.SessionID, &sm.Type, &sm.TimeCreatedMs, &sm.TimeUpdatedMs, &sm.Data); err != nil { + t.Fatalf("scan sessionMessageRow via buildSelect: %v", err) + } + if sm.ID != "evt_aaa" || sm.Type != "model-switched" { + t.Errorf("scanned session_message wrong: %+v", sm) + } +} diff --git a/internal/adapters/opencode/store.go b/internal/adapters/opencode/store.go new file mode 100644 index 0000000..3d63882 --- /dev/null +++ b/internal/adapters/opencode/store.go @@ -0,0 +1,242 @@ +package opencode + +import ( + "context" + "database/sql" + "fmt" + "sort" + "strings" +) + +// This file lays the schema-introspection foundation the opencode adapter +// needs before any delta query can run. opencode's schema evolves across ~30 +// historic migrations, so older databases lack columns a newer one has +// (cost/tokens_* on session were added by 20260510033149_session_usage; +// path/agent/model later still). The adapter therefore NEVER issues +// "SELECT *": it probes each table with PRAGMA table_info, intersects the +// live columns with the set it knows how to read, and builds a SELECT list +// naming only columns that actually exist (adapter-opencode.md §"Read +// Strategy", §"Edge Cases" #1; SOW-0005 AC#5). +// +// The delta-query bodies and the poll loop are Chunk C. This file delivers +// only: the per-table wanted-column lists, the table_info probe, the +// dynamic-SELECT builder, and the missing-column detection a later chunk +// wires to a one-shot INF LogEntry. + +// wantedColumns is the ordered set of columns the adapter reads from each +// tracked table, oldest-schema column first. Every name here is a column the +// mapper (later chunks) consumes; the dynamic SELECT names the INTERSECTION +// of this list with the columns PRAGMA table_info reports, so a column absent +// on an older schema is simply omitted and the mapper sees a zero value. The +// lists are verified against the live database schema (read-only probe, +// 2026-05-30) and the migration journal in adapter-opencode.md §"session" / +// §"message" / §"part" / §"session_message". +var wantedColumns = map[string][]string{ + "session": { + "id", "project_id", "parent_id", "slug", "directory", "title", + "version", "agent", "model", "cost", "tokens_input", "tokens_output", + "tokens_reasoning", "tokens_cache_read", "tokens_cache_write", + "time_created", "time_updated", "time_archived", "time_compacting", + }, + "message": { + "id", "session_id", "time_created", "time_updated", "data", + }, + "part": { + "id", "message_id", "session_id", "time_created", "time_updated", "data", + }, + "session_message": { + "id", "session_id", "type", "time_created", "time_updated", "data", + }, +} + +// requiredColumns is the subset of wantedColumns whose absence makes a table +// unreadable: the primary key, the watermark column, and the payload/body. +// Their loss is not column drift the adapter can paper over with a zero value +// — it means the schema is incompatible and the caller must surface a fatal +// error (later chunks) rather than silently emit empty rows. The id and +// time_updated columns underpin the cursor; data carries the message/part +// body; session_message additionally needs type to discriminate. +var requiredColumns = map[string][]string{ + "session": {"id", "time_created", "time_updated"}, + "message": {"id", "session_id", "time_updated", "data"}, + "part": {"id", "message_id", "session_id", "time_updated", "data"}, + "session_message": {"id", "session_id", "type", "time_updated", "data"}, +} + +// tableSchema is the result of introspecting one table: the columns the +// adapter wants AND found (Present, in wantedColumns order), the columns it +// wants but did NOT find (Missing, sorted), and the raw set of live column +// names for diagnostics. A later chunk turns Missing into one INF LogEntry +// per (table, column) on first occurrence; this chunk only computes it. +type tableSchema struct { + // Table is the table name this schema describes. + Table string + // Present lists the wanted columns that exist in the live table, in + // wantedColumns order. This is exactly the dynamic SELECT list. + Present []string + // Missing lists the wanted columns absent from the live table, sorted. + // Empty on an up-to-date schema. + Missing []string + // live is the set of column names PRAGMA table_info reported, for + // membership checks and diagnostics. + live map[string]struct{} +} + +// has reports whether the live table has the named column. +func (s tableSchema) has(col string) bool { + _, ok := s.live[col] + return ok +} + +// missingRequired returns the required columns (PK / watermark / body) absent +// from the live table. A non-empty result means the table is unreadable and +// the caller must fail rather than emit misleading zero-valued rows. Empty on +// any schema new enough to read. +func (s tableSchema) missingRequired() []string { + req := requiredColumns[s.Table] + var out []string + for _, c := range req { + if !s.has(c) { + out = append(out, c) + } + } + return out +} + +// introspectTable runs PRAGMA table_info(
) on a read-only connection +// and computes the tableSchema: which wanted columns are present, which are +// missing, and the raw live column set. The table name comes from the fixed +// trackedTables/wantedColumns sets (never operator input), so the +// non-parameterisable PRAGMA argument is safe to interpolate. +// +// PRAGMA table_info returns rows (cid, name, type, notnull, dflt_value, pk). +// Only name is needed here. An unknown table (PRAGMA returns no rows) yields a +// tableSchema whose Present is empty and Missing is the full wanted list — the +// caller's missingRequired check then reports the table unreadable. +func introspectTable(ctx context.Context, db *sql.DB, table string) (tableSchema, error) { + wanted, ok := wantedColumns[table] + if !ok { + return tableSchema{}, fmt.Errorf("opencode: introspect unknown table %q", table) + } + + // table is from the fixed wantedColumns map, not operator input. PRAGMA + // arguments cannot be bound as query parameters, so it is interpolated; + // quoting it as an identifier defends against any future drift in the + // source of the name. + rows, err := db.QueryContext(ctx, `PRAGMA table_info(`+quoteIdent(table)+`)`) + if err != nil { + return tableSchema{}, fmt.Errorf("opencode: table_info(%s): %w", table, err) + } + defer func() { _ = rows.Close() }() + + live := map[string]struct{}{} + for rows.Next() { + var ( + cid int + name string + ctype string + notnull int + dflt sql.NullString + pk int + ) + if err := rows.Scan(&cid, &name, &ctype, ¬null, &dflt, &pk); err != nil { + return tableSchema{}, fmt.Errorf("opencode: scan table_info(%s): %w", table, err) + } + live[name] = struct{}{} + } + if err := rows.Err(); err != nil { + return tableSchema{}, fmt.Errorf("opencode: iterate table_info(%s): %w", table, err) + } + + s := tableSchema{Table: table, live: live} + for _, col := range wanted { + if _, found := live[col]; found { + s.Present = append(s.Present, col) + } else { + s.Missing = append(s.Missing, col) + } + } + sort.Strings(s.Missing) + return s, nil +} + +// schemaSet is the introspection result for every tracked table, keyed by +// table name. A later chunk holds this on the adapter and consults it before +// each delta query; this chunk only builds it. +type schemaSet map[string]tableSchema + +// introspectAll probes every tracked table and returns the schemaSet. It +// fails fast if any table is missing a required column (PK / watermark / +// body), because such a table cannot be read safely — the caller surfaces the +// error rather than emitting empty rows. Column drift that is NOT required +// (e.g. an old session row missing cost/tokens_*) is recorded in each +// tableSchema's Missing and tolerated. +func introspectAll(ctx context.Context, db *sql.DB) (schemaSet, error) { + out := make(schemaSet, len(trackedTables)) + for _, table := range trackedTables { + s, err := introspectTable(ctx, db, table) + if err != nil { + return nil, err + } + if missing := s.missingRequired(); len(missing) > 0 { + return nil, fmt.Errorf("opencode: table %q missing required column(s) %v (schema too old or incompatible)", table, missing) + } + out[table] = s + } + return out, nil +} + +// buildSelect renders a delta-read SELECT for the table naming ONLY the +// columns present in the live schema (never SELECT *), ordered by the +// composite watermark key (time_updated, id) the cursor advances along, with +// the standard 1000-row page LIMIT that keeps each read transaction short +// (adapter-opencode.md §"Cursor"). The WHERE clause is intentionally LEFT to +// the caller via two bind placeholders so the delta-query layer (Chunk C) can +// supply the watermark predicate; this chunk emits the column list, ordering, +// and paging skeleton so the dynamic-SELECT behaviour (AC#5) is testable now. +// +// The returned statement has two positional parameters: max_time_updated and +// max_id, in that order, matching: +// +// WHERE time_updated > ?1 OR (time_updated = ?1 AND id > ?2) +// +// quoteIdent guards every identifier; the table name and column names come +// from the fixed wantedColumns map and the live-schema intersection, never +// operator input. +func (s tableSchema) buildSelect() string { + cols := s.Present + if len(cols) == 0 { + // Defensive: introspectAll rejects a table with no readable columns, + // so this path is unreachable in production. Return a syntactically + // valid no-row query rather than an empty string a caller might run. + return "SELECT 1 WHERE 0" + } + quoted := make([]string, len(cols)) + for i, c := range cols { + quoted[i] = quoteIdent(c) + } + var b strings.Builder + b.WriteString("SELECT ") + b.WriteString(strings.Join(quoted, ", ")) + b.WriteString(" FROM ") + b.WriteString(quoteIdent(s.Table)) + b.WriteString(" WHERE time_updated > ? OR (time_updated = ? AND id > ?)") + b.WriteString(" ORDER BY time_updated, id LIMIT 1000") + return b.String() +} + +// NOTE: there is intentionally NO id-only delta SELECT. time_updated is a +// REQUIRED column for every tracked table (requiredColumns), enforced by +// introspectAll which fails fast when it is absent. Every schema that reaches a +// delta query therefore has time_updated, so the composite-key buildSelect above +// is the ONLY delta SELECT — the old pre-Timestamps-mixin id-only fallback +// (buildSelectByID) was unreachable dead code and was removed (SOW-0005 P3.1). + +// quoteIdent wraps a SQL identifier in double quotes, escaping any embedded +// double quote per SQLite identifier rules. All identifiers passed here are +// from the adapter's fixed column/table sets, never operator input; the +// quoting is defence-in-depth so a future column added to wantedColumns that +// happens to be a SQL keyword still parses. +func quoteIdent(id string) string { + return `"` + strings.ReplaceAll(id, `"`, `""`) + `"` +} diff --git a/internal/adapters/opencode/store_load.go b/internal/adapters/opencode/store_load.go new file mode 100644 index 0000000..1cc0da8 --- /dev/null +++ b/internal/adapters/opencode/store_load.go @@ -0,0 +1,279 @@ +package opencode + +import ( + "context" + "database/sql" + "errors" + "fmt" +) + +// This file is the TREE-LOAD layer (SOW-0005 chunk C): given an affected session +// id, it loads the whole session tree — the session row, its messages ordered by +// (time_created, id), and each message's parts ordered by (id) — and assembles +// the []messageWithParts the pure mapper consumes. Full-tree reload is mandatory +// (not partial): mapSession computes per-turn cumulative-token deltas across the +// ordered message list, so a partial reload would miscompute deltas +// (adapter-opencode.md §"Read Strategy" → "Full-session-tree load + map"). It +// also holds the per-table delta-row scanners (sessionRow/messageRow/partRow/ +// sessionMessageRow) that scanTableDelta (store_query.go) drives. Every read uses +// the schema's PRESENT columns only (never SELECT *), so an old schema missing a +// column degrades to a zero value rather than failing. + +// errSessionGone marks an affected session whose session row could not be loaded +// (deleted between the delta page and the tree load, or a part/message orphaned +// from its session). The poll loop skips it with one structured error and +// continues with the remaining sessions (adapter-opencode.md §"Read Strategy"). +var errSessionGone = errors.New("opencode: session row not found") + +// maxSessionMessagesWarn / maxSessionPartsWarn are DEFENSIVE upper bounds on a +// single session's in-memory tree (SOW-0005 round-3 P2-3). The full ordered tree +// MUST be loaded at once — the mapper synthesizes per-turn token deltas by +// subtracting successive cumulative snapshots, so there is no correct streaming +// decomposition. These caps do NOT truncate; they only mark the threshold above +// which the loader emits ONE structured WARN via onWarn so a pathological or +// corrupt session surfaces (in the logs and, via the adapter's onError → +// SourceError, in /api/health) instead of silently spiking memory. Set +// generously: real opencode sessions are far below 100k of either. +const ( + maxSessionMessagesWarn = 100_000 + maxSessionPartsWarn = 100_000 +) + +// warnIfSessionTooLarge emits ONE structured WARN via onWarn when a session's +// loaded message or part count exceeds its defensive bound (SOW-0005 round-3 +// P2-3). It NEVER truncates — the caller still processes the whole tree — it only +// SURFACES the anomaly. onWarn may be nil (the pure no-DB path), in which case the +// check is a no-op. The part count is summed across the per-message map. +func warnIfSessionTooLarge(sessionID string, msgs []messageRow, partsByMessage map[string][]partRow, onWarn func(error)) { + if onWarn == nil { + return + } + if len(msgs) > maxSessionMessagesWarn { + onWarn(fmt.Errorf("opencode: session %s has %d messages (> %d); processing in full — possible pathological/corrupt session (P2-3)", sessionID, len(msgs), maxSessionMessagesWarn)) + } + parts := 0 + for _, ps := range partsByMessage { + parts += len(ps) + } + if parts > maxSessionPartsWarn { + onWarn(fmt.Errorf("opencode: session %s has %d parts (> %d); processing in full — possible pathological/corrupt session (P2-3)", sessionID, parts, maxSessionPartsWarn)) + } +} + +// The dynamic scan-destination decoder (columnIndex + scanDest and its typed +// accessors str/i64/f64/i64Required/strRequired/ownerOrWarn/bytes) lives in +// store_scandest.go (split to keep each file ≤400 lines). The per-table delta-row +// scanners (scanSessionRow/scanMessageRow/scanPartRow/scanSessionMessageRow) live +// in store_scan.go. + +// --- full-session-tree load --------------------------------------------------- +// +// resolveRootID (the parent_id chain walk that gives a nested sub-agent its TRUE +// tree root, SOW-0005 P2.4) lives in store_root.go (split to keep this file ≤400 +// lines). + +// roQuerier is the read-only query surface both *sql.DB and *sql.Tx satisfy. The +// tree-load helpers take it so the SAME code path runs either against the pool +// directly (test entrypoints) or inside ONE shared read-only transaction +// (loadAndMapSession's single consistent snapshot — SOW-0005 round-3 P1-2). +type roQuerier interface { + QueryContext(ctx context.Context, query string, args ...any) (*sql.Rows, error) + QueryRowContext(ctx context.Context, query string, args ...any) *sql.Row +} + +// loadSession loads one session row by id via the present-column SELECT. The +// bool is false (with no error) when the row does not exist (the affected +// session was deleted between the delta and the load); the caller skips it. +// onWarn surfaces a corrupt numeric cell (SOW-0005 P2.6); it may be nil. q is any +// roQuerier (the pool or the shared snapshot tx — P1-2). +func loadSession(ctx context.Context, q roQuerier, schema schemaSet, id string, onWarn func(error)) (sessionRow, bool, error) { + s := schema["session"] + idx := newColumnIndex(s) + query := selectByIDList(s) + d := newScanDest(len(s.Present)).withWarn("session", onWarn) + err := q.QueryRowContext(ctx, query, id).Scan(d.ptrs...) + if errors.Is(err, sql.ErrNoRows) { + return sessionRow{}, false, nil + } + if err != nil { + return sessionRow{}, false, fmt.Errorf("opencode: load session %s: %w", id, err) + } + return sessionRow{ + ID: d.str(idx, "id"), + ProjectID: d.str(idx, "project_id"), + ParentID: d.str(idx, "parent_id"), + Slug: d.str(idx, "slug"), + Directory: d.str(idx, "directory"), + Title: d.str(idx, "title"), + Version: d.str(idx, "version"), + Agent: d.str(idx, "agent"), + Model: d.bytes(idx, "model"), + Cost: d.f64(idx, "cost"), + TokensInput: d.i64(idx, "tokens_input"), + TokensOutput: d.i64(idx, "tokens_output"), + TokensReason: d.i64(idx, "tokens_reasoning"), + TokensCacheRd: d.i64(idx, "tokens_cache_read"), + TokensCacheWr: d.i64(idx, "tokens_cache_write"), + TimeCreatedMs: d.i64(idx, "time_created"), + TimeUpdatedMs: d.i64(idx, "time_updated"), + TimeArchivedMs: d.i64(idx, "time_archived"), + TimeCompactingMs: d.i64(idx, "time_compacting"), + }, true, nil +} + +// loadSessionTree loads a session's full ordered message+part tree as +// []messageWithParts: messages ordered by (time_created, id), each with its +// parts ordered by (id). The whole tree is required so the mapper's per-turn +// token deltas are correct (adapter-opencode.md §"Read Strategy"). +// +// q is any roQuerier. The PRODUCTION path (loadAndMapSession) passes the SAME +// read-only transaction that already read the session row and checked +// time_compacting, so the session metadata, the compaction check, and the tree +// share ONE consistent snapshot (SOW-0005 round-3 P1-2: no compaction-race +// TOCTOU). Test entrypoints may pass the pool directly. The tree-load itself does +// NOT begin/commit a transaction — its caller owns the snapshot lifecycle. +// +// Parts are loaded with ONE query for the whole session (SOW-0005 round-2 P2-B), +// NOT one query per message: the part table denormalizes session_id, so a single +// WHERE session_id = ? ORDER BY (message_id, id) returns every part already +// grouped by message; the rows are then partitioned in memory and attached to +// each message in its (time_created, id) order. session_id is a REQUIRED part +// column (introspectAll), so there is no old-schema message_id-IN fallback +// (SOW-0005 round-3 P3-1 removed the unreachable one). +// +// As a defensive safety signal, the loaded message and part counts are bounded by +// maxSessionMessagesWarn / maxSessionPartsWarn: a session exceeding either emits +// ONE structured WARN via onWarn and is STILL processed in full — the whole +// ordered tree is mandatory for the token-delta synthesis, so truncating would +// corrupt the deltas (SOW-0005 round-3 P2-3). +func loadSessionTree(ctx context.Context, q roQuerier, schema schemaSet, sessionID string, onWarn func(error)) ([]messageWithParts, error) { + msgs, err := loadMessages(ctx, q, schema["message"], sessionID, onWarn) + if err != nil { + return nil, err + } + partsByMessage, err := loadSessionParts(ctx, q, schema["part"], sessionID, onWarn) + if err != nil { + return nil, err + } + warnIfSessionTooLarge(sessionID, msgs, partsByMessage, onWarn) + out := make([]messageWithParts, 0, len(msgs)) + for i := range msgs { + out = append(out, messageWithParts{Message: msgs[i], Parts: partsByMessage[msgs[i].ID]}) + } + return out, nil +} + +// loadMessages reads a session's messages ordered by (time_created, id). The +// order column names come from the live schema; they are required columns +// (introspectAll guarantees session_id/time_updated/data present), and +// time_created is in wantedColumns for message on every schema. q is any +// roQuerier (the shared snapshot tx in production). +func loadMessages(ctx context.Context, qr roQuerier, s tableSchema, sessionID string, onWarn func(error)) ([]messageRow, error) { + idx := newColumnIndex(s) + q := selectByColumn(s, "session_id", messageOrderBy(s)) + rows, err := qr.QueryContext(ctx, q, sessionID) + if err != nil { + return nil, fmt.Errorf("opencode: load messages for %s: %w", sessionID, err) + } + defer func() { _ = rows.Close() }() + + var out []messageRow + for rows.Next() { + if err := ctx.Err(); err != nil { + return nil, err + } + d := newScanDest(len(s.Present)).withWarn("message", onWarn) + if err := rows.Scan(d.ptrs...); err != nil { + return nil, fmt.Errorf("opencode: scan message: %w", err) + } + // id is the key the mapper uses to attach this message's parts + // (partsByMessage[id]); session_id is a required ownership column. A + // corrupt/empty value would mis-key the message's parts and is surfaced + + // skipped, not silently zeroed (SOW-0005 round-7 P2-3). + id, ok := d.ownerOrWarn(idx, "id") + if !ok { + continue + } + sid, ok := d.ownerOrWarn(idx, "session_id") + if !ok { + continue + } + out = append(out, messageRow{ + ID: id, + SessionID: sid, + TimeCreatedMs: d.i64(idx, "time_created"), + TimeUpdatedMs: d.i64(idx, "time_updated"), + Data: d.bytes(idx, "data"), + }) + } + if err := rows.Err(); err != nil { + return nil, fmt.Errorf("opencode: iterate messages: %w", err) + } + return out, nil +} + +// loadSessionParts reads ALL of a session's parts in ONE indexed query over the +// denormalized session_id (SOW-0005 round-2 P2-B — replaces the per-message N+1 +// loop), ordered (message_id, id) so each message's parts arrive contiguous and in +// creation order. scanPartRows partitions them by message_id. session_id is a +// REQUIRED part column (introspectAll fails fast without it), so a part table that +// reaches this function ALWAYS has it — the round-2 P2-B old-schema +// message_id-IN fallback was unreachable and was removed (SOW-0005 round-3 P3-1). +// Returns a map keyed by message_id; a message with no parts is simply absent +// (nil slice on lookup). q is any roQuerier (the shared snapshot tx in production). +func loadSessionParts(ctx context.Context, qr roQuerier, s tableSchema, sessionID string, onWarn func(error)) (map[string][]partRow, error) { + q := selectByColumn(s, "session_id", quoteIdent("message_id")+", "+quoteIdent("id")) + rows, err := qr.QueryContext(ctx, q, sessionID) + if err != nil { + return nil, fmt.Errorf("opencode: load parts for session %s: %w", sessionID, err) + } + return scanPartRows(ctx, rows, s, onWarn, "session "+sessionID) +} + +// scanPartRows scans a part result set and partitions the rows into a map keyed by +// message_id, preserving the query's (message_id, id) order within each group. It +// owns closing rows. label is used only in error context. +func scanPartRows(ctx context.Context, rows *sql.Rows, s tableSchema, onWarn func(error), label string) (map[string][]partRow, error) { + defer func() { _ = rows.Close() }() + idx := newColumnIndex(s) + out := map[string][]partRow{} + for rows.Next() { + if err := ctx.Err(); err != nil { + return nil, err + } + d := newScanDest(len(s.Present)).withWarn("part", onWarn) + if err := rows.Scan(d.ptrs...); err != nil { + return nil, fmt.Errorf("opencode: scan part (%s): %w", label, err) + } + // message_id is the partition key (out[message_id]) and session_id is a + // required ownership column; a corrupt/empty value would land the part under + // out[""] and be silently dropped when the mapper looks parts up by a real + // message id. Surface a WARN and SKIP the row instead (SOW-0005 round-7 P2-3). + mid, ok := d.ownerOrWarn(idx, "message_id") + if !ok { + continue + } + sid, ok := d.ownerOrWarn(idx, "session_id") + if !ok { + continue + } + p := partRow{ + ID: d.str(idx, "id"), + MessageID: mid, + SessionID: sid, + TimeCreatedMs: d.i64(idx, "time_created"), + TimeUpdatedMs: d.i64(idx, "time_updated"), + Data: d.bytes(idx, "data"), + } + out[p.MessageID] = append(out[p.MessageID], p) + } + if err := rows.Err(); err != nil { + return nil, fmt.Errorf("opencode: iterate parts (%s): %w", label, err) + } + return out, nil +} + +// The present-column SELECT builders (presentColsSQL/selectByIDList/ +// selectByColumn/messageOrderBy) and the numeric parse helpers +// (parseInt64*/parseFloat64*) live in store_load_sql.go (split to keep this file +// ≤400 lines). diff --git a/internal/adapters/opencode/store_load_sql.go b/internal/adapters/opencode/store_load_sql.go new file mode 100644 index 0000000..79aa2d0 --- /dev/null +++ b/internal/adapters/opencode/store_load_sql.go @@ -0,0 +1,94 @@ +package opencode + +import ( + "strconv" + "strings" +) + +// This file holds the present-column SELECT builders for point/ordered loads and +// the numeric parse helpers the row scanners use. Split out of store_load.go to +// keep each file ≤400 lines. Every identifier passed to a builder comes from the +// fixed schema (never operator input); quoteIdent (store.go) defends against any +// future column that is a SQL keyword. + +// presentColsSQL renders the quoted present-column list for a table schema, the +// shared prefix of every load SELECT (never SELECT *). +func presentColsSQL(s tableSchema) string { + cols := s.Present + quoted := make([]string, len(cols)) + for i, c := range cols { + quoted[i] = quoteIdent(c) + } + return strings.Join(quoted, ", ") +} + +// selectByIDList builds "SELECT FROM WHERE id = ?" for a point +// load by primary key. Identifiers come from the fixed schema, never user input. +func selectByIDList(s tableSchema) string { + return "SELECT " + presentColsSQL(s) + " FROM " + quoteIdent(s.Table) + " WHERE id = ?" +} + +// selectByColumn builds "SELECT FROM WHERE = ? ORDER BY +// " for an ordered child load. col and orderBy are fixed schema +// identifiers (session_id/message_id; the order key), never user input; orderBy +// is already a comma-separated quoted key. +func selectByColumn(s tableSchema, col, orderBy string) string { + return "SELECT " + presentColsSQL(s) + " FROM " + quoteIdent(s.Table) + + " WHERE " + quoteIdent(col) + " = ? ORDER BY " + orderBy +} + +// messageOrderBy returns the message ordering key: "time_created", "id" when the +// schema has time_created (every observed schema does), else "id" alone. The +// mapper requires assistant messages in (time_created, id) order +// (adapter-opencode.md §"Turn synthesis"). +func messageOrderBy(s tableSchema) string { + if s.has("time_created") { + return quoteIdent("time_created") + ", " + quoteIdent("id") + } + return quoteIdent("id") +} + +// parseInt64 parses a decimal integer column value sqlite returned as text, +// returning 0 for a non-numeric value (defensive — opencode integer columns are +// always numeric, but a corrupt cell must not panic the loader). +func parseInt64(s string) int64 { + v, _ := parseInt64Checked(s) + return v +} + +// parseInt64Checked parses a decimal integer column value, returning (0, false) +// for a non-numeric value so the caller can surface a corruption WARN +// (SOW-0005 P2.6). An empty/whitespace string is NOT corruption (it maps to 0, +// true) — a NULL never reaches here (the caller gates on Valid). +func parseInt64Checked(s string) (int64, bool) { + t := strings.TrimSpace(s) + if t == "" { + return 0, true + } + v, err := strconv.ParseInt(t, 10, 64) + if err != nil { + return 0, false + } + return v, true +} + +// parseFloat64 parses a real column value sqlite returned as text, returning 0 +// for a non-numeric value. +func parseFloat64(s string) float64 { + v, _ := parseFloat64Checked(s) + return v +} + +// parseFloat64Checked parses a real column value, returning (0, false) for a +// non-numeric value so the caller can surface a corruption WARN (SOW-0005 P2.6). +func parseFloat64Checked(s string) (float64, bool) { + t := strings.TrimSpace(s) + if t == "" { + return 0, true + } + v, err := strconv.ParseFloat(t, 64) + if err != nil { + return 0, false + } + return v, true +} diff --git a/internal/adapters/opencode/store_load_test.go b/internal/adapters/opencode/store_load_test.go new file mode 100644 index 0000000..2ce9e05 --- /dev/null +++ b/internal/adapters/opencode/store_load_test.go @@ -0,0 +1,118 @@ +package opencode + +import ( + "testing" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file pins the tree-load layer: loadSession (found / not-found), +// loadSessionTree ordering (messages by (time_created,id), parts by id), and a +// zero-message session loading cleanly — the []messageWithParts contract the +// pure mapper consumes. + +// TestLoadSession_FoundAndMissing asserts loadSession returns the row when it +// exists and (zero, false, nil) when it does not. +func TestLoadSession_FoundAndMissing(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_a", "ses_parent", 100, 150, 0) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + s, ok, err := loadSession(ctxBG(), db, schema, "ses_a", func(error) {}) + if err != nil { + t.Fatalf("loadSession: %v", err) + } + if !ok { + t.Fatal("loadSession(ses_a) ok=false, want true") + } + if s.ID != "ses_a" || s.ParentID != "ses_parent" || s.TimeCreatedMs != 100 { + t.Errorf("loaded session = %+v, want id=ses_a parent=ses_parent created=100", s) + } + + _, ok, err = loadSession(ctxBG(), db, schema, "ses_nope", func(error) {}) + if err != nil { + t.Fatalf("loadSession(missing): %v", err) + } + if ok { + t.Error("loadSession(missing) ok=true, want false") + } +} + +// TestLoadSessionTree_Ordering builds a session with two messages (inserted +// out of time order) and parts (inserted out of id order), then asserts +// loadSessionTree returns messages by (time_created,id) and parts by id. +func TestLoadSessionTree_Ordering(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_a", "", 1, 1, 0) + // Insert msg_2 (later) BEFORE msg_1 (earlier) so insertion order != time order. + insertAssistantMessage(t, rw, "msg_2", "ses_a", 200, 200, 1, 1) + insertAssistantMessage(t, rw, "msg_1", "ses_a", 100, 100, 1, 1) + // Parts of msg_1 inserted out of id order: prt_b then prt_a. + insertPart(t, rw, "prt_b", "msg_1", "ses_a", 110, 110, textBody("second")) + insertPart(t, rw, "prt_a", "msg_1", "ses_a", 105, 105, stepStartBody()) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + tree, err := loadSessionTree(ctxBG(), db, schema, "ses_a", func(error) {}) + if err != nil { + t.Fatalf("loadSessionTree: %v", err) + } + if len(tree) != 2 { + t.Fatalf("tree has %d messages, want 2", len(tree)) + } + // Messages ordered by (time_created, id): msg_1 (100) then msg_2 (200). + if tree[0].Message.ID != "msg_1" || tree[1].Message.ID != "msg_2" { + t.Errorf("message order = [%s %s], want [msg_1 msg_2]", tree[0].Message.ID, tree[1].Message.ID) + } + // Parts of msg_1 ordered by id: prt_a then prt_b. + parts := tree[0].Parts + if len(parts) != 2 || parts[0].ID != "prt_a" || parts[1].ID != "prt_b" { + t.Errorf("part order = %v, want [prt_a prt_b]", partIDs(parts)) + } + // msg_2 has no parts. + if len(tree[1].Parts) != 0 { + t.Errorf("msg_2 parts = %d, want 0", len(tree[1].Parts)) + } +} + +// TestLoadSessionTree_ZeroMessages asserts a session with no messages loads as an +// empty tree (not an error), so mapSession emits just the SessionStarted. +func TestLoadSessionTree_ZeroMessages(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_empty", "", 1, 1, 0) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + tree, err := loadSessionTree(ctxBG(), db, schema, "ses_empty", func(error) {}) + if err != nil { + t.Fatalf("loadSessionTree(empty): %v", err) + } + if len(tree) != 0 { + t.Errorf("empty-session tree has %d messages, want 0", len(tree)) + } + + // And loadAndMapSession over it yields exactly one SessionStarted, no more. + evs, skipped, err := loadAndMapSession(ctxBG(), db, schema, "opencode:test", "ses_empty", silentLogger(), func(error) {}) + if err != nil { + t.Fatalf("loadAndMapSession(empty): %v", err) + } + if skipped { + t.Fatal("loadAndMapSession(empty) reported skipped; want emit") + } + if n := countKind(evs, canonical.EvSessionStarted); n != 1 { + t.Errorf("empty session emitted %d SessionStarted, want 1", n) + } +} diff --git a/internal/adapters/opencode/store_query.go b/internal/adapters/opencode/store_query.go new file mode 100644 index 0000000..4cdfe06 --- /dev/null +++ b/internal/adapters/opencode/store_query.go @@ -0,0 +1,254 @@ +package opencode + +import ( + "context" + "database/sql" + "fmt" +) + +// This file is the SQL DELTA-QUERY layer (SOW-0005 chunk C): one paged delta +// query per tracked table, the cheap PK-indexed MAX(id) change check, the gated +// (expensive) MAX(time_updated) probe, and the affected-session derivation that +// turns a batch of changed rows into the SET of session ids whose full tree the +// tailer reloads. Every query runs in its own SHORT read-only transaction +// (BEGIN DEFERRED) so the live opencode writer's WAL is never pinned +// (adapter-opencode.md §"Read Strategy" → "Delta query…"; the page SQL is the +// SOW-recorded template). The tree LOAD lives in store_load.go; the poll loops +// in tailer.go. This layer performs SQL but no event mapping — it hands typed +// rows to the pure mapper via the loader. + +// deltaPageLimit is the per-page row cap. It matches buildSelect's hardcoded +// LIMIT 1000 (store.go) and the SOW-recorded page size; a page shorter than this +// is the last page for a table. Kept as a named constant so the paging loop's +// short-page test reads against the same value the SELECT embeds. +const deltaPageLimit = 1000 + +// rowKey is the watermark pair (id, time_updated) the per-table delta scan +// reports for each row so the paging loop can advance the cursor without +// re-scanning the row. On an old schema lacking time_updated the scan reports +// timeUpdatedMs = 0 and only id advances. +type rowKey struct { + id string + timeUpdatedMs int64 +} + +// tableDelta is the result of paging one table forward from a watermark: the +// advanced watermark (the max (time_updated, id) seen) and the number of rows +// read across all pages (used by the backfill loop to checkpoint progress). +type tableDelta struct { + watermark TableWatermark + rowCount int +} + +// beginRO opens a short read-only deferred transaction. database/sql maps +// ReadOnly:true to BEGIN DEFERRED for modernc.org/sqlite, taking the snapshot on +// the first statement and never pinning the WAL for writes. The caller MUST +// commit/rollback promptly (one page per tx) to keep the snapshot window <1 s. +func beginRO(ctx context.Context, db *sql.DB) (*sql.Tx, error) { + tx, err := db.BeginTx(ctx, &sql.TxOptions{ReadOnly: true}) + if err != nil { + return nil, fmt.Errorf("opencode: begin ro tx: %w", err) + } + return tx, nil +} + +// maxID returns MAX(id) for the table via the PK-indexed b-tree (the cheap +// primary change check — ~µs, adapter-opencode.md §"Performance"). An empty +// table yields "" (no rows). The table name is a fixed trackedTables entry, so +// it is safe to interpolate (quoted as an identifier defensively). +func maxID(ctx context.Context, db *sql.DB, table string) (string, error) { + var id sql.NullString + q := `SELECT MAX(id) FROM ` + quoteIdent(table) // #nosec G202 -- table is a fixed trackedTables identifier via quoteIdent, never user input + if err := db.QueryRowContext(ctx, q).Scan(&id); err != nil { + return "", fmt.Errorf("opencode: max(id) %s: %w", table, err) + } + return id.String, nil +} + +// maxTimeUpdated returns MAX(time_updated) for the table. This is the EXPENSIVE, +// UNINDEXED probe (a full scan — 400–800 ms on the 585k-row part table), so the +// tailer issues it ONLY when its gate is open (shouldProbeTimeUpdated). A table +// on an old schema lacking the column is reported by the caller before this runs +// (the caller checks tableSchema.has); this query assumes the column exists. +// Returns 0 for an empty table. +func maxTimeUpdated(ctx context.Context, db *sql.DB, table string) (int64, error) { + var v sql.NullInt64 + q := `SELECT MAX(time_updated) FROM ` + quoteIdent(table) // #nosec G202 -- table is a fixed trackedTables identifier via quoteIdent, never user input + if err := db.QueryRowContext(ctx, q).Scan(&v); err != nil { + return 0, fmt.Errorf("opencode: max(time_updated) %s: %w", table, err) + } + return v.Int64, nil +} + +// scanTableDelta pages one table forward from `from`, invoking onRow for every +// changed row. onRow scans the table-specific columns AND returns the row's +// (id, time_updated) so the paging loop advances the watermark without a second +// Scan of the cursor. It pages until a short page ( 0 { + wm = page.watermark + } + if page.n < deltaPageLimit { + break // short page → caught up + } + } + return tableDelta{watermark: wm, rowCount: total}, nil +} + +// pageResult is one page's outcome: rows read and the max (time_updated, id) +// observed within the page. +type pageResult struct { + n int + watermark TableWatermark +} + +// scanOnePage runs one page of the composite-key delta query inside a fresh +// read-only tx, invoking onRow per row and tracking the page's max watermark. The +// tx is committed before returning so the WAL is released between pages (the +// snapshot advances between pages, which is correct for a tailing reader). The +// bind is always the 3-param (time_updated, time_updated, id) form — time_updated +// is a required column on every tracked table (introspectAll), so there is no +// id-only variant. +// +// No warning/error EMISSION happens while the tx is open (SOW-0005 round-5 P2-1): +// onRow writes any corrupt-cell / unknown-type WARN into sink (a non-blocking +// slice append), the tx is committed/rolled back FIRST (explicitly, not via the +// deferred rollback — so the snapshot is provably released), and only THEN are the +// buffered warnings flushed through the live onError. A FATAL row error (a corrupt +// REQUIRED watermark/owning-id cell — round-4 P2-1 / round-5 P2-2) is RETURNED, not +// emitted inside; the tx is rolled back and the sink flushed before it propagates, +// so neither the warnings NOR the fatal error reach the (possibly backpressured) +// out channel with the WAL snapshot still pinned. sink is reset by flush, ready +// for the next page. +func scanOnePage(ctx context.Context, db *sql.DB, query string, from TableWatermark, onRow func(rows *sql.Rows) (rowKey, error), sink *warnSink, onError func(error)) (pageResult, error) { + tx, err := beginRO(ctx, db) + if err != nil { + return pageResult{}, err + } + + rows, err := tx.QueryContext(ctx, query, from.MaxTimeUpdatedMs, from.MaxTimeUpdatedMs, from.MaxTimeUpdatedID) + if err != nil { + _ = tx.Rollback() // close the tx before any (post-tx) error surfacing + sink.flush(onError) + return pageResult{}, fmt.Errorf("opencode: delta query: %w", err) + } + + res := pageResult{watermark: from} + scanErr := iterDeltaPage(ctx, rows, &res, onRow) + _ = rows.Close() + if scanErr == nil { + scanErr = rows.Err() + } + // Close the tx (releasing the WAL snapshot) BEFORE flushing buffered warnings + // or surfacing a fatal row error — so a backpressured onError can never block + // with the snapshot held (P2-1). On a scan/iterate error we roll back; on a + // clean page we commit. + if scanErr != nil { + _ = tx.Rollback() + sink.flush(onError) + return res, fmt.Errorf("opencode: delta page: %w", scanErr) + } + commitErr := tx.Commit() + sink.flush(onError) + if commitErr != nil { + return res, fmt.Errorf("opencode: commit ro tx: %w", commitErr) + } + return res, nil +} + +// iterDeltaPage walks one page's rows, delegating each row's column scan to +// onRow and advancing the watermark from each row's (id, time_updated). The +// PAGING POSITION (MaxTimeUpdatedMs, MaxTimeUpdatedID) is set to the last-paged +// row (rows arrive in (time_updated, id) order, so the last row is the max +// position). MaxIDSeen is raised MONOTONICALLY to the greatest id seen — never +// regressing — so an in-place UPDATE of an OLD row (which sorts LAST by +// time_updated but carries a small id) advances the paging position WITHOUT +// pulling the cheap-detect high-water backwards (SOW-0005 round-2 P1-A). time_ +// updated is always present on a tracked table, so both position fields advance. +func iterDeltaPage(ctx context.Context, rows *sql.Rows, res *pageResult, onRow func(rows *sql.Rows) (rowKey, error)) error { + for rows.Next() { + if err := ctx.Err(); err != nil { + return err + } + key, err := onRow(rows) + if err != nil { + return err + } + res.n++ + res.watermark.MaxTimeUpdatedMs = key.timeUpdatedMs + res.watermark.MaxTimeUpdatedID = key.id + res.watermark = res.watermark.advanceMaxIDSeen(key.id) + } + return nil +} + +// affectedSet accumulates, de-duplicated and in first-seen order, the session +// ids whose full tree must be reloaded after a change cycle. First-seen order +// keeps reload (and thus emission) deterministic for a given delta batch. +type affectedSet struct { + seen map[string]struct{} + order []string +} + +// newAffectedSet returns an empty set. +func newAffectedSet() *affectedSet { + return &affectedSet{seen: map[string]struct{}{}} +} + +// add records a session id (ignoring empties and duplicates). +func (a *affectedSet) add(id string) { + if id == "" { + return + } + if _, ok := a.seen[id]; ok { + return + } + a.seen[id] = struct{}{} + a.order = append(a.order, id) +} + +// ids returns the affected session ids in first-seen order. +func (a *affectedSet) ids() []string { return a.order } + +// resolvePartSession returns the owning session id for a changed part row. The +// part table denormalizes session_id (adapter-opencode.md §"part"), and session_id +// is a REQUIRED part column (requiredColumns["part"], store.go) — introspectAll makes +// its absence FATAL upstream, so a part table that reaches this layer ALWAYS has it. +// The round-2 P2-B old-schema message-lookup fallback (message_id → session_id via a +// pool query, consulting an in-run message→session map first) was therefore +// UNREACHABLE in production and was removed (SOW-0005 round-6 P3-2; same class as the +// round-3 P3-1 dead-fallback removal). The delta scanner (scanPartRow → requiredOwner, +// round-5 P2-2) already ERRORS the page on an empty/corrupt session_id before this is +// reached, so p.SessionID is non-empty here; the empty guard remains as defence in +// depth (it returns an error rather than deriving an empty affected session, which +// affectedSet.add would silently drop while the row "succeeded" → a cursor gap). +func resolvePartSession(p partRow) (string, error) { + if p.SessionID == "" { + return "", fmt.Errorf("opencode: part %s has empty session_id (required column); refusing to derive an empty affected session", p.ID) + } + return p.SessionID, nil +} diff --git a/internal/adapters/opencode/store_query_test.go b/internal/adapters/opencode/store_query_test.go new file mode 100644 index 0000000..e3570a1 --- /dev/null +++ b/internal/adapters/opencode/store_query_test.go @@ -0,0 +1,311 @@ +package opencode + +import ( + "database/sql" + "sort" + "strings" + "testing" +) + +// This file pins the delta-query layer: paged delta SELECTs (watermark advance, +// the LIMIT-1000 boundary, the time_updated tie-break) and the affected-session +// derivation across all four tables. time_updated is a required column, so there +// is no id-only delta fallback (SOW-0005 P3.1). + +// scanMessagesFrom pages the message table from a watermark using scanTableDelta +// and returns the changed messageRows + the advanced watermark + row count. +func scanMessagesFrom(t *testing.T, db *sql.DB, schema schemaSet, from TableWatermark) ([]messageRow, tableDelta) { + t.Helper() + s := schema["message"] + idx := newColumnIndex(s) + n := len(s.Present) + var got []messageRow + scan, row := scanMessageRow(idx, n, nil) + delta, err := scanTableDelta(ctxBG(), db, s, from, func(rows *sql.Rows) (rowKey, error) { + k, err := scan(rows) + if err != nil { + return k, err + } + got = append(got, *row) + return k, nil + }, &warnSink{}, nil) + if err != nil { + t.Fatalf("scanTableDelta(message): %v", err) + } + return got, delta +} + +// TestDeltaQuery_WatermarkFilters asserts rows past the watermark are returned +// and rows at/under it are not, and that the advanced watermark equals the max +// (time_updated, id) seen. +func TestDeltaQuery_WatermarkFilters(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_a", "", 100, 100, 0) + insertAssistantMessage(t, rw, "msg_1", "ses_a", 100, 100, 10, 5) + insertAssistantMessage(t, rw, "msg_2", "ses_a", 200, 200, 10, 5) + insertAssistantMessage(t, rw, "msg_3", "ses_a", 300, 300, 10, 5) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + + db, schema := introspect(t, path) + + // From zero: all three. + got, delta := scanMessagesFrom(t, db, schema, TableWatermark{}) + if len(got) != 3 { + t.Fatalf("from zero: got %d messages, want 3", len(got)) + } + if delta.watermark.MaxTimeUpdatedMs != 300 || delta.watermark.MaxTimeUpdatedID != "msg_3" { + t.Errorf("advanced paging position = %+v, want {300, msg_3}", delta.watermark) + } + // The monotonic high-water also reaches the greatest id paged (P1-A). + if delta.watermark.MaxIDSeen != "msg_3" { + t.Errorf("advanced MaxIDSeen = %q, want msg_3", delta.watermark.MaxIDSeen) + } + if delta.rowCount != 3 { + t.Errorf("rowCount = %d, want 3", delta.rowCount) + } + + // From msg_2's watermark: only msg_3. + got2, _ := scanMessagesFrom(t, db, schema, TableWatermark{MaxTimeUpdatedMs: 200, MaxTimeUpdatedID: "msg_2"}) + if len(got2) != 1 || got2[0].ID != "msg_3" { + t.Fatalf("from {200,msg_2}: got %+v, want [msg_3]", ids(got2)) + } +} + +// TestDeltaQuery_TieBreak asserts the (time_updated = :u AND id > :id) tiebreak: +// two rows share a time_updated; from {tu, firstID} only the higher id returns. +func TestDeltaQuery_TieBreak(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_a", "", 100, 100, 0) + // Same time_updated=500, different ids — the Drizzle single-transaction case. + insertAssistantMessage(t, rw, "msg_a", "ses_a", 500, 500, 1, 1) + insertAssistantMessage(t, rw, "msg_b", "ses_a", 500, 500, 1, 1) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + // From {500, msg_a}: the tiebreak must return only msg_b (id > msg_a at the + // same time_updated), NOT re-return msg_a. + got, _ := scanMessagesFrom(t, db, schema, TableWatermark{MaxTimeUpdatedMs: 500, MaxTimeUpdatedID: "msg_a"}) + if len(got) != 1 || got[0].ID != "msg_b" { + t.Fatalf("tiebreak from {500,msg_a}: got %v, want [msg_b]", ids(got)) + } + + // From {500, ""} (empty id at that time): both rows return. + gotBoth, _ := scanMessagesFrom(t, db, schema, TableWatermark{MaxTimeUpdatedMs: 499, MaxTimeUpdatedID: ""}) + if len(gotBoth) != 2 { + t.Fatalf("from {499,\"\"}: got %d, want 2", len(gotBoth)) + } +} + +// TestDeltaQuery_PagesBeyond1000 inserts more than the LIMIT-1000 page size and +// asserts every row is returned across pages and the watermark equals the max. +func TestDeltaQuery_PagesBeyond1000(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_a", "", 1, 1, 0) + const total = 2500 + tx, err := rw.Begin() + if err != nil { + t.Fatalf("begin bulk: %v", err) + } + stmt, err := tx.Prepare(`INSERT INTO message (id, session_id, time_created, time_updated, data) VALUES (?,?,?,?,?)`) + if err != nil { + t.Fatalf("prepare bulk: %v", err) + } + for i := 1; i <= total; i++ { + // Monotonic time_updated AND lexicographic id so order is unambiguous. + if _, err := stmt.Exec(fmtID("msg", i), "ses_a", int64(i), int64(i), `{"role":"assistant"}`); err != nil { + t.Fatalf("bulk insert %d: %v", i, err) + } + } + _ = stmt.Close() + if err := tx.Commit(); err != nil { + t.Fatalf("commit bulk: %v", err) + } + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + + db, schema := introspect(t, path) + got, delta := scanMessagesFrom(t, db, schema, TableWatermark{}) + if len(got) != total { + t.Fatalf("paged %d messages across pages, want %d", len(got), total) + } + if delta.rowCount != total { + t.Errorf("rowCount = %d, want %d", delta.rowCount, total) + } + if delta.watermark.MaxTimeUpdatedID != fmtID("msg", total) || delta.watermark.MaxTimeUpdatedMs != int64(total) || delta.watermark.MaxIDSeen != fmtID("msg", total) { + t.Errorf("final watermark = %+v, want paging {%d, %s} + MaxIDSeen %s", delta.watermark, total, fmtID("msg", total), fmtID("msg", total)) + } + // Rows must be globally ordered by (time_updated, id) across page seams. + for i := 1; i < len(got); i++ { + if got[i-1].TimeUpdatedMs > got[i].TimeUpdatedMs { + t.Fatalf("rows not ordered across pages at %d: %d > %d", i, got[i-1].TimeUpdatedMs, got[i].TimeUpdatedMs) + } + } +} + +// NOTE: the old TestDeltaQuery_OldSchemaIDFallback was removed with the +// buildSelectByID fallback it exercised (SOW-0005 P3.1). time_updated is a +// required column on every tracked table (introspectAll fails fast without it), +// so the id-only delta path was unreachable in production and its +// introspectAll-bypassing isolation test pinned dead code. + +// TestAffectedSessions_AllTables asserts the affected-session derivation across +// every table: a changed session, message, part (denormalized session_id), and +// session_message each resolve to the right session id, and a session touched by +// multiple tables dedupes to one. +func TestAffectedSessions_AllTables(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_a", "", 1, 1, 0) + insertSession(t, rw, "ses_b", "", 1, 1, 0) + insertAssistantMessage(t, rw, "msg_a", "ses_a", 2, 2, 1, 1) + insertPart(t, rw, "prt_a", "msg_a", "ses_a", 3, 3, textBody("x")) // touches ses_a again (dedupe) + insertPart(t, rw, "prt_b", "msg_b", "ses_b", 4, 4, textBody("y")) + _, err := rw.Exec(`INSERT INTO session_message (id, session_id, type, time_created, time_updated, data) + VALUES ('evt_b','ses_b','model-switched',5,5,'{}')`) + if err != nil { + t.Fatalf("insert session_message: %v", err) + } + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + + db, schema := introspect(t, path) + next, advanced, err := collectDeltasOnly(t, db, schema, newCursor()) + if err != nil { + t.Fatalf("collectDeltasOnly: %v", err) + } + if !advanced { + t.Fatal("expected advanced=true with new rows") + } + want := map[string]bool{"ses_a": true, "ses_b": true} + got := map[string]bool{} + for _, id := range next { + got[id] = true + } + for id := range want { + if !got[id] { + t.Errorf("affected set missing %q (got %v)", id, sortedIDsTest(next)) + } + } + if len(next) != 2 { + t.Errorf("affected set = %v, want exactly 2 (deduped)", sortedIDsTest(next)) + } +} + +// collectDeltasOnly pages every tracked table forward from cur via the per-table +// deltaRowHandler, accumulating the combined affected-session set (first-seen +// order) and whether any watermark advanced — the affected-derivation half of the +// change pipeline, isolated for the affected-set test. It mirrors what +// batchProcessor.pageBatch does per table but across ALL tables into one set +// (the test only inspects the derived session set, not emission/checkpointing). +func collectDeltasOnly(t *testing.T, db *sql.DB, schema schemaSet, cur Cursor) ([]string, bool, error) { + t.Helper() + affected := newAffectedSet() + advanced := false + for _, table := range trackedTables { + s := schema[table] + from := cur.Tables[table] + sink := &warnSink{} + onRow := deltaRowHandler(table, s, affected, sink.collect) + delta, err := scanTableDelta(ctxBG(), db, s, from, onRow, sink, func(error) {}) + if err != nil { + return affected.ids(), advanced, err + } + if delta.rowCount > 0 && watermarkAdvanced(from, delta.watermark) { + advanced = true + } + } + return affected.ids(), advanced, nil +} + +// ids extracts message ids for assertion messages. +func ids(msgs []messageRow) []string { + out := make([]string, len(msgs)) + for i, m := range msgs { + out[i] = m.ID + } + return out +} + +// partIDs extracts part ids for assertion messages. +func partIDs(parts []partRow) []string { + out := make([]string, len(parts)) + for i, p := range parts { + out[i] = p.ID + } + return out +} + +// sortedIDsTest returns a sorted copy of session ids for stable assertion +// messages. +func sortedIDsTest(in []string) []string { + out := make([]string, len(in)) + copy(out, in) + sort.Strings(out) + return out +} + +// TestSessionMessage_UnknownTypeWarns is the P2.7 / spec Edge #1 proof: scanning a +// session_message delta emits exactly one structured WARN for an UNRECOGNIZED +// type and NONE for a known type, while BOTH rows still drive the affected-session +// set (the warn never blocks the cycle). The known type ("model-switched") and the +// unknown ("planned-future-thing") are scanned in one delta pass via the +// session_message deltaRowHandler with a warn-capturing onError. +func TestSessionMessage_UnknownTypeWarns(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + // Two session_message rows: a KNOWN type (no warn) and an UNKNOWN one (warn). + if _, err := rw.Exec(`INSERT INTO session_message (id, session_id, type, time_created, time_updated, data) + VALUES ('evt_known','ses_a','model-switched',1,1,'{}')`); err != nil { + t.Fatalf("insert known: %v", err) + } + if _, err := rw.Exec(`INSERT INTO session_message (id, session_id, type, time_created, time_updated, data) + VALUES ('evt_unknown','ses_b','planned-future-thing',2,2,'{}')`); err != nil { + t.Fatalf("insert unknown: %v", err) + } + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + + db, schema := introspect(t, path) + var ce collectErrs + affected := newAffectedSet() + s := schema["session_message"] + sink := &warnSink{} + onRow := deltaRowHandler("session_message", s, affected, sink.collect) + if _, err := scanTableDelta(ctxBG(), db, s, TableWatermark{}, onRow, sink, ce.onError); err != nil { + t.Fatalf("scanTableDelta(session_message): %v", err) + } + + // Exactly one WARN, naming the unknown type; none for the known type. + if ce.count() != 1 { + t.Fatalf("session_message scan produced %d warnings, want exactly 1 (only the unknown type)", ce.count()) + } + ce.mu.Lock() + msg := ce.errs[0].Error() + ce.mu.Unlock() + if !strings.Contains(msg, "planned-future-thing") || !strings.Contains(msg, "unknown session_message type") { + t.Errorf("warn = %q, want one naming the unknown type", msg) + } + // Both rows still drove the affected set (the warn does not skip the session). + got := map[string]bool{} + for _, id := range affected.ids() { + got[id] = true + } + if !got["ses_a"] || !got["ses_b"] { + t.Errorf("affected set = %v, want both ses_a and ses_b", affected.ids()) + } +} diff --git a/internal/adapters/opencode/store_root.go b/internal/adapters/opencode/store_root.go new file mode 100644 index 0000000..346b64b --- /dev/null +++ b/internal/adapters/opencode/store_root.go @@ -0,0 +1,64 @@ +package opencode + +import ( + "context" + "database/sql" + "errors" + "fmt" +) + +// This file holds resolveRootID — the parent_id chain walk that resolves a +// session's TRUE tree root (SOW-0005 P2.4). Split out of store_load.go to keep +// each file ≤400 lines. + +// rootChainCap bounds the parent_id chain walk in resolveRootID: a defensive +// depth cap so a cyclic or pathological parent chain can never loop forever. The +// deepest observed opencode sub-agent nesting is a few levels; 32 is far beyond +// any real tree (adapter-opencode.md §"Read Strategy" → nested root). +const rootChainCap = 32 + +// resolveRootID walks the session's parent_id chain to the TOPMOST ancestor (the +// true root of the whole session tree), so a nested sub-agent's RootNativeID is +// the tree root rather than its direct parent (SOW-0005 P2.4). For a root session +// (no parent) it returns the session's own id. The walk is read-only, depth-capped +// (rootChainCap) with a seen-set cycle guard. If the chain cannot be fully +// resolved — a missing ancestor row, a cycle, or the cap is hit — it FALLS BACK to +// the last known ancestor (the deepest id it did resolve, i.e. the direct parent +// for a one-step failure) and surfaces one WARN via onError, never blocking the +// session. Only the id+parent_id columns are read (the cheapest possible probe). +// q is any roQuerier: loadAndMapSession passes the SAME read-only transaction that +// read the session row + tree, so root resolution shares that one consistent +// snapshot (SOW-0005 round-3 P1-2). Test callers may pass the pool. +func resolveRootID(ctx context.Context, q roQuerier, id, parentID string, onError func(error)) string { + if parentID == "" { + return id // already the root + } + seen := map[string]struct{}{id: {}} + cur := parentID + for depth := 0; depth < rootChainCap; depth++ { + if _, dup := seen[cur]; dup { + onError(fmt.Errorf("opencode: parent_id cycle resolving root for session %s (stopping at %s)", id, cur)) + return cur + } + seen[cur] = struct{}{} + + var parent sql.NullString + err := q.QueryRowContext(ctx, `SELECT parent_id FROM session WHERE id = ?`, cur).Scan(&parent) + if errors.Is(err, sql.ErrNoRows) { + // Ancestor row not present (yet) — cur is the furthest resolvable + // ancestor. Fall back to it (the direct parent on a one-step failure). + onError(fmt.Errorf("opencode: parent session %s of %s not found; using it as root", cur, id)) + return cur + } + if err != nil { + onError(fmt.Errorf("opencode: resolve root for session %s at %s: %w", id, cur, err)) + return cur + } + if !parent.Valid || parent.String == "" { + return cur // cur is the root (no further parent) + } + cur = parent.String + } + onError(fmt.Errorf("opencode: parent_id chain for session %s exceeded depth %d; using %s as root", id, rootChainCap, cur)) + return cur +} diff --git a/internal/adapters/opencode/store_scan.go b/internal/adapters/opencode/store_scan.go new file mode 100644 index 0000000..eac1050 --- /dev/null +++ b/internal/adapters/opencode/store_scan.go @@ -0,0 +1,201 @@ +package opencode + +import ( + "database/sql" + "fmt" +) + +// This file holds the per-table DELTA-ROW SCANNERS (driven by scanTableDelta in +// store_query.go AND the boundary-bucket re-scan in tailer_boundary.go): one +// closure per tracked table that scans a delta row into its typed struct via the +// present-column index and reports the row's (id, time_updated) watermark key. +// Split out of store_load.go to keep each file ≤400 lines (SOW-0005 round-2; the +// P2-B single-query part loader grew store_load.go). Every read uses the schema's +// PRESENT columns only (never SELECT *), so an old schema missing an optional +// column degrades to a zero value rather than failing. +// +// Each scanner takes an onWarn callback (SOW-0005 round-4 P2-1): the optional +// numeric cells (cost/tokens/time_created/time_archived/time_compacting) surface a +// corrupt value as a WARN and degrade to 0 (parity with the non-delta loadSession +// path); the REQUIRED cursor-watermark columns (id, time_updated) instead return an +// ERROR via i64Required/strRequired so a corrupt cell can never advance the cursor +// to a poisoned watermark (the error aborts the page; the cursor stays at the last +// good position). + +// scanSessionRow reads one session delta row into a sessionRow via the present +// columns and reports its watermark key. Missing optional columns (old schema) +// stay zero (with a WARN on a corrupt non-NULL cell); a corrupt REQUIRED id/ +// time_updated returns an error rather than a poisoned-0 watermark (round-4 P2-1). +func scanSessionRow(idx columnIndex, n int, onWarn func(error)) (func(*sql.Rows) (rowKey, error), *sessionRow) { + var out sessionRow + fn := func(rows *sql.Rows) (rowKey, error) { + d := newScanDest(n).withWarn("session", onWarn) + if err := rows.Scan(d.ptrs...); err != nil { + return rowKey{}, fmt.Errorf("opencode: scan session row: %w", err) + } + id, tuid, err := requiredWatermark(d, idx) + if err != nil { + return rowKey{}, err + } + out = sessionRow{ + ID: id, + ProjectID: d.str(idx, "project_id"), + ParentID: d.str(idx, "parent_id"), + Slug: d.str(idx, "slug"), + Directory: d.str(idx, "directory"), + Title: d.str(idx, "title"), + Version: d.str(idx, "version"), + Agent: d.str(idx, "agent"), + Model: d.bytes(idx, "model"), + Cost: d.f64(idx, "cost"), + TokensInput: d.i64(idx, "tokens_input"), + TokensOutput: d.i64(idx, "tokens_output"), + TokensReason: d.i64(idx, "tokens_reasoning"), + TokensCacheRd: d.i64(idx, "tokens_cache_read"), + TokensCacheWr: d.i64(idx, "tokens_cache_write"), + TimeCreatedMs: d.i64(idx, "time_created"), + TimeUpdatedMs: tuid, + TimeArchivedMs: d.i64(idx, "time_archived"), + TimeCompactingMs: d.i64(idx, "time_compacting"), + } + return rowKey{id: out.ID, timeUpdatedMs: out.TimeUpdatedMs}, nil + } + return fn, &out +} + +// scanMessageRow reads one message delta row into a messageRow. session_id is a +// REQUIRED owning-id column (round-5 P2-2): an empty/corrupt value ERRORS the page +// rather than deriving an empty affected session (which affectedSet.add silently +// drops while the row "succeeds", advancing the cursor past an un-emitted change). +func scanMessageRow(idx columnIndex, n int, onWarn func(error)) (func(*sql.Rows) (rowKey, error), *messageRow) { + var out messageRow + fn := func(rows *sql.Rows) (rowKey, error) { + d := newScanDest(n).withWarn("message", onWarn) + if err := rows.Scan(d.ptrs...); err != nil { + return rowKey{}, fmt.Errorf("opencode: scan message row: %w", err) + } + id, tuid, err := requiredWatermark(d, idx) + if err != nil { + return rowKey{}, err + } + sid, err := requiredOwner(d, idx, "session_id") + if err != nil { + return rowKey{}, err + } + out = messageRow{ + ID: id, + SessionID: sid, + TimeCreatedMs: d.i64(idx, "time_created"), + TimeUpdatedMs: tuid, + Data: d.bytes(idx, "data"), + } + return rowKey{id: out.ID, timeUpdatedMs: out.TimeUpdatedMs}, nil + } + return fn, &out +} + +// scanPartRow reads one part delta row into a partRow. message_id AND session_id +// are REQUIRED owning-id columns (round-5 P2-2): a part's affected session is +// derived from its denormalized session_id (resolvePartSession), so an empty/ +// corrupt session_id would silently drop the change (affectedSet.add("")) while +// the row "succeeds" → cursor gap. message_id is the other owning id (the +// old-schema fallback resolver and msgSession key), so it is required too. Either +// being empty/corrupt ERRORS the page so the cursor never advances past the row. +func scanPartRow(idx columnIndex, n int, onWarn func(error)) (func(*sql.Rows) (rowKey, error), *partRow) { + var out partRow + fn := func(rows *sql.Rows) (rowKey, error) { + d := newScanDest(n).withWarn("part", onWarn) + if err := rows.Scan(d.ptrs...); err != nil { + return rowKey{}, fmt.Errorf("opencode: scan part row: %w", err) + } + id, tuid, err := requiredWatermark(d, idx) + if err != nil { + return rowKey{}, err + } + mid, err := requiredOwner(d, idx, "message_id") + if err != nil { + return rowKey{}, err + } + sid, err := requiredOwner(d, idx, "session_id") + if err != nil { + return rowKey{}, err + } + out = partRow{ + ID: id, + MessageID: mid, + SessionID: sid, + TimeCreatedMs: d.i64(idx, "time_created"), + TimeUpdatedMs: tuid, + Data: d.bytes(idx, "data"), + } + return rowKey{id: out.ID, timeUpdatedMs: out.TimeUpdatedMs}, nil + } + return fn, &out +} + +// scanSessionMessageRow reads one session_message delta row into a +// sessionMessageRow. session_id is a REQUIRED owning-id column (round-5 P2-2): an +// empty/corrupt value ERRORS the page rather than deriving an empty affected +// session. `type` is NOT an owning id — it stays an optional d.str read; a missing +// type keeps its existing unknown-type WARN behaviour (deltaRowHandler), never a +// fatal error (only the owning IDs are fatal-on-corrupt). +func scanSessionMessageRow(idx columnIndex, n int, onWarn func(error)) (func(*sql.Rows) (rowKey, error), *sessionMessageRow) { + var out sessionMessageRow + fn := func(rows *sql.Rows) (rowKey, error) { + d := newScanDest(n).withWarn("session_message", onWarn) + if err := rows.Scan(d.ptrs...); err != nil { + return rowKey{}, fmt.Errorf("opencode: scan session_message row: %w", err) + } + id, tuid, err := requiredWatermark(d, idx) + if err != nil { + return rowKey{}, err + } + sid, err := requiredOwner(d, idx, "session_id") + if err != nil { + return rowKey{}, err + } + out = sessionMessageRow{ + ID: id, + SessionID: sid, + Type: d.str(idx, "type"), + TimeCreatedMs: d.i64(idx, "time_created"), + TimeUpdatedMs: tuid, + Data: d.bytes(idx, "data"), + } + return rowKey{id: out.ID, timeUpdatedMs: out.TimeUpdatedMs}, nil + } + return fn, &out +} + +// requiredWatermark reads the two REQUIRED cursor-watermark columns (id, +// time_updated) for a delta row, returning an error rather than a coerced-0 value +// when either is absent/NULL/corrupt (SOW-0005 round-4 P2-1). Erroring aborts the +// page so a poisoned watermark is never persisted; the cursor stays at the last +// good position and the transient error is surfaced (non-fatal) via the poll loop. +func requiredWatermark(d *scanDest, idx columnIndex) (id string, timeUpdatedMs int64, err error) { + id, err = d.strRequired(idx, "id") + if err != nil { + return "", 0, err + } + timeUpdatedMs, err = d.i64Required(idx, "time_updated") + if err != nil { + return "", 0, err + } + return id, timeUpdatedMs, nil +} + +// requiredOwner reads a REQUIRED OWNING-ID column (message.session_id, +// part.message_id, part.session_id, session_message.session_id) for a delta row, +// returning an error rather than the empty string when the cell is absent/NULL/ +// empty (SOW-0005 round-5 P2-2). The owning id derives the AFFECTED session that +// the tailer reloads; an empty value would be silently swallowed by +// affectedSet.add("") while the row handler SUCCEEDED, so the cursor would advance +// PAST a change that emitted no content — a permanent, health-invisible gap. +// Erroring aborts the page (via the scanner closure → scanOnePage), so the cursor +// stays at the last good watermark and the transient error is surfaced (non-fatal) +// via the poll loop. The column is in requiredColumns, so it is always PRESENT +// (introspectAll makes its absence fatal upstream); the only failure mode reaching +// here is a corrupt/empty cell value. strRequired carries the table/column context. +func requiredOwner(d *scanDest, idx columnIndex, col string) (string, error) { + return d.strRequired(idx, col) +} diff --git a/internal/adapters/opencode/store_scandest.go b/internal/adapters/opencode/store_scandest.go new file mode 100644 index 0000000..695ed9d --- /dev/null +++ b/internal/adapters/opencode/store_scandest.go @@ -0,0 +1,176 @@ +package opencode + +import ( + "database/sql" + "fmt" +) + +// This file holds the DYNAMIC SCAN-DESTINATION DECODER (columnIndex + scanDest): +// the single dynamic Scan path every tracked-table row scan uses, regardless of +// which OPTIONAL columns the live schema omits. Split out of store_load.go to keep +// each file ≤400 lines (SOW-0005 round-7 P2-3 added the ownerOrWarn accessor). The +// tree-load scanners (store_load.go) and the delta-row scanners (store_scan.go) +// both consume these helpers; the SELECT builders / numeric parse helpers live in +// store_load_sql.go. + +// columnIndex maps a present-column name to its position in a dynamic SELECT, so +// a per-table scan reads each typed field from the right scan destination +// regardless of which optional columns the live schema omits. Built once per +// table from tableSchema.Present. +type columnIndex map[string]int + +// newColumnIndex builds the present-column → position map for a table schema. +func newColumnIndex(s tableSchema) columnIndex { + idx := make(columnIndex, len(s.Present)) + for i, c := range s.Present { + idx[c] = i + } + return idx +} + +// scanDest allocates one sql.NullString/NullInt64-backed destination slice +// sized to the present columns, plus a typed accessor closure set. The scan +// reads every column as a nullable holder, then the per-row decoder copies the +// present ones into the typed struct via the columnIndex. This keeps a single +// dynamic Scan path for every table shape. +// +// onWarn (optional) surfaces a CORRUPT numeric cell — a non-NULL value that is +// not parseable as the column's numeric type — with table/column context, so a +// corrupt time/cost/token cell degrades to 0 WITHOUT being silently swallowed +// (SOW-0005 P2.6). The tree-load scanners (loadSession/loadMessages/loadParts — +// the content path) set it; the delta-row scanners (which only derive the +// affected-session set from id/session_id) leave it nil. +type scanDest struct { + holders []sql.NullString + ptrs []any + table string + onWarn func(error) +} + +// newScanDest sizes the holders/pointers to n columns. +func newScanDest(n int) *scanDest { + d := &scanDest{holders: make([]sql.NullString, n), ptrs: make([]any, n)} + for i := range d.holders { + d.ptrs[i] = &d.holders[i] + } + return d +} + +// withWarn attaches a table label + onWarn callback so i64/f64 can surface a +// corrupt numeric cell. Returns the receiver for chaining at the scan site. +func (d *scanDest) withWarn(table string, onWarn func(error)) *scanDest { + d.table = table + d.onWarn = onWarn + return d +} + +// str returns the present column's string value, or "" when the column is absent +// from the live schema (old-schema drift) or SQL NULL. +func (d *scanDest) str(idx columnIndex, col string) string { + if i, ok := idx[col]; ok && d.holders[i].Valid { + return d.holders[i].String + } + return "" +} + +// i64 returns the present column's int64 value, or 0 when absent/NULL. opencode +// integer columns scan cleanly through a NullString (sqlite returns the decimal +// text); strconv keeps the path uniform with the string columns. A non-NULL +// value that fails to parse is CORRUPT — it degrades to 0 and is surfaced via the +// attached onWarn (SOW-0005 P2.6) rather than silently swallowed. +func (d *scanDest) i64(idx columnIndex, col string) int64 { + if i, ok := idx[col]; ok && d.holders[i].Valid { + v, ok := parseInt64Checked(d.holders[i].String) + if !ok { + d.warnCorrupt(col, d.holders[i].String) + } + return v + } + return 0 +} + +// f64 returns the present column's float64 value, or 0 when absent/NULL. A +// non-NULL value that fails to parse is surfaced via onWarn (SOW-0005 P2.6). +func (d *scanDest) f64(idx columnIndex, col string) float64 { + if i, ok := idx[col]; ok && d.holders[i].Valid { + v, ok := parseFloat64Checked(d.holders[i].String) + if !ok { + d.warnCorrupt(col, d.holders[i].String) + } + return v + } + return 0 +} + +// warnCorrupt surfaces a corrupt numeric cell via onWarn when one is attached. +// The raw value is intentionally NOT logged (it could be sensitive); only the +// table/column and a fixed message are reported. +func (d *scanDest) warnCorrupt(col, _ string) { + if d.onWarn != nil { + d.onWarn(fmt.Errorf("opencode: corrupt numeric cell (table=%s column=%s); using 0", d.table, col)) + } +} + +// i64Required reads a REQUIRED int64 column (a cursor-watermark column — +// time_updated) and returns an ERROR rather than coercing to 0 when the cell is +// NULL/absent or present-but-unparseable (SOW-0005 round-4 P2-1). The delta +// scanners feed this into the watermark key, so a corrupt value coerced to 0 would +// persist a POISONED cursor (the watermark could regress to 0). Erroring the row +// instead aborts the page so the cursor stays at the last good watermark — a +// corrupt required cell never advances the durable resume state. The raw value is +// NOT included in the error (it could be sensitive); only the table/column. +func (d *scanDest) i64Required(idx columnIndex, col string) (int64, error) { + i, ok := idx[col] + if !ok || !d.holders[i].Valid { + return 0, fmt.Errorf("opencode: required column %q absent/NULL (table=%s); refusing to advance cursor on a missing watermark", col, d.table) + } + v, parsed := parseInt64Checked(d.holders[i].String) + if !parsed { + return 0, fmt.Errorf("opencode: corrupt required numeric cell (table=%s column=%s); refusing to advance cursor on a poisoned watermark", d.table, col) + } + return v, nil +} + +// strRequired reads a REQUIRED string column (the cursor-watermark id) and returns +// an ERROR when the cell is NULL/absent or empty (SOW-0005 round-4 P2-1). An empty +// id cannot form a valid watermark tie-break, so it must not advance the cursor. +func (d *scanDest) strRequired(idx columnIndex, col string) (string, error) { + i, ok := idx[col] + if !ok || !d.holders[i].Valid || d.holders[i].String == "" { + return "", fmt.Errorf("opencode: required column %q absent/NULL/empty (table=%s); refusing to advance cursor on a missing watermark id", col, d.table) + } + return d.holders[i].String, nil +} + +// ownerOrWarn reads a REQUIRED OWNERSHIP/id column on the FULL-TREE load path +// (message.id / message.session_id / part.message_id / part.session_id) and +// returns (value, true) when present and non-empty. When the cell is absent/NULL/ +// empty it surfaces ONE structured WARN via the attached onWarn (the same table/ +// column-context, no-raw-value discipline as warnCorrupt — buffered into the +// warnSink and flushed AFTER the read tx closes, P2-1) and returns ("", false) so +// the caller SKIPS the row rather than attaching it to the out[""] partition where +// it would be silently dropped (SOW-0005 round-7 P2-3). It mirrors the delta-path +// requiredOwner semantics (store_scan.go), but WARN-and-skip rather than +// error-and-abort: the full-tree reload is a content re-emit, not a cursor advance, +// so one corrupt historical row is surfaced and dropped — not fatal to the whole +// session reload (which would strand every other part). onWarn may be nil on the +// pure no-DB test entrypoints, in which case the row is skipped silently. +func (d *scanDest) ownerOrWarn(idx columnIndex, col string) (string, bool) { + i, ok := idx[col] + if !ok || !d.holders[i].Valid || d.holders[i].String == "" { + if d.onWarn != nil { + d.onWarn(fmt.Errorf("opencode: required ownership column %q absent/NULL/empty (table=%s); skipping row (would otherwise be dropped under an empty owner key)", col, d.table)) + } + return "", false + } + return d.holders[i].String, true +} + +// bytes returns the present column's raw value as bytes, or nil when absent/NULL +// (used for the JSON data/model columns the mapper decodes). +func (d *scanDest) bytes(idx columnIndex, col string) []byte { + if i, ok := idx[col]; ok && d.holders[i].Valid { + return []byte(d.holders[i].String) + } + return nil +} diff --git a/internal/adapters/opencode/store_testhelpers_test.go b/internal/adapters/opencode/store_testhelpers_test.go new file mode 100644 index 0000000..635487c --- /dev/null +++ b/internal/adapters/opencode/store_testhelpers_test.go @@ -0,0 +1,375 @@ +package opencode + +import ( + "context" + "database/sql" + "database/sql/driver" + "encoding/json" + "fmt" + "io" + "log/slog" + "path/filepath" + "strings" + "sync" + "sync/atomic" + "testing" + + _ "modernc.org/sqlite" +) + +// This file holds the synthetic-DB builders and the query-counting driver the +// store_query / store_load / tailer tests share. Every DB is built throwaway in +// t.TempDir() via a SEPARATE read-write connection (production NEVER opens +// opencode.db read-write); the adapter under test reopens the path via the +// read-only openReadOnly helper. Content is synthetic, schema-shaped, never the +// operator's data (SOW-0005 R5; adapter-opencode.md §"Sensitive content"). + +// ocSchemaStmts is the CREATE-TABLE set for a CURRENT-schema synthetic opencode +// DB (the four tracked tables with every wanted column). Mirrors the live shape +// verified in adapter-opencode.md; used by the delta/tailer tests that need real +// JSON bodies the mapper can project to events. +var ocSchemaStmts = []string{ + `CREATE TABLE session ( + id TEXT PRIMARY KEY, project_id TEXT NOT NULL, parent_id TEXT, + slug TEXT NOT NULL, directory TEXT NOT NULL, title TEXT NOT NULL, + version TEXT NOT NULL, agent TEXT, model TEXT, + cost REAL NOT NULL DEFAULT 0, + tokens_input INTEGER NOT NULL DEFAULT 0, + tokens_output INTEGER NOT NULL DEFAULT 0, + tokens_reasoning INTEGER NOT NULL DEFAULT 0, + tokens_cache_read INTEGER NOT NULL DEFAULT 0, + tokens_cache_write INTEGER NOT NULL DEFAULT 0, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + time_archived INTEGER, time_compacting INTEGER)`, + `CREATE TABLE message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL)`, + `CREATE TABLE part ( + id TEXT PRIMARY KEY, message_id TEXT NOT NULL, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL)`, + `CREATE TABLE session_message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, type TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL)`, +} + +// rwDSNFor builds a writable file: DSN for a path (test-only; production never +// opens opencode.db this way). +func rwDSNFor(path string) string { + return "file:" + escapeURIPath(filepath.ToSlash(path)) + "?_pragma=busy_timeout(5000)" +} + +// newEmptyDB creates an empty current-schema opencode DB at dir/name and returns +// its path plus an open read-write *sql.DB the caller uses to insert rows. The +// caller MUST close the rw handle before opening the path read-only so the WAL +// is flushed. +func newEmptyDB(t *testing.T, dir, name string, extra ...string) (string, *sql.DB) { + t.Helper() + path := filepath.Join(dir, name) + rw, err := sql.Open(driverName, rwDSNFor(path)) + if err != nil { + t.Fatalf("open rw: %v", err) + } + for _, s := range append(append([]string{}, ocSchemaStmts...), extra...) { + if _, err := rw.Exec(s); err != nil { + _ = rw.Close() + t.Fatalf("create schema: %v\nstmt: %s", err, s) + } + } + return path, rw +} + +// insertSession inserts a session row with the given id/parent/times. +func insertSession(t *testing.T, rw *sql.DB, id, parent string, createdMs, updatedMs, archivedMs int64) { + t.Helper() + model, _ := json.Marshal(map[string]any{"id": "the-model", "providerID": "the-alias"}) + var arch any + if archivedMs > 0 { + arch = archivedMs + } + _, err := rw.Exec( + `INSERT INTO session (id, project_id, parent_id, slug, directory, title, version, agent, model, time_created, time_updated, time_archived) + VALUES (?,?,?,?,?,?,?,?,?,?,?,?)`, + id, "prj_1", parent, "slug", "/work/dir", "Title", "9.9.9", "test-agent", string(model), createdMs, updatedMs, arch) + if err != nil { + t.Fatalf("insert session %s: %v", id, err) + } +} + +// insertAssistantMessage inserts an assistant message with a JSON body carrying +// tokens/cost/finish (the mapper reads it). The body is schema-shaped synthetic. +func insertAssistantMessage(t *testing.T, rw *sql.DB, id, sessionID string, createdMs, updatedMs int64, inTok, outTok int64) { + t.Helper() + body, _ := json.Marshal(map[string]any{ + "role": "assistant", + "providerID": "the-alias", + "modelID": "the-model", + "agent": "test-agent", + "cost": 0.01, + "tokens": map[string]any{"input": inTok, "output": outTok, "cache": map[string]any{"read": 0, "write": 0}}, + "time": map[string]any{"created": createdMs, "completed": updatedMs}, + "finish": "stop", + }) + insertMessageRaw(t, rw, id, sessionID, createdMs, updatedMs, string(body)) +} + +// insertMessageRaw inserts a message row with a verbatim data body. +func insertMessageRaw(t *testing.T, rw *sql.DB, id, sessionID string, createdMs, updatedMs int64, body string) { + t.Helper() + _, err := rw.Exec( + `INSERT INTO message (id, session_id, time_created, time_updated, data) VALUES (?,?,?,?,?)`, + id, sessionID, createdMs, updatedMs, body) + if err != nil { + t.Fatalf("insert message %s: %v", id, err) + } +} + +// insertPart inserts a part row with a verbatim data body and the given times. +// sessionID may be "" for the old-schema-without-session_id fixtures (the column +// is NOT NULL on the current schema, so callers pass a value there). +func insertPart(t *testing.T, rw *sql.DB, id, messageID, sessionID string, createdMs, updatedMs int64, body string) { + t.Helper() + _, err := rw.Exec( + `INSERT INTO part (id, message_id, session_id, time_created, time_updated, data) VALUES (?,?,?,?,?,?)`, + id, messageID, sessionID, createdMs, updatedMs, body) + if err != nil { + t.Fatalf("insert part %s: %v", id, err) + } +} + +// stepStartBody / stepFinishBody / textBody build common part JSON bodies. +func stepStartBody() string { + b, _ := json.Marshal(map[string]any{"type": "step-start"}) + return string(b) +} + +func stepFinishBody(inCum, outCum int64, cost float64) string { + b, _ := json.Marshal(map[string]any{ + "type": "step-finish", "reason": "stop", "cost": cost, + "tokens": map[string]any{"input": inCum, "output": outCum, "cache": map[string]any{"read": 0, "write": 0}}, + }) + return string(b) +} + +func textBody(text string) string { + b, _ := json.Marshal(map[string]any{"type": "text", "text": text}) + return string(b) +} + +// openRWAgain reopens a built DB path read-write so a test can insert MORE rows +// after the read-only adapter loop is already running (simulating opencode's live +// writer). The handle uses WAL so the read-only reader sees committed rows. This +// is the ONLY writable handle pattern the tests use; production never opens +// opencode.db read-write. +func openRWAgain(t *testing.T, path string) (*sql.DB, error) { + t.Helper() + dsn := "file:" + escapeURIPath(filepath.ToSlash(path)) + "?_pragma=busy_timeout(5000)&_pragma=journal_mode(WAL)" + rw, err := sql.Open(driverName, dsn) + if err != nil { + return nil, err + } + rw.SetMaxOpenConns(1) + if err := rw.PingContext(context.Background()); err != nil { + _ = rw.Close() + return nil, err + } + return rw, nil +} + +// openRO reopens a built DB path read-only via the adapter's helper, registering +// cleanup. It is the ONLY way the tests acquire a connection to the DB under +// test (the read-only contract). +func openRO(t *testing.T, path string) *sql.DB { + t.Helper() + db, err := openReadOnly(context.Background(), path) + if err != nil { + t.Fatalf("openReadOnly %s: %v", path, err) + } + t.Cleanup(func() { _ = db.Close() }) + return db +} + +// introspect reopens a built DB read-only and introspects its schema, returning +// both the *sql.DB and the schemaSet (the common preamble of the store tests). +func introspect(t *testing.T, path string) (*sql.DB, schemaSet) { + t.Helper() + db := openRO(t, path) + set, err := introspectAll(context.Background(), db) + if err != nil { + t.Fatalf("introspectAll: %v", err) + } + return db, set +} + +// --- query-counting driver ---------------------------------------------------- +// +// countingDriver wraps the registered modernc.org/sqlite driver to record the +// text of every executed query. It proves AC#6's stronger property: the literal +// MAX(time_updated) query is NOT executed across idle polls. It is registered +// once under a test-only name; tests open through that name and inspect the +// recorded SQL. + +// queryLog records executed SQL strings, concurrency-safe (the tail loop's +// connection pool may issue from a background goroutine). +type queryLog struct { + mu sync.Mutex + queries []string +} + +func (l *queryLog) record(q string) { + l.mu.Lock() + l.queries = append(l.queries, q) + l.mu.Unlock() +} + +// countContaining returns how many recorded queries contain substr (case-sensitive). +func (l *queryLog) countContaining(substr string) int { + l.mu.Lock() + defer l.mu.Unlock() + n := 0 + for _, q := range l.queries { + if strings.Contains(q, substr) { + n++ + } + } + return n +} + +// reset clears the recorded queries (so a test counts only the SQL issued after +// a priming phase). The driver log is shared across tests via sql.Register's +// once-only registration, so a test that asserts counts MUST reset first and run +// non-parallel with other counting-driver tests would otherwise race; the +// counting tests use distinct DBs and reset immediately before their measured +// window, and the substrings they match (MAX(time_updated) / MAX(id)) are issued +// only by their own detectChange calls. +func (l *queryLog) reset() { + l.mu.Lock() + l.queries = nil + l.mu.Unlock() +} + +// countingDriver is a driver.Driver wrapping an inner driver.Driver, logging +// every query its connections execute into log. +type countingDriver struct { + inner driver.Driver + log *queryLog +} + +func (d *countingDriver) Open(name string) (driver.Conn, error) { + c, err := d.inner.Open(name) + if err != nil { + return nil, err + } + return &countingConn{Conn: c, log: d.log}, nil +} + +// countingConn wraps a driver.Conn, recording QueryContext SQL. modernc's conn +// implements QueryerContext + ExecerContext + the *Context preparers; we forward +// to those when present and record the query text. +type countingConn struct { + driver.Conn + log *queryLog +} + +func (c *countingConn) QueryContext(ctx context.Context, query string, args []driver.NamedValue) (driver.Rows, error) { + c.log.record(query) + if qc, ok := c.Conn.(driver.QueryerContext); ok { + return qc.QueryContext(ctx, query, args) + } + return nil, driver.ErrSkip +} + +func (c *countingConn) ExecContext(ctx context.Context, query string, args []driver.NamedValue) (driver.Result, error) { + c.log.record(query) + if ec, ok := c.Conn.(driver.ExecerContext); ok { + return ec.ExecContext(ctx, query, args) + } + return nil, driver.ErrSkip +} + +func (c *countingConn) PrepareContext(ctx context.Context, query string) (driver.Stmt, error) { + c.log.record(query) + if pc, ok := c.Conn.(driver.ConnPrepareContext); ok { + return pc.PrepareContext(ctx, query) + } + return c.Prepare(query) +} + +func (c *countingConn) BeginTx(ctx context.Context, opts driver.TxOptions) (driver.Tx, error) { + if bt, ok := c.Conn.(driver.ConnBeginTx); ok { + return bt.BeginTx(ctx, opts) + } + return c.Conn.Begin() //nolint:staticcheck // fallback for a driver without ConnBeginTx +} + +var ( + countingDriverOnce sync.Once + countingDriverLog = &queryLog{} + countingDriverReg atomic.Bool +) + +// countingDriverName is the registered name tests open through to capture SQL. +const countingDriverName = "sqlite-counting" + +// registerCountingDriver registers the counting driver once (idempotent across +// tests) and returns the shared query log. It obtains the inner modernc driver +// by opening a throwaway DB through the standard "sqlite" name and reading its +// .Driver(), so it never depends on an unexported modernc type. +func registerCountingDriver(t *testing.T) *queryLog { + t.Helper() + countingDriverOnce.Do(func() { + probe, err := sql.Open(driverName, "file::memory:") + if err != nil { + t.Fatalf("probe open for inner driver: %v", err) + } + // Close the probe on every exit path (including the t.Fatalf below, which + // runtime.Goexits without returning) so the throwaway handle never leaks. + defer func() { _ = probe.Close() }() + inner := probe.Driver() + sql.Register(countingDriverName, &countingDriver{inner: inner, log: countingDriverLog}) + countingDriverReg.Store(true) + }) + if !countingDriverReg.Load() { + t.Fatal("counting driver not registered") + } + return countingDriverLog +} + +// openCounting opens path read-only through the counting driver (same DSN the +// adapter builds), so executed SQL is recorded. The returned *sql.DB is +// cleaned up via t.Cleanup. A small pool keeps the recorded SQL deterministic. +func openCounting(t *testing.T, path string) (*sql.DB, *queryLog) { + t.Helper() + log := registerCountingDriver(t) + dsn, err := buildReadOnlyDSN(path) + if err != nil { + t.Fatalf("buildReadOnlyDSN: %v", err) + } + db, err := sql.Open(countingDriverName, dsn) + if err != nil { + t.Fatalf("open counting: %v", err) + } + db.SetMaxOpenConns(1) + if err := db.PingContext(context.Background()); err != nil { + _ = db.Close() + t.Fatalf("ping counting: %v", err) + } + t.Cleanup(func() { _ = db.Close() }) + return db, log +} + +// ctxBG is a tiny alias to keep test call sites short. +func ctxBG() context.Context { return context.Background() } + +// silentLogger is a discard slog.Logger for scanLoop/tailLoop call sites where +// the missing-column INF output is NOT under test (the drift assertion uses its +// own record-capturing handler instead). +func silentLogger() *slog.Logger { return slog.New(slog.NewTextHandler(io.Discard, nil)) } + +// fmtID zero-pads an integer into a 12-wide lexicographically-sortable suffix so +// synthetic ids sort in creation order like real Sonyflake ids. +func fmtID(prefix string, n int) string { + return fmt.Sprintf("%s_%012d", prefix, n) +} diff --git a/internal/adapters/opencode/tailer.go b/internal/adapters/opencode/tailer.go new file mode 100644 index 0000000..6eb6184 --- /dev/null +++ b/internal/adapters/opencode/tailer.go @@ -0,0 +1,389 @@ +package opencode + +import ( + "context" + "database/sql" + "errors" + "fmt" + "log/slog" + "time" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file is the POLL-LOOP TAILER (SOW-0005 chunk C): the historical-backfill +// scan loop and the realtime poll loop, plus the pure MAX(time_updated) gating +// predicate and the WAL-fsnotify wakeup hint. It mirrors codex's free-function +// tailer shape (functions that take explicit params, not methods on an Adapter +// struct — chunk D's adapter.go calls these). The DB is ALWAYS opened via the +// chunk-A openReadOnly helper (defer close); the delta-query layer (store_query +// .go) + tree load (store_load.go) do the SQL, the pure mapper turns rows into +// events. The adapter MUST NOT close `out` (the ingester owns it), and every I/O +// respects ctx cancellation (adapter-opencode.md §"Watch Strategy" → "Poll-loop +// state machine…"; canonical/adapter.go). + +const ( + // idlePollInterval is the steady-state poll cadence when the previous cycle + // produced no change (adapter-opencode.md §"Watch Strategy"; SOW Open + // Decision #2). + idlePollInterval = 2 * time.Second + // activePollInterval is the cadence after a cycle that produced a change. + activePollInterval = 500 * time.Millisecond + // walFloorInterval is the floor cadence for the walFloorWindow after a WAL + // fsnotify event. + walFloorInterval = 250 * time.Millisecond + // walFloorWindow is how long the 250 ms floor stays active after a WAL event. + walFloorWindow = 5 * time.Second + // timeUpdatedSafetyNet is the maximum interval between MAX(time_updated) + // probes — the safety net that catches in-place mutations even with no WAL + // fsnotify signal (adapter-opencode.md §"Performance"; AC#6). + timeUpdatedSafetyNet = 60 * time.Second + // progressEveryRows checkpoints SourceProgress every N rows paged during the + // backfill so a restart resumes mid-scan (adapter-opencode.md §"Performance"). + progressEveryRows = 1000 +) + +// emitProgress publishes a SourceProgressEvent carrying the current cursor, +// sent ctx-aware. Shape copied verbatim from codex/scanner.go emitProgress +// (SourceSeq:0, Ts: time.Now().UnixMicro()). +func emitProgress(ctx context.Context, sourceID string, cur Cursor, out chan<- canonical.Event) error { + if err := ctx.Err(); err != nil { + return err + } + ev := canonical.SourceProgressEvent{ + EventBase: canonical.EventBase{ + SourceID: sourceID, + SourceSeq: 0, + Ts: time.Now().UnixMicro(), + }, + Cursor: cur.String(), + } + select { + case <-ctx.Done(): + return ctx.Err() + case out <- ev: + return nil + } +} + +// emitEvents sends a mapped session's events ctx-aware. Returns ctx.Err() on +// cancellation so the caller stops promptly without deadlocking on the channel. +func emitEvents(ctx context.Context, evs []canonical.Event, out chan<- canonical.Event) error { + for _, ev := range evs { + select { + case <-ctx.Done(): + return ctx.Err() + case out <- ev: + } + } + return nil +} + +// scanLoop is the historical backfill. It opens the DB read-only, introspects +// the schema once (recording the hash into the cursor), pages every tracked +// table forward from `since`, derives the affected session ids, loads each +// session's full tree, maps it, and emits the events — checkpointing +// SourceProgress every ~progressEveryRows rows paged AND once at the end so a +// restart resumes mid-backfill. A cold start (`since` zero) walks the entire DB. +// Returns the final advanced cursor. +// +// A missing DB file surfaces one structured error via onError and returns +// (since, nil) so the daemon keeps serving other sources (mirrors codex's +// missing-root handling). ctx cancellation returns ctx.Err() promptly. +func scanLoop(ctx context.Context, dbPath, sourceID string, since Cursor, out chan<- canonical.Event, logger *slog.Logger, onError func(error)) (Cursor, error) { + logger = orDefaultLogger(logger) + onError = orNoop(onError) + db, err := openReadOnly(ctx, dbPath, withMaxOpenConns(2)) + if err != nil { + // A missing/unreadable file is non-fatal: report once, keep the daemon + // running for other sources. + onError(fmt.Errorf("opencode: scan open %s (ro): %w", dbPath, err)) + return since, nil + } + defer func() { _ = db.Close() }() + + schema, err := introspectAll(ctx, db) + if err != nil { + // An incompatible schema (a required column missing) is fatal for this + // source — surface it so /api/health shows the failure rather than + // silently emitting nothing. + return since, fmt.Errorf("opencode: scan introspect %s: %w", dbPath, err) + } + // Surface optional column drift once per (table, column). Scan and Tail each + // log this set once on every (re)start; that per-phase duplication is + // acceptable for this rare old-schema path (introspection runs twice — once + // per phase — by design, see adapter.go Scan→Tail hand-off). + logMissingColumns(logger, schema) + + cur := recordSchemaHash(ctx, db, coerceScanCursor(since), onError) + // SourceProgress is checkpointed by the batch processor (commitBatch), + // per-batch, AFTER each batch's affected sessions are emitted — the + // checkpoint-after-emit invariant. The trailing emitProgress that used to fire + // here was a SECOND emit of the same final cursor and is removed (SOW-0005 + // round-2 P3-C: one checkpoint layer only). A backfill that pages any rows ends + // on a batch that advanced, so at least one checkpoint always fires. + cur, _, err = processChanges(ctx, db, schema, cur, sourceID, out, logger, onError) + if err != nil { + return cur, err + } + return cur, nil +} + +// tailLoop is the realtime follow. It opens the DB read-only, introspects once, +// sets up a best-effort fsnotify watch on the WAL companion path as a wakeup +// hint, then polls on a timer whose cadence follows the idle/active/WAL-floor +// state machine. Each poll runs the cheap PK-indexed MAX(id) check per table; +// when the gate is open it ALSO runs the expensive MAX(time_updated) probe; on +// any indicated change it runs the delta+reload+emit path and advances the +// cursor. Returns nil on ctx cancellation. +// +// The WAL watch is best-effort: a missing WAL file or watcher error is logged +// once via onError and the loop falls back to pure timer polling (the 60 s +// safety net still guarantees in-place mutations are eventually seen). A watcher +// error never terminates the loop. +func tailLoop(ctx context.Context, dbPath, sourceID string, cur Cursor, warmStart bool, out chan<- canonical.Event, logger *slog.Logger, onError func(error)) error { + logger = orDefaultLogger(logger) + onError = orNoop(onError) + db, err := openReadOnly(ctx, dbPath, withMaxOpenConns(2)) + if err != nil { + // Non-fatal: report once and return cleanly so the daemon keeps serving + // other sources (mirrors codex's missing-root handling). + onError(fmt.Errorf("opencode: tail open %s (ro): %w", dbPath, err)) + return nil + } + defer func() { _ = db.Close() }() + + schema, err := introspectAll(ctx, db) + if err != nil { + return fmt.Errorf("opencode: tail introspect %s: %w", dbPath, err) + } + // Surface optional column drift once per (table, column). Scan logs the same + // set once too; the per-phase duplication on this rare old-schema path is + // acceptable (introspection runs once per phase by design). + logMissingColumns(logger, schema) + cur = recordSchemaHash(ctx, db, coerceScanCursor(cur), onError) + + walEvents, closeWatch := watchWAL(dbPath, onError) + defer closeWatch() + + st := newPollState(warmStart) + timer := time.NewTimer(st.nextInterval(time.Now())) + defer timer.Stop() + + for { + select { + case <-ctx.Done(): + return nil + case _, ok := <-walEvents: + if !ok { + walEvents = nil // watcher closed; fall back to pure timer polling + continue + } + st.markWALEvent(time.Now()) + resetTimer(timer, st.nextInterval(time.Now())) + case <-timer.C: + advanced, perr := pollOnce(ctx, db, schema, &cur, sourceID, &st, out, logger, onError) + if perr != nil { + if errors.Is(perr, context.Canceled) || errors.Is(perr, context.DeadlineExceeded) { + return nil + } + // A transient query error is non-fatal: report and keep polling. + onError(perr) + } + st.markCycle(advanced, time.Now()) + resetTimer(timer, st.nextInterval(time.Now())) + } + } +} + +// pollOnce runs one poll cycle: the cheap MAX(id) change check per table, the +// gated MAX(time_updated) probe, then — IN ORDER — (1) the boundary-ms re-scan +// against the PRE-ADVANCE cursor when the gate is open (round-6 P1: before the +// forward delta, so a co-occurring forward change cannot strand a same-ms in-place +// update), and (2) the forward delta+reload+emit path when a change was detected +// (which advances the cursor). SourceProgress is checkpointed by the batch processor +// (commitBatch), per batch, AFTER that batch's sessions are emitted; the trailing +// emitProgress that used to fire here was a SECOND emit of the same cursor and is +// removed (SOW-0005 round-2 P3-C: one checkpoint layer only). Returns whether the +// cycle produced a change (so the loop switches to the active cadence). +func pollOnce(ctx context.Context, db *sql.DB, schema schemaSet, cur *Cursor, sourceID string, st *pollState, out chan<- canonical.Event, logger *slog.Logger, onError func(error)) (bool, error) { + now := time.Now() + // Capture, BEFORE markProbe mutates lastProbe, whether THIS cycle's probe gate is + // open (a WAL write since the last probe, OR the 60 s safety net elapsed). This is + // the SAME predicate detectChange uses to decide whether to issue the expensive + // MAX(time_updated) probe — read it here so the boundary re-scan can fire on the + // gate-open path even when detectChange short-circuits on the cheap MAX(id) path + // (round-7 P1-1: the re-scan must NOT key off detectChange's `probed` output). + probeGateOpen := shouldProbeTimeUpdated(now, st.lastWALEvent, st.lastProbe, timeUpdatedSafetyNet) + changed, probed, err := detectChange(ctx, db, schema, *cur, st, now) + if err != nil { + return false, err + } + if probed { + st.markProbe(now) + } + active := false + // Boundary-millisecond re-scan (SOW-0005 round-3 P1-1 → round-7 P1-1): re-emit any + // session touched by an in-place UPDATE at exactly the cursor's boundary ms T — the + // case the cheap MAX(id) path, the gated MAX(time_updated) > gate, and the forward + // delta's strict `> :tuid` tie-break all miss. + // + // UNIFIED TRIGGER (round-7 P1-1, closes the same-ms class): run BEFORE processChanges, + // against the PRE-ADVANCE cursor (*cur is still the pre-advance watermark here — + // processChanges below advances it), whenever — for a warm/real boundary — EITHER + // (a) detectChange reported changed==true on ANY path (the cheap MAX(id) INSERT path + // OR the gated MAX(time_updated) probe), OR + // (b) the probe gate is open this cycle (probeGateOpen: a WAL event since the last + // probe, or the 60 s net elapsed). + // The re-scan deliberately does NOT key off detectChange's `probed` output: the cheap + // MAX(id) path returns changed==true, probed==false and SHORT-CIRCUITS before the gated + // probe, so a true INSERT (MaxIDSeen advances) co-occurring with a same-ms in-place + // UPDATE of a low-id row (row A, excluded from the forward delta by the strict + // `id > MaxTimeUpdatedID` tie-break) would otherwise be stranded: the old `probed`-gated + // trigger skipped the re-scan, processChanges advanced the cursor PAST T for the INSERT, + // and row A fell permanently below the new watermark, never seen. Arming on changed==true + // regardless of path re-emits row A's session FIRST, against the pre-advance T; the + // forward delta then emits the INSERT and advances the cursor. + // + // boundaryReal is the SINGLE cold-`Tail` guard (round-7 P2-1), applied to the WHOLE + // trigger — both the changed and gate-open paths. On a PRISTINE cold snapshot + // (follow-from-now) the cursor's boundary T is the snapshot HEAD and its bucket was + // NEVER emitted; re-scanning it would REPLAY a pre-existing session the cold Tail must + // not emit. boundaryReal is true only when the boundary T is a position whose bucket was + // already emitted — a WARM Tail (resumed from a Scan cursor) or after this Tail's cursor + // has advanced at least once (the new boundary is the just-emitted forward position, so + // re-scanning it is idempotent). Gating the gate-open path too (not just the changed + // path) closes the round-7 P2-1 hole where a cold Tail's first WAL-driven or safety-net + // probe (changed==false but gate open) replayed the snapshot boundary bucket. It also + // lets the old, partial `priorProbe` cold guard be removed entirely. + // + // AC#6 idle property is preserved: a steady-state idle DB within the 60 s net with no + // WAL event has changed==false AND probeGateOpen==false, so this never runs on an idle + // poll. The cursor is NOT advanced by the re-scan; re-emission is idempotent, and a + // session caught by BOTH this and the forward delta below is re-emitted once per path + // with no churn beyond the idempotent boundary bucket. + if st.boundaryReal && (changed || probeGateOpen) { + emitted, berr := emitBoundarySessions(ctx, db, schema, *cur, sourceID, out, logger, onError) + if berr != nil { + return active, berr + } + active = active || emitted + } + // Forward delta: advance the cursor for genuinely new / `> :tuid` rows. Runs AFTER + // the boundary re-scan so a co-occurring forward change (row B) advancing the + // watermark past T can never strand the pre-advance boundary update (row A). + if changed { + next, advanced, perr := processChanges(ctx, db, schema, *cur, sourceID, out, logger, onError) + *cur = next + active = active || advanced + if advanced { + // The cursor has moved to a forward position whose bucket was just emitted; + // the boundary is now a "real" already-emitted position, so subsequent + // gated probes may re-scan it on the changed==true path too (the cold + // HEAD-snapshot, if any, is now behind us). + st.boundaryReal = true + } + if perr != nil { + return active, perr + } + } + return active, nil +} + +// detectChange reports whether any tracked table shows new/changed rows since +// the cursor's watermark, using the cheap PK-indexed MAX(id) on every poll and +// the expensive unindexed MAX(time_updated) ONLY when the gate is open +// (shouldProbeTimeUpdated). The second return reports whether the probe ran (so +// the caller records lastProbe). A table on an old schema without time_updated +// is checked by MAX(id) alone. +// +// The cheap path compares MAX(id) against MaxIDSeen — the MONOTONIC high-water, +// NOT the (time_updated, id) paging position (SOW-0005 round-2 P1-A). The +// pre-P1-A code compared against the paging-position id, which an in-place +// UPDATE of an OLD row regressed to a small value, leaving MAX(id) permanently +// greater so this "cheap" path falsely reported a change on every idle poll → +// the expensive (time_updated, id) full scan ran forever. Comparing against the +// never-regressing MaxIDSeen makes an INSERT the only thing this path fires on; +// in-place mutations of existing rows are caught only by the gated +// MAX(time_updated) probe below, exactly as AC#6 intends. +func detectChange(ctx context.Context, db *sql.DB, schema schemaSet, cur Cursor, st *pollState, now time.Time) (changed, probed bool, err error) { + // Cheap path: MAX(id) per table vs the monotonic high-water. + for _, table := range trackedTables { + mid, mErr := maxID(ctx, db, table) + if mErr != nil { + return false, false, mErr + } + if mid > cur.Tables[table].MaxIDSeen { + return true, false, nil + } + } + // Gated expensive path: MAX(time_updated) per table, only when the gate opens. + if !shouldProbeTimeUpdated(now, st.lastWALEvent, st.lastProbe, timeUpdatedSafetyNet) { + return false, false, nil + } + for _, table := range trackedTables { + if !schema[table].has("time_updated") { + continue + } + mtu, mErr := maxTimeUpdated(ctx, db, table) + if mErr != nil { + return false, true, mErr + } + if mtu > cur.Tables[table].MaxTimeUpdatedMs { + return true, true, nil + } + } + return false, true, nil +} + +// shouldProbeTimeUpdated is the PURE gating predicate for the expensive +// MAX(time_updated) probe (AC#6, load-bearing). The probe is issued ONLY when +// (a) a WAL-mtime fsnotify event has fired since the last probe, OR (b) the +// safetyNet interval has elapsed since the last probe. During steady-state idle +// (no WAL event, within the net window) it returns false on every poll, so the +// unindexed full scan never runs — the property the AC#6 test pins. +func shouldProbeTimeUpdated(now, lastWALEvent, lastProbe time.Time, safetyNet time.Duration) bool { + if lastWALEvent.After(lastProbe) { + return true + } + return now.Sub(lastProbe) >= safetyNet +} + +// orNoop returns a no-op onError when the supplied one is nil so adapter code +// can call it unconditionally (mirrors codex/claude_code). +func orNoop(onError func(error)) func(error) { + if onError == nil { + return func(error) {} + } + return onError +} + +// orDefaultLogger guards a nil logger so a direct test caller passing nil does +// not panic. Production always passes a.logger (non-nil after New), so this is +// defence-in-depth, not a hot path. +func orDefaultLogger(logger *slog.Logger) *slog.Logger { + if logger == nil { + return slog.Default() + } + return logger +} + +// logMissingColumns emits exactly one INF per wanted-but-absent OPTIONAL column +// across the introspected tables, satisfying AC#5 / adapter-opencode.md +// §"Edge Cases" #1. Required-column loss is fatal upstream (introspectAll), so +// every column reaching here is an optional one the dynamic SELECT silently +// omitted; this surfaces the drift so an operator sees WHY a column reads zero +// on an old opencode database. Iteration is deterministic: tables in +// trackedTables order, columns already sorted by introspectTable +// (sort.Strings(s.Missing)). +func logMissingColumns(logger *slog.Logger, schema schemaSet) { + for _, table := range trackedTables { + for _, col := range schema[table].Missing { + logger.Info("opencode: optional column absent on this database schema; omitted from projection (old opencode version)", + "table", table, "column", col) + } + } +} + +// The WAL fsnotify wakeup-hint machinery (watchWAL/closedHintChan) and the +// resetTimer idiom live in tailer_wal.go (split to keep each file ≤400 lines). diff --git a/internal/adapters/opencode/tailer_batch.go b/internal/adapters/opencode/tailer_batch.go new file mode 100644 index 0000000..69b4a08 --- /dev/null +++ b/internal/adapters/opencode/tailer_batch.go @@ -0,0 +1,165 @@ +package opencode + +import ( + "context" + "database/sql" + "log/slog" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file holds the BATCHED delta→emit→checkpoint processor (SOW-0005 P1.1, +// data-loss fix). It is split out of tailer_changes.go to keep each file ≤400 +// lines. The checkpoint-after-emit invariant is documented on processChanges +// (tailer_changes.go): a SourceProgress checkpoint carrying cursor W is emitted +// ONLY after every session affected by rows ≤ W has been reloaded, mapped, and +// emitted. +// +// A batch spans the tracked tables with ONE shared row budget (≈progressEveryRows +// rows TOTAL, not per-table) so a session touched by several tables in the same +// run is reloaded+emitted ONCE per batch (cross-table dedupe), exactly as the +// pre-P1.1 single-pass did — only the checkpoint timing changed. A part's owning +// session is its REQUIRED denormalized session_id (resolvePartSession), so the +// batch no longer threads a message→session map (the old-schema fallback that +// needed one was unreachable and removed — SOW-0005 round-6 P3-2). + +// batchProcessor drives the batched delta→emit→checkpoint loop for one run. It +// owns the running committed cursor (advanced only after a batch's sessions are +// emitted AND checkpointed) and the per-table scan position (the watermark each +// table has been paged to so far, which may be ahead of committed mid-batch). +type batchProcessor struct { + db *sql.DB + schema schemaSet + sourceID string + out chan<- canonical.Event + logger *slog.Logger + onError func(error) + + // committed is the last cursor whose every affected session has been emitted. + // It is what a restart safely resumes from; it advances ONLY in commitBatch, + // after reloadAndEmit succeeds. + committed Cursor + // scanned tracks, per table, the watermark paging has reached. It starts at + // committed and runs ahead of it within a batch; commitBatch promotes it into + // committed once the batch's sessions are emitted. + scanned map[string]TableWatermark + // done marks tables fully paged (a short page was seen), so later batches skip + // them. + done map[string]bool + // advanced reports whether any watermark advanced across the whole run. + advanced bool +} + +// run pages every tracked table forward in bounded cross-table batches, emitting +// and checkpointing each batch before the next. It loops until every table is +// fully paged. +func (bp *batchProcessor) run(ctx context.Context) error { + bp.scanned = map[string]TableWatermark{} + bp.done = map[string]bool{} + for _, table := range trackedTables { + bp.scanned[table] = bp.committed.Tables[table] + } + for !bp.allDone() { + if err := ctx.Err(); err != nil { + return err + } + batch, err := bp.collectBatch(ctx) + if err != nil { + return err + } + if batch.rowCount == 0 { + return nil // nothing left across any table + } + if err := bp.commitBatch(ctx, batch); err != nil { + return err + } + } + return nil +} + +// allDone reports whether every tracked table has been fully paged. +func (bp *batchProcessor) allDone() bool { + for _, table := range trackedTables { + if !bp.done[table] { + return false + } + } + return true +} + +// batchResult is one cross-table batch's outcome: the affected session ids +// (first-seen order, deduped across the tables in this batch) and the row count +// (0 ⇒ nothing left). The advanced per-table watermarks live in bp.scanned. +type batchResult struct { + affected []string + rowCount int +} + +// collectBatch pages the tracked tables in order, accumulating rows into ONE +// shared affected set until the shared budget (progressEveryRows rows total) is +// reached or every table is exhausted. A table that returns a short page is +// marked done so subsequent batches skip it. The per-table watermark advances in +// bp.scanned as pages are read; commitBatch promotes it into committed. +func (bp *batchProcessor) collectBatch(ctx context.Context) (batchResult, error) { + affected := newAffectedSet() + total := 0 + for _, table := range trackedTables { + if bp.done[table] { + continue + } + s := bp.schema[table] + // Warnings raised inside a page tx (corrupt-cell / unknown-type WARN) are + // buffered in sink and flushed by scanOnePage AFTER each page's tx closes + // (SOW-0005 round-5 P2-1) — never emitted with the WAL snapshot pinned. + sink := &warnSink{} + onRow := deltaRowHandler(table, s, affected, sink.collect) + query := s.buildSelect() + for total < progressEveryRows { + if err := ctx.Err(); err != nil { + return batchResult{affected: affected.ids(), rowCount: total}, err + } + page, err := scanOnePage(ctx, bp.db, query, bp.scanned[table], onRow, sink, bp.onError) + if err != nil { + return batchResult{affected: affected.ids(), rowCount: total}, err + } + total += page.n + if page.n > 0 { + bp.scanned[table] = page.watermark + } + if page.n < deltaPageLimit { + bp.done[table] = true + break // table caught up + } + } + if total >= progressEveryRows { + break // budget spent; remaining tables wait for the next batch + } + } + return batchResult{affected: affected.ids(), rowCount: total}, nil +} + +// commitBatch reloads+emits the batch's affected sessions, THEN advances the +// committed cursor to the scanned watermark for every table that moved and +// checkpoints — the checkpoint-after-emit invariant. The watermark is promoted +// only when it genuinely moved (watermarkAdvanced), so a re-observed batch never +// spuriously flips `advanced`; a checkpoint is emitted only when something +// advanced. +func (bp *batchProcessor) commitBatch(ctx context.Context, batch batchResult) error { + if len(batch.affected) > 0 { + if err := reloadAndEmit(ctx, bp.db, bp.schema, bp.sourceID, batch.affected, bp.out, bp.logger, bp.onError); err != nil { + return err + } + } + moved := false + for _, table := range trackedTables { + if watermarkAdvanced(bp.committed.Tables[table], bp.scanned[table]) { + bp.committed = bp.committed.withTable(table, bp.scanned[table]) + moved = true + } + } + if moved { + bp.advanced = true + return emitProgress(ctx, bp.sourceID, bp.committed, bp.out) + } + return nil +} diff --git a/internal/adapters/opencode/tailer_boundary.go b/internal/adapters/opencode/tailer_boundary.go new file mode 100644 index 0000000..8d16012 --- /dev/null +++ b/internal/adapters/opencode/tailer_boundary.go @@ -0,0 +1,157 @@ +package opencode + +import ( + "context" + "database/sql" + "fmt" + "log/slog" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file holds the BOUNDARY-MILLISECOND re-scan (SOW-0005 round-3 P1-1). It is +// split out of tailer.go / tailer_changes.go to keep each file ≤400 lines. +// +// The problem: an already-seen LOW-id row updated IN PLACE at exactly the cursor's +// boundary millisecond T moves neither MAX(id) (no insert) nor MAX(time_updated) +// (the bucket value is unchanged), and the forward delta's strict tie-break +// (time_updated = T AND id > highID) excludes a low-id row — so it would be skipped +// forever if it were the session's ONLY change. The fix re-scans the FULL boundary +// bucket (every row with time_updated = T, regardless of id) on each gate-open +// probe, collects the owning session ids, and re-emits their trees idempotently. +// The cursor is NOT advanced (the boundary rows are already at the watermark). + +// boundarySelect builds the boundary-bucket query for a table: the present-column +// SELECT filtered to a single time_updated value (the cursor's MaxTimeUpdatedMs), +// ordered (time_updated, id) for determinism, with NO LIMIT — the boundary bucket +// is the tiny set of rows sharing one millisecond, never a paged scan. It reuses +// the same present-column projection deltaRowHandler scans, so the existing +// per-table session-derivation closures work unchanged. quoteIdent guards every +// identifier (all from the fixed schema, never operator input). +func boundarySelect(s tableSchema) string { + if len(s.Present) == 0 { + // Defensive: introspectAll rejects a table with no readable columns. + return "SELECT 1 WHERE 0" + } + return "SELECT " + presentColsSQL(s) + + " FROM " + quoteIdent(s.Table) + + " WHERE time_updated = ?" + + " ORDER BY time_updated, id" +} + +// boundaryAffectedSessions re-scans the boundary millisecond bucket of every +// tracked table whose cursor watermark sits at a non-zero MaxTimeUpdatedMs and +// returns the SET of owning session ids (first-seen order, deduped across tables). +// It catches an in-place UPDATE of an already-seen row at exactly the boundary ms +// (SOW-0005 round-3 P1-1) that the forward delta's strict `> :tuid` tie-break and +// the `MAX(*) >` gates both miss. A table with MaxTimeUpdatedMs == 0 (cold start) +// or without time_updated is skipped — there is no boundary to re-check. ctx +// cancellation aborts promptly. +// +// This is READ-ONLY and does NOT touch the cursor: the boundary rows are already +// AT the watermark, so re-emitting their session trees is the only effect (the +// ingester absorbs the re-emission idempotently). A part's owning session is its +// REQUIRED denormalized session_id (resolvePartSession), so the boundary re-scan no +// longer threads a message→session map (SOW-0005 round-6 P3-2 removed the unreachable +// old-schema fallback that needed one). Tables are still scanned in trackedTables +// order for deterministic first-seen affected ordering. +func boundaryAffectedSessions(ctx context.Context, db *sql.DB, schema schemaSet, cur Cursor, onError func(error)) ([]string, error) { + affected := newAffectedSet() + for _, table := range trackedTables { + s := schema[table] + if !s.has("time_updated") { + continue + } + ms := cur.Tables[table].MaxTimeUpdatedMs + if ms == 0 { + continue // no boundary watermark yet (cold start) + } + if err := scanBoundaryBucket(ctx, db, table, s, ms, affected, onError); err != nil { + return affected.ids(), err + } + } + return affected.ids(), nil +} + +// scanBoundaryBucket runs one table's boundary-bucket query inside a short +// read-only transaction (the same WAL-friendly snapshot discipline as the forward +// delta pages) and feeds every row through the table's deltaRowHandler so the +// owning session id lands in `affected`. The watermark the handler reports is +// discarded — the cursor is not advanced by a boundary re-scan. +// +// No warning/error EMISSION happens while the boundary tx is open (SOW-0005 +// round-5 P2-1): onRow buffers any per-row WARN into sink, the tx is committed/ +// rolled back FIRST (explicitly), and the buffered warnings are flushed through +// onError only after the snapshot is released. A fatal row error (a corrupt +// REQUIRED watermark/owning-id cell) is likewise surfaced after the tx closes. +func scanBoundaryBucket(ctx context.Context, db *sql.DB, table string, s tableSchema, ms int64, affected *affectedSet, onError func(error)) error { + tx, err := beginRO(ctx, db) + if err != nil { + return err + } + + sink := &warnSink{} + onRow := deltaRowHandler(table, s, affected, sink.collect) + rows, err := tx.QueryContext(ctx, boundarySelect(s), ms) + if err != nil { + _ = tx.Rollback() + sink.flush(onError) + return fmt.Errorf("opencode: boundary re-scan %s: %w", table, err) + } + scanErr := iterBoundaryRows(ctx, rows, onRow) + _ = rows.Close() + if scanErr == nil { + scanErr = rows.Err() + } + // Close the tx (release the WAL snapshot) BEFORE flushing warnings or surfacing + // a fatal row error, so a backpressured onError can never block with the + // snapshot held (P2-1). + if scanErr != nil { + _ = tx.Rollback() + sink.flush(onError) + return fmt.Errorf("opencode: boundary rows %s: %w", table, scanErr) + } + commitErr := tx.Commit() + sink.flush(onError) + if commitErr != nil { + return fmt.Errorf("opencode: commit boundary tx %s: %w", table, commitErr) + } + return nil +} + +// iterBoundaryRows feeds every boundary-bucket row through onRow, stopping on the +// first row error or ctx cancellation. It does NOT close rows (the caller owns the +// rows + tx lifecycle so it can close the tx before flushing warnings — P2-1). +func iterBoundaryRows(ctx context.Context, rows *sql.Rows, onRow func(rows *sql.Rows) (rowKey, error)) error { + for rows.Next() { + if err := ctx.Err(); err != nil { + return err + } + if _, err := onRow(rows); err != nil { + return err + } + } + return nil +} + +// emitBoundarySessions re-loads and emits the boundary-affected sessions' trees +// (idempotent re-emission). It is called from pollOnce on a gate-open probe, in +// ADDITION to the forward delta path: the forward delta advances the cursor for +// genuinely new/`> :tuid` rows, while this catches the same-ms in-place update of +// an already-seen low-id row. A session caught by both is simply re-emitted once +// per path; the ingester's idempotent upserts absorb it. Returns whether any +// boundary session was emitted (so the caller can fold it into the active-cadence +// decision) and any fatal/ctx error. +func emitBoundarySessions(ctx context.Context, db *sql.DB, schema schemaSet, cur Cursor, sourceID string, out chan<- canonical.Event, logger *slog.Logger, onError func(error)) (bool, error) { + affected, err := boundaryAffectedSessions(ctx, db, schema, cur, onError) + if err != nil { + return false, err + } + if len(affected) == 0 { + return false, nil + } + if err := reloadAndEmit(ctx, db, schema, sourceID, affected, out, logger, onError); err != nil { + return false, err + } + return true, nil +} diff --git a/internal/adapters/opencode/tailer_branch_test.go b/internal/adapters/opencode/tailer_branch_test.go new file mode 100644 index 0000000..d36a112 --- /dev/null +++ b/internal/adapters/opencode/tailer_branch_test.go @@ -0,0 +1,306 @@ +package opencode + +import ( + "context" + "errors" + "strings" + "testing" + "time" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file covers the error/edge branches of the chunk-C delta + tailer layer +// that the happy-path tests don't reach: errSessionGone skip, the part→session +// message-id fallback + indexed lookup, ctx-cancel in emit/reload, the +// >progressEveryRows checkpoint, the time_updated change path, and the small +// pure helpers (isContextErr, coerceScanCursor schema-hash, messageOrderBy). + +// TestReloadAndEmit_SessionGoneSkipped feeds an affected id with no session row +// and asserts reloadAndEmit skips it with one errSessionGone error and keeps +// going for the others. +func TestReloadAndEmit_SessionGoneSkipped(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_real", "", 1, 1, 0) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + out := make(chan canonical.Event, 256) + var ce collectErrs + // "ses_ghost" has no session row → errSessionGone; "ses_real" emits normally. + err := reloadAndEmit(ctxBG(), db, schema, "opencode:test", []string{"ses_ghost", "ses_real"}, out, silentLogger(), ce.onError) + if err != nil { + t.Fatalf("reloadAndEmit: %v", err) + } + got := drainAll(out) + if n := countKind(got, canonical.EvSessionStarted); n != 1 { + t.Errorf("SessionStarted count = %d, want 1 (only ses_real)", n) + } + if ce.count() != 1 { + t.Fatalf("onError count = %d, want 1 (errSessionGone for ses_ghost)", ce.count()) + } + if !errors.Is(ce.errs[0], errSessionGone) { + t.Errorf("error = %v, want errSessionGone", ce.errs[0]) + } +} + +// TestReloadAndEmit_CtxCancel asserts reloadAndEmit returns ctx.Err() promptly +// when the context is already cancelled, without emitting. +func TestReloadAndEmit_CtxCancel(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_real", "", 1, 1, 0) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + ctx, cancel := context.WithCancel(context.Background()) + cancel() + out := make(chan canonical.Event) // unbuffered; a non-ctx-aware emit would hang + var ce collectErrs + err := reloadAndEmit(ctx, db, schema, "opencode:test", []string{"ses_real"}, out, silentLogger(), ce.onError) + if !isContextErr(err) { + t.Fatalf("reloadAndEmit(cancelled) = %v, want context error", err) + } +} + +// TestResolvePartSession pins the SIMPLIFIED resolver (SOW-0005 round-6 P3-2): the +// part table's session_id is a REQUIRED column (requiredColumns["part"]) — its +// absence is fatal upstream in introspectAll — so the resolver reads the denormalized +// value directly. There is no message-lookup fallback (the old-schema branch was +// unreachable and removed, same class as the round-3 P3-1 dead-fallback removal). The +// resolver is now a PURE function of the partRow (no ctx/db/map), so it needs no DB. +func TestResolvePartSession(t *testing.T) { + t.Parallel() + + // Denormalized (required) session_id present → returned directly. + p := partRow{ID: "prt_1", MessageID: "msg_a", SessionID: "ses_a"} + if sid, err := resolvePartSession(p); err != nil || sid != "ses_a" { + t.Fatalf("denormalized: sid=%q err=%v, want ses_a/nil", sid, err) + } + + // Empty session_id → ERROR (defence in depth; the delta scanner's requiredOwner + // already errors the page before this, but the resolver must never derive an + // empty affected session that affectedSet.add would silently drop → cursor gap). + pEmpty := partRow{ID: "prt_empty", MessageID: "msg_a", SessionID: ""} + sid, err := resolvePartSession(pEmpty) + if err == nil { + t.Fatalf("empty session_id must ERROR (got sid=%q, nil err); a silent empty would gap the cursor", sid) + } + if !strings.Contains(err.Error(), "empty session_id") { + t.Errorf("error = %v, want an empty-session_id refusal naming the part", err) + } +} + +// TestProcessChanges_NoChange asserts processChanges over a cursor already at the +// DB maxima returns advanced=false and emits nothing. +func TestProcessChanges_NoChange(t *testing.T) { + t.Parallel() + path := seedBackfillDB(t, t.TempDir(), 2) + db, schema := introspect(t, path) + + // Prime the cursor to the current maxima. + cur := newCursor() + for _, table := range trackedTables { + mid, _ := maxID(ctxBG(), db, table) + mtu, _ := maxTimeUpdated(ctxBG(), db, table) + cur = cur.withTable(table, TableWatermark{MaxIDSeen: mid, MaxTimeUpdatedMs: mtu, MaxTimeUpdatedID: mid}) + } + + out := make(chan canonical.Event, 256) + next, advanced, err := processChanges(ctxBG(), db, schema, cur, "opencode:test", out, silentLogger(), func(error) {}) + if err != nil { + t.Fatalf("processChanges: %v", err) + } + if advanced { + t.Error("processChanges over up-to-date cursor reported advanced=true") + } + if got := drainAll(out); len(got) != 0 { + t.Errorf("no-change processChanges emitted %d events, want 0", len(got)) + } + _ = next +} + +// TestProcessChanges_BatchedCheckpoint asserts processChanges emits at least one +// SourceProgress checkpoint when more than progressEveryRows rows are paged (so a +// backfill resumes mid-scan) AND that every checkpoint is preceded by the content +// it covers — the checkpoint-after-emit invariant (SOW-0005 P1.1). Because all +// rows belong to one session, the batched loop re-emits that session's tree each +// batch, then checkpoints; the LAST event in the stream must therefore be a +// SourceProgress (the final batch checkpoint), never a checkpoint with trailing +// uncommitted content. +func TestProcessChanges_BatchedCheckpoint(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_a", "", 1, 1, 0) + // Insert > progressEveryRows messages so the loop runs more than one batch. + tx, _ := rw.Begin() + stmt, _ := tx.Prepare(`INSERT INTO message (id, session_id, time_created, time_updated, data) VALUES (?,?,?,?,?)`) + for i := 1; i <= progressEveryRows+50; i++ { + if _, err := stmt.Exec(fmtID("msg", i), "ses_a", int64(i), int64(i), `{"role":"user"}`); err != nil { + t.Fatalf("bulk insert: %v", err) + } + } + _ = stmt.Close() + if err := tx.Commit(); err != nil { + t.Fatalf("commit: %v", err) + } + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + + db, schema := introspect(t, path) + out := make(chan canonical.Event, 8192) + var ce collectErrs + _, advanced, err := processChanges(ctxBG(), db, schema, newCursor(), "opencode:test", out, silentLogger(), ce.onError) + if err != nil { + t.Fatalf("processChanges: %v", err) + } + if !advanced { + t.Error("processChanges with new rows reported advanced=false") + } + got := drainAll(out) + if n := countKind(got, canonical.EvSourceProgress); n < 1 { + t.Errorf("processChanges paged > %d rows but emitted %d SourceProgress, want >= 1", progressEveryRows, n) + } + // Checkpoint-after-emit: the final emitted event is the last batch's + // SourceProgress, so the run never ends with content past the last checkpoint. + if len(got) == 0 || got[len(got)-1].EventKind() != canonical.EvSourceProgress { + t.Errorf("last event = %v, want a SourceProgress checkpoint (checkpoint-after-emit)", lastKind(got)) + } +} + +// lastKind returns the kind of the last event (or "" for an empty slice) for +// assertion messages. +func lastKind(evs []canonical.Event) canonical.EventKind { + if len(evs) == 0 { + return "" + } + return evs[len(evs)-1].EventKind() +} + +// TestDetectChange_TimeUpdatedPath asserts an in-place mutation (time_updated +// bumped, id already covered by MaxIDSeen) is caught only via the gated +// MAX(time_updated) probe, never the cheap MAX(id) path (SOW-0005 round-2 P1-A). +func TestDetectChange_TimeUpdatedPath(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_a", "", 100, 100, 0) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + // Cursor whose monotonic high-water (MaxIDSeen) already saw the current MAX(id) + // but whose time_updated paging position is LOWER, simulating an in-place row + // update (same id, bumped time_updated) we have not yet paged. The cheap MAX(id) + // check must NOT fire (MaxIDSeen already covers the id); only the gated + // MAX(time_updated) probe catches the mutation (SOW-0005 round-2 P1-A). + cur := newCursor() + mid, _ := maxID(ctxBG(), db, "session") + cur = cur.withTable("session", TableWatermark{MaxIDSeen: mid, MaxTimeUpdatedMs: 50, MaxTimeUpdatedID: mid}) + + st := newPollState(false) // zero lastProbe ⇒ gate open (net immediately due) + changed, probed, err := detectChange(ctxBG(), db, schema, cur, &st, time.Unix(1_700_000_000, 0)) + if err != nil { + t.Fatalf("detectChange: %v", err) + } + if !probed { + t.Error("expected the gated probe to run (gate open)") + } + if !changed { + t.Error("in-place mutation (time_updated > watermark) not detected via MAX(time_updated)") + } +} + +// TestCoerceScanCursor_PureShaping asserts coerceScanCursor only normalises the +// Tables map and Version and does NOT record a schema hash (the hash is recorded +// separately by recordSchemaHash after __drizzle_migrations is read — SOW-0005 +// chunk D, replacing chunk C's present-column placeholder). +func TestCoerceScanCursor_PureShaping(t *testing.T) { + t.Parallel() + c := coerceScanCursor(Cursor{}) + if c.Tables == nil { + t.Error("coerceScanCursor did not initialise the Tables map") + } + if c.Version != cursorVersion { + t.Errorf("coerceScanCursor Version = %d, want %d", c.Version, cursorVersion) + } + if c.SchemaHash != "" { + t.Errorf("coerceScanCursor recorded a schema hash %q; the hash is recordSchemaHash's job", c.SchemaHash) + } +} + +// TestPureHelpers covers isContextErr, messageOrderBy, and parseInt64/parseFloat64 +// — the small pure-helper branches. +func TestPureHelpers(t *testing.T) { + t.Parallel() + if !isContextErr(context.Canceled) || !isContextErr(context.DeadlineExceeded) { + t.Error("isContextErr should be true for canceled/deadline") + } + if isContextErr(errors.New("other")) || isContextErr(nil) { + t.Error("isContextErr should be false for non-context / nil") + } + + // messageOrderBy: with time_created → composite; without → id only. + withTC := tableSchema{Table: "message", Present: []string{"id", "time_created"}, live: map[string]struct{}{"time_created": {}}} + if ob := messageOrderBy(withTC); !strings.Contains(ob, "time_created") { + t.Errorf("messageOrderBy(with time_created) = %q, want composite", ob) + } + noTC := tableSchema{Table: "message", Present: []string{"id"}, live: map[string]struct{}{}} + if ob := messageOrderBy(noTC); strings.Contains(ob, "time_created") { + t.Errorf("messageOrderBy(no time_created) = %q, want id only", ob) + } + + if parseInt64("123") != 123 || parseInt64("nope") != 0 { + t.Error("parseInt64 wrong") + } + if parseFloat64("1.5") != 1.5 || parseFloat64("nope") != 0 { + t.Error("parseFloat64 wrong") + } + // NOTE: buildSelectByID and its assertion were removed with the id-only delta + // fallback (SOW-0005 P3.1) — time_updated is a required column, so the + // composite-key buildSelect is the only delta SELECT. +} + +// TestEmitHelpers_CtxCancel covers the ctx-cancel branches of emitProgress and +// emitEvents. +func TestEmitHelpers_CtxCancel(t *testing.T) { + t.Parallel() + ctx, cancel := context.WithCancel(context.Background()) + cancel() + out := make(chan canonical.Event) // unbuffered + if err := emitProgress(ctx, "s", newCursor(), out); !isContextErr(err) { + t.Errorf("emitProgress(cancelled) = %v, want context error", err) + } + ev := []canonical.Event{canonical.SourceProgressEvent{}} + if err := emitEvents(ctx, ev, out); !isContextErr(err) { + t.Errorf("emitEvents(cancelled) = %v, want context error", err) + } + // emitEvents over an empty slice is a no-op nil. + if err := emitEvents(context.Background(), nil, out); err != nil { + t.Errorf("emitEvents(nil) = %v, want nil", err) + } +} + +// TestOrNoop covers the nil-onError substitution. +func TestOrNoop(t *testing.T) { + t.Parallel() + if orNoop(nil) == nil { + t.Fatal("orNoop(nil) returned nil") + } + orNoop(nil)(errors.New("must not panic")) + called := false + orNoop(func(error) { called = true })(errors.New("x")) + if !called { + t.Error("orNoop did not pass through the provided func") + } +} diff --git a/internal/adapters/opencode/tailer_changes.go b/internal/adapters/opencode/tailer_changes.go new file mode 100644 index 0000000..bb197d4 --- /dev/null +++ b/internal/adapters/opencode/tailer_changes.go @@ -0,0 +1,307 @@ +package opencode + +import ( + "context" + "database/sql" + "errors" + "fmt" + "log/slog" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file holds the SHARED change-processing helper both poll loops use +// (processChanges) and the poll-loop cadence STATE MACHINE (pollState). It is +// split out of tailer.go to keep each file ≤400 lines. processChanges is the one +// "delta → affected sessions → reload → map → emit → checkpoint" pipeline: +// scanLoop runs it once over the whole backfill; tailLoop runs it per change +// cycle. +// +// CHECKPOINT-AFTER-EMIT INVARIANT (SOW-0005 P1.1, data-loss fix): a +// SourceProgress checkpoint carrying cursor W is emitted ONLY after every session +// affected by rows ≤ W in this run has been reloaded, mapped, and emitted. The +// pipeline therefore runs in BOUNDED BATCHES: each batch pages ≤ progressEveryRows +// delta rows forward (advancing the per-table watermark), reloadAndEmits that +// batch's affected sessions, and ONLY THEN checkpoints the batch cursor. A crash +// or ctx-cancel mid-batch returns the LAST fully-committed cursor (the previous +// batch's), never the in-progress batch's scanned watermark — so a restart from +// the persisted cursor can never resume PAST rows whose canonical events were +// never emitted (adapter-opencode.md §"Read Strategy" → "checkpoint-after-emit"). + +// processChanges pages every tracked table forward from `cur` in bounded batches, +// reloading+emitting each batch's affected sessions BEFORE checkpointing that +// batch's cursor, and returns the advanced cursor + whether anything advanced. It +// is used by BOTH scanLoop (whole backfill) and tailLoop (one cycle). The +// returned cursor is always a checkpoint-safe one: every session for a row it +// covers has been emitted. +// +// Re-emitting an unchanged/partly-changed session is harmless: the ingester's +// idempotent upserts + the post-SOW-0004 idempotent catalog absorb it +// (adapter-opencode.md §"Read Strategy" → "Full-session-tree load + map"). +func processChanges(ctx context.Context, db *sql.DB, schema schemaSet, cur Cursor, sourceID string, out chan<- canonical.Event, logger *slog.Logger, onError func(error)) (Cursor, bool, error) { + bp := &batchProcessor{ + db: db, + schema: schema, + sourceID: sourceID, + out: out, + logger: orDefaultLogger(logger), + onError: onError, + committed: cur, // last cursor whose sessions are fully emitted + } + if err := bp.run(ctx); err != nil { + // On error/cancel the committed cursor is the last batch whose content was + // fully emitted — never the in-progress batch's scanned watermark. + return bp.committed, bp.advanced, err + } + return bp.committed, bp.advanced, nil +} + +// deltaRowHandler returns the per-row scan callback for one table: it scans the +// row into its typed struct, records the owning session id into the affected +// set, and reports the row's watermark key. The returned closure matches +// scanTableDelta's onRow type. onError is threaded into the per-table scanner +// (round-4 P2-1): a corrupt OPTIONAL numeric cell surfaces a WARN and degrades to +// 0, while a corrupt REQUIRED watermark/owning-id cell (id/time_updated/session_id) +// returns an ERROR that aborts the page so the cursor never advances to a poisoned +// watermark or an empty affected session. onError ALSO surfaces non-fatal per-row +// anomalies (a session_message with an unknown type — adapter-opencode.md §"Edge +// Cases" #1/§"session_message") without aborting the cycle. +// +// A part's owning session is its REQUIRED denormalized session_id (resolvePartSession; +// session_id is in requiredColumns["part"], so the old-schema message-lookup fallback +// was unreachable and removed — SOW-0005 round-6 P3-2). The message→session map the +// fallback once consulted is therefore gone too. +func deltaRowHandler(table string, s tableSchema, affected *affectedSet, onError func(error)) func(*sql.Rows) (rowKey, error) { + idx := newColumnIndex(s) + n := len(s.Present) + switch table { + case "session": + scan, row := scanSessionRow(idx, n, onError) + return func(rows *sql.Rows) (rowKey, error) { + k, err := scan(rows) + if err != nil { + return k, err + } + affected.add(row.ID) + return k, nil + } + case "message": + scan, row := scanMessageRow(idx, n, onError) + return func(rows *sql.Rows) (rowKey, error) { + k, err := scan(rows) + if err != nil { + return k, err + } + affected.add(row.SessionID) + return k, nil + } + case "part": + scan, row := scanPartRow(idx, n, onError) + return func(rows *sql.Rows) (rowKey, error) { + k, err := scan(rows) + if err != nil { + return k, err + } + sid, rerr := resolvePartSession(*row) + if rerr != nil { + return k, rerr + } + affected.add(sid) + return k, nil + } + default: // session_message + scan, row := scanSessionMessageRow(idx, n, onError) + hasType := s.has("type") + return func(rows *sql.Rows) (rowKey, error) { + k, err := scan(rows) + if err != nil { + return k, err + } + affected.add(row.SessionID) + // Spec Edge #1 (§"session_message"): warn on an unrecognized + // session_message.type so a new opencode event variant is visible + // rather than silently absorbed. Introspection-aware: if the schema + // lacks the type column, skip silently (row.Type is ""). + if hasType { + warnUnknownSessionMessageType(row.ID, row.Type, onError) + } + return k, nil + } + } +} + +// knownSessionMessageTypes is the set of session_message.type discriminators the +// adapter recognizes today (adapter-opencode.md §"session_message": only +// agent-switched / model-switched ship currently; the upstream union is wider but +// unpopulated). Any other value is forward-compatibility data surfaced via a WARN. +var knownSessionMessageTypes = map[string]struct{}{ + "agent-switched": {}, + "model-switched": {}, +} + +// warnUnknownSessionMessageType emits one structured WARN via onError for a +// session_message row whose type is not recognized (spec Edge #1). An empty type +// (older schema with the column NULL, or absent) is not flagged — only a present +// but unrecognized value. The row's session_id still drives the affected set +// (the caller already added it), so the tree is reloaded regardless. +func warnUnknownSessionMessageType(id, typ string, onError func(error)) { + if typ == "" { + return + } + if _, ok := knownSessionMessageTypes[typ]; ok { + return + } + onError(fmt.Errorf("opencode: unknown session_message type %q (table=session_message id=%s); skipping unrecognized event variant", typ, id)) +} + +// reloadAndEmit loads each affected session's full tree, maps it via the pure +// mapper, and emits the events. ctx cancellation stops promptly. +// +// Error policy (SOW-0005 round-7 P1-2): only TWO outcomes are non-fatal +// skip-and-continue — (1) the `time_compacting` pause (`skipped == true`), which +// re-surfaces in a later delta when the column clears (Edge Cases #8); and (2) a +// session whose row is GONE (`loadAndMapSession` returns nil events with no error +// — deleted between the delta page and the load, or an orphaned part/message), +// surfaced once as `errSessionGone`. ANY OTHER error (a transient tree-load/read +// error, a commit failure, a corrupt-tree decode error) PROPAGATES so the caller +// (commitBatch) does NOT promote the cursor: the same rows are retried next cycle. +// The earlier code swallowed every non-context error (logged + continued) and let +// commitBatch advance the cursor anyway, persisting a watermark beyond rows whose +// content was never emitted — a permanent, health-invisible content loss. Letting +// the error propagate keeps the checkpoint-after-emit invariant: a cursor is +// promoted only after every affected session's content was successfully emitted. +func reloadAndEmit(ctx context.Context, db *sql.DB, schema schemaSet, sourceID string, affected []string, out chan<- canonical.Event, logger *slog.Logger, onError func(error)) error { + for _, sid := range affected { + if err := ctx.Err(); err != nil { + return err + } + evs, skipped, err := loadAndMapSession(ctx, db, schema, sourceID, sid, logger, onError) + if err != nil { + // Every load/map/commit error (context or not) propagates: a context error + // stops the run, and any other transient error must NOT let the cursor + // advance past rows whose content was not emitted (round-7 P1-2). It is + // surfaced via onError at the propagation boundary (commitBatch's caller). + return err + } + if skipped { + // Session paused mid-compaction (time_compacting non-NULL). It is NOT + // emitted this cycle and re-surfaces in a later delta when the column + // clears (its time_updated bumps) — adapter-opencode.md §"Edge Cases" #8. + continue + } + if evs == nil { + // Session row legitimately gone — the only load failure treated as + // skip-and-continue. Surfaced once as a structured error; the cursor may + // advance past it (the row is not coming back). + onError(fmt.Errorf("opencode: affected session %s: %w", sid, errSessionGone)) + continue + } + if eerr := emitEvents(ctx, evs, out); eerr != nil { + return eerr + } + } + return nil +} + +// rollbackFlush closes the per-session read tx (rolling back the read-only +// snapshot) and THEN flushes the buffered warnings through onError (SOW-0005 +// round-5 P2-1) — the early-return chokepoint for loadAndMapSession so no warning +// is emitted with the snapshot held. Rollback-after-an-eventual-commit is a no-op +// (database/sql), and the deferred rollback in the caller is harmless after this. +func rollbackFlush(tx *sql.Tx, sink *warnSink, onError func(error)) { + _ = tx.Rollback() + sink.flush(onError) +} + +// loadAndMapSession loads one session's row + full ordered tree and maps it, +// returning (events, skipped, error). skipped=true means the session is paused +// mid-compaction (time_compacting non-NULL) and must NOT be emitted this cycle +// (SOW-0005 round-2 P2-E / adapter-opencode.md §"Edge Cases" #8); the caller +// continues without emitting and the session re-surfaces when the column clears. +// A nil event slice with skipped=false (and nil error) means the session row was +// not found (the caller surfaces errSessionGone). The mapper uses the +// deterministic default PayloadRef URI builder (the production tailer path injects +// no builder; defaultPayloadURI is the single source of truth). It also resolves +// the session's TRUE tree root by walking the parent_id chain (resolveRootID, +// SOW-0005 P2.4) and injects it so a nested sub-agent's RootNativeID is the whole +// tree's root rather than its direct parent. +func loadAndMapSession(ctx context.Context, db *sql.DB, schema schemaSet, sourceID, sessionID string, logger *slog.Logger, onError func(error)) (evs []canonical.Event, skipped bool, err error) { + // ONE read-only transaction for the WHOLE per-session read (SOW-0005 round-3 + // P1-2): the session row, the time_compacting check, the parent-chain root + // resolution, and the message+part tree all share a single consistent snapshot. + // Opening a second tx for the tree after checking time_compacting in a first one + // was a compaction-race TOCTOU — opencode could begin compaction between the two + // reads and the adapter would emit a partial/mutating tree despite the Edge #8 + // skip rule, and the metadata would come from a different snapshot than the tree. + tx, err := beginRO(ctx, db) + if err != nil { + return nil, false, err + } + // No warning EMISSION while this snapshot is open (SOW-0005 round-5 P2-1): the + // loaders (loadSession/loadSessionTree corrupt-cell + oversized-session WARNs) + // and resolveRootID (parent-chain WARNs) write into sink — a non-blocking slice + // append — instead of the live onError, which would send on the (possibly + // backpressured) out channel and pin the WAL snapshot. The tx is closed FIRST + // (explicit rollback/commit), THEN sink is flushed through onError, THEN the + // pure mapper runs and the caller emits the content events — so NEITHER a + // warning NOR a content event is emitted with the snapshot held. The deferred + // rollback is a panic-safety net only (a no-op after the explicit close). + defer func() { _ = tx.Rollback() }() + sink := &warnSink{} + + s, ok, err := loadSession(ctx, tx, schema, sessionID, sink.collect) + if err != nil { + rollbackFlush(tx, sink, onError) + return nil, false, err + } + if !ok { + rollbackFlush(tx, sink, onError) + return nil, false, nil + } + if s.TimeCompactingMs > 0 { + // Pause: compaction is reshaping this session's message/part rows, so reading + // now would emit partial/stale content. Skip emitting this cycle; the session + // re-appears in a later delta when time_compacting clears (P2-E). The check and + // the (skipped) tree read are now atomic on this one snapshot (P1-2). + rollbackFlush(tx, sink, onError) + orDefaultLogger(logger).Info("opencode: session compaction in progress; skipping tree emit this cycle (re-emits when time_compacting clears)", + "session_id", sessionID) + return nil, true, nil + } + tree, err := loadSessionTree(ctx, tx, schema, sessionID, sink.collect) + if err != nil { + rollbackFlush(tx, sink, onError) + return nil, false, err + } + root := resolveRootID(ctx, tx, s.ID, s.ParentID, sink.collect) + // Commit the read-only snapshot before mapping (mapping is pure CPU work; holding + // the snapshot across it would pin the WAL needlessly). A commit failure on a + // read-only tx is surfaced rather than silently dropped. + commitErr := tx.Commit() + // Flush the buffered loader/root warnings now that the snapshot is released + // (P2-1) — before mapping/emitting, so the ordering is: tx closed → warnings → + // content events (the caller emits evs). + sink.flush(onError) + if commitErr != nil { + return nil, false, fmt.Errorf("opencode: commit session-read tx for %s: %w", sessionID, commitErr) + } + // The mapper runs AFTER the tx is closed, so its own WARNs (mwarn) may go + // straight to the live onError — any channel send now blocks without the + // snapshot held. + evs, err = mapSession(sourceID, s, tree, WithRootNativeID(root), WithOnWarn(onError)) + return evs, false, err +} + +// watermarkAdvanced reports whether b is strictly after a on the composite +// (time_updated, id) key — the same order the delta query and cursor.After use. +// Guards against re-recording an unchanged watermark (which would set advanced +// when nothing moved). +func watermarkAdvanced(a, b TableWatermark) bool { + return cmpWatermark(b, a) > 0 +} + +// isContextErr reports whether err is a context cancellation/deadline (so the +// reload loop returns it rather than swallowing it via onError). +func isContextErr(err error) bool { + return errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) +} diff --git a/internal/adapters/opencode/tailer_counting_test.go b/internal/adapters/opencode/tailer_counting_test.go new file mode 100644 index 0000000..1ce14a1 --- /dev/null +++ b/internal/adapters/opencode/tailer_counting_test.go @@ -0,0 +1,112 @@ +package opencode + +import ( + "testing" + "time" +) + +// This file is the AC#6 secondary proof via the query-counting driver: across +// several IDLE poll cycles (no WAL event, within the safety-net window) the +// literal MAX(time_updated) SQL is NEVER executed, while the cheap MAX(id) check +// runs every cycle. The pure-gate test (tailer_gate_test.go) is the primary +// property proof; this pins it against real executed SQL. + +// TestDetectChange_NoIdleMaxTimeUpdated drives detectChange across several idle +// polls through the counting driver and asserts zero MAX(time_updated) queries +// were executed, while MAX(id) ran on every poll. +// +// NOT t.Parallel(): the counting driver shares one global queryLog (sql.Register +// is once-only), so the two counting tests run serially to keep their reset+ +// measure windows exclusive. +func TestDetectChange_NoIdleMaxTimeUpdated(t *testing.T) { + dir := t.TempDir() + path := seedBackfillDB(t, dir, 2) + + db, log := openCounting(t, path) + schema, err := introspectAll(ctxBG(), db) + if err != nil { + t.Fatalf("introspectAll: %v", err) + } + + // Cursor already at the DB maxima → no change is detected (steady state). We + // compute it from the live maxima so MAX(id) reports "no new rows". + cur := newCursor() + for _, table := range trackedTables { + mid, _ := maxID(ctxBG(), db, table) + mtu, _ := maxTimeUpdated(ctxBG(), db, table) + cur = cur.withTable(table, TableWatermark{MaxIDSeen: mid, MaxTimeUpdatedMs: mtu, MaxTimeUpdatedID: mid}) + } + + // Reset the recorded SQL AFTER the watermark-priming queries above so we only + // count the idle-poll queries. + log.reset() + + // Idle poll state: a recent probe and NO WAL event since, so the gate is + // CLOSED for the whole run (the window is far from the 60 s net). + now := time.Unix(1_700_000_000, 0) + st := newPollState(false) + st.markProbe(now) // lastProbe = now → net not yet due + // lastWALEvent stays zero (before lastProbe) → no WAL-driven probe. + + const idlePolls = 5 + for i := 0; i < idlePolls; i++ { + pollNow := now.Add(time.Duration(i) * activePollInterval) // all within the net window + changed, probed, derr := detectChange(ctxBG(), db, schema, cur, &st, pollNow) + if derr != nil { + t.Fatalf("detectChange poll %d: %v", i, derr) + } + if changed { + t.Fatalf("poll %d reported a change on a steady-state DB", i) + } + if probed { + t.Fatalf("poll %d ran the gated probe during idle (gate should be closed)", i) + } + } + + // The literal MAX(time_updated) must NOT appear across the idle polls. + if n := log.countContaining("MAX(time_updated)"); n != 0 { + t.Errorf("idle polls executed MAX(time_updated) %d times, want 0 (AC#6)", n) + } + // The cheap MAX(id) must have run (one per table per poll). + if n := log.countContaining("MAX(id)"); n < idlePolls { + t.Errorf("idle polls executed MAX(id) %d times, want >= %d (cheap path runs every poll)", n, idlePolls) + } +} + +// TestDetectChange_GateOpenRunsProbe asserts that when the gate IS open (a WAL +// event since the last probe), detectChange DOES execute MAX(time_updated). +// +// NOT t.Parallel(): see TestDetectChange_NoIdleMaxTimeUpdated (shared queryLog). +func TestDetectChange_GateOpenRunsProbe(t *testing.T) { + dir := t.TempDir() + path := seedBackfillDB(t, dir, 1) + + db, log := openCounting(t, path) + schema, err := introspectAll(ctxBG(), db) + if err != nil { + t.Fatalf("introspectAll: %v", err) + } + cur := newCursor() + for _, table := range trackedTables { + mid, _ := maxID(ctxBG(), db, table) + mtu, _ := maxTimeUpdated(ctxBG(), db, table) + cur = cur.withTable(table, TableWatermark{MaxIDSeen: mid, MaxTimeUpdatedMs: mtu, MaxTimeUpdatedID: mid}) + } + log.reset() + + now := time.Unix(1_700_000_000, 0) + st := newPollState(false) + st.markProbe(now) + st.markWALEvent(now.Add(time.Second)) // WAL event AFTER the last probe → gate open + + _, probed, derr := detectChange(ctxBG(), db, schema, cur, &st, now.Add(2*time.Second)) + if derr != nil { + t.Fatalf("detectChange: %v", derr) + } + if !probed { + t.Fatal("gate open (WAL event) but probe did not run") + } + if n := log.countContaining("MAX(time_updated)"); n == 0 { + t.Error("gate open but MAX(time_updated) never executed") + } +} diff --git a/internal/adapters/opencode/tailer_gate_test.go b/internal/adapters/opencode/tailer_gate_test.go new file mode 100644 index 0000000..ec3134b --- /dev/null +++ b/internal/adapters/opencode/tailer_gate_test.go @@ -0,0 +1,173 @@ +package opencode + +import ( + "testing" + "time" +) + +// This file is the AC#6 load-bearing proof: the pure MAX(time_updated) gating +// predicate (shouldProbeTimeUpdated) returns false during steady-state idle and +// true only after a WAL event or once the safety net elapses. It also pins the +// cadence state machine (pollState.nextInterval). The literal-SQL no-idle-probe +// assertion via the counting driver lives in tailer_test.go (it needs a real DB). + +// TestShouldProbeTimeUpdated is the direct, deterministic AC#6 gate test. It +// asserts the predicate's exact truth table so a regression that re-opens the +// expensive probe on idle polls fails here. +func TestShouldProbeTimeUpdated(t *testing.T) { + t.Parallel() + base := time.Unix(1_700_000_000, 0) + net := 60 * time.Second + + cases := []struct { + name string + now time.Time + lastWALEvent time.Time + lastProbe time.Time + want bool + }{ + { + name: "steady idle within net: no probe", + now: base.Add(10 * time.Second), + lastWALEvent: time.Time{}, // no WAL event ever + lastProbe: base, + want: false, + }, + { + name: "wal event after last probe: probe", + now: base.Add(1 * time.Second), + lastWALEvent: base.Add(500 * time.Millisecond), + lastProbe: base, + want: true, + }, + { + name: "wal event before last probe (already consumed): no probe", + now: base.Add(2 * time.Second), + lastWALEvent: base, + lastProbe: base.Add(1 * time.Second), + want: false, + }, + { + name: "safety net exactly elapsed: probe", + now: base.Add(net), + lastWALEvent: time.Time{}, + lastProbe: base, + want: true, + }, + { + name: "safety net just under: no probe", + now: base.Add(net - time.Millisecond), + lastWALEvent: time.Time{}, + lastProbe: base, + want: false, + }, + { + name: "safety net well past: probe", + now: base.Add(5 * net), + lastWALEvent: time.Time{}, + lastProbe: base, + want: true, + }, + } + for _, tc := range cases { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + got := shouldProbeTimeUpdated(tc.now, tc.lastWALEvent, tc.lastProbe, net) + if got != tc.want { + t.Errorf("shouldProbeTimeUpdated(now=%v, wal=%v, probe=%v) = %v, want %v", + tc.now.Sub(base), tc.lastWALEvent.Sub(base), tc.lastProbe.Sub(base), got, tc.want) + } + }) + } +} + +// TestShouldProbeTimeUpdated_ManyIdlePolls simulates a run of idle polls (no WAL +// event) at the active cadence and asserts the gate stays CLOSED on every poll +// until the 60 s net elapses — the property that keeps the unindexed full scan +// off the idle hot path. +func TestShouldProbeTimeUpdated_ManyIdlePolls(t *testing.T) { + t.Parallel() + net := 60 * time.Second + start := time.Unix(1_700_000_000, 0) + lastProbe := start + var noWAL time.Time + + probes := 0 + // Poll every 500 ms (active cadence) for 50 s — all within the net window. + for elapsed := time.Duration(0); elapsed < 50*time.Second; elapsed += 500 * time.Millisecond { + now := start.Add(elapsed) + if shouldProbeTimeUpdated(now, noWAL, lastProbe, net) { + probes++ + lastProbe = now + } + } + if probes != 0 { + t.Errorf("idle polls within the safety-net window issued %d MAX(time_updated) probes, want 0", probes) + } +} + +// TestPollStateNextInterval pins the cadence state machine: idle 2 s, active +// 500 ms, and the 250 ms floor while the WAL-event window is open. +func TestPollStateNextInterval(t *testing.T) { + t.Parallel() + now := time.Unix(1_700_000_000, 0) + + t.Run("idle is 2s", func(t *testing.T) { + t.Parallel() + st := newPollState(false) + if d := st.nextInterval(now); d != idlePollInterval { + t.Errorf("idle interval = %v, want %v", d, idlePollInterval) + } + }) + + t.Run("active is 500ms", func(t *testing.T) { + t.Parallel() + st := newPollState(false) + st.markCycle(true, now) + if d := st.nextInterval(now); d != activePollInterval { + t.Errorf("active interval = %v, want %v", d, activePollInterval) + } + }) + + t.Run("wal floor overrides idle within window", func(t *testing.T) { + t.Parallel() + st := newPollState(false) + st.markWALEvent(now) + if d := st.nextInterval(now.Add(1 * time.Second)); d != walFloorInterval { + t.Errorf("interval within WAL window = %v, want %v (floor)", d, walFloorInterval) + } + }) + + t.Run("wal floor expires after window", func(t *testing.T) { + t.Parallel() + st := newPollState(false) + st.markWALEvent(now) + // After the 5 s window, idle cadence resumes. + if d := st.nextInterval(now.Add(walFloorWindow + time.Second)); d != idlePollInterval { + t.Errorf("interval after WAL window = %v, want %v (idle)", d, idlePollInterval) + } + }) + + t.Run("active still floored within wal window", func(t *testing.T) { + t.Parallel() + st := newPollState(false) + st.markCycle(true, now) + st.markWALEvent(now) + if d := st.nextInterval(now.Add(time.Second)); d != walFloorInterval { + t.Errorf("active+WAL interval = %v, want %v (floor below active)", d, walFloorInterval) + } + }) +} + +// TestPollStateFirstProbeGateOpen asserts the INITIAL state opens the probe gate +// on the first poll (lastProbe zero ⇒ the net is immediately due), so a tail that +// starts after in-place mutations reconciles them on the first cycle. +func TestPollStateFirstProbeGateOpen(t *testing.T) { + t.Parallel() + st := newPollState(false) + now := time.Unix(1_700_000_000, 0) + if !shouldProbeTimeUpdated(now, st.lastWALEvent, st.lastProbe, timeUpdatedSafetyNet) { + t.Error("first poll gate should be OPEN (zero lastProbe ⇒ net immediately due)") + } +} diff --git a/internal/adapters/opencode/tailer_pollcycle_test.go b/internal/adapters/opencode/tailer_pollcycle_test.go new file mode 100644 index 0000000..acf6abe --- /dev/null +++ b/internal/adapters/opencode/tailer_pollcycle_test.go @@ -0,0 +1,185 @@ +package opencode + +import ( + "context" + "testing" + "time" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file covers the integrated poll-cycle path (pollOnce → detectChange → +// processChanges → emitProgress) and the scanLoop introspect-fatal branch, +// driving real DBs rather than the loop goroutine so the assertions are +// deterministic (no timers). + +// TestPollOnce_ProductiveCycle drives pollOnce directly against a DB with a new +// session past the cursor and asserts it emits the session's events + a +// SourceProgress and reports advanced=true. +func TestPollOnce_ProductiveCycle(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_a", "", 100, 100, 0) + insertAssistantMessage(t, rw, "msg_a", "ses_a", 110, 110, 5, 2) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + + out := make(chan canonical.Event, 256) + cur := newCursor() + st := newPollState(false) + advanced, err := pollOnce(ctxBG(), db, schema, &cur, "opencode:test", &st, out, silentLogger(), func(error) {}) + if err != nil { + t.Fatalf("pollOnce: %v", err) + } + if !advanced { + t.Fatal("pollOnce over a DB with new rows reported advanced=false") + } + got := drainAll(out) + if n := countKind(got, canonical.EvSessionStarted); n != 1 { + t.Errorf("SessionStarted count = %d, want 1", n) + } + if n := countKind(got, canonical.EvSourceProgress); n < 1 { + t.Errorf("productive pollOnce emitted %d SourceProgress, want >= 1", n) + } + // The cursor must have advanced past the inserted rows: the monotonic + // high-water (which gates the second no-op poll's cheap MAX(id) check) reaches + // the inserted session id (SOW-0005 round-2 P1-A). + if cur.Tables["session"].MaxIDSeen != "ses_a" { + t.Errorf("cursor session MaxIDSeen = %q, want ses_a", cur.Tables["session"].MaxIDSeen) + } + + // A SECOND pollOnce over the now-current cursor is a no-op (advanced=false, + // nothing emitted) — proves the cheap MAX(id) gate closes after catch-up. + // + // Close the probe gate first (a recent probe, no WAL event) so this poll is a + // GENUINELY idle cycle: changed==false (cheap MAX(id) silent) AND the gate is + // closed, so neither the forward delta nor the round-7 boundary re-scan fires. + // (The first poll advanced via the cheap MAX(id) path, which short-circuits before + // the gated probe, so markProbe never ran and lastProbe is still zero — leaving the + // safety-net gate immediately due. Without closing it here, the boundary re-scan + // would idempotently re-emit ses_a's boundary bucket on the gate-open path, which is + // correct behaviour but not what THIS test pins — see TestP1_R7_* for that path.) + st.markProbe(time.Now()) + out2 := make(chan canonical.Event, 16) + adv2, err := pollOnce(ctxBG(), db, schema, &cur, "opencode:test", &st, out2, silentLogger(), func(error) {}) + if err != nil { + t.Fatalf("pollOnce (second): %v", err) + } + if adv2 { + t.Error("second pollOnce over an up-to-date cursor reported advanced=true") + } + if got := drainAll(out2); len(got) != 0 { + t.Errorf("second pollOnce emitted %d events, want 0", len(got)) + } +} + +// TestScanLoop_IntrospectFatal asserts scanLoop returns a fatal error (not a +// benign skip) when a tracked table is missing a required column — an +// incompatible schema must surface, not silently emit nothing. +func TestScanLoop_IntrospectFatal(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + // Drop the message.data column dependency by replacing the message table with + // one lacking the required `data` column. + if _, err := rw.Exec(`DROP TABLE message`); err != nil { + t.Fatalf("drop message: %v", err) + } + if _, err := rw.Exec(`CREATE TABLE message (id TEXT PRIMARY KEY, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL)`); err != nil { + t.Fatalf("recreate message: %v", err) + } + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + + out := make(chan canonical.Event, 16) + var ce collectErrs + _, err := scanLoop(ctxBG(), path, "opencode:"+path, newCursor(), out, silentLogger(), ce.onError) + if err == nil { + t.Fatal("scanLoop over a schema missing a required column: want fatal error") + } +} + +// TestReloadAndEmit_GenericErrorPropagates pins the SOW-0005 round-7 P1-2 fix: a +// non-context, non-session-gone load error for one affected session PROPAGATES +// (returns) rather than being swallowed (logged + continue). The earlier code +// called onError(err); continue, which let commitBatch advance the cursor PAST +// rows whose content was never emitted — a permanent, health-invisible loss. +// Propagating the error keeps the checkpoint-after-emit invariant: the cursor is +// promoted only after every affected session's content is emitted. +// +// A closed DB makes loadSession itself error (generic, non-context), so +// reloadAndEmit must return that error. +func TestReloadAndEmit_GenericErrorPropagates(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_a", "", 1, 1, 0) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + if err := db.Close(); err != nil { // closed → loadSession errors (generic) + t.Fatalf("close ro db: %v", err) + } + + out := make(chan canonical.Event, 16) + var ce collectErrs + err := reloadAndEmit(ctxBG(), db, schema, "opencode:test", []string{"ses_a"}, out, silentLogger(), ce.onError) + if err == nil { + t.Fatal("reloadAndEmit must PROPAGATE a generic load error (round-7 P1-2), not swallow it; got nil") + } +} + +// TestOrDefaultLogger covers the nil-guard scanLoop/tailLoop apply to the logger +// param so a direct test caller passing nil does not panic: nil yields a non-nil +// logger (slog.Default()), and a supplied logger is returned unchanged. +func TestOrDefaultLogger(t *testing.T) { + t.Parallel() + if got := orDefaultLogger(nil); got == nil { + t.Fatal("orDefaultLogger(nil) = nil, want a non-nil default logger") + } + custom := silentLogger() + if got := orDefaultLogger(custom); got != custom { + t.Errorf("orDefaultLogger(custom) returned a different logger, want the same instance") + } +} + +// TestTailLoop_WALHintWakesCycle exercises the tailLoop WAL-hint branch end to +// end: with a live WAL companion, a write to it (plus a new row) wakes a cycle +// faster than the idle cadence and the new session surfaces. +func TestTailLoop_WALHintWakesCycle(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_seed", "", 1, 1, 0) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + // Reopen WAL-mode so a -wal companion exists for the watch. + rw2, err := openRWAgain(t, path) + if err != nil { + t.Fatalf("reopen rw (wal): %v", err) + } + defer func() { _ = rw2.Close() }() + + ctx, cancel := context.WithCancel(context.Background()) + out := make(chan canonical.Event, 4096) + var ce collectErrs + done := make(chan struct{}) + go func() { + defer close(done) + _ = tailLoop(ctx, path, "opencode:"+path, newCursor(), false, out, silentLogger(), ce.onError) + }() + defer func() { cancel(); <-done }() + + // Insert a new session; the WAL write fires the fsnotify hint. + insertSession(t, rw2, "ses_wal", "", 100, 100, 0) + if _, ok := waitForSession(out, "ses_wal", 8*time.Second); !ok { + t.Fatal("tailLoop did not surface a new session with a live WAL companion") + } +} diff --git a/internal/adapters/opencode/tailer_pollstate.go b/internal/adapters/opencode/tailer_pollstate.go new file mode 100644 index 0000000..15ca060 --- /dev/null +++ b/internal/adapters/opencode/tailer_pollstate.go @@ -0,0 +1,138 @@ +package opencode + +import ( + "context" + "database/sql" + "fmt" + "time" +) + +// This file holds the realtime poll-loop CADENCE STATE MACHINE (pollState) and the +// cursor-shaping helpers (coerceScanCursor / recordSchemaHash). Split out of +// tailer_changes.go to keep each file ≤400 lines (SOW-0005 round-5: the P2-1 +// post-tx warning flush grew loadAndMapSession). These are pure cadence/cursor +// concerns, distinct from the delta→emit→checkpoint pipeline in tailer_changes.go. + +// --- poll-loop cadence state machine ------------------------------------------ + +// pollState threads the realtime poll loop's cadence inputs: whether the last +// cycle was active (produced a change), the last WAL fsnotify event time, and +// the last MAX(time_updated) probe time. It is consulted by the gating predicate +// and the next-interval computation. Not safe for concurrent use; the tail loop +// owns one and mutates it single-threaded. +type pollState struct { + active bool + lastWALEvent time.Time + lastProbe time.Time + walFloorTill time.Time + // boundaryReal reports whether the cursor's current boundary ms is a position + // whose bucket was ALREADY emitted — so the round-7 P1-1 unified boundary re-scan + // (which runs before the forward delta whenever changed==true on ANY path OR the + // probe gate is open) may run without replaying a never-emitted cold-Tail HEAD + // snapshot. It is the SINGLE cold-`Tail` guard (round-7 P2-1): it gates the WHOLE + // trigger — both the changed path and the gate-open path — replacing the old, + // partial `priorProbe` flag that guarded only the changed==false path (and could + // be defeated by a cold Tail's first WAL-driven or safety-net probe). It starts + // true for a WARM Tail (resumed from a Scan cursor: Scan already emitted the + // boundary) and false for a COLD Tail (HEAD snapshot: follow-from-now, boundary + // never emitted); pollOnce flips it true once the cursor first advances (the new + // boundary is the just-emitted forward position, so re-scanning it is idempotent). + boundaryReal bool +} + +// newPollState returns the initial state: idle, no WAL event yet, and a zero +// lastProbe so the FIRST poll's gate is open (the safety net is immediately due, +// guaranteeing the initial cycle reconciles in-place mutations that arrived +// before tailing started). +// +// warmStart seeds boundaryReal (SOW-0005 round-6 P1; round-7 P2-1 makes it the +// single cold-Tail guard): a WARM Tail resumed from a Scan cursor inherits a +// boundary whose bucket Scan already emitted, so the boundary re-scan may run from +// the first poll; a COLD Tail (HEAD snapshot, follow-from-now) starts +// boundaryReal=false so it does NOT replay the never-emitted snapshot boundary on +// ANY path (the changed path, the gated-probe path, or the WAL-driven/safety-net +// gate-open path) until the cursor first advances — pollOnce flips boundaryReal +// true at that point. +func newPollState(warmStart bool) pollState { + return pollState{boundaryReal: warmStart} +} + +// markWALEvent records a WAL fsnotify event at t, opening the 250 ms floor window +// for walFloorWindow and the probe gate (lastWALEvent advances past lastProbe). +func (s *pollState) markWALEvent(t time.Time) { + s.lastWALEvent = t + s.walFloorTill = t.Add(walFloorWindow) +} + +// markProbe records that a MAX(time_updated) probe ran at t (advancing lastProbe +// re-closes the gate until the next WAL event or the next 60 s net tick). +func (s *pollState) markProbe(t time.Time) { + s.lastProbe = t +} + +// markCycle records the outcome of a poll cycle at t: active when it produced a +// change (switches to the 500 ms cadence), idle otherwise (2 s). +func (s *pollState) markCycle(advanced bool, _ time.Time) { s.active = advanced } + +// nextInterval returns the wait before the next poll: the active/idle base +// interval, floored to walFloorInterval while the WAL-event window is open. +func (s *pollState) nextInterval(now time.Time) time.Duration { + base := idlePollInterval + if s.active { + base = activePollInterval + } + if now.Before(s.walFloorTill) && base > walFloorInterval { + return walFloorInterval + } + return base +} + +// --- cursor shaping ----------------------------------------------------------- + +// coerceScanCursor normalises a cursor for use: a nil/zero Tables map is +// initialised and the version is set. The watermarks are NOT reset (column drift +// is handled per-column; only a depended-on column vanishing forces a re-ingest). +// The schema hash is recorded SEPARATELY by the poll loops via withSchemaHash +// after reading __drizzle_migrations (recordSchemaHash) — it is the REAL +// migration-name digest (schemaHash in migrations.go), replacing chunk C's +// interim present-column-shape fingerprint. Keeping the hash out of this function +// keeps it a pure cursor-shaping helper. Returns a ready-to-page cursor. +func coerceScanCursor(c Cursor) Cursor { + if c.Tables == nil { + c = newCursor() + } + if c.Version == 0 { + c.Version = cursorVersion + } + return c +} + +// recordSchemaHash reads __drizzle_migrations once and stamps the REAL +// migration-name schema hash (schemaHash) onto the cursor (adapter-opencode.md +// §"Cursor"). Called by scanLoop and tailLoop right after introspectAll, while +// the read-only DB is open. +// +// Mismatch behaviour (spec adapter-opencode.md §"Cursor"): when the incoming +// cursor already carries a different hash (opencode applied a migration between +// runs), the change is logged as a structured WARN via onError, the hash is +// re-read, and the loop CONTINUES without resetting watermarks — column drift is +// handled per-column by the dynamic SELECT, so a benign migration (a new column +// the adapter does not read, a data migration) never forces a re-ingest. A +// genuine read error (corrupt journal) is non-fatal here: the prior hash is kept +// and onError is notified, so the backfill/poll still proceeds. +func recordSchemaHash(ctx context.Context, db *sql.DB, c Cursor, onError func(error)) Cursor { + hash, err := readSchemaHash(ctx, db) + if err != nil { + onError(fmt.Errorf("opencode: read schema hash (keeping prior, continuing): %w", err)) + return c + } + if hash == "" { + // No __drizzle_migrations (foreign/old DB): leave any prior hash as-is; + // there is nothing authoritative to record. + return c + } + if c.SchemaHash != "" && c.SchemaHash != hash { + onError(fmt.Errorf("opencode: schema hash changed (migration applied); re-reading, watermarks preserved: %.12s… → %.12s…", c.SchemaHash, hash)) + } + return c.withSchemaHash(hash) +} diff --git a/internal/adapters/opencode/tailer_resume_test.go b/internal/adapters/opencode/tailer_resume_test.go new file mode 100644 index 0000000..5bf2294 --- /dev/null +++ b/internal/adapters/opencode/tailer_resume_test.go @@ -0,0 +1,284 @@ +package opencode + +import ( + "context" + "database/sql" + "sort" + "strconv" + "strings" + "testing" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file pins the resume invariant (AC#6 durability): a scanLoop over the +// first half of a fixture, its cursor persisted+reparsed, then a scanLoop over +// the rest from that cursor, yields the SAME set of canonical content events as +// a single cold scanLoop over the whole fixture — zero duplicates, zero gaps +// (modulo SourceProgress checkpoints, which are not content). + +// eventFingerprint renders a content event into a stable string for set +// comparison. SourceProgress is excluded by the caller (it is a checkpoint, not +// content). The fingerprint captures the kind + the identifying fields so a +// duplicate or a missing event changes the multiset. +func eventFingerprint(ev canonical.Event) string { + switch e := ev.(type) { + case canonical.SessionStartedEvent: + return "session_started|" + e.NativeID + "|" + string(e.Kind) + case canonical.SessionFinalizedEvent: + return "session_finalized|" + e.NativeID + "|" + string(e.Status) + case canonical.TurnStartedEvent: + return fp("turn_started", e.SessionNativeID, e.Seq) + case canonical.TurnFinalizedEvent: + return fp("turn_finalized", e.SessionNativeID, e.Seq) + case canonical.OpStartedEvent: + return fp("op_started", e.SessionNativeID, e.TurnSeq) + "|" + string(e.Kind) + "|" + strconv.Itoa(e.Seq) + case canonical.OpFinalizedEvent: + return fp("op_finalized", e.SessionNativeID, e.TurnSeq) + "|" + strconv.Itoa(e.Seq) + case canonical.PayloadRefEvent: + return fp("payload_ref", e.SessionNativeID, e.TurnSeq) + "|" + strconv.Itoa(e.OpSeq) + "|" + e.LocationURI + case canonical.LogEntryEvent: + return "log|" + e.SessionNativeID + "|" + e.Severity + "|" + e.Message + default: + return string(ev.EventKind()) + } +} + +func fp(kind, sid string, seq int) string { return kind + "|" + sid + "|" + strconv.Itoa(seq) } + +// contentFingerprints returns the sorted content-event fingerprints (excluding +// SourceProgress) of an event slice — the multiset to compare across runs. +func contentFingerprints(evs []canonical.Event) []string { + var out []string + for _, ev := range evs { + if ev.EventKind() == canonical.EvSourceProgress { + continue + } + out = append(out, eventFingerprint(ev)) + } + sort.Strings(out) + return out +} + +// TestScanLoop_ResumeZeroDupesZeroGaps is the resume proof. It builds a fixture, +// scans the WHOLE thing cold (baseline), then on a FRESH copy scans the first +// half (by inserting half, scanning, persisting the cursor), inserts the rest, +// reparses the cursor, and scans again — asserting the union of content events +// from the two-part run equals the cold-run baseline. +func TestScanLoop_ResumeZeroDupesZeroGaps(t *testing.T) { + t.Parallel() + + // Baseline: one DB with all 6 sessions, scanned cold. + baselinePath := seedBackfillDB(t, t.TempDir(), 6) + baseOut := make(chan canonical.Event, 8192) + var ce0 collectErrs + if _, err := scanLoop(ctxBG(), baselinePath, "opencode:x", newCursor(), baseOut, silentLogger(), ce0.onError); err != nil { + t.Fatalf("baseline scanLoop: %v", err) + } + baseline := contentFingerprints(drainAll(baseOut)) + + // Two-part run: build a SEPARATE DB, seed half, scan, persist+reparse cursor, + // seed the rest, scan from the reparsed cursor. + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + seedSessionsInto(t, rw, 1, 3) // sessions 1..3 + if err := rw.Close(); err != nil { + t.Fatalf("close rw (first half): %v", err) + } + + out1 := make(chan canonical.Event, 8192) + var ce1 collectErrs + cur1, err := scanLoop(ctxBG(), path, "opencode:x", newCursor(), out1, silentLogger(), ce1.onError) + if err != nil { + t.Fatalf("first-half scanLoop: %v", err) + } + part1 := contentFingerprints(drainAll(out1)) + + // Persist + reparse the cursor (the durable round-trip). + stored := cur1.String() + reparsed, err := ParseCursor(stored) + if err != nil { + t.Fatalf("ParseCursor(%q): %v", stored, err) + } + + // Seed the rest (sessions 4..6) into the SAME DB. + rw2, err := openRWAgain(t, path) + if err != nil { + t.Fatalf("reopen rw (second half): %v", err) + } + seedSessionsInto(t, rw2, 4, 6) + if err := rw2.Close(); err != nil { + t.Fatalf("close rw (second half): %v", err) + } + + out2 := make(chan canonical.Event, 8192) + var ce2 collectErrs + if _, err := scanLoop(ctxBG(), path, "opencode:x", reparsed, out2, silentLogger(), ce2.onError); err != nil { + t.Fatalf("second-half scanLoop: %v", err) + } + part2 := contentFingerprints(drainAll(out2)) + + // Union of the two parts must equal the cold baseline: zero gaps (every + // baseline event present) and zero dupes (no event present more than the + // baseline multiplicity). + union := append(append([]string{}, part1...), part2...) + sort.Strings(union) + + if diff := multisetDiff(baseline, union); diff != "" { + t.Fatalf("resume content events differ from cold baseline:\n%s", diff) + } +} + +// TestProcessChanges_CheckpointAfterEmit_NoLoss is the P1.1 data-loss regression. +// It forces MORE THAN ONE batch (>progressEveryRows session rows) and CANCELS the +// context right after the FIRST batch's SourceProgress checkpoint. It then asserts +// the checkpoint-after-emit invariant: the returned (persisted) cursor covers +// ONLY sessions whose content was emitted, so a RESUME from it re-emits every +// not-yet-emitted session — the union of run-1 + resume is the COMPLETE session +// set, zero skipped. The pre-P1.1 code advanced the watermark mid-paging BEFORE +// emitting, so a cancel here would have persisted a cursor past un-emitted +// sessions → permanent loss; this test fails against that behaviour. +func TestProcessChanges_CheckpointAfterEmit_NoLoss(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + + // Seed > progressEveryRows session rows (each a bare session — no messages or + // parts) so the FIRST batch's shared budget is spent entirely within the + // session table, leaving the last session for a SECOND batch. + const total = progressEveryRows + 5 + tx, _ := rw.Begin() + stmt, _ := tx.Prepare(`INSERT INTO session (id, project_id, parent_id, slug, directory, title, version, agent, model, time_created, time_updated, time_archived) + VALUES (?,?,?,?,?,?,?,?,?,?,?,?)`) + for i := 1; i <= total; i++ { + if _, err := stmt.Exec(fmtID("ses", i), "prj", "", "slug", "/w", "T", "9.9.9", "a", "", int64(i), int64(i), nil); err != nil { + t.Fatalf("bulk insert session %d: %v", i, err) + } + } + _ = stmt.Close() + if err := tx.Commit(); err != nil { + t.Fatalf("commit: %v", err) + } + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + + db, schema := introspect(t, path) + + // run-1: a consumer that records SessionStarted ids and cancels the context as + // soon as it sees the FIRST SourceProgress (the first batch's checkpoint), so + // the run stops AFTER batch-1 committed but BEFORE batch-2 emits. + ctx, cancel := context.WithCancel(ctxBG()) + defer cancel() + out := make(chan canonical.Event, 8192) + run1Seen := map[string]bool{} + doneConsume := make(chan struct{}) + go func() { + defer close(doneConsume) + sawProgress := false + for ev := range out { + if s, ok := ev.(canonical.SessionStartedEvent); ok { + run1Seen[s.NativeID] = true + } + if ev.EventKind() == canonical.EvSourceProgress && !sawProgress { + sawProgress = true + cancel() // stop the producer right after the first checkpoint + } + } + }() + + cur1, _, _ := processChanges(ctx, db, schema, newCursor(), "opencode:x", out, silentLogger(), func(error) {}) + close(out) + <-doneConsume + + // run-1 must NOT have emitted every session (the cancel cut it short) — else + // the test is not exercising the resume path. + if len(run1Seen) >= total { + t.Fatalf("run-1 emitted all %d sessions; cancel did not cut the run short", total) + } + + // Persist + reparse the cursor (the durable round-trip a daemon restart does). + reparsed, err := ParseCursor(cur1.String()) + if err != nil { + t.Fatalf("ParseCursor: %v", err) + } + + // Resume from the persisted cursor with a fresh (uncancelled) context. + out2 := make(chan canonical.Event, 8192) + if _, _, err := processChanges(ctxBG(), db, schema, reparsed, "opencode:x", out2, silentLogger(), func(error) {}); err != nil { + t.Fatalf("resume processChanges: %v", err) + } + for _, ev := range drainAll(out2) { + if s, ok := ev.(canonical.SessionStartedEvent); ok { + run1Seen[s.NativeID] = true + } + } + + // Zero loss: the UNION of run-1 + resume must cover EVERY session. A session + // missing here would be one whose content was never emitted yet whose row the + // persisted cursor had advanced past — the data-loss bug P1.1 fixes. + if len(run1Seen) != total { + t.Fatalf("union of run-1 + resume covered %d sessions, want all %d (zero loss)", len(run1Seen), total) + } +} + +// seedSessionsInto inserts sessions [lo, hi] (1-based, inclusive) into rw with +// the SAME per-session structure and ids seedBackfillDB uses (ses_/msg_, +// tokens 10*i/5*i, one step-start/step-finish/text triple), so the content +// fingerprints match the cold baseline regardless of absolute timestamps. Times +// are derived from the index so they are globally monotonic. +func seedSessionsInto(t *testing.T, rw *sql.DB, lo, hi int) { + t.Helper() + for i := lo; i <= hi; i++ { + sid := fmtID("ses", i) + mid := fmtID("msg", i) + ts := int64(1000 + i*10) + insertSession(t, rw, sid, "", ts, ts, 0) + insertAssistantMessage(t, rw, mid, sid, ts+1, ts+1, int64(10*i), int64(5*i)) + insertPart(t, rw, fmtID("prt_ss", i), mid, sid, ts+2, ts+2, stepStartBody()) + insertPart(t, rw, fmtID("prt_sf", i), mid, sid, ts+3, ts+3, stepFinishBody(int64(10*i), int64(5*i), 0.01)) + insertPart(t, rw, fmtID("prt_tx", i), mid, sid, ts+4, ts+4, textBody("answer")) + } +} + +// multisetDiff returns "" when sorted multisets a and b are equal, else a +// human-readable description of the first difference. +func multisetDiff(a, b []string) string { + if len(a) != len(b) { + var sb strings.Builder + sb.WriteString("length: baseline=") + sb.WriteString(strconv.Itoa(len(a))) + sb.WriteString(" resume=") + sb.WriteString(strconv.Itoa(len(b))) + sb.WriteString("\n") + sb.WriteString(firstMismatch(a, b)) + return sb.String() + } + for i := range a { + if a[i] != b[i] { + return "at " + strconv.Itoa(i) + ": baseline=" + a[i] + " resume=" + b[i] + } + } + return "" +} + +// firstMismatch reports the first element present in one slice but not at the +// same position in the other (both are sorted). +func firstMismatch(a, b []string) string { + n := len(a) + if len(b) < n { + n = len(b) + } + for i := 0; i < n; i++ { + if a[i] != b[i] { + return "first diff @ " + strconv.Itoa(i) + ": baseline=" + a[i] + " resume=" + b[i] + } + } + if len(a) > n { + return "baseline has extra: " + a[n] + } + if len(b) > n { + return "resume has extra: " + b[n] + } + return "" +} diff --git a/internal/adapters/opencode/tailer_test.go b/internal/adapters/opencode/tailer_test.go new file mode 100644 index 0000000..e78643a --- /dev/null +++ b/internal/adapters/opencode/tailer_test.go @@ -0,0 +1,299 @@ +package opencode + +import ( + "context" + "path/filepath" + "sync" + "testing" + "time" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file pins the poll-loop tailer: scanLoop backfill (events + progress + +// final watermarks), ctx-cancel, missing DB; tailLoop one-cycle pickup, +// ctx-cancel, missing-WAL; the counting-driver no-idle-MAX(time_updated) +// property; and the resume zero-dupes/zero-gaps invariant. + +// collectErrs is a concurrency-safe onError sink. +type collectErrs struct { + mu sync.Mutex + errs []error +} + +func (c *collectErrs) onError(e error) { + c.mu.Lock() + c.errs = append(c.errs, e) + c.mu.Unlock() +} + +func (c *collectErrs) count() int { + c.mu.Lock() + defer c.mu.Unlock() + return len(c.errs) +} + +// drainAll reads every event currently buffered on out (non-blocking) into a +// slice. Used after a bounded scanLoop completes. +func drainAll(out chan canonical.Event) []canonical.Event { + var got []canonical.Event + for { + select { + case ev := <-out: + got = append(got, ev) + default: + return got + } + } +} + +// seedBackfillDB builds a DB with n root sessions, each with one assistant +// message carrying one step-start/step-finish/text part triple, and returns the +// path. Times are monotonic across sessions so watermarks are unambiguous. +func seedBackfillDB(t *testing.T, dir string, n int) string { + t.Helper() + path, rw := newEmptyDB(t, dir, "opencode.db") + ts := int64(1000) + for i := 1; i <= n; i++ { + sid := fmtID("ses", i) + mid := fmtID("msg", i) + insertSession(t, rw, sid, "", ts, ts, 0) + ts++ + insertAssistantMessage(t, rw, mid, sid, ts, ts, int64(10*i), int64(5*i)) + insertPart(t, rw, fmtID("prt_ss", i), mid, sid, ts, ts, stepStartBody()) + ts++ + insertPart(t, rw, fmtID("prt_sf", i), mid, sid, ts, ts, stepFinishBody(int64(10*i), int64(5*i), 0.01)) + ts++ + insertPart(t, rw, fmtID("prt_tx", i), mid, sid, ts, ts, textBody("answer")) + ts++ + } + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + return path +} + +// TestScanLoop_BackfillEmitsAll asserts a cold scanLoop over N sessions emits a +// SessionStarted per session, at least one SourceProgress, and returns a cursor +// whose per-table watermarks equal the DB maxima. +func TestScanLoop_BackfillEmitsAll(t *testing.T) { + t.Parallel() + const n = 3 + path := seedBackfillDB(t, t.TempDir(), n) + + out := make(chan canonical.Event, 4096) + var ce collectErrs + cur, err := scanLoop(ctxBG(), path, "opencode:"+path, newCursor(), out, silentLogger(), ce.onError) + if err != nil { + t.Fatalf("scanLoop: %v", err) + } + got := drainAll(out) + + if c := countKind(got, canonical.EvSessionStarted); c != n { + t.Errorf("SessionStarted count = %d, want %d", c, n) + } + if c := countKind(got, canonical.EvSourceProgress); c < 1 { + t.Errorf("SourceProgress count = %d, want >= 1", c) + } + if c := countKind(got, canonical.EvTurnFinalized); c != n { + t.Errorf("TurnFinalized count = %d, want %d (one assistant turn per session)", c, n) + } + if ce.count() != 0 { + t.Errorf("backfill surfaced %d errors, want 0", ce.count()) + } + + // Final cursor watermarks equal the DB maxima for each table. + db, _ := introspect(t, path) + for _, table := range trackedTables { + wantMaxID, _ := maxID(ctxBG(), db, table) + wantMaxTU, _ := maxTimeUpdated(ctxBG(), db, table) + w := cur.Tables[table] + // After a full backfill the monotonic high-water AND the (time_updated, id) + // paging-position id both reach the DB's MAX(id) (the fixture is monotonic, + // so the last-paged row carries the greatest id) (SOW-0005 round-2 P1-A). + if w.MaxIDSeen != wantMaxID { + t.Errorf("table %q cursor MaxIDSeen = %q, want DB max %q", table, w.MaxIDSeen, wantMaxID) + } + if w.MaxTimeUpdatedID != wantMaxID { + t.Errorf("table %q cursor MaxTimeUpdatedID = %q, want DB max %q", table, w.MaxTimeUpdatedID, wantMaxID) + } + if w.MaxTimeUpdatedMs != wantMaxTU { + t.Errorf("table %q cursor MaxTimeUpdatedMs = %d, want DB max %d", table, w.MaxTimeUpdatedMs, wantMaxTU) + } + } +} + +// TestScanLoop_MissingDBBenign asserts a missing DB file surfaces one error and +// returns (since, nil) so the daemon keeps serving other sources. +func TestScanLoop_MissingDBBenign(t *testing.T) { + t.Parallel() + missing := filepath.Join(t.TempDir(), "no-such.db") + out := make(chan canonical.Event, 4) + var ce collectErrs + cur, err := scanLoop(ctxBG(), missing, "opencode:"+missing, newCursor(), out, silentLogger(), ce.onError) + if err != nil { + t.Fatalf("scanLoop(missing) = %v, want nil", err) + } + if ce.count() == 0 { + t.Error("missing DB should surface one error") + } + if cur.hasProgress() { + t.Error("missing DB cursor should have no progress") + } +} + +// TestScanLoop_CtxCancelMidScan asserts a cancelled ctx returns ctx.Err() and +// does not deadlock on an UNbuffered channel (the send must observe ctx.Done()). +func TestScanLoop_CtxCancelMidScan(t *testing.T) { + t.Parallel() + path := seedBackfillDB(t, t.TempDir(), 50) + + ctx, cancel := context.WithCancel(context.Background()) + cancel() // cancel up front: the first ctx-aware send/Err must bail. + + out := make(chan canonical.Event) // unbuffered: a non-ctx-aware send would hang + var ce collectErrs + done := make(chan error, 1) + go func() { + _, err := scanLoop(ctx, path, "opencode:"+path, newCursor(), out, silentLogger(), ce.onError) + done <- err + }() + select { + case err := <-done: + if err != nil && !isContextErr(err) { + t.Fatalf("scanLoop(cancelled) = %v, want nil or context error", err) + } + case <-time.After(5 * time.Second): + t.Fatal("scanLoop did not return after ctx cancel (deadlock on channel?)") + } +} + +// TestTailLoop_PicksUpNewSession runs tailLoop with a drained channel, inserts a +// new session+turn AFTER the loop starts, and asserts the new session's events + +// a SourceProgress are emitted. A missing-WAL DB still tails via the timer. +func TestTailLoop_PicksUpNewSession(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + // Seed one session so the initial cursor is non-empty; close so the WAL + // flushes and (importantly) there is NO opencode.db-wal companion → the tail + // must fall back to pure timer polling. + insertSession(t, rw, "ses_seed", "", 1, 1, 0) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + + // Cold cursor → the first cycle backfills the seed; we then add a new session. + ctx, cancel := context.WithCancel(context.Background()) + out := make(chan canonical.Event, 4096) + var ce collectErrs + var wg sync.WaitGroup + wg.Add(1) + go func() { + defer wg.Done() + _ = tailLoop(ctx, path, "opencode:"+path, newCursor(), false, out, silentLogger(), ce.onError) + }() + // ONE combined teardown: cancel FIRST, then wait. A separate `defer cancel()` + // + `defer wg.Wait()` would run LIFO (wait before cancel) and deadlock — the + // loop never gets cancelled while wg.Wait blocks on it. + defer func() { cancel(); wg.Wait() }() + + // Reopen rw to add a new session AFTER the loop is polling. + rw2, err := openRWAgain(t, path) + if err != nil { + t.Fatalf("reopen rw: %v", err) + } + defer func() { _ = rw2.Close() }() + insertSession(t, rw2, "ses_new", "", 100, 100, 0) + insertAssistantMessage(t, rw2, "msg_new", "ses_new", 110, 110, 7, 3) + + // Within a few idle polls (2 s cadence + 60 s net not needed: MAX(id) catches + // the INSERT) the new session must surface. + if _, ok := waitForSession(out, "ses_new", 8*time.Second); !ok { + t.Fatal("tailLoop did not emit the new session within the deadline") + } + // The SourceProgress checkpoint is emitted AFTER the session's events in the + // same productive cycle (pollOnce → emitProgress), so drain forward for it + // rather than asserting on the slice captured up to the SessionStarted. + if _, ok := waitForEventKind(out, canonical.EvSourceProgress, 5*time.Second); !ok { + t.Error("tail cycle did not emit a SourceProgress checkpoint after the new session") + } +} + +// waitForEventKind drains out until an event of the given kind appears or the +// deadline elapses. +func waitForEventKind(out chan canonical.Event, kind canonical.EventKind, d time.Duration) ([]canonical.Event, bool) { + deadline := time.After(d) + var got []canonical.Event + for { + select { + case ev := <-out: + got = append(got, ev) + if ev.EventKind() == kind { + return got, true + } + case <-deadline: + return got, false + } + } +} + +// TestTailLoop_CtxCancelReturnsNil asserts tailLoop returns nil promptly on ctx +// cancel. +func TestTailLoop_CtxCancelReturnsNil(t *testing.T) { + t.Parallel() + path := seedBackfillDB(t, t.TempDir(), 1) + ctx, cancel := context.WithCancel(context.Background()) + out := make(chan canonical.Event, 4096) + var ce collectErrs + done := make(chan error, 1) + go func() { + done <- tailLoop(ctx, path, "opencode:"+path, newCursor(), false, out, silentLogger(), ce.onError) + }() + // Let it establish + run one cycle, then cancel. + time.Sleep(200 * time.Millisecond) + cancel() + select { + case err := <-done: + if err != nil { + t.Fatalf("tailLoop(cancelled) = %v, want nil", err) + } + case <-time.After(5 * time.Second): + t.Fatal("tailLoop did not return after ctx cancel") + } +} + +// TestTailLoop_MissingDBBenign asserts a missing DB surfaces one error and +// returns nil (the daemon keeps running for other sources). +func TestTailLoop_MissingDBBenign(t *testing.T) { + t.Parallel() + missing := filepath.Join(t.TempDir(), "no-such.db") + ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second) + defer cancel() + out := make(chan canonical.Event, 4) + var ce collectErrs + if err := tailLoop(ctx, missing, "opencode:"+missing, newCursor(), false, out, silentLogger(), ce.onError); err != nil { + t.Fatalf("tailLoop(missing) = %v, want nil", err) + } + if ce.count() == 0 { + t.Error("missing DB should surface one error") + } +} + +// waitForSession drains out until a SessionStarted with the given native id +// appears or the deadline elapses. +func waitForSession(out chan canonical.Event, nativeID string, d time.Duration) ([]canonical.Event, bool) { + deadline := time.After(d) + var got []canonical.Event + for { + select { + case ev := <-out: + got = append(got, ev) + if s, ok := ev.(canonical.SessionStartedEvent); ok && s.NativeID == nativeID { + return got, true + } + case <-deadline: + return got, false + } + } +} diff --git a/internal/adapters/opencode/tailer_wal.go b/internal/adapters/opencode/tailer_wal.go new file mode 100644 index 0000000..0bb1000 --- /dev/null +++ b/internal/adapters/opencode/tailer_wal.go @@ -0,0 +1,120 @@ +package opencode + +import ( + "fmt" + "path/filepath" + "sync" + "time" + + "github.com/fsnotify/fsnotify" +) + +// This file holds the WAL fsnotify WAKEUP-HINT machinery and the timer-reset +// idiom the realtime poll loop (tailLoop in tailer.go) uses. Split out of +// tailer.go to keep each file ≤400 lines (SOW-0005 round-2; the P1-A/P3-C comment +// expansions pushed tailer.go over budget). None of this is authoritative change +// detection — the hint only nudges the cadence; the MAX(id)/MAX(time_updated) +// probes (tailer.go) decide what actually changed. + +// watchWAL sets up a best-effort fsnotify watch on the opencode.db-wal companion +// path and returns a channel that fires (a bare struct{}) on each Write/Chmod +// event plus a close func. It is a WAKEUP HINT ONLY: a missing WAL file, an Add +// failure, or a watcher error is reported once via onError and yields a closed +// channel so the caller falls back to pure timer polling. The watch is on the +// PARENT directory (the WAL file may not exist yet, and watching a not-yet- +// existing file fails); events are filtered to the WAL basename. +func watchWAL(dbPath string, onError func(error)) (<-chan struct{}, func()) { + walPath := dbPath + "-wal" + dir := filepath.Dir(walPath) + walBase := filepath.Base(walPath) + + watcher, err := fsnotify.NewWatcher() + if err != nil { + onError(fmt.Errorf("opencode: wal watcher (falling back to timer polling): %w", err)) + return closedHintChan(), func() {} + } + if aerr := watcher.Add(dir); aerr != nil { + onError(fmt.Errorf("opencode: watch wal dir %s (falling back to timer polling): %w", dir, aerr)) + _ = watcher.Close() + return closedHintChan(), func() {} + } + + hint := make(chan struct{}, 1) + done := make(chan struct{}) + var wg sync.WaitGroup + wg.Add(1) + go func() { + defer wg.Done() + defer close(hint) + for { + select { + case <-done: + return + case ev, ok := <-watcher.Events: + if !ok { + return + } + if filepath.Base(ev.Name) != walBase { + continue + } + if ev.Op&(fsnotify.Write|fsnotify.Chmod|fsnotify.Create) == 0 { + continue + } + // Non-blocking notify: a pending hint already wakes the next poll. + select { + case hint <- struct{}{}: + default: + } + case werr, ok := <-watcher.Errors: + if !ok { + return + } + // A watcher error never terminates the tail loop; report and + // keep the watch (or let the timer net carry it). + onError(fmt.Errorf("opencode: wal watcher error: %w", werr)) + } + } + }() + + // closeWatch stops the goroutine and WAITS for it to exit before returning + // (SOW-0005 round-7 P2-2). The watcher goroutine may call onError (a watcher + // error) — i.e. a send on the adapter's out channel; if closeWatch returned + // before the goroutine exited, the source goroutine could then close(events) + // while this goroutine was still sending on it → a send-on-closed-channel + // panic. signalling `done` AND closing the watcher unblocks the select; the + // WaitGroup makes the goroutine's exit a happens-before of closeWatch's return + // (Tail's `defer closeWatch()` therefore guarantees the watcher goroutine is + // dead before Tail returns and the source closes events). It is idempotent: + // `close(done)` is guarded by sync.Once so a double call (e.g. an explicit call + // plus the deferred one) does not panic. + var once sync.Once + closeWatch := func() { + once.Do(func() { + close(done) + _ = watcher.Close() + }) + wg.Wait() + } + return hint, closeWatch +} + +// closedHintChan returns an already-closed hint channel, used when the WAL watch +// cannot be established so the tail loop falls back to pure timer polling. +func closedHintChan() <-chan struct{} { + ch := make(chan struct{}) + close(ch) + return ch +} + +// resetTimer safely resets a timer to fire after d, draining any pending fire so +// the next select sees only the new deadline (the standard time.Timer reset +// idiom). +func resetTimer(t *time.Timer, d time.Duration) { + if !t.Stop() { + select { + case <-t.C: + default: + } + } + t.Reset(d) +} diff --git a/internal/adapters/opencode/tailer_wal_test.go b/internal/adapters/opencode/tailer_wal_test.go new file mode 100644 index 0000000..e3606c2 --- /dev/null +++ b/internal/adapters/opencode/tailer_wal_test.go @@ -0,0 +1,149 @@ +package opencode + +import ( + "os" + "path/filepath" + "testing" + "time" + + "github.com/netdata/ai-viewer/internal/canonical" +) + +// This file covers the WAL fsnotify hint (success + missing-dir fallback) and the +// SQL error paths of the delta/load layer (a closed DB surfaces errors rather +// than panicking) — the branches the happy-path and pure-helper tests don't hit. + +// TestWatchWAL_FiresOnWrite asserts the WAL watch delivers a hint when the +// companion opencode.db-wal file is written after the watch is established. +func TestWatchWAL_FiresOnWrite(t *testing.T) { + t.Parallel() + dir := t.TempDir() + dbPath := filepath.Join(dir, "opencode.db") + walPath := dbPath + "-wal" + // Create the DB file (so its dir exists) and an initial empty WAL companion. + if err := os.WriteFile(dbPath, []byte("x"), 0o600); err != nil { + t.Fatalf("write db: %v", err) + } + if err := os.WriteFile(walPath, []byte{}, 0o600); err != nil { + t.Fatalf("write wal: %v", err) + } + + var ce collectErrs + hint, closeWatch := watchWAL(dbPath, ce.onError) + defer closeWatch() + + // Append to the WAL to fire a Write event on opencode.db-wal. + time.Sleep(100 * time.Millisecond) // let the watch establish + f, err := os.OpenFile(walPath, os.O_APPEND|os.O_WRONLY, 0o600) + if err != nil { + t.Fatalf("open wal for append: %v", err) + } + _, _ = f.WriteString("frame") + _ = f.Close() + + select { + case <-hint: + // Got the wakeup hint. + case <-time.After(5 * time.Second): + t.Fatal("WAL write did not deliver a wakeup hint") + } + if ce.count() != 0 { + t.Errorf("watchWAL on a present WAL surfaced %d errors, want 0", ce.count()) + } +} + +// TestWatchWAL_MissingDirFallsBack asserts that when the WAL's parent directory +// does not exist, watchWAL reports one error and returns a closed channel (the +// caller falls back to pure timer polling) without panicking. +func TestWatchWAL_MissingDirFallsBack(t *testing.T) { + t.Parallel() + missingDir := filepath.Join(t.TempDir(), "no-such-dir") + dbPath := filepath.Join(missingDir, "opencode.db") + + var ce collectErrs + hint, closeWatch := watchWAL(dbPath, ce.onError) + defer closeWatch() + + // The channel must be closed (drains immediately) so the select in tailLoop + // nils it and falls back to the timer. + select { + case _, ok := <-hint: + if ok { + t.Error("expected a closed hint channel for a missing WAL dir") + } + case <-time.After(2 * time.Second): + t.Error("missing-dir hint channel did not drain as closed") + } + if ce.count() == 0 { + t.Error("watchWAL on a missing dir should surface one error") + } +} + +// TestClosedHintChan asserts the helper returns an already-closed channel. +func TestClosedHintChan(t *testing.T) { + t.Parallel() + ch := closedHintChan() + select { + case _, ok := <-ch: + if ok { + t.Error("closedHintChan returned an open channel") + } + default: + t.Error("closedHintChan channel did not drain immediately (not closed)") + } +} + +// TestQueryLayer_ClosedDBErrors asserts the cheap probes and the delta scan +// surface an error (not a panic) when the DB handle is closed mid-flight — the +// SQL error branches the happy path skips. +func TestQueryLayer_ClosedDBErrors(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_a", "", 1, 1, 0) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + // Close the read-only handle so subsequent queries error. + if err := db.Close(); err != nil { + t.Fatalf("close ro db: %v", err) + } + + if _, err := maxID(ctxBG(), db, "session"); err == nil { + t.Error("maxID on a closed DB: want error") + } + if _, err := maxTimeUpdated(ctxBG(), db, "session"); err == nil { + t.Error("maxTimeUpdated on a closed DB: want error") + } + scan, _ := scanSessionRow(newColumnIndex(schema["session"]), len(schema["session"].Present), nil) + if _, err := scanTableDelta(ctxBG(), db, schema["session"], TableWatermark{}, scan, &warnSink{}, nil); err == nil { + t.Error("scanTableDelta on a closed DB: want error") + } + if _, _, err := loadSession(ctxBG(), db, schema, "ses_a", func(error) {}); err == nil { + t.Error("loadSession on a closed DB: want error") + } + if _, err := loadSessionTree(ctxBG(), db, schema, "ses_a", func(error) {}); err == nil { + t.Error("loadSessionTree on a closed DB: want error") + } +} + +// TestProcessChanges_CollectError asserts processChanges propagates a delta-scan +// error (closed DB) rather than swallowing it. +func TestProcessChanges_CollectError(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path, rw := newEmptyDB(t, dir, "opencode.db") + insertSession(t, rw, "ses_a", "", 1, 1, 0) + if err := rw.Close(); err != nil { + t.Fatalf("close rw: %v", err) + } + db, schema := introspect(t, path) + if err := db.Close(); err != nil { + t.Fatalf("close ro db: %v", err) + } + out := make(chan canonical.Event, 16) + if _, _, err := processChanges(ctxBG(), db, schema, newCursor(), "opencode:test", out, silentLogger(), func(error) {}); err == nil { + t.Error("processChanges over a closed DB: want error") + } +} diff --git a/internal/adapters/opencode/types.go b/internal/adapters/opencode/types.go new file mode 100644 index 0000000..ecb7d2c --- /dev/null +++ b/internal/adapters/opencode/types.go @@ -0,0 +1,391 @@ +package opencode + +import ( + "bytes" + "encoding/json" + "errors" +) + +// errEmptyData marks a message.data or part.data blob that is empty or +// whitespace-only. The columns are NOT NULL in opencode's schema, so an empty +// body is a corruption signal the caller surfaces as one structured error and +// skips, rather than decoding into a misleading zero value. +var errEmptyData = errors.New("opencode: empty data blob") + +// This file defines the typed row structs for the four opencode tables and +// the discriminated JSON bodies carried in message.data and part.data. Only +// the load-bearing fields the later mapper consumes are decoded; every struct +// tolerates unknown sibling fields (encoding/json drops them) and unknown +// discriminator values (decode helpers return an "unknown" marker rather than +// erroring), so a newer opencode schema never hard-fails ingest +// (adapter-opencode.md §"Edge Cases" #1, #11; §"session_message"). +// +// Times in these structs are opencode's native MILLISECONDS since the epoch. +// The mapper multiplies by 1000 to reach canonical microseconds; this file +// never converts (adapter-opencode.md §"Edge Cases" #6). + +// sessionRow is one row of the session table. Columns added by later +// migrations (path/agent/model/cost/tokens_*/time_archived/...) are pointers +// or zero-valued so an older-schema row that lacks them decodes cleanly; the +// dynamic SELECT (schema.go) only ever names columns that exist. +type sessionRow struct { + ID string + ProjectID string + ParentID string // empty => root session; set => sub-agent + Slug string + Directory string // sensitive: working dir at session start + Title string // sensitive: operator-facing title + Version string // opencode CLI version that wrote the row + Agent string // agent name (e.g. "code-reviewer"); may be empty + Model []byte // raw JSON {"id","providerID","variant?"} or nil + Cost float64 + TokensInput int64 + TokensOutput int64 + TokensReason int64 + TokensCacheRd int64 + TokensCacheWr int64 + TimeCreatedMs int64 + TimeUpdatedMs int64 + TimeArchivedMs int64 // 0 => not archived; set => SessionFinalized completed + // TimeCompactingMs is non-zero WHILE a compaction is running on this session + // (opencode sets session.time_compacting). The tailer PAUSES emitting a + // session's tree while this is non-zero — compaction reshapes message/part rows, + // so reading mid-compaction would emit partial/stale content. It re-emits once + // the column clears (its time_updated bumps, re-surfacing the session in a later + // delta) (adapter-opencode.md §"Edge Cases" #8; SOW-0005 round-2 P2-E). + TimeCompactingMs int64 +} + +// messageRow is one row of the message table. data is the raw discriminated +// union (user | assistant) decoded by decodeMessageData. +type messageRow struct { + ID string + SessionID string + TimeCreatedMs int64 + TimeUpdatedMs int64 + Data []byte +} + +// partRow is one row of the part table. data is the raw discriminated union +// (12 variants on $.type) decoded by decodePartData. +type partRow struct { + ID string + MessageID string + SessionID string + TimeCreatedMs int64 + TimeUpdatedMs int64 + Data []byte +} + +// isStepFinish reports whether the part's body is a step-finish part, by peeking +// ONLY at its $.type discriminator (a cheap decode of one field, not the whole +// body). The turn-finalize predicate (turnIsTerminal) needs to know a message +// carries ≥1 step-finish part even when the part's tokens fail to decode, so this +// keys on type presence — not on the full partData decode succeeding. A malformed +// body yields false (it has no recognizable type). +func (p partRow) isStepFinish() bool { + return peekPartType(p.Data) == partStepFinish +} + +// peekPartType decodes ONLY the $.type field of a part.data blob, returning the +// classified partType (partUnknown for a malformed/typeless body). It is the +// minimal-cost type probe the turn-finalize predicate uses, avoiding a second +// full decodePartData on every part. +func peekPartType(raw []byte) partType { + if len(bytes.TrimSpace(raw)) == 0 { + return partUnknown + } + var d struct { + Type string `json:"type"` + } + if json.Unmarshal(raw, &d) != nil { + return partUnknown + } + return classifyPartType(d.Type) +} + +// sessionMessageRow is one row of the session_message sidecar (agent/model +// switches). type is the discriminator; data is its raw body. +type sessionMessageRow struct { + ID string + SessionID string + Type string // "agent-switched" | "model-switched" | future + TimeCreatedMs int64 + TimeUpdatedMs int64 + Data []byte +} + +// modelRef is the session.model JSON ({id, providerID, variant?}) and the +// assistant message's nested model object. Decoded best-effort; absent fields +// stay empty. +type modelRef struct { + ID string `json:"id"` + ModelID string `json:"modelID"` // assistant user-message variant + ProviderID string `json:"providerID"` + Variant string `json:"variant"` +} + +// modelID returns the model identifier, preferring the session-style "id" +// field and falling back to the assistant user-message "modelID" field. +func (m modelRef) modelID() string { + if m.ID != "" { + return m.ID + } + return m.ModelID +} + +// --- message.data discriminated union (role: user | assistant) --------------- + +// messageRole classifies a decoded message.data body. roleUnknown is returned +// for any unrecognised role so the mapper can skip-with-WARN rather than crash +// (forward compatibility). +type messageRole string + +const ( + roleUser messageRole = "user" + roleAssistant messageRole = "assistant" + roleUnknown messageRole = "unknown" +) + +// tokenCounts is the nested tokens object on an assistant message and on a +// step-finish part. NOTE: on step-finish parts these values are CUMULATIVE +// within a message — the mapper computes per-op deltas (adapter-opencode.md +// §"Tool calls and Models", §"Canonical Model Gaps" #3). This struct only +// carries the values; the cumulative-vs-delta arithmetic lives in the mapper. +type tokenCounts struct { + Total int64 `json:"total"` + Input int64 `json:"input"` + Output int64 `json:"output"` + Reasoning int64 `json:"reasoning"` + Cache cacheTokens `json:"cache"` +} + +// cacheTokens is the nested cache read/write split opencode tracks and the +// canonical model now carries (SOW-0002). +type cacheTokens struct { + Read int64 `json:"read"` + Write int64 `json:"write"` +} + +// messageTime is the {created, completed?} block. Completed is a pointer so +// the mapper distinguishes "still running" (nil) from "completed at ms 0". +type messageTime struct { + Created int64 `json:"created"` + Completed *int64 `json:"completed"` +} + +// assistantError is the tagged AssistantError union. Only Name is load-bearing +// (it becomes the canonical ErrorClass when a session is finalized failed); +// the rest of the body stays in Raw for the mapper's Extras path. +type assistantError struct { + Name string `json:"name"` + Data json.RawMessage `json:"data"` +} + +// messageData is the decoded message.data body covering BOTH user and +// assistant variants. Unused-by-this-variant fields stay zero. The mapper +// reads Role first, then the fields relevant to that role. Unknown sibling +// keys are dropped by encoding/json. +type messageData struct { + Role string `json:"role"` + ParentID string `json:"parentID"` // assistant: the user msg that triggered the turn + Agent string `json:"agent"` + ModelID string `json:"modelID"` // assistant + ProviderID string `json:"providerID"` // assistant (user-defined alias, e.g. "my-provider-alias") + Mode string `json:"mode"` // assistant (deprecated alias of agent) + Cost float64 `json:"cost"` // assistant + Tokens tokenCounts `json:"tokens"` // assistant (turn rollup) + Time messageTime `json:"time"` + Finish string `json:"finish"` // assistant: "stop" | "tool-calls" | ... + Model *modelRef `json:"model"` // user-variant nested model object + Error *assistantError `json:"error"` // assistant: failure marker +} + +// role returns the typed role of the message, mapping unrecognised values to +// roleUnknown for forward-compatible skipping. +func (d messageData) role() messageRole { + switch d.Role { + case "user": + return roleUser + case "assistant": + return roleAssistant + default: + return roleUnknown + } +} + +// decodeMessageData parses a message.data blob. A malformed body returns a +// zero messageData with role roleUnknown and the decode error, so callers can +// surface one structured error and skip the row rather than abort the table. +func decodeMessageData(raw []byte) (messageData, error) { + var d messageData + if len(bytes.TrimSpace(raw)) == 0 { + return d, errEmptyData + } + if err := json.Unmarshal(raw, &d); err != nil { + return messageData{}, err + } + return d, nil +} + +// --- part.data discriminated union ($.type, 12 variants) --------------------- + +// partType is the part.data $.type discriminator. partUnknown is returned for +// any value not in the known set, so the mapper skips-with-WARN. +type partType string + +const ( + partStepStart partType = "step-start" + partStepFinish partType = "step-finish" + partText partType = "text" + partReasoning partType = "reasoning" + partTool partType = "tool" + partPatch partType = "patch" + partCompaction partType = "compaction" + partRetry partType = "retry" + partFile partType = "file" + partSnapshot partType = "snapshot" + partSubtask partType = "subtask" + partAgent partType = "agent" + partUnknown partType = "unknown" +) + +// knownPartTypes is the set of recognised part.data $.type values +// (adapter-opencode.md §"part" distribution table; message-v2.ts:352-378). +// A type absent here is forward-compatibility data: the mapper skips it with +// one structured WARN. +var knownPartTypes = map[partType]struct{}{ + partStepStart: {}, + partStepFinish: {}, + partText: {}, + partReasoning: {}, + partTool: {}, + partPatch: {}, + partCompaction: {}, + partRetry: {}, + partFile: {}, + partSnapshot: {}, + partSubtask: {}, + partAgent: {}, +} + +// classifyPartType returns the typed discriminator for a $.type string, +// mapping unknown values to partUnknown. +func classifyPartType(t string) partType { + pt := partType(t) + if _, ok := knownPartTypes[pt]; ok { + return pt + } + return partUnknown +} + +// partTimes is the {start, end?} block shared by reasoning parts and tool +// state. End is a pointer so the mapper distinguishes "still running" (nil) +// from "ended at ms 0". +type partTimes struct { + Start int64 `json:"start"` + End *int64 `json:"end"` +} + +// partError is the tagged error object on a `retry` part: opencode's RetryPart +// carries `error: ApiError`, an `{name, data}` tagged union (anomalyco/opencode +// — packages/sdk RetryPart/ApiError). Only Name is load-bearing for the retry +// LogEntry (adapter-opencode.md §"Per-table emit rules": `retry` → WRN with the +// attempt AND error.name). The data body is dropped — it can carry sensitive +// response content the adapter never copies. An absent error decodes to a zero +// partError (Name == ""). +type partError struct { + Name string `json:"name"` +} + +// toolState is the part.data.state tagged union for a tool part +// (message-v2.ts:248-308). Only the load-bearing fields are typed; Input and +// Metadata stay raw for the mapper (bytes_in approximation, sub-agent +// sessionId extraction). Status is the discriminator +// (pending|running|completed|error). +type toolState struct { + Status string `json:"status"` + Input json.RawMessage `json:"input"` + Output string `json:"output"` + Title string `json:"title"` + Error string `json:"error"` + Time partTimes `json:"time"` + Metadata json.RawMessage `json:"metadata"` +} + +// subAgentSessionID extracts state.metadata.sessionId — set on a tool part +// where tool=="task" to name the spawned child session (adapter-opencode.md +// §"Sub-Agent Linkage"). Returns "" when absent or malformed. Retained for the +// fuzz contract (it must never panic); callers that need to distinguish a +// malformed-but-present blob use subAgentSessionIDChecked. +func (s toolState) subAgentSessionID() string { + id, _ := s.subAgentSessionIDChecked() + return id +} + +// subAgentSessionIDChecked extracts state.metadata.sessionId AND reports whether +// the metadata was PRESENT but failed to decode (malformed). An absent/null +// metadata yields ("", false) — nothing to warn about; a present-but-unparseable +// blob yields ("", true) so the caller can surface a structured WARN rather than +// silently dropping a possible sub-agent linkage (SOW-0005 P2.6). +func (s toolState) subAgentSessionIDChecked() (id string, malformed bool) { + body := bytes.TrimSpace(s.Metadata) + if len(body) == 0 || bytes.Equal(body, []byte("null")) { + return "", false + } + var m struct { + SessionID string `json:"sessionId"` + } + if json.Unmarshal(body, &m) != nil { + return "", true + } + return m.SessionID, false +} + +// partData is the decoded part.data body covering every variant. Type is +// always set (classified); the remaining fields are populated only for the +// variants that carry them. Unknown sibling keys are dropped by +// encoding/json. The mapper switches on Type(). +type partData struct { + RawType string `json:"type"` + // step-finish: + Reason string `json:"reason"` + Cost float64 `json:"cost"` + Tokens tokenCounts `json:"tokens"` + // text / reasoning: + Text string `json:"text"` + Time partTimes `json:"time"` + // tool: + CallID string `json:"callID"` + Tool string `json:"tool"` + State *toolState `json:"state"` + // patch: + Hash string `json:"hash"` + Files []string `json:"files"` // sensitive: absolute paths + // compaction: + Auto bool `json:"auto"` + // retry: + Attempt int `json:"attempt"` + Error partError `json:"error"` // retry: the ApiError that triggered the attempt + // file: + MIME string `json:"mime"` + Filename string `json:"filename"` + URL string `json:"url"` +} + +// kind returns the typed discriminator for the part body. +func (d partData) kind() partType { return classifyPartType(d.RawType) } + +// decodePartData parses a part.data blob. A malformed body returns a zero +// partData with kind partUnknown and the decode error, so callers surface one +// structured error and skip the row rather than abort the table. +func decodePartData(raw []byte) (partData, error) { + var d partData + if len(bytes.TrimSpace(raw)) == 0 { + return d, errEmptyData + } + if err := json.Unmarshal(raw, &d); err != nil { + return partData{}, err + } + return d, nil +} diff --git a/internal/adapters/opencode/types_test.go b/internal/adapters/opencode/types_test.go new file mode 100644 index 0000000..25f18ea --- /dev/null +++ b/internal/adapters/opencode/types_test.go @@ -0,0 +1,291 @@ +package opencode + +import ( + "errors" + "testing" +) + +// TestDecodeMessageData_Assistant decodes a synthetic assistant message body +// and checks the load-bearing fields the mapper consumes (role, provider +// alias, model, cumulative token block, completed time, finish reason). +func TestDecodeMessageData_Assistant(t *testing.T) { + t.Parallel() + raw := []byte(`{ + "id":"msg_x","sessionID":"ses_x","role":"assistant","parentID":"msg_u", + "agent":"code-reviewer","modelID":"synth-model","providerID":"synthetic-alias", + "mode":"code-reviewer","cost":0.5, + "tokens":{"total":410,"input":250,"output":77,"reasoning":16,"cache":{"read":100,"write":0}}, + "time":{"created":1700000000000,"completed":1700000005000}, + "finish":"stop" + }`) + d, err := decodeMessageData(raw) + if err != nil { + t.Fatalf("decodeMessageData: %v", err) + } + if d.role() != roleAssistant { + t.Errorf("role = %v, want assistant", d.role()) + } + if d.ProviderID != "synthetic-alias" { + t.Errorf("providerID = %q", d.ProviderID) + } + if d.ModelID != "synth-model" { + t.Errorf("modelID = %q", d.ModelID) + } + if d.ParentID != "msg_u" { + t.Errorf("parentID = %q", d.ParentID) + } + if d.Tokens.Input != 250 || d.Tokens.Cache.Read != 100 { + t.Errorf("tokens = %+v", d.Tokens) + } + if d.Time.Completed == nil || *d.Time.Completed != 1700000005000 { + t.Errorf("completed time = %v", d.Time.Completed) + } + if d.Finish != "stop" { + t.Errorf("finish = %q", d.Finish) + } + if d.Error != nil { + t.Errorf("unexpected error block: %+v", d.Error) + } +} + +// TestDecodeMessageData_User decodes a synthetic user message body and checks +// the nested model object resolves via modelID(). +func TestDecodeMessageData_User(t *testing.T) { + t.Parallel() + raw := []byte(`{ + "id":"msg_u","sessionID":"ses_x","role":"user", + "time":{"created":1700000000000}, + "agent":"code-reviewer", + "model":{"providerID":"synthetic-alias","modelID":"synth-model","variant":"default"} + }`) + d, err := decodeMessageData(raw) + if err != nil { + t.Fatalf("decodeMessageData: %v", err) + } + if d.role() != roleUser { + t.Errorf("role = %v, want user", d.role()) + } + if d.Time.Completed != nil { + t.Errorf("user message must have no completed time, got %v", d.Time.Completed) + } + if d.Model == nil || d.Model.modelID() != "synth-model" { + t.Errorf("nested model = %+v", d.Model) + } +} + +// TestDecodeMessageData_AssistantError decodes a failed assistant message and +// confirms the error name (future ErrorClass) is captured. +func TestDecodeMessageData_AssistantError(t *testing.T) { + t.Parallel() + raw := []byte(`{"role":"assistant","error":{"name":"ProviderAuthError","data":{"detail":"synthetic"}}}`) + d, err := decodeMessageData(raw) + if err != nil { + t.Fatalf("decodeMessageData: %v", err) + } + if d.Error == nil || d.Error.Name != "ProviderAuthError" { + t.Fatalf("error block = %+v", d.Error) + } +} + +// TestDecodeMessageData_UnknownRole asserts an unrecognised role decodes +// without error and reports roleUnknown so the mapper can skip-with-WARN. +func TestDecodeMessageData_UnknownRole(t *testing.T) { + t.Parallel() + d, err := decodeMessageData([]byte(`{"role":"system","note":"future variant"}`)) + if err != nil { + t.Fatalf("decodeMessageData: %v", err) + } + if d.role() != roleUnknown { + t.Errorf("role = %v, want unknown", d.role()) + } +} + +// TestDecodeMessageData_Errors covers the empty body and malformed JSON +// rejections. +func TestDecodeMessageData_Errors(t *testing.T) { + t.Parallel() + if _, err := decodeMessageData(nil); !errors.Is(err, errEmptyData) { + t.Errorf("nil body: err = %v, want errEmptyData", err) + } + if _, err := decodeMessageData([]byte(" ")); !errors.Is(err, errEmptyData) { + t.Errorf("blank body: err = %v, want errEmptyData", err) + } + if _, err := decodeMessageData([]byte("{bad")); err == nil { + t.Error("malformed body: want error") + } +} + +// TestDecodePartData_AllKnownVariants decodes one synthetic body per known +// $.type and checks the discriminator classifies correctly plus a load-bearing +// field per variant. +func TestDecodePartData_AllKnownVariants(t *testing.T) { + t.Parallel() + cases := []struct { + name string + raw string + want partType + chk func(t *testing.T, d partData) + }{ + { + name: "step-start", raw: `{"type":"step-start"}`, want: partStepStart, + chk: func(t *testing.T, d partData) {}, + }, + { + name: "step-finish", raw: `{"type":"step-finish","reason":"stop","cost":0.1,"tokens":{"input":250,"output":7,"cache":{"read":10,"write":0}}}`, want: partStepFinish, + chk: func(t *testing.T, d partData) { + if d.Tokens.Input != 250 { + t.Errorf("step-finish tokens.input = %d, want 250 (cumulative; mapper deltas later)", d.Tokens.Input) + } + }, + }, + { + name: "text", raw: `{"type":"text","text":"synthetic","time":{"start":1,"end":2}}`, want: partText, + chk: func(t *testing.T, d partData) { + if d.Text != "synthetic" { + t.Errorf("text = %q", d.Text) + } + }, + }, + { + name: "reasoning", raw: `{"type":"reasoning","text":"thinking","time":{"start":1}}`, want: partReasoning, + chk: func(t *testing.T, d partData) { + if d.Time.End != nil { + t.Errorf("reasoning end must be nil (running), got %v", d.Time.End) + } + }, + }, + { + name: "tool", raw: `{"type":"tool","callID":"c1","tool":"github_get_file_contents","state":{"status":"completed","input":{"path":"x"},"output":"ok","time":{"start":1,"end":2}}}`, want: partTool, + chk: func(t *testing.T, d partData) { + if d.Tool != "github_get_file_contents" || d.State == nil || d.State.Status != "completed" { + t.Errorf("tool = %q state = %+v", d.Tool, d.State) + } + }, + }, + { + name: "tool-task-subagent", raw: `{"type":"tool","tool":"task","state":{"status":"completed","metadata":{"sessionId":"ses_child"},"time":{"start":1,"end":2}}}`, want: partTool, + chk: func(t *testing.T, d partData) { + if d.State == nil || d.State.subAgentSessionID() != "ses_child" { + t.Errorf("sub-agent sessionId not extracted: %+v", d.State) + } + }, + }, + { + name: "patch", raw: `{"type":"patch","hash":"abc","files":["/work/example/a.go","/work/example/b.go"]}`, want: partPatch, + chk: func(t *testing.T, d partData) { + if len(d.Files) != 2 || d.Hash != "abc" { + t.Errorf("patch files = %v hash = %q", d.Files, d.Hash) + } + }, + }, + { + name: "compaction", raw: `{"type":"compaction","auto":true}`, want: partCompaction, + chk: func(t *testing.T, d partData) { + if !d.Auto { + t.Error("compaction auto = false, want true") + } + }, + }, + { + name: "retry", raw: `{"type":"retry","attempt":3}`, want: partRetry, + chk: func(t *testing.T, d partData) { + if d.Attempt != 3 { + t.Errorf("retry attempt = %d, want 3", d.Attempt) + } + }, + }, + { + name: "file", raw: `{"type":"file","mime":"image/png","filename":"shot.png","url":"opencode-sqlite://x"}`, want: partFile, + chk: func(t *testing.T, d partData) { + if d.MIME != "image/png" || d.URL == "" { + t.Errorf("file mime = %q url = %q", d.MIME, d.URL) + } + }, + }, + {name: "snapshot", raw: `{"type":"snapshot","snapshot":"h"}`, want: partSnapshot, chk: func(t *testing.T, d partData) {}}, + {name: "subtask", raw: `{"type":"subtask","prompt":"p","agent":"general"}`, want: partSubtask, chk: func(t *testing.T, d partData) {}}, + {name: "agent", raw: `{"type":"agent","name":"general"}`, want: partAgent, chk: func(t *testing.T, d partData) {}}, + } + for _, tc := range cases { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + d, err := decodePartData([]byte(tc.raw)) + if err != nil { + t.Fatalf("decodePartData: %v", err) + } + if d.kind() != tc.want { + t.Fatalf("kind = %v, want %v", d.kind(), tc.want) + } + tc.chk(t, d) + }) + } +} + +// TestDecodePartData_UnknownType asserts an unrecognised $.type decodes without +// error and classifies as partUnknown (forward-compat skip), and that unknown +// sibling columns/keys do not hard-fail the decode (tolerance requirement). +func TestDecodePartData_UnknownTypeAndColumns(t *testing.T) { + t.Parallel() + // Unknown $.type plus an unknown nested key. + d, err := decodePartData([]byte(`{"type":"future-variant","brandNewField":{"x":1},"text":"still readable"}`)) + if err != nil { + t.Fatalf("decodePartData: %v", err) + } + if d.kind() != partUnknown { + t.Errorf("kind = %v, want unknown", d.kind()) + } + // A known type carrying extra unknown columns must still decode the known + // fields (encoding/json drops the unknown ones). + d2, err := decodePartData([]byte(`{"type":"text","text":"ok","unknownColumnFromNewerSchema":42}`)) + if err != nil { + t.Fatalf("decodePartData with extra column: %v", err) + } + if d2.kind() != partText || d2.Text != "ok" { + t.Errorf("extra-column decode lost known fields: %+v", d2) + } +} + +// TestDecodePartData_Errors covers empty and malformed bodies. +func TestDecodePartData_Errors(t *testing.T) { + t.Parallel() + if _, err := decodePartData(nil); !errors.Is(err, errEmptyData) { + t.Errorf("nil body: err = %v, want errEmptyData", err) + } + if _, err := decodePartData([]byte("{bad")); err == nil { + t.Error("malformed body: want error") + } +} + +// TestToolState_SubAgentSessionID covers the absent/null/malformed metadata +// paths return "". +func TestToolState_SubAgentSessionID(t *testing.T) { + t.Parallel() + if got := (toolState{}).subAgentSessionID(); got != "" { + t.Errorf("absent metadata: got %q, want empty", got) + } + if got := (toolState{Metadata: []byte("null")}).subAgentSessionID(); got != "" { + t.Errorf("null metadata: got %q, want empty", got) + } + if got := (toolState{Metadata: []byte("{bad")}).subAgentSessionID(); got != "" { + t.Errorf("malformed metadata: got %q, want empty", got) + } + if got := (toolState{Metadata: []byte(`{"sessionId":"ses_c"}`)}).subAgentSessionID(); got != "ses_c" { + t.Errorf("valid metadata: got %q, want ses_c", got) + } +} + +// TestModelRef_ModelID covers both the session-style "id" and the +// assistant-user-message "modelID" resolution. +func TestModelRef_ModelID(t *testing.T) { + t.Parallel() + if got := (modelRef{ID: "from-id"}).modelID(); got != "from-id" { + t.Errorf("id form = %q", got) + } + if got := (modelRef{ModelID: "from-modelID"}).modelID(); got != "from-modelID" { + t.Errorf("modelID form = %q", got) + } + if got := (modelRef{}).modelID(); got != "" { + t.Errorf("empty form = %q", got) + } +} diff --git a/internal/adapters/opencode/warnsink.go b/internal/adapters/opencode/warnsink.go new file mode 100644 index 0000000..da50f7d --- /dev/null +++ b/internal/adapters/opencode/warnsink.go @@ -0,0 +1,61 @@ +package opencode + +// This file holds warnSink — the in-memory warning buffer that defers all +// warning/error EMISSION until AFTER a source-DB read transaction is committed or +// rolled back (SOW-0005 round-5 P2-1). +// +// The problem: the delta scan (store_query.go scanOnePage / tailer_boundary.go +// scanBoundaryBucket) and the full-session-tree load (tailer_changes.go +// loadAndMapSession) used to pass the adapter's live onError straight into the +// row scanners and tree loaders, which invoke it SYNCHRONOUSLY for a corrupt-cell +// WARN, an unknown-session_message-type WARN, an oversized-session WARN, or a +// root-chain WARN — WHILE the read transaction is still open. onError ultimately +// sends on the adapter's out channel (adapter.go OnError → SourceErrorEvent). If a +// slow ingester backpressures that channel, the send BLOCKS, and the read tx is +// held open across the block — pinning the WAL snapshot on the live multi-GB +// opencode database and delaying opencode's own checkpoint. +// +// The fix: during a read tx, scanners/loaders write warnings into a warnSink +// (a plain slice append, never blocking); the tx-owning function commits/rolls +// back the tx FIRST, then flushes the buffered warnings through the real onError. +// A blocking onError can then only stall AFTER the snapshot is released. Content +// events are likewise emitted only after the tx closes (loadAndMapSession maps + +// the caller emits post-commit). A warnSink is single-goroutine (the poll loop +// owns one per tx scope); it needs no synchronisation. + +// warnSink buffers warnings/errors raised while a source-DB read transaction is +// open, for flushing after the tx closes (P2-1). The zero value is ready to use. +type warnSink struct { + errs []error +} + +// collect appends one warning/error to the buffer. It is the func(error) the +// scanners and tree loaders receive IN PLACE of the live onError while a read tx +// is open; it only appends (never blocks, never touches the out channel), so it is +// safe to call with the snapshot held. A nil error is ignored. +func (s *warnSink) collect(err error) { + if err != nil { + s.errs = append(s.errs, err) + } +} + +// flush emits every buffered warning through onError (in collection order) and +// resets the buffer, so the same sink can be reused for the next tx scope (e.g. +// the next delta page). It MUST be called only AFTER the read tx is committed or +// rolled back (P2-1), so a backpressured onError can no longer pin the WAL +// snapshot. A nil onError drops the buffer (defensive; production always wires a +// real onError via orNoop). Returns the number of warnings flushed. +func (s *warnSink) flush(onError func(error)) int { + n := len(s.errs) + if onError != nil { + for _, e := range s.errs { + onError(e) + } + } + s.errs = s.errs[:0] + return n +} + +// len reports how many warnings are currently buffered (used by tests asserting +// the tx-closed-before-flush ordering). +func (s *warnSink) len() int { return len(s.errs) } diff --git a/testdata/opencode/a_happy/expected.jsonl b/testdata/opencode/a_happy/expected.jsonl new file mode 100644 index 0000000..2f8a1cc --- /dev/null +++ b/testdata/opencode/a_happy/expected.jsonl @@ -0,0 +1,12 @@ +{"kind":"session_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":0,"Ts":1000000,"NativeID":"ses_happy01","RootNativeID":"ses_happy01","ParentNativeID":"","ParentOpKey":"","Kind":"root","AgentName":"general","Model":"claude-x","Cwd":"/work/proj","CallPath":"","Extras":{"directory":"/work/proj","project_id":"prj_x","providerID":"anthropic","slug":"calm-otter","title":"Happy session","version":"1.0.0"}}} +{"kind":"turn_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":1,"Ts":2000000,"SessionNativeID":"ses_happy01","Seq":1}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":2,"Ts":2100000,"SessionNativeID":"ses_happy01","TurnSeq":1,"Seq":1,"ParentOpSeq":-1,"Kind":"llm","Name":"claude-x","ToolNamespace":"","Model":"claude-x","Provider":"anthropic","ProviderAlias":"anthropic","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":3,"Ts":2200000,"SessionNativeID":"ses_happy01","TurnSeq":1,"Seq":2,"ParentOpSeq":1,"Kind":"reasoning","Name":"","ToolNamespace":"","Model":"","Provider":"","ProviderAlias":"","ReasoningKind":"raw","ChildSessionNativeID":"","Extras":null}} +{"kind":"payload_ref","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":4,"Ts":2200000,"SessionNativeID":"ses_happy01","TurnSeq":1,"OpSeq":2,"PayloadKind":"llm_reasoning","Format":"text","Compression":"","LocationURI":"opencode-sqlite://?part_id=prt_a02\u0026field=text","OriginalBytes":19,"StoredBytes":0,"SHA256":""}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":5,"Ts":2300000,"SessionNativeID":"ses_happy01","TurnSeq":1,"Seq":2,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":2300000,"TokensIn":0,"TokensOut":0,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0,"BytesIn":0,"BytesOut":0,"CharsIn":0,"CharsOut":0,"CtxUsed":0,"CtxMax":0}} +{"kind":"payload_ref","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":6,"Ts":2400000,"SessionNativeID":"ses_happy01","TurnSeq":1,"OpSeq":1,"PayloadKind":"llm_response","Format":"text","Compression":"","LocationURI":"opencode-sqlite://?part_id=prt_a03\u0026field=text","OriginalBytes":-1,"StoredBytes":0,"SHA256":""}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":7,"Ts":2500000,"SessionNativeID":"ses_happy01","TurnSeq":1,"Seq":3,"ParentOpSeq":1,"Kind":"tool","Name":"read","ToolNamespace":"","Model":"","Provider":"","ProviderAlias":"","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"payload_ref","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":8,"Ts":2500000,"SessionNativeID":"ses_happy01","TurnSeq":1,"OpSeq":3,"PayloadKind":"tool_response","Format":"json","Compression":"","LocationURI":"opencode-sqlite://?part_id=prt_a04\u0026field=state.output","OriginalBytes":-1,"StoredBytes":0,"SHA256":""}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":9,"Ts":2600000,"SessionNativeID":"ses_happy01","TurnSeq":1,"Seq":3,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":2600000,"TokensIn":0,"TokensOut":0,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0,"BytesIn":29,"BytesOut":12,"CharsIn":0,"CharsOut":0,"CtxUsed":0,"CtxMax":0}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":10,"Ts":9000000,"SessionNativeID":"ses_happy01","TurnSeq":1,"Seq":1,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":9000000,"TokensIn":500,"TokensOut":80,"TokensCacheRead":100,"TokensCacheWrite":0,"CostUSD":0.02,"BytesIn":0,"BytesOut":0,"CharsIn":0,"CharsOut":0,"CtxUsed":600,"CtxMax":0}} +{"kind":"turn_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":11,"Ts":9000000,"SessionNativeID":"ses_happy01","Seq":1,"Status":"completed","ErrorClass":"","EndTs":9000000,"TokensIn":500,"TokensOut":80,"TokensCacheRead":100,"TokensCacheWrite":0,"CostUSD":0.02}} diff --git a/testdata/opencode/a_happy/fixture.sql b/testdata/opencode/a_happy/fixture.sql new file mode 100644 index 0000000..0ed7795 --- /dev/null +++ b/testdata/opencode/a_happy/fixture.sql @@ -0,0 +1,56 @@ +-- a_happy: the baseline opencode session tree. +-- One ROOT session, one assistant turn whose parts exercise the full op chain: +-- step-start (opens the LLM op) -> reasoning op (+ llm_reasoning PayloadRef) +-- -> text part (llm_response PayloadRef on the LLM op) -> tool op (read, with a +-- tool_response PayloadRef) -> step-finish (closes the LLM op). +-- Pins: SessionStarted(root) -> TurnStarted -> the op/payload tree in part order +-- -> TurnFinalized. No archive + no data.error => NO SessionFinalized (opencode +-- never finalizes a running session; adapter-opencode.md "Canonical Model Gaps" #5). +-- All ids/timestamps are synthetic and invented (SOW-0005 R5). + +CREATE TABLE session ( + id TEXT PRIMARY KEY, project_id TEXT NOT NULL, parent_id TEXT, + slug TEXT NOT NULL, directory TEXT NOT NULL, title TEXT NOT NULL, + version TEXT NOT NULL, agent TEXT, model TEXT, + cost REAL NOT NULL DEFAULT 0, + tokens_input INTEGER NOT NULL DEFAULT 0, + tokens_output INTEGER NOT NULL DEFAULT 0, + tokens_reasoning INTEGER NOT NULL DEFAULT 0, + tokens_cache_read INTEGER NOT NULL DEFAULT 0, + tokens_cache_write INTEGER NOT NULL DEFAULT 0, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + time_archived INTEGER, time_compacting INTEGER); + +CREATE TABLE message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +CREATE TABLE part ( + id TEXT PRIMARY KEY, message_id TEXT NOT NULL, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +CREATE TABLE session_message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, type TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +INSERT INTO session + (id, project_id, parent_id, slug, directory, title, version, agent, model, time_created, time_updated, time_archived) +VALUES + ('ses_happy01', 'prj_x', '', 'calm-otter', '/work/proj', 'Happy session', '1.0.0', 'general', + '{"id":"claude-x","providerID":"anthropic"}', 1000, 9000, NULL); + +INSERT INTO message (id, session_id, time_created, time_updated, data) +VALUES + ('msg_a1', 'ses_happy01', 2000, 9000, + '{"role":"assistant","providerID":"anthropic","modelID":"claude-x","agent":"general","cost":0.02,"tokens":{"input":500,"output":80,"reasoning":0,"cache":{"read":100,"write":0}},"time":{"created":2000,"completed":9000},"finish":"stop"}'); + +INSERT INTO part (id, message_id, session_id, time_created, time_updated, data) +VALUES + ('prt_a01', 'msg_a1', 'ses_happy01', 2100, 2100, '{"type":"step-start"}'), + ('prt_a02', 'msg_a1', 'ses_happy01', 2200, 2300, '{"type":"reasoning","text":"thinking it through","time":{"start":2200,"end":2300}}'), + ('prt_a03', 'msg_a1', 'ses_happy01', 2400, 2400, '{"type":"text","text":"the answer"}'), + ('prt_a04', 'msg_a1', 'ses_happy01', 2500, 2600, '{"type":"tool","callID":"call_a4","tool":"read","state":{"status":"completed","input":{"path":"/work/proj/main.go"},"output":"package main","time":{"start":2500,"end":2600}}}'), + ('prt_a05', 'msg_a1', 'ses_happy01', 9000, 9000, '{"type":"step-finish","reason":"stop","cost":0.02,"tokens":{"input":500,"output":80,"reasoning":0,"cache":{"read":100,"write":0}}}'); diff --git a/testdata/opencode/b_subagent_task/expected.jsonl b/testdata/opencode/b_subagent_task/expected.jsonl new file mode 100644 index 0000000..589e42c --- /dev/null +++ b/testdata/opencode/b_subagent_task/expected.jsonl @@ -0,0 +1,15 @@ +{"kind":"session_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":0,"Ts":1000000,"NativeID":"ses_parent01","RootNativeID":"ses_parent01","ParentNativeID":"","ParentOpKey":"","Kind":"root","AgentName":"general","Model":"claude-x","Cwd":"/work/proj","CallPath":"","Extras":{"directory":"/work/proj","project_id":"prj_x","providerID":"anthropic","slug":"lead-fox","title":"Parent session","version":"1.0.0"}}} +{"kind":"turn_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":1,"Ts":2000000,"SessionNativeID":"ses_parent01","Seq":1}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":2,"Ts":2100000,"SessionNativeID":"ses_parent01","TurnSeq":1,"Seq":1,"ParentOpSeq":-1,"Kind":"llm","Name":"claude-x","ToolNamespace":"","Model":"claude-x","Provider":"anthropic","ProviderAlias":"anthropic","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":3,"Ts":3000000,"SessionNativeID":"ses_parent01","TurnSeq":1,"Seq":2,"ParentOpSeq":1,"Kind":"session","Name":"","ToolNamespace":"","Model":"","Provider":"","ProviderAlias":"","ReasoningKind":"","ChildSessionNativeID":"ses_child01","Extras":null}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":4,"Ts":3000000,"SessionNativeID":"ses_parent01","TurnSeq":1,"Seq":3,"ParentOpSeq":1,"Kind":"tool","Name":"task","ToolNamespace":"","Model":"","Provider":"","ProviderAlias":"","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"payload_ref","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":5,"Ts":3000000,"SessionNativeID":"ses_parent01","TurnSeq":1,"OpSeq":3,"PayloadKind":"tool_response","Format":"json","Compression":"","LocationURI":"opencode-sqlite://?part_id=prt_p02\u0026field=state.output","OriginalBytes":-1,"StoredBytes":0,"SHA256":""}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":6,"Ts":7000000,"SessionNativeID":"ses_parent01","TurnSeq":1,"Seq":3,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":7000000,"TokensIn":0,"TokensOut":0,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0,"BytesIn":24,"BytesOut":10,"CharsIn":0,"CharsOut":0,"CtxUsed":0,"CtxMax":0}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":7,"Ts":8000000,"SessionNativeID":"ses_parent01","TurnSeq":1,"Seq":1,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":8000000,"TokensIn":200,"TokensOut":50,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.03,"BytesIn":0,"BytesOut":0,"CharsIn":0,"CharsOut":0,"CtxUsed":200,"CtxMax":0}} +{"kind":"turn_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":8,"Ts":8000000,"SessionNativeID":"ses_parent01","Seq":1,"Status":"completed","ErrorClass":"","EndTs":8000000,"TokensIn":200,"TokensOut":50,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.03}} +{"kind":"session_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":0,"Ts":5000000,"NativeID":"ses_child01","RootNativeID":"ses_parent01","ParentNativeID":"ses_parent01","ParentOpKey":"","Kind":"sub_agent","AgentName":"reviewer","Model":"claude-x","Cwd":"/work/proj","CallPath":"","Extras":{"directory":"/work/proj","project_id":"prj_x","providerID":"anthropic","slug":"aide-fox","title":"Child session","version":"1.0.0"}}} +{"kind":"turn_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":1,"Ts":5100000,"SessionNativeID":"ses_child01","Seq":1}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":2,"Ts":5200000,"SessionNativeID":"ses_child01","TurnSeq":1,"Seq":1,"ParentOpSeq":-1,"Kind":"llm","Name":"claude-x","ToolNamespace":"","Model":"claude-x","Provider":"anthropic","ProviderAlias":"anthropic","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"payload_ref","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":3,"Ts":5300000,"SessionNativeID":"ses_child01","TurnSeq":1,"OpSeq":1,"PayloadKind":"llm_response","Format":"text","Compression":"","LocationURI":"opencode-sqlite://?part_id=prt_c02\u0026field=text","OriginalBytes":-1,"StoredBytes":0,"SHA256":""}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":4,"Ts":7000000,"SessionNativeID":"ses_child01","TurnSeq":1,"Seq":1,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":7000000,"TokensIn":90,"TokensOut":20,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01,"BytesIn":0,"BytesOut":0,"CharsIn":0,"CharsOut":0,"CtxUsed":90,"CtxMax":0}} +{"kind":"turn_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":5,"Ts":7000000,"SessionNativeID":"ses_child01","Seq":1,"Status":"completed","ErrorClass":"","EndTs":7000000,"TokensIn":90,"TokensOut":20,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01}} diff --git a/testdata/opencode/b_subagent_task/fixture.sql b/testdata/opencode/b_subagent_task/fixture.sql new file mode 100644 index 0000000..57c0619 --- /dev/null +++ b/testdata/opencode/b_subagent_task/fixture.sql @@ -0,0 +1,63 @@ +-- b_subagent_task: sub-agent linkage via BOTH mechanisms (AC#4). +-- 1. session.parent_id: ses_child01.parent_id = ses_parent01 => the child maps +-- to Kind=sub_agent with ParentNativeID=ses_parent01 and RootNativeID=ses_parent01. +-- 2. tool='task' with state.metadata.sessionId: the parent's task part names +-- ses_child01, so the mapper emits BOTH a session Op (kind=session, +-- ChildSessionNativeID=ses_child01, the topology parent) AND the tool Op +-- (kind=tool, name=task) in the same turn (adapter-opencode.md rule for +-- "tool where tool='task'"). +-- The parent session row's time_updated (5000) is LOWER than the child's (7000) +-- so the session-table delta yields the parent tree first, then the child tree +-- (deterministic affected-session order). All ids/timestamps synthetic. + +CREATE TABLE session ( + id TEXT PRIMARY KEY, project_id TEXT NOT NULL, parent_id TEXT, + slug TEXT NOT NULL, directory TEXT NOT NULL, title TEXT NOT NULL, + version TEXT NOT NULL, agent TEXT, model TEXT, + cost REAL NOT NULL DEFAULT 0, + tokens_input INTEGER NOT NULL DEFAULT 0, + tokens_output INTEGER NOT NULL DEFAULT 0, + tokens_reasoning INTEGER NOT NULL DEFAULT 0, + tokens_cache_read INTEGER NOT NULL DEFAULT 0, + tokens_cache_write INTEGER NOT NULL DEFAULT 0, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + time_archived INTEGER, time_compacting INTEGER); + +CREATE TABLE message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +CREATE TABLE part ( + id TEXT PRIMARY KEY, message_id TEXT NOT NULL, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +CREATE TABLE session_message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, type TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +INSERT INTO session + (id, project_id, parent_id, slug, directory, title, version, agent, model, time_created, time_updated, time_archived) +VALUES + ('ses_parent01', 'prj_x', '', 'lead-fox', '/work/proj', 'Parent session', '1.0.0', 'general', + '{"id":"claude-x","providerID":"anthropic"}', 1000, 5000, NULL), + ('ses_child01', 'prj_x', 'ses_parent01', 'aide-fox', '/work/proj', 'Child session', '1.0.0', 'reviewer', + '{"id":"claude-x","providerID":"anthropic"}', 5000, 7000, NULL); + +INSERT INTO message (id, session_id, time_created, time_updated, data) +VALUES + ('msg_p1', 'ses_parent01', 2000, 8000, + '{"role":"assistant","providerID":"anthropic","modelID":"claude-x","agent":"general","cost":0.03,"tokens":{"input":200,"output":50,"reasoning":0,"cache":{"read":0,"write":0}},"time":{"created":2000,"completed":8000},"finish":"stop"}'), + ('msg_c1', 'ses_child01', 5100, 7000, + '{"role":"assistant","providerID":"anthropic","modelID":"claude-x","agent":"reviewer","cost":0.01,"tokens":{"input":90,"output":20,"reasoning":0,"cache":{"read":0,"write":0}},"time":{"created":5100,"completed":7000},"finish":"stop"}'); + +INSERT INTO part (id, message_id, session_id, time_created, time_updated, data) +VALUES + ('prt_p01', 'msg_p1', 'ses_parent01', 2100, 2100, '{"type":"step-start"}'), + ('prt_p02', 'msg_p1', 'ses_parent01', 3000, 7000, '{"type":"tool","callID":"call_p2","tool":"task","state":{"status":"completed","input":{"prompt":"review this"},"output":"child done","metadata":{"sessionId":"ses_child01"},"time":{"start":3000,"end":7000}}}'), + ('prt_p03', 'msg_p1', 'ses_parent01', 8000, 8000, '{"type":"step-finish","reason":"stop","cost":0.03,"tokens":{"input":200,"output":50,"reasoning":0,"cache":{"read":0,"write":0}}}'), + ('prt_c01', 'msg_c1', 'ses_child01', 5200, 5200, '{"type":"step-start"}'), + ('prt_c02', 'msg_c1', 'ses_child01', 5300, 5300, '{"type":"text","text":"child reply"}'), + ('prt_c03', 'msg_c1', 'ses_child01', 7000, 7000, '{"type":"step-finish","reason":"stop","cost":0.01,"tokens":{"input":90,"output":20,"reasoning":0,"cache":{"read":0,"write":0}}}'); diff --git a/testdata/opencode/c_multi_provider/expected.jsonl b/testdata/opencode/c_multi_provider/expected.jsonl new file mode 100644 index 0000000..147f47b --- /dev/null +++ b/testdata/opencode/c_multi_provider/expected.jsonl @@ -0,0 +1,11 @@ +{"kind":"session_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":0,"Ts":1000000,"NativeID":"ses_multi01","RootNativeID":"ses_multi01","ParentNativeID":"","ParentOpKey":"","Kind":"root","AgentName":"general","Model":"claude-x","Cwd":"/work/proj","CallPath":"","Extras":{"directory":"/work/proj","project_id":"prj_x","providerID":"anthropic","slug":"dual-vendor","title":"Two providers","version":"1.0.0"}}} +{"kind":"turn_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":1,"Ts":2000000,"SessionNativeID":"ses_multi01","Seq":1}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":2,"Ts":2100000,"SessionNativeID":"ses_multi01","TurnSeq":1,"Seq":1,"ParentOpSeq":-1,"Kind":"llm","Name":"claude-x","ToolNamespace":"","Model":"claude-x","Provider":"anthropic","ProviderAlias":"anthropic","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"payload_ref","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":3,"Ts":2200000,"SessionNativeID":"ses_multi01","TurnSeq":1,"OpSeq":1,"PayloadKind":"llm_response","Format":"text","Compression":"","LocationURI":"opencode-sqlite://?part_id=prt_m1b\u0026field=text","OriginalBytes":-1,"StoredBytes":0,"SHA256":""}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":4,"Ts":4000000,"SessionNativeID":"ses_multi01","TurnSeq":1,"Seq":1,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":4000000,"TokensIn":100,"TokensOut":30,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01,"BytesIn":0,"BytesOut":0,"CharsIn":0,"CharsOut":0,"CtxUsed":100,"CtxMax":0}} +{"kind":"turn_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":5,"Ts":4000000,"SessionNativeID":"ses_multi01","Seq":1,"Status":"completed","ErrorClass":"","EndTs":4000000,"TokensIn":100,"TokensOut":30,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01}} +{"kind":"turn_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":6,"Ts":5000000,"SessionNativeID":"ses_multi01","Seq":2}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":7,"Ts":5100000,"SessionNativeID":"ses_multi01","TurnSeq":2,"Seq":1,"ParentOpSeq":-1,"Kind":"llm","Name":"gpt-y","ToolNamespace":"","Model":"gpt-y","Provider":"openai","ProviderAlias":"openai","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"payload_ref","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":8,"Ts":5200000,"SessionNativeID":"ses_multi01","TurnSeq":2,"OpSeq":1,"PayloadKind":"llm_response","Format":"text","Compression":"","LocationURI":"opencode-sqlite://?part_id=prt_m2b\u0026field=text","OriginalBytes":-1,"StoredBytes":0,"SHA256":""}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":9,"Ts":8000000,"SessionNativeID":"ses_multi01","TurnSeq":2,"Seq":1,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":8000000,"TokensIn":300,"TokensOut":80,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.02,"BytesIn":0,"BytesOut":0,"CharsIn":0,"CharsOut":0,"CtxUsed":300,"CtxMax":0}} +{"kind":"turn_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":10,"Ts":8000000,"SessionNativeID":"ses_multi01","Seq":2,"Status":"completed","ErrorClass":"","EndTs":8000000,"TokensIn":200,"TokensOut":50,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.02}} diff --git a/testdata/opencode/c_multi_provider/fixture.sql b/testdata/opencode/c_multi_provider/fixture.sql new file mode 100644 index 0000000..9ad4898 --- /dev/null +++ b/testdata/opencode/c_multi_provider/fixture.sql @@ -0,0 +1,63 @@ +-- c_multi_provider: one session, two assistant turns using DIFFERENT providers +-- (AC#7). Turn 1 is providerID=anthropic/modelID=claude-x; turn 2 is +-- providerID=openai/modelID=gpt-y. Each turn's LLM op must carry its own +-- ProviderAlias verbatim and a canonical Provider (both are in +-- knownProviderAliases, so Provider==alias here). Both providers surface so the +-- downstream catalog seeds two provider rows. +-- +-- This fixture ALSO pins the two-level cumulative-token model: +-- * per-op tokens reset per message (computeStepDeltas): turn1 op = 100/30, +-- turn2 op = 300/80 (each is the single step-finish's own cumulative). +-- * per-turn tokens are the message-level delta across the session: turn1 = +-- 100/30 (first turn), turn2 = 200/50 (300-100, 80-30). +-- All ids/timestamps synthetic. + +CREATE TABLE session ( + id TEXT PRIMARY KEY, project_id TEXT NOT NULL, parent_id TEXT, + slug TEXT NOT NULL, directory TEXT NOT NULL, title TEXT NOT NULL, + version TEXT NOT NULL, agent TEXT, model TEXT, + cost REAL NOT NULL DEFAULT 0, + tokens_input INTEGER NOT NULL DEFAULT 0, + tokens_output INTEGER NOT NULL DEFAULT 0, + tokens_reasoning INTEGER NOT NULL DEFAULT 0, + tokens_cache_read INTEGER NOT NULL DEFAULT 0, + tokens_cache_write INTEGER NOT NULL DEFAULT 0, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + time_archived INTEGER, time_compacting INTEGER); + +CREATE TABLE message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +CREATE TABLE part ( + id TEXT PRIMARY KEY, message_id TEXT NOT NULL, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +CREATE TABLE session_message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, type TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +INSERT INTO session + (id, project_id, parent_id, slug, directory, title, version, agent, model, time_created, time_updated, time_archived) +VALUES + ('ses_multi01', 'prj_x', '', 'dual-vendor', '/work/proj', 'Two providers', '1.0.0', 'general', + '{"id":"claude-x","providerID":"anthropic"}', 1000, 9000, NULL); + +INSERT INTO message (id, session_id, time_created, time_updated, data) +VALUES + ('msg_m1', 'ses_multi01', 2000, 4000, + '{"role":"assistant","providerID":"anthropic","modelID":"claude-x","agent":"general","cost":0.01,"tokens":{"input":100,"output":30,"reasoning":0,"cache":{"read":0,"write":0}},"time":{"created":2000,"completed":4000},"finish":"stop"}'), + ('msg_m2', 'ses_multi01', 5000, 8000, + '{"role":"assistant","providerID":"openai","modelID":"gpt-y","agent":"general","cost":0.02,"tokens":{"input":300,"output":80,"reasoning":0,"cache":{"read":0,"write":0}},"time":{"created":5000,"completed":8000},"finish":"stop"}'); + +INSERT INTO part (id, message_id, session_id, time_created, time_updated, data) +VALUES + ('prt_m1a', 'msg_m1', 'ses_multi01', 2100, 2100, '{"type":"step-start"}'), + ('prt_m1b', 'msg_m1', 'ses_multi01', 2200, 2200, '{"type":"text","text":"anthropic reply"}'), + ('prt_m1c', 'msg_m1', 'ses_multi01', 4000, 4000, '{"type":"step-finish","reason":"stop","cost":0.01,"tokens":{"input":100,"output":30,"reasoning":0,"cache":{"read":0,"write":0}}}'), + ('prt_m2a', 'msg_m2', 'ses_multi01', 5100, 5100, '{"type":"step-start"}'), + ('prt_m2b', 'msg_m2', 'ses_multi01', 5200, 5200, '{"type":"text","text":"openai reply"}'), + ('prt_m2c', 'msg_m2', 'ses_multi01', 8000, 8000, '{"type":"step-finish","reason":"stop","cost":0.02,"tokens":{"input":300,"output":80,"reasoning":0,"cache":{"read":0,"write":0}}}'); diff --git a/testdata/opencode/d_schema_drift/expected.jsonl b/testdata/opencode/d_schema_drift/expected.jsonl new file mode 100644 index 0000000..eba11a0 --- /dev/null +++ b/testdata/opencode/d_schema_drift/expected.jsonl @@ -0,0 +1,6 @@ +{"kind":"session_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":0,"Ts":1000000,"NativeID":"ses_drift01","RootNativeID":"ses_drift01","ParentNativeID":"","ParentOpKey":"","Kind":"root","AgentName":"","Model":"","Cwd":"/work/proj","CallPath":"","Extras":{"directory":"/work/proj","project_id":"prj_x","slug":"old-badger","title":"Old schema session","version":"0.9.0"}}} +{"kind":"turn_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":1,"Ts":2000000,"SessionNativeID":"ses_drift01","Seq":1}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":2,"Ts":2100000,"SessionNativeID":"ses_drift01","TurnSeq":1,"Seq":1,"ParentOpSeq":-1,"Kind":"llm","Name":"claude-x","ToolNamespace":"","Model":"claude-x","Provider":"anthropic","ProviderAlias":"anthropic","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"payload_ref","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":3,"Ts":2200000,"SessionNativeID":"ses_drift01","TurnSeq":1,"OpSeq":1,"PayloadKind":"llm_response","Format":"text","Compression":"","LocationURI":"opencode-sqlite://?part_id=prt_d1b\u0026field=text","OriginalBytes":-1,"StoredBytes":0,"SHA256":""}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":4,"Ts":6000000,"SessionNativeID":"ses_drift01","TurnSeq":1,"Seq":1,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":6000000,"TokensIn":60,"TokensOut":15,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01,"BytesIn":0,"BytesOut":0,"CharsIn":0,"CharsOut":0,"CtxUsed":60,"CtxMax":0}} +{"kind":"turn_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":5,"Ts":6000000,"SessionNativeID":"ses_drift01","Seq":1,"Status":"completed","ErrorClass":"","EndTs":6000000,"TokensIn":60,"TokensOut":15,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01}} diff --git a/testdata/opencode/d_schema_drift/fixture.sql b/testdata/opencode/d_schema_drift/fixture.sql new file mode 100644 index 0000000..12da7b6 --- /dev/null +++ b/testdata/opencode/d_schema_drift/fixture.sql @@ -0,0 +1,64 @@ +-- d_schema_drift: a pre-20260510033149_session_usage opencode schema. The +-- session table LACKS the optional columns added by later migrations: +-- agent, model, cost, tokens_input, tokens_output, tokens_reasoning, +-- tokens_cache_read, tokens_cache_write, time_archived, time_compacting +-- (10 columns; time_compacting added to the wanted set in SOW-0005 round-2 P2-E). +-- It keeps the REQUIRED id/time_created/time_updated (+ the always-present +-- project_id/parent_id/slug/directory/title/version), so introspectAll must +-- ACCEPT it (graceful degrade — adapter-opencode.md "Edge Cases" #1, AC#5), the +-- dynamic SELECT must OMIT the missing columns (never SELECT *), and the +-- emitted events must carry empty/zero session-level values for them: +-- SessionStarted.Model="" (session.model absent), .AgentName="" (session.agent +-- absent), and Extras WITHOUT providerID/variant (both come from session.model). +-- The LLM-op/turn token + provider values are UNAFFECTED: they come from the +-- message.data JSON body (modelID/providerID/tokens), which the drift does not +-- touch — so the op still carries Provider=anthropic/Model=claude-x and the turn +-- still carries its tokens. +-- +-- The spec/AC#5 promise of "one INF log per missing optional column" IS wired in +-- production: tailer.go logMissingColumns iterates tableSchema.Missing right +-- after introspectAll in both scanLoop and tailLoop, emitting one INFO per +-- (table, column) via the Adapter.logger threaded from adapter.go Scan/Tail. +-- This fixture pins both halves: the graceful DEGRADE (accept + omit-columns + +-- zero values) via golden_invariants_test.go:TestGoldenInvariant_DSchemaDrift, +-- and the INF emission via TestGoldenInvariant_DSchemaDrift_MissingColumnsLoggedINF +-- (a record-capturing slog.Handler asserts the logged (table,column) set equals +-- the introspected Missing set). The INF set is a log, not a canonical event, so +-- it is correctly absent from expected.jsonl. All ids/timestamps synthetic. + +CREATE TABLE session ( + id TEXT PRIMARY KEY, project_id TEXT NOT NULL, parent_id TEXT, + slug TEXT NOT NULL, directory TEXT NOT NULL, title TEXT NOT NULL, + version TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL); + +CREATE TABLE message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +CREATE TABLE part ( + id TEXT PRIMARY KEY, message_id TEXT NOT NULL, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +CREATE TABLE session_message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, type TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +INSERT INTO session + (id, project_id, parent_id, slug, directory, title, version, time_created, time_updated) +VALUES + ('ses_drift01', 'prj_x', '', 'old-badger', '/work/proj', 'Old schema session', '0.9.0', 1000, 6000); + +INSERT INTO message (id, session_id, time_created, time_updated, data) +VALUES + ('msg_d1', 'ses_drift01', 2000, 6000, + '{"role":"assistant","providerID":"anthropic","modelID":"claude-x","agent":"general","cost":0.01,"tokens":{"input":60,"output":15,"reasoning":0,"cache":{"read":0,"write":0}},"time":{"created":2000,"completed":6000},"finish":"stop"}'); + +INSERT INTO part (id, message_id, session_id, time_created, time_updated, data) +VALUES + ('prt_d1a', 'msg_d1', 'ses_drift01', 2100, 2100, '{"type":"step-start"}'), + ('prt_d1b', 'msg_d1', 'ses_drift01', 2200, 2200, '{"type":"text","text":"old reply"}'), + ('prt_d1c', 'msg_d1', 'ses_drift01', 6000, 6000, '{"type":"step-finish","reason":"stop","cost":0.01,"tokens":{"input":60,"output":15,"reasoning":0,"cache":{"read":0,"write":0}}}'); diff --git a/testdata/opencode/e_cumulative_tokens/expected.jsonl b/testdata/opencode/e_cumulative_tokens/expected.jsonl new file mode 100644 index 0000000..55bcaaf --- /dev/null +++ b/testdata/opencode/e_cumulative_tokens/expected.jsonl @@ -0,0 +1,11 @@ +{"kind":"session_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":0,"Ts":1000000,"NativeID":"ses_cumul01","RootNativeID":"ses_cumul01","ParentNativeID":"","ParentOpKey":"","Kind":"root","AgentName":"general","Model":"claude-x","Cwd":"/work/proj","CallPath":"","Extras":{"directory":"/work/proj","project_id":"prj_x","providerID":"anthropic","slug":"token-mole","title":"Cumulative tokens","version":"1.0.0"}}} +{"kind":"turn_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":1,"Ts":2000000,"SessionNativeID":"ses_cumul01","Seq":1}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":2,"Ts":2100000,"SessionNativeID":"ses_cumul01","TurnSeq":1,"Seq":1,"ParentOpSeq":-1,"Kind":"llm","Name":"claude-x","ToolNamespace":"","Model":"claude-x","Provider":"anthropic","ProviderAlias":"anthropic","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":3,"Ts":2200000,"SessionNativeID":"ses_cumul01","TurnSeq":1,"Seq":1,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":2200000,"TokensIn":100,"TokensOut":20,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01,"BytesIn":0,"BytesOut":0,"CharsIn":0,"CharsOut":0,"CtxUsed":100,"CtxMax":0}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":4,"Ts":3000000,"SessionNativeID":"ses_cumul01","TurnSeq":1,"Seq":2,"ParentOpSeq":-1,"Kind":"llm","Name":"claude-x","ToolNamespace":"","Model":"claude-x","Provider":"anthropic","ProviderAlias":"anthropic","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":5,"Ts":3100000,"SessionNativeID":"ses_cumul01","TurnSeq":1,"Seq":2,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":3100000,"TokensIn":150,"TokensOut":30,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01,"BytesIn":0,"BytesOut":0,"CharsIn":0,"CharsOut":0,"CtxUsed":250,"CtxMax":0}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":6,"Ts":4000000,"SessionNativeID":"ses_cumul01","TurnSeq":1,"Seq":3,"ParentOpSeq":-1,"Kind":"llm","Name":"claude-x","ToolNamespace":"","Model":"claude-x","Provider":"anthropic","ProviderAlias":"anthropic","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":7,"Ts":4100000,"SessionNativeID":"ses_cumul01","TurnSeq":1,"Seq":3,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":4100000,"TokensIn":160,"TokensOut":40,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01,"BytesIn":0,"BytesOut":0,"CharsIn":0,"CharsOut":0,"CtxUsed":410,"CtxMax":0}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":8,"Ts":5000000,"SessionNativeID":"ses_cumul01","TurnSeq":1,"Seq":4,"ParentOpSeq":-1,"Kind":"llm","Name":"claude-x","ToolNamespace":"","Model":"claude-x","Provider":"anthropic","ProviderAlias":"anthropic","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":9,"Ts":9000000,"SessionNativeID":"ses_cumul01","TurnSeq":1,"Seq":4,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":9000000,"TokensIn":0,"TokensOut":0,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01,"BytesIn":0,"BytesOut":0,"CharsIn":0,"CharsOut":0,"CtxUsed":400,"CtxMax":0}} +{"kind":"turn_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":10,"Ts":9000000,"SessionNativeID":"ses_cumul01","Seq":1,"Status":"completed","ErrorClass":"","EndTs":9000000,"TokensIn":400,"TokensOut":80,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.04}} diff --git a/testdata/opencode/e_cumulative_tokens/fixture.sql b/testdata/opencode/e_cumulative_tokens/fixture.sql new file mode 100644 index 0000000..4dae028 --- /dev/null +++ b/testdata/opencode/e_cumulative_tokens/fixture.sql @@ -0,0 +1,63 @@ +-- e_cumulative_tokens: the cumulative->delta token regression (AC#3, the top +-- silent-defect guard). ONE assistant message, FOUR step-start/step-finish pairs +-- whose step-finish tokens are CUMULATIVE within the message: +-- input: 100, 250, 410, 400 output: 20, 50, 90, 80 +-- The mapper's computeStepDeltas must emit per-LLM-op DELTAS: +-- op1 in=100 out=20 (first step = its own cumulative) +-- op2 in=150 out=30 (250-100, 50-20) +-- op3 in=160 out=40 (410-250, 90-50) +-- op4 in=0 out=0 (400<410 and 80<90 => negative delta CLAMPED to 0) +-- A regression to raw-value emission (100/250/410/400) would fail the golden. +-- The per-turn tokens are the message-level cumulative (input 400 / output 80); +-- this single turn is the first, so its delta equals its own cumulative. +-- All ids/timestamps synthetic. + +CREATE TABLE session ( + id TEXT PRIMARY KEY, project_id TEXT NOT NULL, parent_id TEXT, + slug TEXT NOT NULL, directory TEXT NOT NULL, title TEXT NOT NULL, + version TEXT NOT NULL, agent TEXT, model TEXT, + cost REAL NOT NULL DEFAULT 0, + tokens_input INTEGER NOT NULL DEFAULT 0, + tokens_output INTEGER NOT NULL DEFAULT 0, + tokens_reasoning INTEGER NOT NULL DEFAULT 0, + tokens_cache_read INTEGER NOT NULL DEFAULT 0, + tokens_cache_write INTEGER NOT NULL DEFAULT 0, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + time_archived INTEGER, time_compacting INTEGER); + +CREATE TABLE message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +CREATE TABLE part ( + id TEXT PRIMARY KEY, message_id TEXT NOT NULL, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +CREATE TABLE session_message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, type TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +INSERT INTO session + (id, project_id, parent_id, slug, directory, title, version, agent, model, time_created, time_updated, time_archived) +VALUES + ('ses_cumul01', 'prj_x', '', 'token-mole', '/work/proj', 'Cumulative tokens', '1.0.0', 'general', + '{"id":"claude-x","providerID":"anthropic"}', 1000, 9000, NULL); + +INSERT INTO message (id, session_id, time_created, time_updated, data) +VALUES + ('msg_e1', 'ses_cumul01', 2000, 9000, + '{"role":"assistant","providerID":"anthropic","modelID":"claude-x","agent":"general","cost":0.04,"tokens":{"input":400,"output":80,"reasoning":0,"cache":{"read":0,"write":0}},"time":{"created":2000,"completed":9000},"finish":"stop"}'); + +INSERT INTO part (id, message_id, session_id, time_created, time_updated, data) +VALUES + ('prt_e01', 'msg_e1', 'ses_cumul01', 2100, 2100, '{"type":"step-start"}'), + ('prt_e02', 'msg_e1', 'ses_cumul01', 2200, 2200, '{"type":"step-finish","reason":"tool-calls","cost":0.01,"tokens":{"input":100,"output":20,"reasoning":0,"cache":{"read":0,"write":0}}}'), + ('prt_e03', 'msg_e1', 'ses_cumul01', 3000, 3000, '{"type":"step-start"}'), + ('prt_e04', 'msg_e1', 'ses_cumul01', 3100, 3100, '{"type":"step-finish","reason":"tool-calls","cost":0.01,"tokens":{"input":250,"output":50,"reasoning":0,"cache":{"read":0,"write":0}}}'), + ('prt_e05', 'msg_e1', 'ses_cumul01', 4000, 4000, '{"type":"step-start"}'), + ('prt_e06', 'msg_e1', 'ses_cumul01', 4100, 4100, '{"type":"step-finish","reason":"tool-calls","cost":0.01,"tokens":{"input":410,"output":90,"reasoning":0,"cache":{"read":0,"write":0}}}'), + ('prt_e07', 'msg_e1', 'ses_cumul01', 5000, 5000, '{"type":"step-start"}'), + ('prt_e08', 'msg_e1', 'ses_cumul01', 9000, 9000, '{"type":"step-finish","reason":"stop","cost":0.01,"tokens":{"input":400,"output":80,"reasoning":0,"cache":{"read":0,"write":0}}}'); diff --git a/testdata/opencode/g_nested_subagent/expected.jsonl b/testdata/opencode/g_nested_subagent/expected.jsonl new file mode 100644 index 0000000..b72d603 --- /dev/null +++ b/testdata/opencode/g_nested_subagent/expected.jsonl @@ -0,0 +1,15 @@ +{"kind":"session_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":0,"Ts":1000000,"NativeID":"ses_groot","RootNativeID":"ses_groot","ParentNativeID":"","ParentOpKey":"","Kind":"root","AgentName":"general","Model":"claude-x","Cwd":"/work/proj","CallPath":"","Extras":{"directory":"/work/proj","project_id":"prj_x","providerID":"anthropic","slug":"tall-tree","title":"Root session","version":"1.0.0"}}} +{"kind":"turn_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":1,"Ts":1100000,"SessionNativeID":"ses_groot","Seq":1}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":2,"Ts":1200000,"SessionNativeID":"ses_groot","TurnSeq":1,"Seq":1,"ParentOpSeq":-1,"Kind":"llm","Name":"claude-x","ToolNamespace":"","Model":"claude-x","Provider":"anthropic","ProviderAlias":"anthropic","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":3,"Ts":3000000,"SessionNativeID":"ses_groot","TurnSeq":1,"Seq":1,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":3000000,"TokensIn":100,"TokensOut":20,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01,"BytesIn":0,"BytesOut":0,"CharsIn":0,"CharsOut":0,"CtxUsed":100,"CtxMax":0}} +{"kind":"turn_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":4,"Ts":3000000,"SessionNativeID":"ses_groot","Seq":1,"Status":"completed","ErrorClass":"","EndTs":3000000,"TokensIn":100,"TokensOut":20,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01}} +{"kind":"session_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":0,"Ts":3000000,"NativeID":"ses_gchild","RootNativeID":"ses_groot","ParentNativeID":"ses_groot","ParentOpKey":"","Kind":"sub_agent","AgentName":"reviewer","Model":"claude-x","Cwd":"/work/proj","CallPath":"","Extras":{"directory":"/work/proj","project_id":"prj_x","providerID":"anthropic","slug":"mid-branch","title":"Child session","version":"1.0.0"}}} +{"kind":"turn_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":1,"Ts":3100000,"SessionNativeID":"ses_gchild","Seq":1}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":2,"Ts":3200000,"SessionNativeID":"ses_gchild","TurnSeq":1,"Seq":1,"ParentOpSeq":-1,"Kind":"llm","Name":"claude-x","ToolNamespace":"","Model":"claude-x","Provider":"anthropic","ProviderAlias":"anthropic","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":3,"Ts":5000000,"SessionNativeID":"ses_gchild","TurnSeq":1,"Seq":1,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":5000000,"TokensIn":90,"TokensOut":18,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01,"BytesIn":0,"BytesOut":0,"CharsIn":0,"CharsOut":0,"CtxUsed":90,"CtxMax":0}} +{"kind":"turn_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":4,"Ts":5000000,"SessionNativeID":"ses_gchild","Seq":1,"Status":"completed","ErrorClass":"","EndTs":5000000,"TokensIn":90,"TokensOut":18,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01}} +{"kind":"session_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":0,"Ts":5000000,"NativeID":"ses_ggrand","RootNativeID":"ses_groot","ParentNativeID":"ses_gchild","ParentOpKey":"","Kind":"sub_agent","AgentName":"reviewer","Model":"claude-x","Cwd":"/work/proj","CallPath":"","Extras":{"directory":"/work/proj","project_id":"prj_x","providerID":"anthropic","slug":"leaf-node","title":"Grandchild session","version":"1.0.0"}}} +{"kind":"turn_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":1,"Ts":5100000,"SessionNativeID":"ses_ggrand","Seq":1}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":2,"Ts":5200000,"SessionNativeID":"ses_ggrand","TurnSeq":1,"Seq":1,"ParentOpSeq":-1,"Kind":"llm","Name":"claude-x","ToolNamespace":"","Model":"claude-x","Provider":"anthropic","ProviderAlias":"anthropic","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":3,"Ts":7000000,"SessionNativeID":"ses_ggrand","TurnSeq":1,"Seq":1,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":7000000,"TokensIn":80,"TokensOut":16,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01,"BytesIn":0,"BytesOut":0,"CharsIn":0,"CharsOut":0,"CtxUsed":80,"CtxMax":0}} +{"kind":"turn_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":4,"Ts":7000000,"SessionNativeID":"ses_ggrand","Seq":1,"Status":"completed","ErrorClass":"","EndTs":7000000,"TokensIn":80,"TokensOut":16,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01}} diff --git a/testdata/opencode/g_nested_subagent/fixture.sql b/testdata/opencode/g_nested_subagent/fixture.sql new file mode 100644 index 0000000..f918c1f --- /dev/null +++ b/testdata/opencode/g_nested_subagent/fixture.sql @@ -0,0 +1,67 @@ +-- g_nested_subagent: a THREE-level session tree (SOW-0005 P2.4 RootNativeID). +-- ses_groot (root, parent_id NULL) +-- └─ ses_gchild (parent_id = ses_groot) +-- └─ ses_ggrand (parent_id = ses_gchild) +-- The grandchild's RootNativeID must be the TRUE tree root (ses_groot), NOT its +-- direct parent (ses_gchild): the loader walks the parent_id chain to the top +-- (resolveRootID) and injects it into the mapper. Each session has one minimal +-- COMPLETED assistant turn (step-start + step-finish, completed ts) so the turn +-- finalizes (P1.3) and the golden is non-empty. All ids/timestamps synthetic +-- (SOW-0005 R5). time_updated ordering (groot 3000 < gchild 5000 < ggrand 7000) +-- makes the session-table delta emit root → child → grandchild deterministically. + +CREATE TABLE session ( + id TEXT PRIMARY KEY, project_id TEXT NOT NULL, parent_id TEXT, + slug TEXT NOT NULL, directory TEXT NOT NULL, title TEXT NOT NULL, + version TEXT NOT NULL, agent TEXT, model TEXT, + cost REAL NOT NULL DEFAULT 0, + tokens_input INTEGER NOT NULL DEFAULT 0, + tokens_output INTEGER NOT NULL DEFAULT 0, + tokens_reasoning INTEGER NOT NULL DEFAULT 0, + tokens_cache_read INTEGER NOT NULL DEFAULT 0, + tokens_cache_write INTEGER NOT NULL DEFAULT 0, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + time_archived INTEGER, time_compacting INTEGER); + +CREATE TABLE message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +CREATE TABLE part ( + id TEXT PRIMARY KEY, message_id TEXT NOT NULL, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +CREATE TABLE session_message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, type TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +INSERT INTO session + (id, project_id, parent_id, slug, directory, title, version, agent, model, time_created, time_updated, time_archived) +VALUES + ('ses_groot', 'prj_x', '', 'tall-tree', '/work/proj', 'Root session', '1.0.0', 'general', + '{"id":"claude-x","providerID":"anthropic"}', 1000, 3000, NULL), + ('ses_gchild', 'prj_x', 'ses_groot', 'mid-branch', '/work/proj', 'Child session', '1.0.0', 'reviewer', + '{"id":"claude-x","providerID":"anthropic"}', 3000, 5000, NULL), + ('ses_ggrand', 'prj_x', 'ses_gchild', 'leaf-node', '/work/proj', 'Grandchild session', '1.0.0', 'reviewer', + '{"id":"claude-x","providerID":"anthropic"}', 5000, 7000, NULL); + +INSERT INTO message (id, session_id, time_created, time_updated, data) +VALUES + ('msg_gr', 'ses_groot', 1100, 3000, + '{"role":"assistant","providerID":"anthropic","modelID":"claude-x","agent":"general","cost":0.01,"tokens":{"input":100,"output":20,"reasoning":0,"cache":{"read":0,"write":0}},"time":{"created":1100,"completed":3000},"finish":"stop"}'), + ('msg_gc', 'ses_gchild', 3100, 5000, + '{"role":"assistant","providerID":"anthropic","modelID":"claude-x","agent":"reviewer","cost":0.01,"tokens":{"input":90,"output":18,"reasoning":0,"cache":{"read":0,"write":0}},"time":{"created":3100,"completed":5000},"finish":"stop"}'), + ('msg_gg', 'ses_ggrand', 5100, 7000, + '{"role":"assistant","providerID":"anthropic","modelID":"claude-x","agent":"reviewer","cost":0.01,"tokens":{"input":80,"output":16,"reasoning":0,"cache":{"read":0,"write":0}},"time":{"created":5100,"completed":7000},"finish":"stop"}'); + +INSERT INTO part (id, message_id, session_id, time_created, time_updated, data) +VALUES + ('prt_gr1', 'msg_gr', 'ses_groot', 1200, 1200, '{"type":"step-start"}'), + ('prt_gr2', 'msg_gr', 'ses_groot', 3000, 3000, '{"type":"step-finish","reason":"stop","cost":0.01,"tokens":{"input":100,"output":20,"reasoning":0,"cache":{"read":0,"write":0}}}'), + ('prt_gc1', 'msg_gc', 'ses_gchild', 3200, 3200, '{"type":"step-start"}'), + ('prt_gc2', 'msg_gc', 'ses_gchild', 5000, 5000, '{"type":"step-finish","reason":"stop","cost":0.01,"tokens":{"input":90,"output":18,"reasoning":0,"cache":{"read":0,"write":0}}}'), + ('prt_gg1', 'msg_gg', 'ses_ggrand', 5200, 5200, '{"type":"step-start"}'), + ('prt_gg2', 'msg_gg', 'ses_ggrand', 7000, 7000, '{"type":"step-finish","reason":"stop","cost":0.01,"tokens":{"input":80,"output":16,"reasoning":0,"cache":{"read":0,"write":0}}}'); diff --git a/testdata/opencode/h_failed_tool/expected.jsonl b/testdata/opencode/h_failed_tool/expected.jsonl new file mode 100644 index 0000000..3322ea9 --- /dev/null +++ b/testdata/opencode/h_failed_tool/expected.jsonl @@ -0,0 +1,7 @@ +{"kind":"session_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":0,"Ts":1000000,"NativeID":"ses_fail01","RootNativeID":"ses_fail01","ParentNativeID":"","ParentOpKey":"","Kind":"root","AgentName":"general","Model":"claude-x","Cwd":"/work/proj","CallPath":"","Extras":{"directory":"/work/proj","project_id":"prj_x","providerID":"anthropic","slug":"cross-lynx","title":"Failed-tool session","version":"1.0.0"}}} +{"kind":"turn_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":1,"Ts":2000000,"SessionNativeID":"ses_fail01","Seq":1}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":2,"Ts":2100000,"SessionNativeID":"ses_fail01","TurnSeq":1,"Seq":1,"ParentOpSeq":-1,"Kind":"llm","Name":"claude-x","ToolNamespace":"","Model":"claude-x","Provider":"anthropic","ProviderAlias":"anthropic","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":3,"Ts":2500000,"SessionNativeID":"ses_fail01","TurnSeq":1,"Seq":2,"ParentOpSeq":1,"Kind":"tool","Name":"bash","ToolNamespace":"","Model":"","Provider":"","ProviderAlias":"","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":4,"Ts":2600000,"SessionNativeID":"ses_fail01","TurnSeq":1,"Seq":2,"Status":"failed","ErrorClass":"error","ErrorMessage":"command failed: exit status 2","EndTs":2600000,"TokensIn":0,"TokensOut":0,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0,"BytesIn":24,"BytesOut":0,"CharsIn":0,"CharsOut":0,"CtxUsed":0,"CtxMax":0}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":5,"Ts":9000000,"SessionNativeID":"ses_fail01","TurnSeq":1,"Seq":1,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":9000000,"TokensIn":300,"TokensOut":40,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01,"BytesIn":0,"BytesOut":0,"CharsIn":0,"CharsOut":0,"CtxUsed":300,"CtxMax":0}} +{"kind":"turn_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":6,"Ts":9000000,"SessionNativeID":"ses_fail01","Seq":1,"Status":"completed","ErrorClass":"","EndTs":9000000,"TokensIn":300,"TokensOut":40,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01}} diff --git a/testdata/opencode/h_failed_tool/fixture.sql b/testdata/opencode/h_failed_tool/fixture.sql new file mode 100644 index 0000000..d6f6a59 --- /dev/null +++ b/testdata/opencode/h_failed_tool/fixture.sql @@ -0,0 +1,57 @@ +-- h_failed_tool: pins SOW-0005 round-2 P1-C — an opencode tool whose +-- state.status == "error" must finalize as the CANONICAL op status "failed" +-- (NOT the non-canonical "error"), carrying the opencode detail in ErrorClass +-- (a class label) + ErrorMessage (state.error). Canonical op statuses are +-- running|completed|failed|cancelled|truncated (canonical-events.md:196). +-- +-- One ROOT session, one assistant turn: step-start (opens the LLM op) -> a +-- bash tool that ERRORED (state.status="error", state.error="command failed") +-- -> step-finish (closes the LLM op). The turn itself carries NO data.error and +-- has a completed ts, so it finalizes COMPLETED; only the TOOL op is failed. +-- This isolates the op-status mapping from the turn/session failed path. +-- All ids/timestamps are synthetic and invented (SOW-0005 R5). + +CREATE TABLE session ( + id TEXT PRIMARY KEY, project_id TEXT NOT NULL, parent_id TEXT, + slug TEXT NOT NULL, directory TEXT NOT NULL, title TEXT NOT NULL, + version TEXT NOT NULL, agent TEXT, model TEXT, + cost REAL NOT NULL DEFAULT 0, + tokens_input INTEGER NOT NULL DEFAULT 0, + tokens_output INTEGER NOT NULL DEFAULT 0, + tokens_reasoning INTEGER NOT NULL DEFAULT 0, + tokens_cache_read INTEGER NOT NULL DEFAULT 0, + tokens_cache_write INTEGER NOT NULL DEFAULT 0, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + time_archived INTEGER, time_compacting INTEGER); + +CREATE TABLE message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +CREATE TABLE part ( + id TEXT PRIMARY KEY, message_id TEXT NOT NULL, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +CREATE TABLE session_message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, type TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +INSERT INTO session + (id, project_id, parent_id, slug, directory, title, version, agent, model, time_created, time_updated, time_archived) +VALUES + ('ses_fail01', 'prj_x', '', 'cross-lynx', '/work/proj', 'Failed-tool session', '1.0.0', 'general', + '{"id":"claude-x","providerID":"anthropic"}', 1000, 9000, NULL); + +INSERT INTO message (id, session_id, time_created, time_updated, data) +VALUES + ('msg_f1', 'ses_fail01', 2000, 9000, + '{"role":"assistant","providerID":"anthropic","modelID":"claude-x","agent":"general","cost":0.01,"tokens":{"input":300,"output":40,"reasoning":0,"cache":{"read":0,"write":0}},"time":{"created":2000,"completed":9000},"finish":"stop"}'); + +INSERT INTO part (id, message_id, session_id, time_created, time_updated, data) +VALUES + ('prt_f01', 'msg_f1', 'ses_fail01', 2100, 2100, '{"type":"step-start"}'), + ('prt_f02', 'msg_f1', 'ses_fail01', 2500, 2600, '{"type":"tool","callID":"call_f2","tool":"bash","state":{"status":"error","input":{"command":"make build"},"error":"command failed: exit status 2","time":{"start":2500,"end":2600}}}'), + ('prt_f03', 'msg_f1', 'ses_fail01', 9000, 9000, '{"type":"step-finish","reason":"stop","cost":0.01,"tokens":{"input":300,"output":40,"reasoning":0,"cache":{"read":0,"write":0}}}'); diff --git a/testdata/opencode/i_failed_assistant/expected.jsonl b/testdata/opencode/i_failed_assistant/expected.jsonl new file mode 100644 index 0000000..3b01de7 --- /dev/null +++ b/testdata/opencode/i_failed_assistant/expected.jsonl @@ -0,0 +1,9 @@ +{"kind":"session_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":0,"Ts":1000000,"NativeID":"ses_err01","RootNativeID":"ses_err01","ParentNativeID":"","ParentOpKey":"","Kind":"root","AgentName":"general","Model":"claude-x","Cwd":"/work/proj","CallPath":"","Extras":{"directory":"/work/proj","project_id":"prj_x","providerID":"anthropic","slug":"amber-wolf","title":"Aborted session","version":"1.0.0"}}} +{"kind":"turn_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":1,"Ts":2000000,"SessionNativeID":"ses_err01","Seq":1}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":2,"Ts":2100000,"SessionNativeID":"ses_err01","TurnSeq":1,"Seq":1,"ParentOpSeq":-1,"Kind":"llm","Name":"claude-x","ToolNamespace":"","Model":"claude-x","Provider":"anthropic","ProviderAlias":"anthropic","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":3,"Ts":2500000,"SessionNativeID":"ses_err01","TurnSeq":1,"Seq":2,"ParentOpSeq":1,"Kind":"tool","Name":"read","ToolNamespace":"","Model":"","Provider":"","ProviderAlias":"","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"payload_ref","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":4,"Ts":2500000,"SessionNativeID":"ses_err01","TurnSeq":1,"OpSeq":2,"PayloadKind":"tool_response","Format":"json","Compression":"","LocationURI":"opencode-sqlite://?part_id=prt_e02\u0026field=state.output","OriginalBytes":-1,"StoredBytes":0,"SHA256":""}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":5,"Ts":2600000,"SessionNativeID":"ses_err01","TurnSeq":1,"Seq":2,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":2600000,"TokensIn":0,"TokensOut":0,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0,"BytesIn":29,"BytesOut":12,"CharsIn":0,"CharsOut":0,"CtxUsed":0,"CtxMax":0}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":6,"Ts":9000000,"SessionNativeID":"ses_err01","TurnSeq":1,"Seq":1,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":9000000,"TokensIn":300,"TokensOut":40,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01,"BytesIn":0,"BytesOut":0,"CharsIn":0,"CharsOut":0,"CtxUsed":300,"CtxMax":0}} +{"kind":"turn_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":7,"Ts":9000000,"SessionNativeID":"ses_err01","Seq":1,"Status":"failed","ErrorClass":"MessageAbortedError","EndTs":9000000,"TokensIn":300,"TokensOut":40,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01}} +{"kind":"session_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":8,"Ts":9000000,"NativeID":"ses_err01","Status":"failed","ErrorClass":"MessageAbortedError","ErrorMessage":"request was aborted by the user","EndTs":9000000}} diff --git a/testdata/opencode/i_failed_assistant/fixture.sql b/testdata/opencode/i_failed_assistant/fixture.sql new file mode 100644 index 0000000..a7fd434 --- /dev/null +++ b/testdata/opencode/i_failed_assistant/fixture.sql @@ -0,0 +1,62 @@ +-- i_failed_assistant: pins SOW-0005 round-5 P3-1 — a session whose LAST assistant +-- message carries data.error finalizes as SessionFinalized(Status="failed") with +-- ErrorClass from error.name AND ErrorMessage from error.data.message. +-- +-- opencode's AssistantError is a tagged union (NamedError.create → +-- {"name":,"data":}); every shipping variant except +-- MessageOutputLengthError carries a `message` string in `data`. Here the +-- assistant message errored with a MessageAbortedError whose data.message is +-- "request was aborted by the user" (the most common variant on the reference DB). +-- +-- One ROOT session, one assistant turn: step-start (opens the LLM op) -> a +-- completed read tool -> step-finish (closes the LLM op). The assistant message +-- itself carries data.error, so the turn finalizes FAILED (ErrorClass= +-- MessageAbortedError; TurnFinalizedEvent has no ErrorMessage field) AND the +-- session finalizes FAILED (ErrorClass + ErrorMessage). No time_archived, so the +-- failed path (not the archived/completed path) decides the terminal. +-- All ids/timestamps/messages are synthetic and invented (SOW-0005 R5). + +CREATE TABLE session ( + id TEXT PRIMARY KEY, project_id TEXT NOT NULL, parent_id TEXT, + slug TEXT NOT NULL, directory TEXT NOT NULL, title TEXT NOT NULL, + version TEXT NOT NULL, agent TEXT, model TEXT, + cost REAL NOT NULL DEFAULT 0, + tokens_input INTEGER NOT NULL DEFAULT 0, + tokens_output INTEGER NOT NULL DEFAULT 0, + tokens_reasoning INTEGER NOT NULL DEFAULT 0, + tokens_cache_read INTEGER NOT NULL DEFAULT 0, + tokens_cache_write INTEGER NOT NULL DEFAULT 0, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + time_archived INTEGER, time_compacting INTEGER); + +CREATE TABLE message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +CREATE TABLE part ( + id TEXT PRIMARY KEY, message_id TEXT NOT NULL, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +CREATE TABLE session_message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, type TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +INSERT INTO session + (id, project_id, parent_id, slug, directory, title, version, agent, model, time_created, time_updated, time_archived) +VALUES + ('ses_err01', 'prj_x', '', 'amber-wolf', '/work/proj', 'Aborted session', '1.0.0', 'general', + '{"id":"claude-x","providerID":"anthropic"}', 1000, 9000, NULL); + +INSERT INTO message (id, session_id, time_created, time_updated, data) +VALUES + ('msg_e1', 'ses_err01', 2000, 9000, + '{"role":"assistant","providerID":"anthropic","modelID":"claude-x","agent":"general","cost":0.01,"tokens":{"input":300,"output":40,"reasoning":0,"cache":{"read":0,"write":0}},"time":{"created":2000,"completed":9000},"finish":"stop","error":{"name":"MessageAbortedError","data":{"message":"request was aborted by the user"}}}'); + +INSERT INTO part (id, message_id, session_id, time_created, time_updated, data) +VALUES + ('prt_e01', 'msg_e1', 'ses_err01', 2100, 2100, '{"type":"step-start"}'), + ('prt_e02', 'msg_e1', 'ses_err01', 2500, 2600, '{"type":"tool","callID":"call_e2","tool":"read","state":{"status":"completed","input":{"path":"/work/proj/main.go"},"output":"package main","time":{"start":2500,"end":2600}}}'), + ('prt_e03', 'msg_e1', 'ses_err01', 9000, 9000, '{"type":"step-finish","reason":"stop","cost":0.01,"tokens":{"input":300,"output":40,"reasoning":0,"cache":{"read":0,"write":0}}}'); diff --git a/testdata/opencode/j_file_attachment/expected.jsonl b/testdata/opencode/j_file_attachment/expected.jsonl new file mode 100644 index 0000000..5beaea1 --- /dev/null +++ b/testdata/opencode/j_file_attachment/expected.jsonl @@ -0,0 +1,6 @@ +{"kind":"session_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":0,"Ts":1000000,"NativeID":"ses_file01","RootNativeID":"ses_file01","ParentNativeID":"","ParentOpKey":"","Kind":"root","AgentName":"general","Model":"claude-x","Cwd":"/work/proj","CallPath":"","Extras":{"directory":"/work/proj","project_id":"prj_x","providerID":"anthropic","slug":"tidy-heron","title":"File-attachment session","version":"1.0.0"}}} +{"kind":"turn_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":1,"Ts":2000000,"SessionNativeID":"ses_file01","Seq":1}} +{"kind":"op_started","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":2,"Ts":2100000,"SessionNativeID":"ses_file01","TurnSeq":1,"Seq":1,"ParentOpSeq":-1,"Kind":"llm","Name":"claude-x","ToolNamespace":"","Model":"claude-x","Provider":"anthropic","ProviderAlias":"anthropic","ReasoningKind":"","ChildSessionNativeID":"","Extras":null}} +{"kind":"log_entry","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":3,"Ts":2300000,"SessionNativeID":"ses_file01","TurnSeq":1,"OpSeq":1,"Severity":"INF","Source":"opencode","Message":"file attachment","Extras":{"filename":"diagram.png","mime":"image/png","url":"https://cdn.example.invalid/diagram.png"}}} +{"kind":"op_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":4,"Ts":9000000,"SessionNativeID":"ses_file01","TurnSeq":1,"Seq":1,"Status":"completed","ErrorClass":"","ErrorMessage":"","EndTs":9000000,"TokensIn":200,"TokensOut":30,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01,"BytesIn":0,"BytesOut":0,"CharsIn":0,"CharsOut":0,"CtxUsed":200,"CtxMax":0}} +{"kind":"turn_finalized","payload":{"SourceID":"opencode:\u003cROOT\u003e","SourceSeq":5,"Ts":9000000,"SessionNativeID":"ses_file01","Seq":1,"Status":"completed","ErrorClass":"","EndTs":9000000,"TokensIn":200,"TokensOut":30,"TokensCacheRead":0,"TokensCacheWrite":0,"CostUSD":0.01}} diff --git a/testdata/opencode/j_file_attachment/fixture.sql b/testdata/opencode/j_file_attachment/fixture.sql new file mode 100644 index 0000000..f41636f --- /dev/null +++ b/testdata/opencode/j_file_attachment/fixture.sql @@ -0,0 +1,61 @@ +-- j_file_attachment: pins SOW-0005 round-4 P2-3 + round-6 P3-3 — a file part is a +-- user file ATTACHMENT, surfaced as an INF LogEntry carrying {filename, url, mime} +-- in its extras, NOT a PayloadRef. The canonical PayloadRefEvent.PayloadKind set +-- (internal/canonical/events.go) has no attachment kind, so the adapter must emit +-- NO PayloadRef for a file part (the removed "user_attachment" kind was a +-- canonical-contract violation). This is the end-to-end golden the unit tests +-- (mapper_test.go TestMapSession_FilePartLogEntry) lacked: the full load→map→golden +-- pipeline, asserting the LogEntry flows through and no non-canonical PayloadKind +-- appears anywhere in the stream (golden_invariants_test.go). +-- +-- One ROOT session, one assistant turn: step-start (opens the LLM op) -> a file +-- part (the attachment) -> step-finish (closes the LLM op). The turn has a completed +-- ts and ≥1 step-finish, so it finalizes COMPLETED; NO archive + NO data.error => +-- NO SessionFinalized (running session). The file part is scoped to the turn and the +-- open LLM op (OpSeq=1). The URL uses the example.invalid host (no operator data). +-- All ids/timestamps are synthetic and invented (SOW-0005 R5). + +CREATE TABLE session ( + id TEXT PRIMARY KEY, project_id TEXT NOT NULL, parent_id TEXT, + slug TEXT NOT NULL, directory TEXT NOT NULL, title TEXT NOT NULL, + version TEXT NOT NULL, agent TEXT, model TEXT, + cost REAL NOT NULL DEFAULT 0, + tokens_input INTEGER NOT NULL DEFAULT 0, + tokens_output INTEGER NOT NULL DEFAULT 0, + tokens_reasoning INTEGER NOT NULL DEFAULT 0, + tokens_cache_read INTEGER NOT NULL DEFAULT 0, + tokens_cache_write INTEGER NOT NULL DEFAULT 0, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + time_archived INTEGER, time_compacting INTEGER); + +CREATE TABLE message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +CREATE TABLE part ( + id TEXT PRIMARY KEY, message_id TEXT NOT NULL, session_id TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +CREATE TABLE session_message ( + id TEXT PRIMARY KEY, session_id TEXT NOT NULL, type TEXT NOT NULL, + time_created INTEGER NOT NULL, time_updated INTEGER NOT NULL, + data TEXT NOT NULL); + +INSERT INTO session + (id, project_id, parent_id, slug, directory, title, version, agent, model, time_created, time_updated, time_archived) +VALUES + ('ses_file01', 'prj_x', '', 'tidy-heron', '/work/proj', 'File-attachment session', '1.0.0', 'general', + '{"id":"claude-x","providerID":"anthropic"}', 1000, 9000, NULL); + +INSERT INTO message (id, session_id, time_created, time_updated, data) +VALUES + ('msg_j1', 'ses_file01', 2000, 9000, + '{"role":"assistant","providerID":"anthropic","modelID":"claude-x","agent":"general","cost":0.01,"tokens":{"input":200,"output":30,"reasoning":0,"cache":{"read":0,"write":0}},"time":{"created":2000,"completed":9000},"finish":"stop"}'); + +INSERT INTO part (id, message_id, session_id, time_created, time_updated, data) +VALUES + ('prt_j01', 'msg_j1', 'ses_file01', 2100, 2100, '{"type":"step-start"}'), + ('prt_j02', 'msg_j1', 'ses_file01', 2300, 2300, '{"type":"file","mime":"image/png","filename":"diagram.png","url":"https://cdn.example.invalid/diagram.png"}'), + ('prt_j03', 'msg_j1', 'ses_file01', 9000, 9000, '{"type":"step-finish","reason":"stop","cost":0.01,"tokens":{"input":200,"output":30,"reasoning":0,"cache":{"read":0,"write":0}}}');