feat(agent): ProcedureWatcher flags stalled in-conduct procedures by xmap · Pull Request #304 · xmap/cora

xmap · 2026-06-22T07:11:09Z

Summary

CORA's 8th seeded agent and first liveness automation on the Operation BC. A deterministic, flag-only, composition-root periodic watcher: each tick it lists in-conduct procedures (Running / Held) and records one Decision(context=ProcedureProgress, choice=Stall) per stall episode for any that has sat past an operator window without progressing. It issues no command (it surfaces the stall so a human acts before an experiment hangs unnoticed mid-procedure). Procedure is a distinct aggregate from Run, so this is a liveness gap RunSupervisor does not cover. Off by default; gates on Actor.active.

Unblocked by #276 (Held/Resumed + last_status_changed_at materialized on every procedure transition).

The load-bearing guard: the anti-false-flag fold

Appending activity steps does not advance proj_operation_procedure_summary.last_status_changed_at (the projection NO-OPs ProcedureActivitiesLogbookOpened against it; activity is orthogonal to lifecycle). So a Running procedure actively logging steps would look frozen by its status timestamp alone, and keying on that would false-flag an actively-progressing conduct, a foolable watchdog that is worse than none. A Running candidate already past its status-timestamp window therefore gets one read of the latest activity recorded_at before it is flagged; Held is not folded (a paused conduct accepts no activity). Mirrors ClearanceWatcher folding ReviewStep.decided_at for UnderReview.

Activity recency had no existing read path (activity rows land in the write-only ActivityStore side table; the aggregate stream carries only the one-time ProcedureActivitiesLogbookOpened marker). So this adds a BC-local ProcedureActivityLookup read port (+ Postgres adapter + in-memory stub), keyed on recorded_at (the CORA write-time trust anchor, not the spoofable sampled_at), riding a new (procedure_id, recorded_at DESC) index. Mirrors the RunChannelLookup pattern.

naming-r3

context ProcedureProgress (lifecycle dimension, family-clean with ClearanceProgress; not the state-noun ProcedureStall, not the existing conduct context ProcedureExecution)
choice Stall (Flag and Stale are taken by sibling contexts, and choice tokens must be globally unique in the DecisionChoice projection)
agent kind ProcedureWatcher (two-word aggregate-named, like ClearanceWatcher / CalibrationWatcher)
agent id in a new 0c0c block

Tests / verification

New unit tests: vocab disjointness (unions every closed sibling set), the in-memory lookup stub, the watcher runtime (is_stalled boundary, the Running fold both ways, Held no-fold, cannot-tell defer, defensive status guard, idempotency, actor-absent / disabled no-ops), the seed shape, and the default-lookup selector.
New integration test for the Postgres adapter (real DB).
Local: ruff, pyright, tach, architecture (26,895), full unit tier (10,463), mkdocs --strict all green.

Deferred

An act rung (auto-Hold / warn-at-resume); a count field on the recency DTO; promoting ProcedureActivityLookup to infrastructure/ports (rule-of-three, on a real second cross-BC consumer).

🤖 Generated with Claude Code

CORA's 8th seeded agent and first liveness automation on the Operation BC. A deterministic, flag-only, composition-root periodic watcher: each tick it lists in-conduct procedures (Running / Held) and records one Decision(context=ProcedureProgress, choice=Stall) per stall episode for any that has sat past an operator window without progressing. It issues no command (it surfaces the stall so a human acts before an experiment hangs unnoticed mid-procedure). Procedure is a distinct aggregate from Run, so this is a liveness gap RunSupervisor does not cover. The load-bearing guard is the anti-false-flag fold. Appending activity steps does not advance proj_operation_procedure_summary.last_status_changed_at (the projection NO-OPs ProcedureActivitiesLogbookOpened against it; activity is orthogonal to lifecycle), so a Running procedure actively logging steps would look frozen by its status timestamp alone. Keying on it without folding in activity recency would false-flag an actively-progressing conduct, a foolable watchdog that is worse than none. So a Running candidate already past its status-timestamp window gets one read of the latest activity recorded_at before it is flagged; Held is not folded (a paused conduct accepts no activity). This mirrors ClearanceWatcher folding ReviewStep.decided_at for UnderReview. Activity recency had no existing read path: activity rows land in the write-only ActivityStore side table and the aggregate stream carries only the one-time ProcedureActivitiesLogbookOpened marker. So this adds a BC-local ProcedureActivityLookup read port (+ Postgres adapter + in-memory stub), keyed on recorded_at (the CORA write-time trust anchor, not the spoofable sampled_at), riding a new (procedure_id, recorded_at DESC) index. Mirrors the RunChannelLookup pattern. naming-r3: context ProcedureProgress (the lifecycle dimension, family-clean with ClearanceProgress; not the state-noun ProcedureStall, not the existing conduct context ProcedureExecution); choice Stall (Flag and Stale are taken by sibling contexts, and choice tokens must be globally unique in the DecisionChoice projection); agent kind ProcedureWatcher (two-word aggregate-named, like ClearanceWatcher / CalibrationWatcher). Agent id in a new 0c0c block. Off by default, gates on Actor.active. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

github-actions · 2026-06-22T07:21:44Z

Coverage report

Click to see where and how coverage changed

File	Statements	Missing	Coverage	Coverage (new stmts)	Lines missing
apps/api/src/cora/agent
__init__.py
seed_procedure_watcher.py
apps/api/src/cora/api
_procedure_watcher.py					132, 287
main.py
apps/api/src/cora/decision/aggregates/decision
state.py
apps/api/src/cora/infrastructure
config.py
apps/api/src/cora/operation/adapters
postgres_procedure_activity_lookup.py
apps/api/src/cora/operation/ports
__init__.py
procedure_activity_lookup.py
Project Total

_{This report was generated by python-coverage-comment-action}

Folds the two in-scope gaps the fleet review found: - Paginated drain was untested (the fake hardcoded next_cursor=None), so the `cursor = page.next_cursor` continuation never ran; a mutant dropping it survived. Adds test_tick_drains_paginated_procedures: a stale procedure on page 2 is reached only if the cursor advances. - The Actor.active kill switch was only tested via the actor-absent arm. Adds test_tick_is_noop_when_watcher_actor_deactivated: seed then deactivate the agent Actor and assert the tick writes nothing (pins the `not actor.active` disjunct). Also corrects the seed docstring: the watcher issues no write command, but it does issue an authz-gated ListProcedures read each tick, so under a real Authorize policy the agent principal still needs that read grant. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

xmap enabled auto-merge (squash) June 22, 2026 07:38

xmap merged commit f33ed5c into main Jun 22, 2026
16 checks passed

xmap deleted the worktree-procedure-watcher branch June 22, 2026 07:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent): ProcedureWatcher flags stalled in-conduct procedures#304

feat(agent): ProcedureWatcher flags stalled in-conduct procedures#304
xmap merged 2 commits into
mainfrom
worktree-procedure-watcher

xmap commented Jun 22, 2026

Uh oh!

github-actions Bot commented Jun 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

xmap commented Jun 22, 2026

Summary

The load-bearing guard: the anti-false-flag fold

naming-r3

Tests / verification

Deferred

Uh oh!

github-actions Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Jun 22, 2026 •

edited

Loading