feat(agent): ProcedureWatcher flags stalled in-conduct procedures#304
Merged
Conversation
CORA's 8th seeded agent and first liveness automation on the Operation BC. A deterministic, flag-only, composition-root periodic watcher: each tick it lists in-conduct procedures (Running / Held) and records one Decision(context=ProcedureProgress, choice=Stall) per stall episode for any that has sat past an operator window without progressing. It issues no command (it surfaces the stall so a human acts before an experiment hangs unnoticed mid-procedure). Procedure is a distinct aggregate from Run, so this is a liveness gap RunSupervisor does not cover. The load-bearing guard is the anti-false-flag fold. Appending activity steps does not advance proj_operation_procedure_summary.last_status_changed_at (the projection NO-OPs ProcedureActivitiesLogbookOpened against it; activity is orthogonal to lifecycle), so a Running procedure actively logging steps would look frozen by its status timestamp alone. Keying on it without folding in activity recency would false-flag an actively-progressing conduct, a foolable watchdog that is worse than none. So a Running candidate already past its status-timestamp window gets one read of the latest activity recorded_at before it is flagged; Held is not folded (a paused conduct accepts no activity). This mirrors ClearanceWatcher folding ReviewStep.decided_at for UnderReview. Activity recency had no existing read path: activity rows land in the write-only ActivityStore side table and the aggregate stream carries only the one-time ProcedureActivitiesLogbookOpened marker. So this adds a BC-local ProcedureActivityLookup read port (+ Postgres adapter + in-memory stub), keyed on recorded_at (the CORA write-time trust anchor, not the spoofable sampled_at), riding a new (procedure_id, recorded_at DESC) index. Mirrors the RunChannelLookup pattern. naming-r3: context ProcedureProgress (the lifecycle dimension, family-clean with ClearanceProgress; not the state-noun ProcedureStall, not the existing conduct context ProcedureExecution); choice Stall (Flag and Stale are taken by sibling contexts, and choice tokens must be globally unique in the DecisionChoice projection); agent kind ProcedureWatcher (two-word aggregate-named, like ClearanceWatcher / CalibrationWatcher). Agent id in a new 0c0c block. Off by default, gates on Actor.active. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Folds the two in-scope gaps the fleet review found: - Paginated drain was untested (the fake hardcoded next_cursor=None), so the `cursor = page.next_cursor` continuation never ran; a mutant dropping it survived. Adds test_tick_drains_paginated_procedures: a stale procedure on page 2 is reached only if the cursor advances. - The Actor.active kill switch was only tested via the actor-absent arm. Adds test_tick_is_noop_when_watcher_actor_deactivated: seed then deactivate the agent Actor and assert the tick writes nothing (pins the `not actor.active` disjunct). Also corrects the seed docstring: the watcher issues no write command, but it does issue an authz-gated ListProcedures read each tick, so under a real Authorize policy the agent principal still needs that read grant. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
CORA's 8th seeded agent and first liveness automation on the Operation BC. A deterministic, flag-only, composition-root periodic watcher: each tick it lists in-conduct procedures (
Running/Held) and records oneDecision(context=ProcedureProgress, choice=Stall)per stall episode for any that has sat past an operator window without progressing. It issues no command (it surfaces the stall so a human acts before an experiment hangs unnoticed mid-procedure). Procedure is a distinct aggregate from Run, so this is a liveness gapRunSupervisordoes not cover. Off by default; gates onActor.active.Unblocked by #276 (Held/Resumed +
last_status_changed_atmaterialized on every procedure transition).The load-bearing guard: the anti-false-flag fold
Appending activity steps does not advance
proj_operation_procedure_summary.last_status_changed_at(the projection NO-OPsProcedureActivitiesLogbookOpenedagainst it; activity is orthogonal to lifecycle). So aRunningprocedure actively logging steps would look frozen by its status timestamp alone, and keying on that would false-flag an actively-progressing conduct, a foolable watchdog that is worse than none. ARunningcandidate already past its status-timestamp window therefore gets one read of the latest activityrecorded_atbefore it is flagged;Heldis not folded (a paused conduct accepts no activity). MirrorsClearanceWatcherfoldingReviewStep.decided_atforUnderReview.Activity recency had no existing read path (activity rows land in the write-only
ActivityStoreside table; the aggregate stream carries only the one-timeProcedureActivitiesLogbookOpenedmarker). So this adds a BC-localProcedureActivityLookupread port (+ Postgres adapter + in-memory stub), keyed onrecorded_at(the CORA write-time trust anchor, not the spoofablesampled_at), riding a new(procedure_id, recorded_at DESC)index. Mirrors theRunChannelLookuppattern.naming-r3
ProcedureProgress(lifecycle dimension, family-clean withClearanceProgress; not the state-nounProcedureStall, not the existing conduct contextProcedureExecution)Stall(FlagandStaleare taken by sibling contexts, and choice tokens must be globally unique in theDecisionChoiceprojection)ProcedureWatcher(two-word aggregate-named, likeClearanceWatcher/CalibrationWatcher)0c0cblockTests / verification
Deferred
An act rung (auto-Hold / warn-at-resume); a count field on the recency DTO; promoting
ProcedureActivityLookuptoinfrastructure/ports(rule-of-three, on a real second cross-BC consumer).🤖 Generated with Claude Code