feat(api): RunSupervisor advise rung for the shadow observation-signal rules by xmap · Pull Request #294 · xmap/cora

xmap · 2026-06-21T20:29:49Z

What

Promotes the RunSupervisor's three shadow observe-only rules one rung on the autonomy ladder: observe -> advise. When run_supervisor_advise_enabled is on (default off), each rule records one Decision per breach edge for a human, and still issues no command.

Rule	Advise Decision
run-age run-liveness backstop (#273)	`SupervisionQuieted`
Rule R beam-aware rate-dropout (#288)	`SupervisionStalled`
Rule Q quality-below-limit (#288)	`SupervisionBreached`

Three commits: (A) the Decision-BC vocab (7 -> 10 choices + vocab test), (B) the supervisor emission + config, (C) gate-review test additions.

Why

The shadow rules log would_flag but leave no durable record a human can triage. The advise rung records a Decision(context=RunSupervision, choice=...) per breach episode under the supervisor's identity + Authorize path, while keeping the act rung (auto-Hold) deferred. It climbs exactly one rung — no command is issued from these rules.

Trust posture (verified by the gate review)

Off by default, a further opt-in above each rule's own enable + run_supervisor_enabled.
Decision-only, never a command at the advise rung.
Edge-triggered: one Decision per breach episode (off the already-walled per-rule memory); a standing breach across ticks does not re-emit; cannot-tell paths still defer (no Decision).
Beam-free emitter for the liveness rule (it runs before the beam read); no shared-memory bleed into the beam-Hold FSM.

Naming

SupervisionBreached is the naming-r3 rename of the originally-proposed SupervisionDoubted — "Doubted" read as the supervisor's epistemic state; "Breached" names the objective limit-crossing, family-uniform with SupervisionDeferred / Conflicted / Stalled. Both design memos were updated in lockstep.

Gate review

Focused 3-lens review on the diff: correctness/trust = ship (all 5 trust invariants sound), cross-BC/vocab = ship, test/fitness = changes-needed (a test-coverage gap, no correctness bug). Addressed in commit C: liveness edge-trigger test, plus two cannot-tell-under-advise tests (no Decision when the channel has no observation, and when the rule is disabled). A reviewer's worry that a value=None Decision could emit was verified false — the decider returns would_flag=False on a None reading; commit C pins that.

Deferred

The act rung (reversible auto-Hold on a confirmed breach) + the act-mode sim composition guard, per the design lock.

Test plan

Unit: each disposition emits exactly one Decision under advise-on with no command; advise-off records nothing; edge-triggering for all three rules; cannot-tell gates. Decision-BC vocab parity (closed-set == Literal == 10). Full suite + architecture fitness green on every code commit.

🤖 Generated with Claude Code

Slice A of the observation-signal advise rung. Adds SupervisionQuieted (run-age liveness backstop), SupervisionStalled (Rule R rate-dropout), and SupervisionBreached (Rule Q quality-below-limit) to the RunSupervisionChoice Literal + RUN_SUPERVISION_CHOICES frozenset (7 -> 10), with the vocab test updated to the 10-value set + a work-noun guard on the new dispositions. WHY: promoting the shipped shadow observation-signal + run-liveness rules one rung (observe -> advise) means the supervisor records one Decision per breach edge for a human; that Decision's choice must exist in the closed set first. Decision-only dispositions (never a command). SupervisionBreached is the naming-r3 rename of the originally-proposed SupervisionDoubted: "Doubted" read as the supervisor's epistemic state; "Breached" names the objective limit-crossing, family-uniform with Deferred / Conflicted / Stalled. This slice adds vocabulary only; the supervisor emission lands next. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Slice B of the observation-signal advise rung. Adds run_supervisor_advise_enabled (default off, a further opt-in above each rule's own enable) and, when on, emits exactly one Decision per breach EDGE from the three shadow rules -- still issuing NO command (advise rung): - run-liveness backstop -> SupervisionQuieted - Rule R rate-dropout -> SupervisionStalled - Rule Q quality breach -> SupervisionBreached WHY: the shadow rules (#288 / #273) log would_flag but leave no durable record a human can triage. The advise rung climbs exactly one step (observe -> advise), recording one RunSupervision Decision per breach episode for a human while keeping the act rung (auto-Hold) deferred. Emission is edge-triggered off the already-walled per-rule memory (one Decision per episode; nothing on a standing breach across ticks), beam-free (the liveness rule runs before the beam read), and reuses the existing DecisionRegistered shape under the RunSupervisor identity + Authorize path. Shadow logging is unchanged; advise only adds the Decision. cannot-tell still defers (no Decision). Tests cover advise-off (no Decision), each disposition under advise-on (one Decision, no command), and edge-triggering (one Decision across two ticks of a standing breach). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Gate-review follow-ups (the advise diff drew 2 ship + 1 changes_needed, the last purely a test-coverage gap; the correctness/trust lens passed clean). Adds three tests: - advise liveness is edge-triggered: two ticks of a standing stale Run record only ONE SupervisionQuieted Decision (parity with the quality + stall edge-trigger tests). - advise records no Decision when the quality channel has no observation (cannot-tell -> defer; pins that the value-None path never emits, which a reviewer worried about -- the decider returns would_flag=False on None). - advise records no Decision when the rule is disabled (snr_limit None): advise respects each rule's own enable, not just the global advise flag. Test-only; no production change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

github-actions · 2026-06-21T20:39:46Z

Coverage report

Click to see where and how coverage changed

File	Statements	Missing	Coverage	Coverage (new stmts)	Lines missing
apps/api/src/cora/api
_run_supervisor.py					1010
apps/api/src/cora/decision/aggregates/decision
state.py
apps/api/src/cora/infrastructure
config.py
Project Total

_{This report was generated by python-coverage-comment-action}

The diff-coverage gate (hard 90% on changed lines) flagged _run_supervisor.py at 88.9%: the new _record_supervision_advice except ConcurrencyError branch (lines 490-491) was uncovered. Adds an idempotency test that re-derives the same advise Decision id (via a FixedIdGenerator repeating the id) so the second append collides and is swallowed -- mirrors the existing test_record_decision_is_idempotent_on_repeated_id for the beam-Hold path. Test-only; covers the cross-restart re-emission no-op. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

xmap and others added 3 commits June 21, 2026 22:52

xmap merged commit c71ef08 into main Jun 22, 2026
16 checks passed

xmap deleted the worktree-supervisor-advise-rung branch June 22, 2026 04:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(api): RunSupervisor advise rung for the shadow observation-signal rules#294

feat(api): RunSupervisor advise rung for the shadow observation-signal rules#294
xmap merged 4 commits into
mainfrom
worktree-supervisor-advise-rung

xmap commented Jun 21, 2026

Uh oh!

github-actions Bot commented Jun 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

xmap commented Jun 21, 2026

What

Why

Trust posture (verified by the gate review)

Naming

Gate review

Deferred

Test plan

Uh oh!

github-actions Bot commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Jun 21, 2026 •

edited

Loading