feat(api): RunSupervisor advise rung for the shadow observation-signal rules#294
Merged
Conversation
Slice A of the observation-signal advise rung. Adds SupervisionQuieted (run-age liveness backstop), SupervisionStalled (Rule R rate-dropout), and SupervisionBreached (Rule Q quality-below-limit) to the RunSupervisionChoice Literal + RUN_SUPERVISION_CHOICES frozenset (7 -> 10), with the vocab test updated to the 10-value set + a work-noun guard on the new dispositions. WHY: promoting the shipped shadow observation-signal + run-liveness rules one rung (observe -> advise) means the supervisor records one Decision per breach edge for a human; that Decision's choice must exist in the closed set first. Decision-only dispositions (never a command). SupervisionBreached is the naming-r3 rename of the originally-proposed SupervisionDoubted: "Doubted" read as the supervisor's epistemic state; "Breached" names the objective limit-crossing, family-uniform with Deferred / Conflicted / Stalled. This slice adds vocabulary only; the supervisor emission lands next. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Slice B of the observation-signal advise rung. Adds run_supervisor_advise_enabled (default off, a further opt-in above each rule's own enable) and, when on, emits exactly one Decision per breach EDGE from the three shadow rules -- still issuing NO command (advise rung): - run-liveness backstop -> SupervisionQuieted - Rule R rate-dropout -> SupervisionStalled - Rule Q quality breach -> SupervisionBreached WHY: the shadow rules (#288 / #273) log would_flag but leave no durable record a human can triage. The advise rung climbs exactly one step (observe -> advise), recording one RunSupervision Decision per breach episode for a human while keeping the act rung (auto-Hold) deferred. Emission is edge-triggered off the already-walled per-rule memory (one Decision per episode; nothing on a standing breach across ticks), beam-free (the liveness rule runs before the beam read), and reuses the existing DecisionRegistered shape under the RunSupervisor identity + Authorize path. Shadow logging is unchanged; advise only adds the Decision. cannot-tell still defers (no Decision). Tests cover advise-off (no Decision), each disposition under advise-on (one Decision, no command), and edge-triggering (one Decision across two ticks of a standing breach). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Gate-review follow-ups (the advise diff drew 2 ship + 1 changes_needed, the
last purely a test-coverage gap; the correctness/trust lens passed clean).
Adds three tests:
- advise liveness is edge-triggered: two ticks of a standing stale Run
record only ONE SupervisionQuieted Decision (parity with the quality +
stall edge-trigger tests).
- advise records no Decision when the quality channel has no observation
(cannot-tell -> defer; pins that the value-None path never emits, which a
reviewer worried about -- the decider returns would_flag=False on None).
- advise records no Decision when the rule is disabled (snr_limit None):
advise respects each rule's own enable, not just the global advise flag.
Test-only; no production change.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
||||||||||||||||||||||||||||||||||||||||||||||||
The diff-coverage gate (hard 90% on changed lines) flagged _run_supervisor.py at 88.9%: the new _record_supervision_advice except ConcurrencyError branch (lines 490-491) was uncovered. Adds an idempotency test that re-derives the same advise Decision id (via a FixedIdGenerator repeating the id) so the second append collides and is swallowed -- mirrors the existing test_record_decision_is_idempotent_on_repeated_id for the beam-Hold path. Test-only; covers the cross-restart re-emission no-op. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Promotes the RunSupervisor's three shadow observe-only rules one rung on the autonomy ladder: observe -> advise. When
run_supervisor_advise_enabledis on (default off), each rule records one Decision per breach edge for a human, and still issues no command.SupervisionQuietedSupervisionStalledSupervisionBreachedThree commits: (A) the Decision-BC vocab (7 -> 10 choices + vocab test), (B) the supervisor emission + config, (C) gate-review test additions.
Why
The shadow rules log
would_flagbut leave no durable record a human can triage. The advise rung records aDecision(context=RunSupervision, choice=...)per breach episode under the supervisor's identity + Authorize path, while keeping the act rung (auto-Hold) deferred. It climbs exactly one rung — no command is issued from these rules.Trust posture (verified by the gate review)
run_supervisor_enabled.cannot-tellpaths still defer (no Decision).Naming
SupervisionBreachedis the naming-r3 rename of the originally-proposedSupervisionDoubted— "Doubted" read as the supervisor's epistemic state; "Breached" names the objective limit-crossing, family-uniform withSupervisionDeferred/Conflicted/Stalled. Both design memos were updated in lockstep.Gate review
Focused 3-lens review on the diff: correctness/trust = ship (all 5 trust invariants sound), cross-BC/vocab = ship, test/fitness = changes-needed (a test-coverage gap, no correctness bug). Addressed in commit C: liveness edge-trigger test, plus two cannot-tell-under-advise tests (no Decision when the channel has no observation, and when the rule is disabled). A reviewer's worry that a
value=NoneDecision could emit was verified false — the decider returnswould_flag=Falseon a None reading; commit C pins that.Deferred
The act rung (reversible auto-Hold on a confirmed breach) + the act-mode sim composition guard, per the design lock.
Test plan
Unit: each disposition emits exactly one Decision under advise-on with no command; advise-off records nothing; edge-triggering for all three rules; cannot-tell gates. Decision-BC vocab parity (closed-set == Literal == 10). Full suite + architecture fitness green on every code commit.
🤖 Generated with Claude Code