feat(supervisor): S2 driver — MUL gate → owner advance (drive_mul_advance)#578
Conversation
…ance) S2->S4 composition on the light crate. New in kanban_actor: - mul_target(phase, qualia, mantissa): pure lowering — gate_decision_i4 -> KanbanColumn::advance_on_gate (Flow->forward, Block->Prune-where-legal, Hold->None). Integer i4, no f64/NaN. - drive_mul_advance(actor, qualia, mantissa): read the owner's phase, run the gate, and on a non-Hold gate cast KanbanMsg::Advance to the owning actor (the owner advances itself). Returns the move, or None on Hold. +1 test (5 total green): s2_driver_gate_advances_then_holds (Flow qualia + mantissa>0 -> Planning->CognitiveWork; neutral + 0 -> Hold -> no advance). clippy + fmt clean; light build, no disk/symbiont gate. Actor-side S2 consumer (the mul_phase_step node wrapper stays the single-node convenience). Remaining S2: the per-row cognitive-shader-driver owner loop over the qualia column (needs MailboxSoaView::qualia() + the shader-driver build). OUT-leg now real+tested on the light crate: S4 actor + delivery edge + S2 driver. Plan S2 annotated; AGENT_LOG cont.33. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
|
Warning Review limit reached
More reviews will be available in 1 minute and 41 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 967a72fb29
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let phase = ractor::call!(actor, |reply| KanbanMsg::Phase { reply }) | ||
| .map_err(|e| KanbanRouteError::Rpc(e.to_string()))?; | ||
| match mul_target(phase, qualia, mantissa) { |
There was a problem hiding this comment.
Make MUL advance atomic with phase read
When two S2 drivers, or a driver and deliver_kanban_step, target the same actor concurrently, this separate Phase RPC can become stale before the Advance RPC below is processed. For example, two Flow drivers can both read Planning, both compute CognitiveWork, then the second is rejected as CognitiveWork -> CognitiveWork even though the gate/DAG did not drift. The gate decision and phase transition need to run in one actor message (or otherwise be serialized with the owner state) to preserve the mailbox-as-owner guarantee under concurrent sends.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Good catch — real TOCTOU race, fixed. The two-RPC shape (Phase read, then Advance) let two concurrent Flow drivers both read Planning → both compute CognitiveWork → the second spuriously rejected.
Fix: a new KanbanMsg::MulAdvance { qualia, mantissa, reply } runs the gate (gate_decision_i4 → advance_on_gate) and the transition inside ONE serialized mailbox message, so the gate reads state.phase() at the instant of mutation — exactly the mailbox-as-owner atomicity you flagged. drive_mul_advance is now a single MulAdvance RPC (no external read-then-write gap).
Added a regression test concurrent_mul_drivers_serialize_no_spurious_rejection: two concurrent Flow drivers now chain Planning → CognitiveWork → Evaluation — both Ok(Some), neither a spurious Illegal. 6 tests green, clippy + fmt clean. (commit on the branch.)
Generated by Claude Code
…codex #578) The two-RPC drive_mul_advance (Phase read, then Advance) had a TOCTOU race: two concurrent Flow drivers could both read Planning, both compute CognitiveWork, and the second be spuriously rejected as CognitiveWork -> CognitiveWork even though the gate/DAG didn't drift. Fix: a new KanbanMsg::MulAdvance { qualia, mantissa, reply } runs the gate (gate_decision_i4 -> advance_on_gate) AND the transition inside ONE serialized mailbox message, so the gate reads the owner's phase at the instant of mutation (mailbox-as-owner atomicity). drive_mul_advance is now a single MulAdvance RPC. +1 test (6 total green): concurrent_mul_drivers_serialize_no_spurious_rejection — two concurrent Flow drivers chain Planning -> CognitiveWork -> Evaluation, both Ok, neither a spurious Illegal. clippy + fmt clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
… advance
The actor-side half of S3 (E-SUBSTRATE-IS-THE-SCHEDULER), mirroring the S2
atomic MulAdvance pattern in lance-graph-supervisor::kanban_actor:
- KanbanMsg::Tick { at, reply } — atomic in-actor realization of the contract's
NextPhaseScheduler: a substrate version tick advances the owner along the
Rubicon forward arc (phase().next_phases().first()) in ONE serialized message,
reading the phase at the instant of mutation. Absorbing column -> None: the
no-op tick is suppressed (forward arc is legal by construction, so the
infallible advance_phase is used; no error path). This applies the codex #578
atomicity lesson to the IN-leg.
- drive_version_tick(actor, at) — thin async wrapper over Tick.
- drive_scheduled_tick(scheduler, view, at, exec, actor) — generic consumer that
drives the EXISTING VersionScheduler trait ("propose, don't dispose": the
scheduler proposes from a view, the owner disposes via Advance, None
suppresses). For custom policies that read a richer view than the owner
computes internally; documented as advisory (proposal computed outside the
owner message, so it may relay a typed Illegal rather than corrupt).
+3 tests (12 passed/0 failed under --features supervisor --lib): forward-arc
chain then suppress at absorbing; two concurrent ticks serialize along the arc
(no stale-phase collision); generic consumer drives NextPhaseScheduler
propose->dispose + suppresses an absorbing proposal. clippy + fmt clean; light
build (no lance/datafusion, no disk gate).
Remaining S3 (lance/disk-gated): wire the LIVE
LanceVersionScheduler::drive_at_latest over a real VersionedGraph::versions() to
feed `at`. The apply + no-op-suppress loop is done; only the live versions()
poll remains.
Board hygiene: plan S3 status annotated; AGENT_LOG cont.34 prepended.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
S2 driver — MUL gate → owner advance (
drive_mul_advance)The S2→S4 composition, on the same light
lance-graph-supervisorcrate. The MUL gate decides; the owner advances itself.What landed (
kanban_actor)mul_target(phase, qualia, mantissa) -> Option<KanbanColumn>— pure lowering:gate_decision_i4→KanbanColumn::advance_on_gate(Flow → forward, Block → Prune-where-legal, Hold →None). Integer i4 — no f64/NaN.drive_mul_advance(actor, qualia, mantissa)— read the owner's phase, run the gate, and on a non-Hold gatecastKanbanMsg::Advanceto the owning actor. Returns theKanbanMove, orNoneon Hold.Verified
--features supervisor(4 prior +s2_driver_gate_advances_then_holds: Flow qualia + mantissa>0 → Planning→CognitiveWork; neutral + 0 → Hold → no advance, phase unchanged).Net — the OUT-leg is now real, tested code on the light crate
Remaining: S2 per-row
cognitive-shader-driverloop over thequaliacolumn (needsMailboxSoaView::qualia()+ the shader-driver build), and S3 (the lanceLanceVersionSchedulerconsumer) + run-NaN — all need the heavier lance builds. This PR is the actor-side S2 consumer; the nodemul_phase_stepwrapper stays the single-node convenience.Changes
crates/lance-graph-supervisor/src/kanban_actor.rs—mul_target+drive_mul_advance+ 1 test.claude/plans/capstone-out-leg-wiring-v1.md— S2 status.claude/board/AGENT_LOG.md— cont.³³🤖 Generated with Claude Code
Generated by Claude Code