feat(rollout): non-ACP session-factory CONNECT seam#753
Conversation
Add the generic kernel seam that lets a registered agent ride the non-ACP Session path instead of ACP. ACP stays the default; this is additive. - registry.py: AgentConfig.session_factory + "session-factory" in VALID_PROTOCOLS. - agents/protocol.py: declare Session.on_change (the assignable streaming hook the kernel already sets on every session, ACP and non-ACP) and document the optional close() lifecycle hook + the steps event-dict shape — so the Session contract reflects what the kernel actually drives. - rollout/_session_factory.py (new): the 3 cohesive helpers (is_session_factory_agent / resolve_session_factory / connect_session_factory), typed, kept out of the 2k-line engine. - rollout/__init__.py: one `_open_session` dispatch shared by the primary connect and the role-swap reconnect (no duplicated if/else); an explicit `_is_session_path` flag the post-connect methods branch on (instead of overloading `_acp_client is None`); `_execute_session_prompts` merged into the existing try/except AgentPromptTimeoutError + _commit_acp_execution; non-ACP branches of `_attach_trajectory_writer` and `disconnect`; and a shared `_commit_partial_capture` core feeding both the ACP and session partial-capture methods (which also fixes a divergence — the session path now honours _terminal_timeout like the ACP path). Verified on Daytona (x86_64) with omnigent-pi + deepseek-chat: reward 1.0 on hello-world and the real citation-check task (all 9 verifier tests), trajectory_source="session". 102 rollout/trajectory unit tests + 611 agent/ protocol/registry tests pass (the 1 unrelated agent-router-CLI failure pre-exists on feat/0.7-on-release). NOTE: the non-ACP path must run with usage tracking on (auto/required) so in-sandbox model calls route through the litellm proxy — with usage off, the 0.7 zero-activity guard nulls the reward.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ce89918700
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # Assignable streaming hook (see the class docstring). Declared here because | ||
| # the kernel sets it on every session it drives — part of the real contract, | ||
| # not an ACP-only detail. | ||
| on_change: Callable[[Session], None] | None |
There was a problem hiding this comment.
Expose on_change on ACPSessionAdapter
Adding on_change as a required Session member makes the existing ACP adapter stop satisfying the runtime-checkable protocol: ACPSessionAdapter.__init__ only sets _client and _ask_user_handler, so isinstance(adapter, Session) is false for ACP sessions. This breaks callers/tests that rely on the documented adapter honoring the Session contract; add a delegating on_change property or initialize the attribute on the adapter as well.
Useful? React with 👍 / 👎.
| return getattr(agent_cfg, "protocol", "acp") == "session-factory" and bool( | ||
| getattr(agent_cfg, "session_factory", "") | ||
| ) |
There was a problem hiding this comment.
Route missing factories through the validator
When a config declares protocol == "session-factory" but has an empty session_factory, this predicate returns false, so _open_session silently takes the ACP path instead of calling resolve_session_factory and failing with the intended configuration error. This is especially easy to hit for runtime-registered agents because register_agent() currently has no way to pass the new session_factory field, causing them to be stored with the empty default and then launched as ACP.
Useful? React with 👍 / 👎.
Bugbot couldn't run - usage limit reachedBugbot is counted against Cursor usage for this user or team, and this run hit a usage or spend limit. A user or team admin can review and increase usage limits in the Cursor dashboard. (requestId: serverGenReqId_56d6b127-96dd-464e-bee3-66d2aa8a5dcf) |
What
Adds a generic, additive kernel seam so a registered agent can be driven over BenchFlow's transport-agnostic
Sessionprotocol instead of ACP. ACP stays the default in every path; nothing changes unless an agent's config declaresprotocol="session-factory".This is the framework half of the Omnigent integration (the agent itself lives out-of-core in benchflow-ai/agents#7). The non-ACP
Session/Agentprotocol plane already exists on this branch; this only wires the kernel to resolve asession_factoryentrypoint and drive the returnedSession.Pieces
registry.py—AgentConfig.session_factoryfield +"session-factory"inVALID_PROTOCOLS.agents/protocol.py— declareSession.on_change(the assignable streaming hook the kernel already sets on every session, ACP and non-ACP) and document the optionalclose()lifecycle hook + thestepsevent-dict shape, so the contract reflects what the kernel actually drives.rollout/_session_factory.py(new) — the three cohesive helpers (is_session_factory_agent/resolve_session_factory/connect_session_factory), typed, kept out of the 2k-line engine.rollout/__init__.py— one_open_sessiondispatch shared by both the primary connect and the role-swap reconnect (no duplicated if/else); an explicit_is_session_pathflag the post-connect methods branch on (instead of overloading_acp_client is None, which also means "not connected");_execute_session_promptsmerged into the existingtry/except AgentPromptTimeoutError+_commit_acp_execution; non-ACP branches of_attach_trajectory_writeranddisconnect; and a shared_commit_partial_capturecore feeding both partial-capture methods (which also fixes a latent divergence — the session path now honours_terminal_timeoutlike the ACP path).Verified
End-to-end on Daytona (x86_64) with
omnigent-pi+deepseek/deepseek-chat:trajectory_source: session,error: NoneTests: 102 rollout/trajectory unit tests pass and 611 agent/protocol/registry tests pass (1 skipped). The single failure in
test_agent_router_cli_e2e.py(abench agent rundeprecation-string assertion) pre-exists onfeat/0.7-on-release— it reproduces with this change stashed and is unrelated to the rollout/protocol seam.This change went through a structural-quality review; the dispatch dedup, the explicit
_is_session_pathflag, the shared partial-capture core, and the honestSession.on_changedeclaration are the review-driven shape.Operational note
The non-ACP path must run with usage tracking on (
auto, the default, orrequired) — notoff. omnigent's model calls run inside the sandbox, so they must route through BenchFlow's litellm usage proxy to be captured. Withusage_tracking="off", the 0.7 zero-activity guard (total_tokens==0 AND n_tool_calls==0) treats the run as a silent provider failure and nulls the reward.Note
Medium Risk
Touches core rollout connect/execute/disconnect and trajectory capture; behavior is gated on explicit
session-factoryconfig, but mistakes in branching could affect multi-scene rollouts and partial trajectories.Overview
Introduces a generic kernel seam so agents registered with
protocol="session-factory"and amodule:callablesession_factoryentrypoint are connected and executed through the transport-agnosticSessionprotocol instead of ACP. ACP remains the default for all existing agents.Registry & contract:
AgentConfiggainssession_factoryand"session-factory"inVALID_PROTOCOLS. TheSessionprotocol now explicitly documents and declareson_change(trajectory streaming) and optionalclose().Rollout engine: New
rollout/_session_factory.pyresolves the factory, passes resolvedagent_envintoconnect, and returns a liveSession.connect/connect_asfunnel through_open_session, with_is_session_pathdistinguishing non-ACP from “not connected.” Non-ACP runs use_execute_session_prompts, a session-native trajectory sink onsteps, duck-typedcloseon disconnect, andpartial_session/sessiontrajectory sources._commit_partial_captureunifies ACP and session partial-capture (including_terminal_timeouton the session path).Reviewed by Cursor Bugbot for commit ce89918. Bugbot is set up for automated code reviews on this repo. Configure here.