Skip to content

feat(rollout): non-ACP session-factory CONNECT seam#753

Open
xdotli wants to merge 1 commit into
feat/0.7-on-releasefrom
feat/0.7-omnigent-session-factory
Open

feat(rollout): non-ACP session-factory CONNECT seam#753
xdotli wants to merge 1 commit into
feat/0.7-on-releasefrom
feat/0.7-omnigent-session-factory

Conversation

@xdotli

@xdotli xdotli commented Jun 14, 2026

Copy link
Copy Markdown
Member

What

Adds a generic, additive kernel seam so a registered agent can be driven over BenchFlow's transport-agnostic Session protocol instead of ACP. ACP stays the default in every path; nothing changes unless an agent's config declares protocol="session-factory".

This is the framework half of the Omnigent integration (the agent itself lives out-of-core in benchflow-ai/agents#7). The non-ACP Session/Agent protocol plane already exists on this branch; this only wires the kernel to resolve a session_factory entrypoint and drive the returned Session.

Pieces

  • registry.pyAgentConfig.session_factory field + "session-factory" in VALID_PROTOCOLS.
  • agents/protocol.py — declare Session.on_change (the assignable streaming hook the kernel already sets on every session, ACP and non-ACP) and document the optional close() lifecycle hook + the steps event-dict shape, so the contract reflects what the kernel actually drives.
  • rollout/_session_factory.py (new) — the three cohesive helpers (is_session_factory_agent / resolve_session_factory / connect_session_factory), typed, kept out of the 2k-line engine.
  • rollout/__init__.py — one _open_session dispatch shared by both the primary connect and the role-swap reconnect (no duplicated if/else); an explicit _is_session_path flag the post-connect methods branch on (instead of overloading _acp_client is None, which also means "not connected"); _execute_session_prompts merged into the existing try/except AgentPromptTimeoutError + _commit_acp_execution; non-ACP branches of _attach_trajectory_writer and disconnect; and a shared _commit_partial_capture core feeding both partial-capture methods (which also fixes a latent divergence — the session path now honours _terminal_timeout like the ACP path).

Verified

End-to-end on Daytona (x86_64) with omnigent-pi + deepseek/deepseek-chat:

Task Result
hello-world reward 1.0, trajectory_source: session, error: None
citation-check (real) reward 1.0, 9/9 verifier tests

Tests: 102 rollout/trajectory unit tests pass and 611 agent/protocol/registry tests pass (1 skipped). The single failure in test_agent_router_cli_e2e.py (a bench agent run deprecation-string assertion) pre-exists on feat/0.7-on-release — it reproduces with this change stashed and is unrelated to the rollout/protocol seam.

This change went through a structural-quality review; the dispatch dedup, the explicit _is_session_path flag, the shared partial-capture core, and the honest Session.on_change declaration are the review-driven shape.

Operational note

The non-ACP path must run with usage tracking on (auto, the default, or required) — not off. omnigent's model calls run inside the sandbox, so they must route through BenchFlow's litellm usage proxy to be captured. With usage_tracking="off", the 0.7 zero-activity guard (total_tokens==0 AND n_tool_calls==0) treats the run as a silent provider failure and nulls the reward.


Note

Medium Risk
Touches core rollout connect/execute/disconnect and trajectory capture; behavior is gated on explicit session-factory config, but mistakes in branching could affect multi-scene rollouts and partial trajectories.

Overview
Introduces a generic kernel seam so agents registered with protocol="session-factory" and a module:callable session_factory entrypoint are connected and executed through the transport-agnostic Session protocol instead of ACP. ACP remains the default for all existing agents.

Registry & contract: AgentConfig gains session_factory and "session-factory" in VALID_PROTOCOLS. The Session protocol now explicitly documents and declares on_change (trajectory streaming) and optional close().

Rollout engine: New rollout/_session_factory.py resolves the factory, passes resolved agent_env into connect, and returns a live Session. connect / connect_as funnel through _open_session, with _is_session_path distinguishing non-ACP from “not connected.” Non-ACP runs use _execute_session_prompts, a session-native trajectory sink on steps, duck-typed close on disconnect, and partial_session / session trajectory sources. _commit_partial_capture unifies ACP and session partial-capture (including _terminal_timeout on the session path).

Reviewed by Cursor Bugbot for commit ce89918. Bugbot is set up for automated code reviews on this repo. Configure here.

Add the generic kernel seam that lets a registered agent ride the non-ACP
Session path instead of ACP. ACP stays the default; this is additive.

- registry.py: AgentConfig.session_factory + "session-factory" in VALID_PROTOCOLS.
- agents/protocol.py: declare Session.on_change (the assignable streaming hook
  the kernel already sets on every session, ACP and non-ACP) and document the
  optional close() lifecycle hook + the steps event-dict shape — so the Session
  contract reflects what the kernel actually drives.
- rollout/_session_factory.py (new): the 3 cohesive helpers
  (is_session_factory_agent / resolve_session_factory / connect_session_factory),
  typed, kept out of the 2k-line engine.
- rollout/__init__.py: one `_open_session` dispatch shared by the primary connect
  and the role-swap reconnect (no duplicated if/else); an explicit
  `_is_session_path` flag the post-connect methods branch on (instead of
  overloading `_acp_client is None`); `_execute_session_prompts` merged into the
  existing try/except AgentPromptTimeoutError + _commit_acp_execution; non-ACP
  branches of `_attach_trajectory_writer` and `disconnect`; and a shared
  `_commit_partial_capture` core feeding both the ACP and session partial-capture
  methods (which also fixes a divergence — the session path now honours
  _terminal_timeout like the ACP path).

Verified on Daytona (x86_64) with omnigent-pi + deepseek-chat: reward 1.0 on
hello-world and the real citation-check task (all 9 verifier tests),
trajectory_source="session". 102 rollout/trajectory unit tests + 611 agent/
protocol/registry tests pass (the 1 unrelated agent-router-CLI failure
pre-exists on feat/0.7-on-release). NOTE: the non-ACP path must run with usage
tracking on (auto/required) so in-sandbox model calls route through the litellm
proxy — with usage off, the 0.7 zero-activity guard nulls the reward.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ce89918700

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

# Assignable streaming hook (see the class docstring). Declared here because
# the kernel sets it on every session it drives — part of the real contract,
# not an ACP-only detail.
on_change: Callable[[Session], None] | None

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Expose on_change on ACPSessionAdapter

Adding on_change as a required Session member makes the existing ACP adapter stop satisfying the runtime-checkable protocol: ACPSessionAdapter.__init__ only sets _client and _ask_user_handler, so isinstance(adapter, Session) is false for ACP sessions. This breaks callers/tests that rely on the documented adapter honoring the Session contract; add a delegating on_change property or initialize the attribute on the adapter as well.

Useful? React with 👍 / 👎.

Comment on lines +33 to +35
return getattr(agent_cfg, "protocol", "acp") == "session-factory" and bool(
getattr(agent_cfg, "session_factory", "")
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Route missing factories through the validator

When a config declares protocol == "session-factory" but has an empty session_factory, this predicate returns false, so _open_session silently takes the ACP path instead of calling resolve_session_factory and failing with the intended configuration error. This is especially easy to hit for runtime-registered agents because register_agent() currently has no way to pass the new session_factory field, causing them to be stored with the empty default and then launched as ACP.

Useful? React with 👍 / 👎.

@cursor

cursor Bot commented Jun 14, 2026

Copy link
Copy Markdown

Bugbot couldn't run - usage limit reached

Bugbot is counted against Cursor usage for this user or team, and this run hit a usage or spend limit.

A user or team admin can review and increase usage limits in the Cursor dashboard.

(requestId: serverGenReqId_56d6b127-96dd-464e-bee3-66d2aa8a5dcf)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant