Skip to content

Wire worker_tools through tool_coordinate (proposer + critics)#23

Merged
fxspeiser merged 1 commit into
mainfrom
feature/coordinate-worker-tools
May 26, 2026
Merged

Wire worker_tools through tool_coordinate (proposer + critics)#23
fxspeiser merged 1 commit into
mainfrom
feature/coordinate-worker-tools

Conversation

@fxspeiser
Copy link
Copy Markdown
Owner

Summary

Closes the follow-up logged in PR #21: `tool_coordinate` now supports the bounded inner-ReAct loop on the roles where evidence gathering is meaningful.

Design — tools run BEFORE the structured emission. A `<tool_call>` tag in the middle of a role envelope would break schema validation, so `_request_structured_with_tools` interleaves the hop loop with schema parsing:

  • tool_call tag present + within budget → dispatch, wrap, re-prompt
  • tool_call tag present + budget exhausted → refusal + one final emission round
  • no tool_call tag → parse + schema-validate (with the existing retry-once-on-failure semantics)

Roles:

  • Proposer + Critics get `worker_tools` (they benefit from fetching evidence to ground their position).
  • Synthesizer is intentionally EXCLUDED — purely combinatorial. Surfaced explicitly as `worker_tools.applies_to: ["proposer", "critic"]` on the response so operators see this.

`_request_structured` itself grows `worker_tools` + `session_id` kwargs and delegates to the new variant only when the allowlist-filtered tools list is non-empty — every other call site (debate / orchestrate / claims extractor / etc.) is unaffected.

Test plan

  • New `scripts/test_coordinate_worker_tools.py` covers `_request_structured_with_tools` happy + budget-exhaustion + identity + retry, plus end-to-end coordinate (proposer + critic fetch; synth does NOT) and the metadata surface
  • Full suite (35 scripts) passes locally

🤖 Generated with Claude Code

Closes the follow-up logged in PR #21: coordinate now supports the
bounded inner-ReAct loop, but only on the roles where evidence
gathering is meaningful.

Design choice: tool calls happen BEFORE the structured JSON emission.
A `<tool_call>` tag in the middle of a role envelope would break
schema validation, so `_request_structured_with_tools` interleaves
the hop loop with schema parsing — tool calls each cost one hop;
the response without a tool_call tag is parsed against the schema
(with the existing retry-once-on-validation-failure semantics).

Roles:
- PROPOSER + CRITICS get worker_tools when enabled. They benefit from
  fetching evidence to ground their position.
- SYNTHESIZER is intentionally EXCLUDED. Its job is purely
  combinatorial — combining proposer + critique output. Letting it
  call tools opens scope creep and external-call cost on what should
  be a pure reduction step.

Implementation:
- `_request_structured` grows kwargs `worker_tools` + `session_id`.
  When `worker_tools` is non-empty (after allowlist filtering) it
  delegates to `_request_structured_with_tools`.
- `_request_structured_with_tools` interleaves the hop loop with
  schema parsing:
  * tool_call tag present + within budget -> dispatch, wrap, re-prompt
  * tool_call tag present + budget exhausted -> refusal + one final
    emission round; that final attempt is parsed once, no retry
  * no tool_call tag -> attempt JSON parse + schema validation; on
    failure, re-prompt once with the validation errors (the standard
    `_request_structured` retry path)
- The system message hint enforces "tool_call envelopes only BEFORE
  the final JSON object" so the worker doesn't try to mix them.
- `tool_coordinate` accepts `worker_tools: [...]` arg, filters against
  the hard allowlist, and surfaces `worker_tools: {accepted, rejected,
  hop_budget, applies_to: ["proposer", "critic"]}` back on the
  response — operators see explicitly that synth is excluded.
- Tool calls roll up under the same session_id (cost, breakers, fetch
  egress budget all apply to inner calls).

Schema:
- `coordinate.input.worker_tools` added with `enum: ["fetch","verify"]`
  and a note that synth is excluded.

Tests (scripts/test_coordinate_worker_tools.py):
- `_request_structured_with_tools` happy path (one tool_call -> valid
  envelope)
- Hop budget exhaustion inside the structured loop -> 3rd request
  refused, worker still produces a valid envelope on the final round
- worker_tools empty -> identity behavior (no tool-call hint injected;
  no `inner_tool_calls` on the answer)
- Schema validation retry preserved on the tool-use path
- End-to-end coordinate: proposer + both critics each fetch once; synth
  dispatch does NOT carry the tool-call hint in its system prompt;
  inner_tool_calls recorded on per-role answers
- Legacy coordinate (no worker_tools arg): no metadata on the response,
  no extra prompt scaffolding, no fetches

Full suite (35 scripts) passes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@fxspeiser fxspeiser merged commit c8f352d into main May 26, 2026
1 check passed
@fxspeiser fxspeiser deleted the feature/coordinate-worker-tools branch May 26, 2026 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant