Bump confer / triangulate token budgets for OpenAI + Gemini by fxspeiser · Pull Request #27 · fxspeiser/crosscheck-agent

fxspeiser · 2026-05-31T20:49:57Z

Summary

Follow-up to PR #26. Same MAX_TOKENS truncation pattern is hitting `triangulate` on gpt-5 and both `confer` + `triangulate` on gemini-2.5-pro.

New `_PROVIDER_TOKEN_BUDGETS` entries (all 6144):

`openai.triangulate` (was 2048 reasoning-default)
`gemini.confer` (was 1500 — see follow-up below)
`gemini.triangulate` (was 1500 — see follow-up below)

Follow-up worth tracking: gemini-2.5-pro is NOT currently tagged in `PROVIDER_CAPS["gemini"]["reasoning_prefixes"]`, so non-overridden purposes (`debate`, `synth`, `audit`) fall through to the smaller non-reasoning ceilings (1500 / 1024 / 768) instead of the 2048 reasoning default. This PR's per-provider override short-circuits that for confer + triangulate, but if cap-hitting shows up on gemini debate/synth/audit, the right fix is adding `"reasoning_prefixes": ("gemini-2.5-pro",)` to the gemini caps entry.

Test plan

Updated `scripts/test_provider_token_budgets.py` asserts the new values + explicitly documents the gemini.debate fall-through gap
Full suite (38 scripts) passes locally

🤖 Generated with Claude Code

Follow-up to PR #26. Same MAX_TOKENS truncation pattern is hitting triangulate on gpt-5 and both confer + triangulate on gemini-2.5-pro: reasoning tokens count against `max_completion_tokens`, the visible answer gets cut mid-emission. New _PROVIDER_TOKEN_BUDGETS entries (all 6144): - openai.triangulate (was 2048 reasoning-default) - gemini.confer (was 1500 non-reasoning ceiling) - gemini.triangulate (was 1500 non-reasoning ceiling) Notable side-finding (logged as a follow-up, NOT fixed here): gemini-2.5-pro is NOT currently tagged in `PROVIDER_CAPS["gemini"]["reasoning_prefixes"]`, so non-overridden purposes (debate, synth, audit) fall through to the SMALLER non-reasoning ceilings (1500 / 1024 / 768) instead of the reasoning default of 2048. The provider-specific override in this PR short-circuits that for confer + triangulate, but if the user starts hitting caps on gemini debate / synth / audit, the right fix is adding `"reasoning_prefixes": ("gemini-2.5-pro",)` to the gemini PROVIDER_CAPS entry. Out of scope for this PR. Tests: - openai.triangulate = 6144 - gemini.confer = 6144, gemini.triangulate = 6144 - gemini.debate falls through to 1500 (the non-reasoning ceiling, documenting the follow-up gap explicitly) Full suite (38 scripts) passes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Closes the follow-up flagged in PR #27. gemini-2.5-pro burns reasoning tokens just like gpt-5 and claude-opus-4-7 — it should have been in PROVIDER_CAPS["gemini"]["reasoning_prefixes"] from day one. Effects of the tag: - `_is_reasoning_model("gemini", "gemini-2.5-pro")` now returns True. - `_budget_for_purpose` picks the reasoning-class ceilings (2048) for every non-overridden purpose instead of falling through to the smaller non-reasoning ceilings: debate: 1500 -> 2048 synth: 1024 -> 2048 audit: 768 -> 2048 moderator/orchestrate/plan/review/coordinate: already 2048 confer + triangulate keep the PR #27 override at 6144. - Prompt adapters now strip "think step by step" / "think out loud" preambles before sending to gemini-2.5-pro (already shipped in PR #19; matches behavior for the other reasoning families). Future-proof: the prefix is "gemini-2.5-pro" so a hypothetical gemini-1.5-flash / gemini-2.0-flash variant correctly falls through to the non-reasoning ceilings (verified in the test). No change to the Gemini API call itself — `generationConfig.temperature` + `maxOutputTokens` are both accepted by gemini-2.5-pro. Tests (scripts/test_provider_token_budgets.py): - `_is_reasoning_model("gemini", "gemini-2.5-pro")` is True - gemini debate / synth / audit all = 2048 (reasoning default) - gemini confer / triangulate stay = 6144 (PR #27 override) - A hypothetical non-reasoning gemini still falls through correctly Full suite (38 scripts) passes. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

fxspeiser merged commit 66e3bed into main May 31, 2026
1 check passed

fxspeiser deleted the feature/budget-bump-gpt-gemini branch May 31, 2026 20:50

fxspeiser mentioned this pull request Jun 1, 2026

Tag gemini-2.5-pro as reasoning-class #28

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump confer / triangulate token budgets for OpenAI + Gemini#27

Bump confer / triangulate token budgets for OpenAI + Gemini#27
fxspeiser merged 1 commit into
mainfrom
feature/budget-bump-gpt-gemini

fxspeiser commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fxspeiser commented May 31, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant