fix: clamp Bedrock thinking effort to the LiteLLM-accepted ceiling (#737) by ElegantLin · Pull Request #739 · benchflow-ai/benchflow

ElegantLin · 2026-06-13T18:00:50Z

Stacked on #736 (base = test/599-bedrock-effort-parity) — it reuses the _wire_output_config_effort end-to-end helper introduced there. Will rebase onto main once #736 merges.

Summary

BenchFlow accepts BENCHFLOW_BEDROCK_THINKING_EFFORT values xhigh/max, but LiteLLM 1.88.0rc1's Bedrock Converse transform rejects them for adaptive Claude 4.8+ with a BadRequestError (issue #737). So the #598 "MAX thinking" config errored at request time instead of running at the highest supported effort.

Fix

Clamp a requested effort to the LiteLLM-accepted ceiling (high) in both injection paths, matching the existing garbage→high default so a MAX-thinking run proceeds at the real maximum rather than crashing:

Route config (litellm_config._bedrock_thinking_effort) — bakes the clamped value into config.yaml.
Standalone proxy patch (litellm_bedrock_patch) — clamps the os.environ override too. It is deployed into the sandbox alone (cannot import benchflow), so it carries its own in-sync copy of the ladder; a comment in each points at the other.

Test plan

Route param clamp: xhigh/max → high; updated the two pre-existing tests that encoded the bug (max threaded through unchanged) to assert the clamp.
Proxy-env path: a run-level max override clamps in the real wire payload without raising.
Drift guard: every requestable effort, after the clamp, is accepted by the REAL litellm Converse transform — if a future litellm changes its accepted set, this fails loudly instead of shipping a rejected value mid-run.
Both clamps verified fail-then-pass when individually neutered.
87 passed across the bedrock/litellm slice; unrelated agent-CLI reasoning_effort tests unaffected; ruff + format + ty clean.

Closes #737

🤖 Generated with Claude Code

) BenchFlow accepts BENCHFLOW_BEDROCK_THINKING_EFFORT values `xhigh`/`max`, but LiteLLM 1.88.0rc1's Bedrock Converse transform rejects them for adaptive Claude 4.8+ with a BadRequestError. So the #598 "MAX thinking" config errored at request time instead of running at the highest supported effort. Clamp a requested effort to the accepted ceiling (`high`) in both injection paths — the route config (litellm_config) and the standalone proxy patch (litellm_bedrock_patch, which cannot import benchflow, so it carries its own in-sync copy of the ladder). This matches the existing garbage->high default: a MAX-thinking run now proceeds at the real maximum rather than crashing. Tests: - clamp asserted on the route param (xhigh/max -> high) and updated the two pre-existing tests that encoded the bug (effort `max` threaded through); - proxy-env override `max` clamps in the real wire payload without raising; - drift guard: every requestable effort, after the clamp, is accepted by the REAL litellm Converse transform — if a future litellm changes its accepted set this fails loudly instead of shipping a rejected value mid-run. Both clamps verified to fail-then-pass when individually neutered. Closes #737 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

ElegantLin requested a review from bingran-you June 13, 2026 18:07

bingran-you mentioned this pull request Jun 14, 2026

Bedrock 4.8 thinking effort 'max'/'xhigh' are accepted by BenchFlow but rejected by LiteLLM → request-time BadRequestError #737

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: clamp Bedrock thinking effort to the LiteLLM-accepted ceiling (#737)#739

fix: clamp Bedrock thinking effort to the LiteLLM-accepted ceiling (#737)#739
ElegantLin wants to merge 1 commit into
test/599-bedrock-effort-parityfrom
fix/737-bedrock-effort-clamp

ElegantLin commented Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ElegantLin commented Jun 13, 2026

Summary

Fix

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant