fix: clamp Bedrock thinking effort to the LiteLLM-accepted ceiling (#737)#739
Open
ElegantLin wants to merge 1 commit into
Open
fix: clamp Bedrock thinking effort to the LiteLLM-accepted ceiling (#737)#739ElegantLin wants to merge 1 commit into
ElegantLin wants to merge 1 commit into
Conversation
) BenchFlow accepts BENCHFLOW_BEDROCK_THINKING_EFFORT values `xhigh`/`max`, but LiteLLM 1.88.0rc1's Bedrock Converse transform rejects them for adaptive Claude 4.8+ with a BadRequestError. So the #598 "MAX thinking" config errored at request time instead of running at the highest supported effort. Clamp a requested effort to the accepted ceiling (`high`) in both injection paths — the route config (litellm_config) and the standalone proxy patch (litellm_bedrock_patch, which cannot import benchflow, so it carries its own in-sync copy of the ladder). This matches the existing garbage->high default: a MAX-thinking run now proceeds at the real maximum rather than crashing. Tests: - clamp asserted on the route param (xhigh/max -> high) and updated the two pre-existing tests that encoded the bug (effort `max` threaded through); - proxy-env override `max` clamps in the real wire payload without raising; - drift guard: every requestable effort, after the clamp, is accepted by the REAL litellm Converse transform — if a future litellm changes its accepted set this fails loudly instead of shipping a rejected value mid-run. Both clamps verified to fail-then-pass when individually neutered. Closes #737 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
BenchFlow accepts
BENCHFLOW_BEDROCK_THINKING_EFFORTvaluesxhigh/max, but LiteLLM 1.88.0rc1's Bedrock Converse transform rejects them for adaptive Claude 4.8+ with aBadRequestError(issue #737). So the #598 "MAX thinking" config errored at request time instead of running at the highest supported effort.Fix
Clamp a requested effort to the LiteLLM-accepted ceiling (
high) in both injection paths, matching the existing garbage→highdefault so a MAX-thinking run proceeds at the real maximum rather than crashing:litellm_config._bedrock_thinking_effort) — bakes the clamped value intoconfig.yaml.litellm_bedrock_patch) — clamps the os.environ override too. It is deployed into the sandbox alone (cannot import benchflow), so it carries its own in-sync copy of the ladder; a comment in each points at the other.Test plan
xhigh/max→high; updated the two pre-existing tests that encoded the bug (maxthreaded through unchanged) to assert the clamp.maxoverride clamps in the real wire payload without raising.reasoning_efforttests unaffected; ruff + format + ty clean.Closes #737
🤖 Generated with Claude Code