fix: surface Anthropic stop_reason to detect truncation (#5148)#5149
Open
devin-ai-integration[bot] wants to merge 2 commits intomainfrom
Open
fix: surface Anthropic stop_reason to detect truncation (#5148)#5149devin-ai-integration[bot] wants to merge 2 commits intomainfrom
devin-ai-integration[bot] wants to merge 2 commits intomainfrom
Conversation
- Add stop_reason field to LLMCallCompletedEvent - Update BaseLLM._emit_call_completed_event to accept and pass stop_reason - Add _warn_if_truncated helper to AnthropicCompletion that logs a warning when stop_reason='max_tokens' - Apply fix to all 6 Anthropic completion methods (sync and async): _handle_completion, _handle_streaming_completion, _handle_tool_use_conversation, _ahandle_completion, _ahandle_streaming_completion, _ahandle_tool_use_conversation - Add 7 tests covering truncation warning, event field, and tool use paths Co-Authored-By: João <joao@crewai.com>
Contributor
Author
|
Prompt hidden (unlisted session) |
Contributor
Author
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
… values MagicMock objects (and other non-Anthropic responses) can return non-string values for getattr(response, 'stop_reason', None). Add a typed extraction helper that returns None unless the value is actually a string, preventing Pydantic validation errors in LLMCallCompletedEvent. Co-Authored-By: João <joao@crewai.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #5148. Anthropic's
Messageresponse includes astop_reasonfield that indicates why the API stopped generating (e.g."max_tokens"means the output was truncated). Previously,AnthropicCompletionsilently discarded this field, making it impossible for users to detect truncation via hooks or events.Changes:
stop_reason: str | None = Nonefield toLLMCallCompletedEventstop_reasonparameter toBaseLLM._emit_call_completed_event_extract_stop_reason()static method onAnthropicCompletionthat safely extractsstop_reasonasstr | None(guards against non-string values, e.g. from MagicMock in tests)_warn_if_truncated()helper onAnthropicCompletionthat logs a warning whenstop_reason == "max_tokens"stop_reasonthrough all 6 Anthropic completion methods (sync + async × regular/streaming/tool-use)Non-Anthropic providers are unaffected — they continue to emit
stop_reason=Noneby default.Review & Testing Checklist for Human
_handle_completion,_handle_streaming_completion,_handle_tool_use_conversation, and their async counterparts. Confirm no code path was missed, especially around early returns (structured output, tool use blocks)._handle_streaming_completion/_ahandle_streaming_completion,stop_reasonis read fromstream.get_final_message(). Verify that the reconstructedfinal_messageactually carries thestop_reasonfrom the stream (it should, per Anthropic SDK behavior).from_agent.roleaccess is safe:_warn_if_truncatedaccessesfrom_agent.rolewhenfrom_agentis truthy. This should always work sincefrom_agentis typed asAgent | None, but confirm no caller passes a non-Agent truthy value.finish_reasonfield — consider whether a follow-up is needed.Suggested manual test: Set
max_tokensto a very small value (e.g. 50) on an Anthropic model, run a crew, and verify:stop_reason='max_tokens'LLMCallCompletedEventcarriesstop_reason="max_tokens"(observable via a custom event handler)Notes
stop_reasonfield onLLMCallCompletedEventand theBaseLLMparameter are additive and backwards-compatible (defaultNone)._extract_stop_reasonuses anisinstance(raw, str)guard so that non-string attribute values (e.g. auto-created MagicMock attributes in existing tests) safely becomeNonerather than causing Pydantic validation errors.Link to Devin session: https://app.devin.ai/sessions/7214a66c41b94b07803ad5faacf12270