fix: don't honor result_as_answer when tool execution errors#5157
fix: don't honor result_as_answer when tool execution errors#5157devin-ai-integration[bot] wants to merge 2 commits intomainfrom
Conversation
When a tool with result_as_answer=True raises an exception, the agent now continues reasoning about the error instead of treating the error message as the final answer. Fixes #5156 Co-Authored-By: João <joao@crewai.com>
|
Prompt hidden (unlisted session) |
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Address Cursor Bugbot review: the add_image exception handlers in use() and ause() were missing the error flag, allowing result_as_answer to be incorrectly honored when those paths errored. Co-Authored-By: João <joao@crewai.com>
|
what counts as tool execution error, in tools/venv_tool.pyimport asyncio Use a relative path for better portabilityThis assumes the sandbox_venv is in the root of your projectSANDBOX_PYTHON_PATH = "/opt/sandbox_venv/bin/python" class CodeInterpreterInput(BaseModel): class VenvCodeInterpreterTool(BaseTool): |

Summary
Fixes #5156. When a tool with
result_as_answer=Trueraises an exception, the error message was being treated as the agent's final answer, preventing the agent from reflecting on the failure and retrying.The fix adds error tracking across all tool execution code paths so that
result_as_answeris only honored on successful tool executions:tool_usage.py: Added_last_execution_erroredflag, set in all error branches (ToolUsageError, tool selection failure, runtime exception in_use/_ause)tool_utils.py: Bothexecute_tool_and_check_finalityandaexecute_tool_and_check_finalitycheck the flag before returningresult_as_answer=Truecrew_agent_executor.py: Propagateserror_occurredthrough execution result dict;_append_tool_result_and_check_finalitygates on itagent_utils.py: Uses existingerror_event_emittedto gateresult_as_answerexperimental/agent_executor.py: Same pattern applied to sequential loop, parallel results loop, and parallel error fallbackReview & Testing Checklist for Human
step_executor.pycoverage: This file was not modified. Confirm that its native tool path delegates to one of the fixed executors and doesn't have its own independentresult_as_answercheck that bypasses the fix._last_execution_erroredreliability: The flag is a mutable instance attribute onToolUsage, reset at the top ofuse()/ause()and read immediately after bytool_utils.py. Confirm no intermediate call can reset it before it's read.result_as_answer=Truetool that intentionally fails, and confirm the agent continues reasoning rather than returning the error as its final answer."original_tool": Nonealongside"error_occurred": True— theresult_as_answerguard is technically unreachable here sinceoriginal_toolis falsy. Confirm this is acceptable.Notes
ToolUsageflag,execute_tool_and_check_finality(both error and success), and native tool execution inAgentExecutor(both error and success)._last_execution_erroredflag onToolUsage(for text/ReAct pattern), and anerror_occurreddict key /error_event_emittedlocal variable (for native tool calling). This follows existing conventions in each module rather than introducing a new abstraction.Link to Devin session: https://app.devin.ai/sessions/a7393abd35bf4141bf23fe9e1b86b364
Note
Medium Risk
Changes tool-execution finality logic across multiple executors and hook wrappers; behavior around
result_as_answernow depends on new error-tracking flags, which could alter when agents short-circuit after tool calls.Overview
Prevents tools marked
result_as_answer=Truefrom prematurely short-circuiting the agent when the tool execution fails, allowing the model to see the error and continue reasoning/retrying.This propagates explicit error state through native tool execution results (including parallel paths) in
CrewAgentExecutorand the experimentalAgentExecutor, and adds_last_execution_erroredtracking inToolUsagesotool_utils.execute_*_tool_and_check_finalityonly returnsresult_as_answeron successful runs. Adds regression tests covering both success/error cases for native tool execution andToolUsage/execute_tool_and_check_finalitybehavior.Written by Cursor Bugbot for commit f5dc745. This will update automatically on new commits. Configure here.