fix: retry with resume message when model returns empty response by gurjot-05 · Pull Request #5006 · google/adk-python

gurjot-05 · 2026-03-26T04:42:00Z

Bug

Some models (notably Gemini 2.5 Pro/Flash) intermittently return empty content (parts: [], candidatesTokenCount: 0, finishReason: STOP) after processing tool results. This is especially common under concurrent load and with streaming + thinking enabled.

ADK's is_final_response() treats this as a valid completed turn because it only checks for the absence of function calls — not the presence of actual content. The agent loop stops and the user sees nothing.

Observed with:

Gemini 2.5 Pro via vertex_ai — after tool execution in orchestrator agents
Intermittent: ~15-29% failure rate under parallel load (7+ concurrent sessions)
Affects both streaming (SSE + thoughts) and non-streaming modes

Example session showing the bug:

Event: agent calls nth_research → completes successfully
Event: agent calls review_research → completes successfully (status: "approved")
Event: agent should call draft_response next, but returns parts: [] ← EMPTY
Agent loop ends. User sees nothing.

Related: #3525

Root Cause

There are three distinct failure modes, all leading to the same silent halt:

Mode 1: Non-streaming empty response

In BaseLlmFlow.run_async():

if not last_event or last_event.is_final_response() or last_event.partial:
    break

An event with parts: [] passes is_final_response() — no function calls, no function responses, not partial — so the loop breaks silently.

Mode 2: Streaming + thinking, no final response yielded

With streaming + thinking enabled, the LiteLLM adapter yields thought chunks as partial=True. When finish_reason=stop arrives with no text content and no accumulated text/reasoning:

lite_llm.py finalization requires (text or reasoning_parts) which is False
No final aggregated response is ever yielded from the streaming generator
last_event is either None or partial=True from a thought chunk
The loop breaks on not last_event or last_event.partial — before any retry logic

Mode 3: Model returns content=None

When the model returns content=None, _postprocess_async filters it out (returns without yielding if no content and no error). last_event stays None, and the loop breaks.

Fix

Two-layer fix addressing the root cause and adding defense-in-depth:

Layer 1: lite_llm.py — Ensure streaming always yields a final response

Added a fallback after the streaming loop: if the model produced no meaningful output at all (no text, no reasoning, no tool calls), yield an explicit empty non-partial LlmResponse so downstream retry logic can detect and handle it.

This converts Mode 2 into Mode 1, making it catchable by the retry logic.

Layer 2: base_llm_flow.py — Unified retry for all empty response modes

Restructured run_async() to check for empty responses before normal termination conditions. The retry now handles all three cases:

not last_event — no events yielded at all (Mode 3)
last_event.partial with no meaningful content — streaming thought-only chunks, no final response (Mode 2, defense-in-depth)
last_event.is_final_response() with no meaningful content — non-streaming empty response (Mode 1)

When an empty response is detected:

A resume message is injected: "Your previous response was empty. Please resume execution from where you left off."
This gives the model context about why it's being re-prompted, improving recovery rate vs. silent retry
_MAX_EMPTY_RESPONSE_RETRIES = 2 prevents infinite loops

False positive prevention

Partial events with real content (normal streaming): _has_meaningful_content() returns True — no retry
Agent transfer events: Author check (last_event.author == agent.name) prevents retrying events from other agents
Thought-only parts: Treated as non-meaningful (they have no user-visible content)

Tests

Existing tests: 384 passed (0 failures)

PR tests (test_empty_response_retry.py): 12 passed

Comprehensive scenario tests (test_empty_response_all_scenarios.py): 12 passed

#	Scenario	Expected	Result
1	Non-streaming empty parts:[] then recovery	1 retry, good response	PASS
2	Thought-only final response	1 retry	PASS
3a	No events at all (last_event=None)	Retries up to max	PASS
3b	Partial + empty content	1 retry, then recovery	PASS
3b'	Partial + thought-only content	1 retry, then recovery	PASS
4	Partial WITH real text content	No retry (normal break)	PASS
5	Empty exhausts max retries	Exactly 2 retries, then stop	PASS
6	Empty, Empty, Good (multi-retry recovery)	2 retries, then good	PASS
7	LiteLLM fallback response is non-partial	is_final_response()=True	PASS
8	Whitespace-only text	1 retry	PASS
9	Function call is meaningful	Not retried	PASS
10	Partial empty then partial with content	1 retry, then normal break	PASS

Test plan

pytest tests/unittests/flows/llm_flows/ — 384 passed, 0 failed
All 12 original PR tests pass
All 12 comprehensive scenario tests pass
Agent transfer tests not affected (author check prevents false positives)
Verified with production logs from real orchestrator agent running Gemini 2.5 Pro — both non-streaming and streaming+thinking failure modes are covered

adk-bot · 2026-03-26T07:24:49Z

Response from ADK Triaging Agent

Hello @gurjot-05, thank you for your contribution!

It looks like the Contributor License Agreement (CLA) check has failed. Before we can merge this PR, you will need to sign the CLA. You can do so by following the instructions at https://cla.developers.google.com/.

Signing the CLA is a one-time process and is required for all contributions. Thanks!

Some models (notably Gemini 2.5) intermittently return empty content (parts: [], candidatesTokenCount: 0, finishReason: STOP) after processing tool results. This is especially common under concurrent load and with streaming + thinking enabled. ADK's is_final_response() treats this as a valid completed turn because it only checks for the absence of function calls, not the presence of actual content. The agent loop stops and the user sees nothing. This fix adds retry logic in BaseLlmFlow.run_async(): 1. _has_meaningful_content() helper detects empty/thought-only events 2. When an empty final response is detected from the current agent, a resume message ("Your previous response was empty. Please resume execution from where you left off.") is injected into the session as a user event before re-prompting the model 3. Maximum 2 retries to prevent infinite loops 4. Author check (last_event.author == agent.name) prevents false positives on legitimate empty events from agent transfers Unlike a silent re-prompt, the injected message gives the model context about why it is being called again, improving recovery rate. Fixes google#3525

…cases The original fix only retried when is_final_response() was True with empty content. This missed two scenarios observed in production: 1. Streaming + thinking: model streams thought chunks (partial=True) then stops with no text — the LiteLLM adapter dropped the response entirely, and the loop broke on last_event.partial without retry. 2. No events at all: model returned content=None which was filtered by _postprocess_async, leaving last_event=None — loop broke immediately. Changes: - lite_llm.py: Add fallback after streaming loop to yield an explicit empty non-partial LlmResponse when nothing was finalized, so downstream retry logic can detect and handle it. - base_llm_flow.py: Restructure run_async() to check for empty responses (None, partial+empty, final+empty) before normal termination, enabling retry across all three scenarios. - Update existing test for new retry-on-None behavior. - Add 12 comprehensive scenario tests covering all cases.

The resume nudge event was being yielded from run_async(), which sent it through the SSE stream to the frontend. Users saw "Your previous response was empty" as a visible chat message. Fix: use session_service.append_event() to write the resume message directly to the session history. The model sees it on the next call (for better recovery), but it never reaches the UI/SSE stream.

The retry counter was per-invocation, not per-failure-burst. If a model returned empty responses at different points during the same invocation, earlier (recovered) empties consumed the budget. A later empty response would exhaust the counter and halt silently. Fix: reset empty_response_count to 0 after any successful (non-empty) response. Also add a warning log when retries are exhausted so the halt is not silent.

…nses The empty response retry was too aggressive — it triggered on: 1. Sub-agents (AgentTool, ParallelAgent) that legitimately return no content 2. First LLM calls with no prior tool execution Fixes: - Add null guard for last_event in is_final_response check (NoneType crash) - Only retry after at least one tool call in the invocation, since the bug only manifests when models return empty after processing tool results - Remove append_event for resume message (caused session state corruption in pause/resume flows and leaked to UI) - Silent retry instead (proven 100% recovery rate in production tests) - Update scenario tests to include tool call before empty response

rohityan · 2026-03-26T19:56:28Z

Hi @gurjot-05 , Thank you for your contribution! We appreciate you taking the time to submit this pull request. Your PR has been received by the team and is currently under review. We will provide feedback as soon as we have an update to share.

rohityan · 2026-03-26T19:57:08Z

Hi @GWeale , can you please review this.

gurjot-05 · 2026-03-26T20:11:57Z

Thank you @rohityan I look forward to your feedback. Let me know if there’s anything else I can provide in the meantime.

pras2309 · 2026-03-27T14:20:21Z

@gurjot-05, please rebase the branch.

adk-bot added the core [Component] This issue is related to the core interface and implementation label Mar 26, 2026

gurjot-05 force-pushed the fix/retry-empty-model-response branch from 5e0a344 to 3acd2ac Compare March 26, 2026 06:44

gurjot-05 force-pushed the fix/retry-empty-model-response branch from 10f9073 to e6289db Compare March 26, 2026 07:26

rohityan self-assigned this Mar 26, 2026

gurjot-05 added 5 commits March 27, 2026 00:05

gurjot-05 force-pushed the fix/retry-empty-model-response branch from 6e331f5 to 2f489f4 Compare March 26, 2026 18:51

Merge branch 'main' into fix/retry-empty-model-response

62a9bae

rohityan added the needs review [Status] The PR/issue is awaiting review from the maintainer label Mar 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: retry with resume message when model returns empty response#5006

fix: retry with resume message when model returns empty response#5006
gurjot-05 wants to merge 6 commits intogoogle:mainfrom
gurjot-05:fix/retry-empty-model-response

gurjot-05 commented Mar 26, 2026 •

edited

Loading

Uh oh!

adk-bot commented Mar 26, 2026

Uh oh!

rohityan commented Mar 26, 2026

Uh oh!

rohityan commented Mar 26, 2026

Uh oh!

gurjot-05 commented Mar 26, 2026

Uh oh!

pras2309 commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

gurjot-05 commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bug

Root Cause

Mode 1: Non-streaming empty response

Mode 2: Streaming + thinking, no final response yielded

Mode 3: Model returns content=None

Fix

Layer 1: lite_llm.py — Ensure streaming always yields a final response

Layer 2: base_llm_flow.py — Unified retry for all empty response modes

False positive prevention

Tests

Existing tests: 384 passed (0 failures)

PR tests (test_empty_response_retry.py): 12 passed

Comprehensive scenario tests (test_empty_response_all_scenarios.py): 12 passed

Test plan

Uh oh!

adk-bot commented Mar 26, 2026

Uh oh!

rohityan commented Mar 26, 2026

Uh oh!

rohityan commented Mar 26, 2026

Uh oh!

gurjot-05 commented Mar 26, 2026

Uh oh!

pras2309 commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gurjot-05 commented Mar 26, 2026 •

edited

Loading