Skip to content

google.LLM in FallbackAdapter fails with HTTP 400 when ChatContext contains function calls from a non-Google model #6135

@igui

Description

@igui

Summary

When google.LLM (Gemini 3 / Gemini 2.5) is placed as a fallback inside a FallbackAdapter after a non-Google LLM (e.g. OpenAI), any second turn that includes a function call from the primary LLM causes Gemini to return HTTP 400 INVALID_ARGUMENT:

Function call is missing a thought_signature in functionCall parts

Steps to reproduce

from livekit.agents.llm import FallbackAdapter, ChatContext, FunctionCall, FunctionCallOutput
from livekit.plugins import google

ctx = ChatContext.empty()
ctx.add_message(role="user", content="I feel pain")
ctx.insert(FunctionCall(call_id="call_openai_abc123", name="manage_pain", arguments="{}"))
ctx.insert(FunctionCallOutput(call_id="call_openai_abc123", name="manage_pain", output="ok", is_error=False))

fallback = FallbackAdapter([<openai_llm_that_fails>, google.LLM(model="gemini-3-flash-preview")])

async with fallback.chat(chat_ctx=ctx, tools=[]) as stream:
    async for chunk in stream:
        ...  # raises APIConnectionError — all LLMs failed

Root cause

The Google plugin guards thought-signature injection with:

if thought_signatures and (sig := thought_signatures.get(msg.call_id)):
    fc_part["thought_signature"] = sig

When google.LLM is a FallbackAdapter fallback, its _thought_signatures dict is empty (no Gemini turn has happened yet). The guard short-circuits on the falsy empty dict, so no thought_signature is injected for the OpenAI-produced function call, and Gemini rejects the request.

Even pre-populating _thought_signatures with an unrelated key doesn't help — .get(unknown_call_id) returns None, which fails the walrus condition.

Current workaround

Google documents b"skip_thought_signature_validator" as a sentinel for this scenario:

There are rare cases where you need to provide functionCall parts that were not generated by the API and therefore don't have an associated thought signature (for example, when transferring history from a model that does not include thought signatures). You can set thought_signature to skip_thought_signature_validator, but, this should be a last resort as it will negatively impact model performance.

As a stopgap I subclassed dict to override __bool__ (always True) and get() (returns the sentinel for unknown call IDs), then replaced _thought_signatures on a google.LLM subclass.

This unblocks the fallback but can trigger the documented performance degradation: in some models sometimes the agent occasionally repeats itself around function calls (e.g. "Thank you for your call. Thank you for your call. See you soon!" when a wrap-up tool is called). I last seen this a couple of months ago with Gemini 3.1 preview but I can't confirm this on later models.

Proposed fix

When converting ChatContext to API messages, for any function_call part whose call_id is absent from _thought_signatures, inject the sentinel rather than silently omitting the field. While this is a no-op for sessions where Gemini is the primary LLM (all call_ids present with real signatures) and fixes the fallback case transparently. However, one might one to place it as a parameter, like "fallback_thought_signatures"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions