Skip to content

fix: eliminate process.env race condition in GoogleLlm and AiSdkLlm#658

Open
brianmhunt wants to merge 3 commits intoIQAIcom:mainfrom
brianmhunt:fix/google-runtime-race-condition
Open

fix: eliminate process.env race condition in GoogleLlm and AiSdkLlm#658
brianmhunt wants to merge 3 commits intoIQAIcom:mainfrom
brianmhunt:fix/google-runtime-race-condition

Conversation

@brianmhunt
Copy link
Copy Markdown

Summary

  • Add GoogleLlmConfig interface to GoogleLlm allowing explicit, request-scoped configuration (apiKey, vertexai/project/location, or a pre-built GoogleGenAI client) that bypasses process.env entirely
  • Add AiSdkLlmOptions with googleGenaiClient to AiSdkLlm, passed through to GeminiContextCacheManager for the same purpose
  • Convert private _field to JS #private fields and extract shared #buildClient() method to DRY client construction across apiClient, liveApiClient

Motivation

GoogleLlm, AiSdkLlm, and GeminiContextCacheManager all lazily read process.env at request time to construct Google clients. In a multi-tenant server, concurrent requests needing different Google backends (Vertex AI vs API key) race on the shared mutable process.env, making request-scoped provider selection unsafe.

All changes are fully backward compatible — env fallback is preserved when no config is provided.

Test plan

  • Explicit apiKey config bypasses env
  • Explicit vertexai config bypasses env
  • Pre-built client injection used directly (for both apiClient and liveApiClient)
  • apiBackend resolves from config when provided, env when not
  • Env fallback still works when no config provided
  • Race condition test: N concurrent GoogleLlm instances with different configs each resolve their own client independently
  • AiSdkLlm googleGenaiClient option passed through to GeminiContextCacheManager
  • Full test suite passes (538 tests)

🤖 Generated with Claude Code

GoogleLlm, AiSdkLlm, and GeminiContextCacheManager lazily read
process.env to build Google clients. In a multi-tenant server,
concurrent requests needing different backends (Vertex AI vs API key)
race on the shared mutable process.env.

Add GoogleLlmConfig to allow explicit, request-scoped configuration
(apiKey, vertexai/project/location, or a pre-built client) that
bypasses env vars entirely. Env fallback is preserved when no config
is provided, so the change is fully backward compatible.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Mar 25, 2026

🦋 Changeset detected

Latest commit: a800cc3

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
@iqai/adk Minor
@iqai/adk-cli Patch
@iqai/mcp-docs Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link
Copy Markdown
Contributor

vercel bot commented Mar 25, 2026

@brianmhunt is attempting to deploy a commit to the IQ Team on Vercel.

A member of the Team first needs to authorize it.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a critical race condition in multi-tenant server environments where GoogleLlm, AiSdkLlm, and GeminiContextCacheManager previously relied on lazily reading process.env for Google client configuration. By introducing explicit configuration options, the changes enable request-scoped provider selection, eliminating the unsafe concurrent modification of environment variables, while ensuring full backward compatibility with existing environment variable-based configurations.

Highlights

  • Explicit Configuration for GoogleLlm: Introduced the GoogleLlmConfig interface to GoogleLlm, allowing explicit, request-scoped configuration for apiKey, Vertex AI project/location, or a pre-built GoogleGenAI client, effectively bypassing process.env.
  • AiSdkLlm Client Injection: Added AiSdkLlmOptions with a googleGenaiClient property to AiSdkLlm, enabling a pre-built Google GenAI client to be passed through to GeminiContextCacheManager for context caching, thus avoiding environment variable race conditions.
  • Code Refactoring and Encapsulation: Refactored GoogleLlm by converting private _field properties to JavaScript's #private fields and extracted a shared #buildClient() method to centralize and de-duplicate client construction logic.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the AiSdkLlm and GoogleLlm classes to introduce explicit configuration options, primarily to address potential race conditions in multi-tenant server environments. This is achieved by allowing pre-built GoogleGenAI clients or specific API configurations to be injected directly rather than relying solely on environment variables. Key changes include adding AiSdkLlmOptions and GoogleLlmConfig interfaces, updating constructors to accept these options, and centralizing client building logic in GoogleLlm to prioritize explicit configurations. All internal private fields have been converted to use the # private class field syntax. Corresponding test files have been updated with new test cases to validate these explicit configuration options and ensure backward compatibility and race condition safety. I have no feedback to provide.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 83c11e3f42

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/adk/src/models/google-llm.ts Outdated
Comment on lines +311 to +314
if (this.#config?.vertexai === true) {
this.#apiBackend = GoogleLLMVariant.VERTEX_AI;
} else if (this.#config) {
this.#apiBackend = GoogleLLMVariant.GEMINI_API;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Resolve backend from effective client settings

apiBackend treats any config with vertexai: true as VERTEX_AI, but #buildClient only creates a Vertex client when both project and location are present. With inputs like { vertexai: true, apiKey: "..." }, the instance ends up using a Gemini API-key client while still reporting VERTEX_AI, so preprocessRequest skips Gemini-specific sanitization (labels/displayName) and requests can fail against the API-key backend.

Useful? React with 👍 / 👎.

Comment thread packages/adk/src/models/google-llm.ts Outdated
Comment on lines +393 to +397
// 3. Env fallback (existing behaviour)
const useVertexAI = process.env.GOOGLE_GENAI_USE_VERTEXAI === "true";
const apiKey = process.env.GOOGLE_API_KEY;
const project = process.env.GOOGLE_CLOUD_PROJECT;
const location = process.env.GOOGLE_CLOUD_LOCATION;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fail on partial explicit config before reading process.env

When a GoogleLlmConfig object is provided but missing required fields, #buildClient falls through to environment-variable selection. That silently reintroduces the shared-process.env behavior this change is intended to avoid (for example, { vertexai: true, project: "p" } can unexpectedly pick env credentials/backend instead of failing fast), which can route a request to the wrong backend in multi-tenant usage.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I see codex picked them. Yep, I think these are significant

brianmhunt and others added 2 commits March 25, 2026 10:57
Address review feedback:
- When GoogleLlmConfig is provided but incomplete (e.g. vertexai
  without project/location), throw instead of silently falling
  through to process.env (which would reintroduce the race condition)
- Align apiBackend getter with #buildClient: require vertexai +
  project + location to report VERTEX_AI, not just vertexai flag

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown
Contributor

vercel bot commented Mar 31, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
adk-web Ready Ready Preview Mar 31, 2026 4:43pm

Request Review

@Timonwa
Copy link
Copy Markdown
Contributor

Timonwa commented Mar 31, 2026

hi @brianmhunt , thank you for contributing to ADK-TS. 🎉 your pr will be reviewed soon by one of our developers.

Copy link
Copy Markdown
Contributor

@MarvelNwachukwu MarvelNwachukwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @brianmhunt, two things came up.

  1. Pre-built Vertex clients get misclassified

If someone passes { client: vertexClient } without also setting vertexai: true + project + location, the apiBackend getter falls through to GEMINI_API. The #buildClient() path does the right thing and returns the injected client, but apiBackend has no idea it's actually a Vertex client.

That's not just cosmetic. preprocessRequest() checks apiBackend to decide whether to strip labels and display names, and liveApiVersion uses it to pick between v1alpha and v1beta1. So you end up with a Vertex client getting the Gemini API treatment, which will quietly break things.

Probably the simplest fix is adding a backend field to GoogleLlmConfig so callers can declare what their pre-built client actually is. Or have the apiBackend getter refuse to guess when it sees { client: ... } without an explicit backend.

  1. The AiSdkLlm cache manager test is a false positive

The test at ai-sdk.test.ts:340 consumes the generator and then asserts GeminiContextCacheManager was constructed with the right args. But generateContentAsyncImpl has a catch block (line305) that swallows non-rate-limit errors and yields an error response instead of throwing. So if the mock isn't wired up right and handleContextCaching throws a TypeError, the test still passes because the generator just quietly yields an error and moves on. The assertion on the constructor call succeeds, but the code path it's meant to verify never actually ran.

Should either assert that the yielded response isn't an error, or check that handleContextCaching was actually called successfully.

One more thing: the changeset should be patch, not minor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants