docs: add cloud Gemini agent setup recipes by chiveswang · Pull Request #30 · Open-LLM-VTuber/open-llm-vtuber.github.io

chiveswang · 2026-06-07T03:50:21Z

Summary

update the Gemini example model from the old experimental Flash model to gemini-2.5-flash
document using cloud Gemini Flash / Flash-Lite models to reduce local hardware load
add guidance for hosted Gemma 4 endpoints and when to use openai_compatible_llm
add a reusable setup prompt for Claude, Codex, or another agent to safely edit local conf.yaml

Validation

npm run build

Notes

No API keys or private endpoint values are included.
The rate-limit guidance points users to AI Studio because Gemini RPM/TPM/RPD limits vary by project, model, billing tier, and account state.

Summary by CodeRabbit

Documentation
- Updated Gemini model configuration examples to the latest models
- Added guidance on using Gemini cloud models to reduce local hardware requirements
- Included recommended Gemini Flash model variants
- Added information about rate limits and handling for Gemini APIs
- Enhanced configuration instructions for setting up Gemini LLM

coderabbitai · 2026-06-07T03:50:27Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 12739781-393c-46bd-ad9b-e2ee25b99b82

📥 Commits

Reviewing files that changed from the base of the PR and between b6a1634 and e5184bd.

📒 Files selected for processing (2)

docs/user-guide/backend/llm.md
i18n/en/docusaurus-plugin-content-docs/current/user-guide/backend/llm.md

✅ Files skipped from review due to trivial changes (2)

docs/user-guide/backend/llm.md
i18n/en/docusaurus-plugin-content-docs/current/user-guide/backend/llm.md

📝 Walkthrough

Walkthrough

This PR updates documentation for the Gemini LLM backend configuration. The gemini_llm example is updated to use the newer gemini-2.5-flash model, and new guidance is added explaining how to offload computation to Gemini cloud services to reduce local hardware requirements. An agent-friendly configuration prompt is included for safe automated editing of conf.yaml.

Changes

Gemini Cloud Model Configuration Documentation

Layer / File(s)	Summary
Gemini model version update `docs/user-guide/backend/llm.md`, `i18n/en/docusaurus-plugin-content-docs/current/user-guide/backend/llm.md`	Updated the `gemini_llm` configuration example to reference `gemini-2.5-flash` instead of `gemini-2.0-flash-exp` in both the main docs and i18n variant.
Cloud Gemini usage guidance and agent configuration `docs/user-guide/backend/llm.md`, `i18n/en/docusaurus-plugin-content-docs/current/user-guide/backend/llm.md`	Added new sections documenting how to use cloud Gemini models to reduce local compute load, including a list of recommended Gemini Flash and Flash-Lite variants, guidance for Gemma 4 via OpenAI-compatible configuration, a tip about Google API rate limits (RPM/TPM/RPD) and 429 error handling, and a detailed prompt template for Claude/Codex agents to safely edit local `conf.yaml` for Gemini cloud configuration without committing API keys or private URLs.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

A rabbit hops through the cloud so bright,
With Gemini Flash shining in the light!
From 2-point-oh to 2-point-five we spring,
Less compute needed—oh, what a thing! ✨🐰

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically summarizes the main change: adding documentation for cloud Gemini agent setup recipes, which aligns with updating Gemini model examples and adding setup guidance.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chiveswang · 2026-06-08T14:01:12Z

@coderabbitai review

coderabbitai · 2026-06-08T14:01:19Z

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/user-guide/backend/llm.md`:
- Around line 143-157: Update the example Gemini model IDs to the current Flash
/ Flash‑Lite variants: replace occurrences of "gemini-2.5-flash-lite" with
"gemini-3.1-flash-lite" and "gemini-2.5-flash" with "gemini-3.5-flash" in the
gemini_llm section, leaving the Gemma 4 guidance and references to
openai_compatible_llm, ollama_llm, lmstudio_llm, and llama_cpp_llm unchanged.
- Line 139: Update the docs entry that specifies the model string
"gemini-2.5-flash" by adding a brief deprecation/shutdown note stating that
"gemini-2.5-flash" is deprecated and scheduled for shutdown on Oct 16, 2026, and
recommend switching to "gemini-3.5-flash"; place this warning adjacent to the
model declaration (the line containing model: "gemini-2.5-flash") so readers see
the replacement recommendation and shutdown date immediately.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bd06ec65-c9b1-475c-8311-42f2f5afa174

📥 Commits

Reviewing files that changed from the base of the PR and between 622a307 and b6a1634.

📒 Files selected for processing (2)

docs/user-guide/backend/llm.md
i18n/en/docusaurus-plugin-content-docs/current/user-guide/backend/llm.md

coderabbitai · 2026-06-08T14:08:24Z

+#### 使用 Gemini 云端模型降低本地运行需求
+
+`gemini_llm` 使用 Google Gemini API 的 [OpenAI 兼容端点](https://ai.google.dev/gemini-api/docs/openai)。如果你希望减少本地 GPU/CPU 负担，可以把主要对话模型切到云端 Gemini，只在本地运行 Live2D、TTS、ASR 或其他你需要保留在本机的组件。
+
+常见选择：
+
+- `gemini-2.5-flash-lite`：适合高频、轻量、低延迟对话。
+- `gemini-2.5-flash`：适合需要更强理解与推理能力的常规对话。
+- 其他 Gemini Flash / Flash-Lite 模型：请以 [Gemini 模型列表](https://ai.google.dev/gemini-api/docs/models) 和 AI Studio 中可用的模型名称为准。
+
+如果你想使用托管的 Gemma 4，请先确认服务提供的实际端点和模型名称。Gemma 4 是 Google 的开放模型系列，可以通过 Google AI Studio、Vertex AI 或其他托管服务使用；如果该服务提供 OpenAI 兼容接口，请优先把它配置到 `openai_compatible_llm`，而不是假设它一定属于 `gemini_llm`。Gemma 4 的本地运行路线仍然适合放在 `ollama_llm`、`lmstudio_llm` 或 `llama_cpp_llm`。
+
+:::tip
+Gemini 的 RPM、TPM、RPD 限制会随模型、项目、计费层级和账号状态变化，并且是按 Google Cloud/AI Studio 项目计算，不是按单个 API key 计算。请在 AI Studio 查看当前项目的实际限制，并参考 [Gemini API rate limits](https://ai.google.dev/gemini-api/docs/rate-limits)。如果遇到 429，请降低并发、缩短上下文、减少连续重试，或换用更高配额的项目。
+:::


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Does the Google Gemini API provide an OpenAI-compatible endpoint as of June 2026? What are the currently available Gemini Flash and Flash-Lite model variants?

💡 Result:

Yes. Google’s Gemini API provides an OpenAI-compatible REST endpoint for use with the OpenAI client libraries by pointing them at a Gemini “OpenAI compatibility” base URL. The docs show using the OpenAI library with base_url="https://generativelanguage.googleapis.com/v1beta/openai/" and calling the OpenAI-style path /chat/completions [1]. Currently available Gemini Flash model variants (Gemini API model IDs): - gemini-3.5-flash (stable/GA) [2] - gemini-3-flash-preview (listed as the preview version in the Gemini 3.5 Flash docs) [2] Currently available Gemini Flash-Lite model variants (Gemini API model IDs): - gemini-3.1-flash-lite (stable/GA) [3] Notes on “Flash-Lite” preview/discontinued variants: - The Gemini API release notes state that gemini-3.1-flash-lite-preview was shut down and that the GA replacement is gemini-3.1-flash-lite [4]. - The Gemini 2.0 Flash-Lite family is shown as shut down/discontinued by June 1, 2026 in the Gemini API changelog, and recommends using gemini-3.1-flash-lite instead [4].

Citations:

1: https://ai.google.dev/gemini-api/docs/openai

2: https://ai.google.dev/gemini-api/docs/models/gemini-3.5-flash

3: https://ai.google.dev/gemini-api/docs/models/gemini-3.1-flash-lite

4: https://ai.google.dev/gemini-api/docs/changelog

Update Gemini model names to match current Flash / Flash-Lite variants

Gemini API does provide an OpenAI-compatible endpoint (the docs endpoint link you cite is correct). However, the model IDs shown (gemini-2.5-flash-lite, gemini-2.5-flash) don’t match the currently available Flash/Flash-Lite variants in June 2026, which are gemini-3.1-flash-lite and gemini-3.5-flash (with some older Flash-Lite variants noted as shut down/discontinued). Update the examples accordingly; keep the Gemma 4 guidance as written (use openai_compatible_llm when the provider exposes an OpenAI-compatible interface).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/user-guide/backend/llm.md` around lines 143 - 157, Update the example Gemini model IDs to the current Flash / Flash‑Lite variants: replace occurrences of "gemini-2.5-flash-lite" with "gemini-3.1-flash-lite" and "gemini-2.5-flash" with "gemini-3.5-flash" in the gemini_llm section, leaving the Gemma 4 guidance and references to openai_compatible_llm, ollama_llm, lmstudio_llm, and llama_cpp_llm unchanged.

docs: add cloud Gemini agent setup recipes

b6a1634

coderabbitai Bot reviewed Jun 8, 2026

View reviewed changes

docs: refresh gemini model recommendations

e5184bd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add cloud Gemini agent setup recipes#30

docs: add cloud Gemini agent setup recipes#30
chiveswang wants to merge 2 commits into
Open-LLM-VTuber:mainfrom
chiveswang:docs/cloud-gemini-agent-recipes

chiveswang commented Jun 7, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 7, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

chiveswang commented Jun 8, 2026

Uh oh!

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

chiveswang commented Jun 7, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Notes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

chiveswang commented Jun 8, 2026

Uh oh!

coderabbitai Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

chiveswang commented Jun 7, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 7, 2026 •

edited

Loading

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading