Skip to content

docs: add cloud Gemini agent setup recipes#30

Open
chiveswang wants to merge 2 commits into
Open-LLM-VTuber:mainfrom
chiveswang:docs/cloud-gemini-agent-recipes
Open

docs: add cloud Gemini agent setup recipes#30
chiveswang wants to merge 2 commits into
Open-LLM-VTuber:mainfrom
chiveswang:docs/cloud-gemini-agent-recipes

Conversation

@chiveswang

@chiveswang chiveswang commented Jun 7, 2026

Copy link
Copy Markdown

Summary

  • update the Gemini example model from the old experimental Flash model to gemini-2.5-flash
  • document using cloud Gemini Flash / Flash-Lite models to reduce local hardware load
  • add guidance for hosted Gemma 4 endpoints and when to use openai_compatible_llm
  • add a reusable setup prompt for Claude, Codex, or another agent to safely edit local conf.yaml

Validation

  • npm run build

Notes

  • No API keys or private endpoint values are included.
  • The rate-limit guidance points users to AI Studio because Gemini RPM/TPM/RPD limits vary by project, model, billing tier, and account state.

Summary by CodeRabbit

  • Documentation
    • Updated Gemini model configuration examples to the latest models
    • Added guidance on using Gemini cloud models to reduce local hardware requirements
    • Included recommended Gemini Flash model variants
    • Added information about rate limits and handling for Gemini APIs
    • Enhanced configuration instructions for setting up Gemini LLM

@coderabbitai

coderabbitai Bot commented Jun 7, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 12739781-393c-46bd-ad9b-e2ee25b99b82

📥 Commits

Reviewing files that changed from the base of the PR and between b6a1634 and e5184bd.

📒 Files selected for processing (2)
  • docs/user-guide/backend/llm.md
  • i18n/en/docusaurus-plugin-content-docs/current/user-guide/backend/llm.md
✅ Files skipped from review due to trivial changes (2)
  • docs/user-guide/backend/llm.md
  • i18n/en/docusaurus-plugin-content-docs/current/user-guide/backend/llm.md

📝 Walkthrough

Walkthrough

This PR updates documentation for the Gemini LLM backend configuration. The gemini_llm example is updated to use the newer gemini-2.5-flash model, and new guidance is added explaining how to offload computation to Gemini cloud services to reduce local hardware requirements. An agent-friendly configuration prompt is included for safe automated editing of conf.yaml.

Changes

Gemini Cloud Model Configuration Documentation

Layer / File(s) Summary
Gemini model version update
docs/user-guide/backend/llm.md, i18n/en/docusaurus-plugin-content-docs/current/user-guide/backend/llm.md
Updated the gemini_llm configuration example to reference gemini-2.5-flash instead of gemini-2.0-flash-exp in both the main docs and i18n variant.
Cloud Gemini usage guidance and agent configuration
docs/user-guide/backend/llm.md, i18n/en/docusaurus-plugin-content-docs/current/user-guide/backend/llm.md
Added new sections documenting how to use cloud Gemini models to reduce local compute load, including a list of recommended Gemini Flash and Flash-Lite variants, guidance for Gemma 4 via OpenAI-compatible configuration, a tip about Google API rate limits (RPM/TPM/RPD) and 429 error handling, and a detailed prompt template for Claude/Codex agents to safely edit local conf.yaml for Gemini cloud configuration without committing API keys or private URLs.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

A rabbit hops through the cloud so bright,
With Gemini Flash shining in the light!
From 2-point-oh to 2-point-five we spring,
Less compute needed—oh, what a thing! ✨🐰

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main change: adding documentation for cloud Gemini agent setup recipes, which aligns with updating Gemini model examples and adding setup guidance.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chiveswang

Copy link
Copy Markdown
Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/user-guide/backend/llm.md`:
- Around line 143-157: Update the example Gemini model IDs to the current Flash
/ Flash‑Lite variants: replace occurrences of "gemini-2.5-flash-lite" with
"gemini-3.1-flash-lite" and "gemini-2.5-flash" with "gemini-3.5-flash" in the
gemini_llm section, leaving the Gemma 4 guidance and references to
openai_compatible_llm, ollama_llm, lmstudio_llm, and llama_cpp_llm unchanged.
- Line 139: Update the docs entry that specifies the model string
"gemini-2.5-flash" by adding a brief deprecation/shutdown note stating that
"gemini-2.5-flash" is deprecated and scheduled for shutdown on Oct 16, 2026, and
recommend switching to "gemini-3.5-flash"; place this warning adjacent to the
model declaration (the line containing model: "gemini-2.5-flash") so readers see
the replacement recommendation and shutdown date immediately.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bd06ec65-c9b1-475c-8311-42f2f5afa174

📥 Commits

Reviewing files that changed from the base of the PR and between 622a307 and b6a1634.

📒 Files selected for processing (2)
  • docs/user-guide/backend/llm.md
  • i18n/en/docusaurus-plugin-content-docs/current/user-guide/backend/llm.md

Comment thread docs/user-guide/backend/llm.md Outdated
Comment on lines +143 to +157
#### 使用 Gemini 云端模型降低本地运行需求

`gemini_llm` 使用 Google Gemini API 的 [OpenAI 兼容端点](https://ai.google.dev/gemini-api/docs/openai)。如果你希望减少本地 GPU/CPU 负担,可以把主要对话模型切到云端 Gemini,只在本地运行 Live2D、TTS、ASR 或其他你需要保留在本机的组件。

常见选择:

- `gemini-2.5-flash-lite`:适合高频、轻量、低延迟对话。
- `gemini-2.5-flash`:适合需要更强理解与推理能力的常规对话。
- 其他 Gemini Flash / Flash-Lite 模型:请以 [Gemini 模型列表](https://ai.google.dev/gemini-api/docs/models) 和 AI Studio 中可用的模型名称为准。

如果你想使用托管的 Gemma 4,请先确认服务提供的实际端点和模型名称。Gemma 4 是 Google 的开放模型系列,可以通过 Google AI Studio、Vertex AI 或其他托管服务使用;如果该服务提供 OpenAI 兼容接口,请优先把它配置到 `openai_compatible_llm`,而不是假设它一定属于 `gemini_llm`。Gemma 4 的本地运行路线仍然适合放在 `ollama_llm`、`lmstudio_llm` 或 `llama_cpp_llm`。

:::tip
Gemini 的 RPM、TPM、RPD 限制会随模型、项目、计费层级和账号状态变化,并且是按 Google Cloud/AI Studio 项目计算,不是按单个 API key 计算。请在 AI Studio 查看当前项目的实际限制,并参考 [Gemini API rate limits](https://ai.google.dev/gemini-api/docs/rate-limits)。如果遇到 429,请降低并发、缩短上下文、减少连续重试,或换用更高配额的项目。
:::

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Does the Google Gemini API provide an OpenAI-compatible endpoint as of June 2026? What are the currently available Gemini Flash and Flash-Lite model variants?

💡 Result:

Yes. Google’s Gemini API provides an OpenAI-compatible REST endpoint for use with the OpenAI client libraries by pointing them at a Gemini “OpenAI compatibility” base URL. The docs show using the OpenAI library with base_url="https://generativelanguage.googleapis.com/v1beta/openai/" and calling the OpenAI-style path /chat/completions [1]. Currently available Gemini Flash model variants (Gemini API model IDs): - gemini-3.5-flash (stable/GA) [2] - gemini-3-flash-preview (listed as the preview version in the Gemini 3.5 Flash docs) [2] Currently available Gemini Flash-Lite model variants (Gemini API model IDs): - gemini-3.1-flash-lite (stable/GA) [3] Notes on “Flash-Lite” preview/discontinued variants: - The Gemini API release notes state that gemini-3.1-flash-lite-preview was shut down and that the GA replacement is gemini-3.1-flash-lite [4]. - The Gemini 2.0 Flash-Lite family is shown as shut down/discontinued by June 1, 2026 in the Gemini API changelog, and recommends using gemini-3.1-flash-lite instead [4].

Citations:


Update Gemini model names to match current Flash / Flash-Lite variants

Gemini API does provide an OpenAI-compatible endpoint (the docs endpoint link you cite is correct). However, the model IDs shown (gemini-2.5-flash-lite, gemini-2.5-flash) don’t match the currently available Flash/Flash-Lite variants in June 2026, which are gemini-3.1-flash-lite and gemini-3.5-flash (with some older Flash-Lite variants noted as shut down/discontinued). Update the examples accordingly; keep the Gemma 4 guidance as written (use openai_compatible_llm when the provider exposes an OpenAI-compatible interface).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/user-guide/backend/llm.md` around lines 143 - 157, Update the example
Gemini model IDs to the current Flash / Flash‑Lite variants: replace occurrences
of "gemini-2.5-flash-lite" with "gemini-3.1-flash-lite" and "gemini-2.5-flash"
with "gemini-3.5-flash" in the gemini_llm section, leaving the Gemma 4 guidance
and references to openai_compatible_llm, ollama_llm, lmstudio_llm, and
llama_cpp_llm unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant