Google Vertex AI

Access Claude, Gemini, and MaaS models through Google Cloud's Vertex AI platform.

Configuration

Vertex AI uses Google Cloud OAuth2 authentication with service accounts.

Service Account (Recommended)

Environment Variables:

GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
GOOGLE_CLOUD_PROJECT="your-project-id"
GOOGLE_CLOUD_REGION="global"

Provider Options:

ReqLLM.generate_text(
  "google_vertex:claude-sonnet-4-5@20250929",
  "Hello",
  provider_options: [
    service_account_json: "/path/to/service-account.json",
    project_id: "your-project-id",
    region: "global"
  ]
)

Model Specs

For the full model-spec workflow, see Model Specs.

Use exact Vertex model IDs from LLMDB.xyz when possible. For MaaS and other OpenAI-compatible Vertex models that are not in the registry yet, build a full explicit model spec with ReqLLM.model!/1. Some MaaS model IDs also need extra.family when the family cannot be inferred from the ID alone.

Provider Options

Passed via :provider_options keyword:

`service_account_json`

Type: String (file path)
Purpose: Path to Google Cloud service account JSON file
Fallback: GOOGLE_APPLICATION_CREDENTIALS env var
Example: provider_options: [service_account_json: "/path/to/credentials.json"]

`access_token`

Type: String
Purpose: Use an existing OAuth2 access token generated outside ReqLLM (e.g., via Goth or gcloud)
Behavior: Bypasses the service account JSON flow and internal token management
Example: provider_options: [access_token: "your-access-token"]

`project_id`

Type: String
Purpose: Google Cloud project ID
Fallback: GOOGLE_CLOUD_PROJECT env var
Example: provider_options: [project_id: "my-project-123"]
Required: Yes

`region`

Type: String
Default: "global"
Purpose: GCP region for Vertex AI endpoint
Example: provider_options: [region: "us-central1"]
Note: Use "global" for newest models, specific regions for regional deployment

`additional_model_request_fields`

Type: Map
Purpose: Model-specific request fields (e.g., thinking configuration)

Example:

provider_options: [
  additional_model_request_fields: %{
    thinking: %{type: "enabled", budget_tokens: 4096}
  }
]

`labels`

Type: Map of strings to strings
Purpose: Custom metadata labels attached to the request. Used by Google Cloud for billing and reporting — labels are filterable in billing reports and BigQuery exports.
Constraints: Up to 64 labels per request; keys 1–63 chars starting with a lowercase letter; keys and values may only contain lowercase letters, numbers, underscores, and dashes.
Availability: Vertex AI only — the direct Gemini API (generativelanguage.googleapis.com) does not support this field.

Example:

provider_options: [
  labels: %{
    "team" => "engineering",
    "environment" => "production",
    "use_case" => "contract_analysis"
  }
]

Reference: Custom metadata labels

Claude-Specific Options

Vertex AI supports the same Claude options as native Anthropic:

`anthropic_top_k`

Type: 1..40
Purpose: Sample from top K options per token
Example: provider_options: [anthropic_top_k: 20]

`stop_sequences`

Type: List of strings
Purpose: Custom stop sequences
Example: provider_options: [stop_sequences: ["END", "STOP"]]

`anthropic_metadata`

Type: Map
Purpose: Request metadata for tracking
Example: provider_options: [anthropic_metadata: %{user_id: "123"}]

`thinking`

Type: Map
Purpose: Enable extended thinking/reasoning
Example: provider_options: [thinking: %{type: "enabled", budget_tokens: 4096}]
Access: ReqLLM.Response.thinking(response)

`anthropic_prompt_cache`

Type: Boolean
Purpose: Enable prompt caching
Example: provider_options: [anthropic_prompt_cache: true]

`anthropic_prompt_cache_ttl`

Type: String (e.g., "1h")
Purpose: Cache TTL (default ~5min if omitted)
Example: provider_options: [anthropic_prompt_cache_ttl: "1h"]

Supported Models

Claude 4.5 Family

Haiku 4.5: google_vertex:claude-haiku-4-5@20251001
- Fast, cost-effective
- Full tool calling and reasoning support
Sonnet 4.5: google_vertex:claude-sonnet-4-5@20250929
- Balanced performance and capability
- Extended thinking support
Opus 4.1: google_vertex:claude-opus-4-1@20250805
- Highest capability
- Advanced reasoning

Claude 4.0 & Earlier

Sonnet 4.0: google_vertex:claude-sonnet-4@20250514
Opus 4.0: google_vertex:claude-opus-4@20250514
Sonnet 3.7: google_vertex:claude-3-7-sonnet@20250219
Sonnet 3.5 v2: google_vertex:claude-3-5-sonnet@20241022
Haiku 3.5: google_vertex:claude-3-5-haiku@20241022

Model ID Format

Vertex uses the @ symbol for versioning:

Format: claude-{tier}-{version}@{date}
Example: claude-sonnet-4-5@20250929

Wire Format Notes

Authentication: OAuth2 with service account tokens (auto-refreshed)
Endpoint: Model-specific paths under aiplatform.googleapis.com
API: Uses Anthropic's raw message format (compatible with native API)
Streaming: Standard Server-Sent Events (SSE)
Region routing: Global endpoint for newest models, regional for specific deployments

All differences handled automatically by ReqLLM.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Google Vertex AI

Configuration

Service Account (Recommended)

Model Specs

Provider Options

`service_account_json`

`access_token`

`project_id`

`region`

`additional_model_request_fields`

`labels`

Claude-Specific Options

`anthropic_top_k`

`stop_sequences`

`anthropic_metadata`

`thinking`

`anthropic_prompt_cache`

`anthropic_prompt_cache_ttl`

Supported Models

Claude 4.5 Family

Claude 4.0 & Earlier

Model ID Format

Wire Format Notes

Resources

FilesExpand file tree

google_vertex.md

Latest commit

History

google_vertex.md

File metadata and controls

Google Vertex AI

Configuration

Service Account (Recommended)

Model Specs

Provider Options

service_account_json

access_token

project_id

region

additional_model_request_fields

labels

Claude-Specific Options

anthropic_top_k

stop_sequences

anthropic_metadata

thinking

anthropic_prompt_cache

anthropic_prompt_cache_ttl

Supported Models

Claude 4.5 Family

Claude 4.0 & Earlier

Model ID Format

Wire Format Notes

Resources

`service_account_json`

`access_token`

`project_id`

`region`

`additional_model_request_fields`

`labels`

`anthropic_top_k`

`stop_sequences`

`anthropic_metadata`

`thinking`

`anthropic_prompt_cache`

`anthropic_prompt_cache_ttl`