Access Claude, Gemini, and MaaS models through Google Cloud's Vertex AI platform.
Vertex AI uses Google Cloud OAuth2 authentication with service accounts.
Environment Variables:
GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
GOOGLE_CLOUD_PROJECT="your-project-id"
GOOGLE_CLOUD_REGION="global"Provider Options:
ReqLLM.generate_text(
"google_vertex:claude-sonnet-4-5@20250929",
"Hello",
provider_options: [
service_account_json: "/path/to/service-account.json",
project_id: "your-project-id",
region: "global"
]
)For the full model-spec workflow, see Model Specs.
Use exact Vertex model IDs from LLMDB.xyz when possible. For MaaS and other OpenAI-compatible Vertex models that are not in the registry yet, build a full explicit model spec with ReqLLM.model!/1. Some MaaS model IDs also need extra.family when the family cannot be inferred from the ID alone.
Passed via :provider_options keyword:
- Type: String (file path)
- Purpose: Path to Google Cloud service account JSON file
- Fallback:
GOOGLE_APPLICATION_CREDENTIALSenv var - Example:
provider_options: [service_account_json: "/path/to/credentials.json"]
- Type: String
- Purpose: Use an existing OAuth2 access token generated outside ReqLLM (e.g., via Goth or gcloud)
- Behavior: Bypasses the service account JSON flow and internal token management
- Example:
provider_options: [access_token: "your-access-token"]
- Type: String
- Purpose: Google Cloud project ID
- Fallback:
GOOGLE_CLOUD_PROJECTenv var - Example:
provider_options: [project_id: "my-project-123"] - Required: Yes
- Type: String
- Default:
"global" - Purpose: GCP region for Vertex AI endpoint
- Example:
provider_options: [region: "us-central1"] - Note: Use
"global"for newest models, specific regions for regional deployment
- Type: Map
- Purpose: Model-specific request fields (e.g., thinking configuration)
- Example:
provider_options: [ additional_model_request_fields: %{ thinking: %{type: "enabled", budget_tokens: 4096} } ]
- Type: Map of strings to strings
- Purpose: Custom metadata labels attached to the request. Used by Google Cloud for billing and reporting — labels are filterable in billing reports and BigQuery exports.
- Constraints: Up to 64 labels per request; keys 1–63 chars starting with a lowercase letter; keys and values may only contain lowercase letters, numbers, underscores, and dashes.
- Availability: Vertex AI only — the direct Gemini API (
generativelanguage.googleapis.com) does not support this field. - Example:
provider_options: [ labels: %{ "team" => "engineering", "environment" => "production", "use_case" => "contract_analysis" } ]
- Reference: Custom metadata labels
Vertex AI supports the same Claude options as native Anthropic:
- Type:
1..40 - Purpose: Sample from top K options per token
- Example:
provider_options: [anthropic_top_k: 20]
- Type: List of strings
- Purpose: Custom stop sequences
- Example:
provider_options: [stop_sequences: ["END", "STOP"]]
- Type: Map
- Purpose: Request metadata for tracking
- Example:
provider_options: [anthropic_metadata: %{user_id: "123"}]
- Type: Map
- Purpose: Enable extended thinking/reasoning
- Example:
provider_options: [thinking: %{type: "enabled", budget_tokens: 4096}] - Access:
ReqLLM.Response.thinking(response)
- Type: Boolean
- Purpose: Enable prompt caching
- Example:
provider_options: [anthropic_prompt_cache: true]
- Type: String (e.g.,
"1h") - Purpose: Cache TTL (default ~5min if omitted)
- Example:
provider_options: [anthropic_prompt_cache_ttl: "1h"]
-
Haiku 4.5:
google_vertex:claude-haiku-4-5@20251001- Fast, cost-effective
- Full tool calling and reasoning support
-
Sonnet 4.5:
google_vertex:claude-sonnet-4-5@20250929- Balanced performance and capability
- Extended thinking support
-
Opus 4.1:
google_vertex:claude-opus-4-1@20250805- Highest capability
- Advanced reasoning
- Sonnet 4.0:
google_vertex:claude-sonnet-4@20250514 - Opus 4.0:
google_vertex:claude-opus-4@20250514 - Sonnet 3.7:
google_vertex:claude-3-7-sonnet@20250219 - Sonnet 3.5 v2:
google_vertex:claude-3-5-sonnet@20241022 - Haiku 3.5:
google_vertex:claude-3-5-haiku@20241022
Vertex uses the @ symbol for versioning:
- Format:
claude-{tier}-{version}@{date} - Example:
claude-sonnet-4-5@20250929
- Authentication: OAuth2 with service account tokens (auto-refreshed)
- Endpoint: Model-specific paths under
aiplatform.googleapis.com - API: Uses Anthropic's raw message format (compatible with native API)
- Streaming: Standard Server-Sent Events (SSE)
- Region routing: Global endpoint for newest models, regional for specific deployments
All differences handled automatically by ReqLLM.