TanStack · tombeckenham · Jun 11, 2026
diff --git a/.changeset/gemini-veo-video-adapter.md b/.changeset/gemini-veo-video-adapter.md
@@ -0,0 +1,42 @@
+---
+'@tanstack/ai': minor
+'@tanstack/ai-gemini': minor
+---
+
+Add a Google Veo video adapter (`geminiVideo` / `createGeminiVideo`) and the
+per-model typed-duration video contract it is built on (#534, #634).
+
+**`@tanstack/ai`** (additive, non-breaking): `VideoAdapter` /
+`BaseVideoAdapter` gain a `TModelDurationByName` generic (defaulting to
+`Record<string, number>`, preserving today's `duration?: number` typing for
+adapters without a map) plus two introspection methods with safe defaults:
+
+- `availableDurations()` — a `DurationOptions` tagged union
+  (`discrete | range | mixed | none`) describing the durations the current
+  model accepts. Default: `{ kind: 'none' }`.
+- `snapDuration(seconds)` — coerce raw seconds to the closest valid duration
+  (`snapToDurationOption` is exported for adapter authors). Default:
+  `undefined`.
+
+`generateVideo({ duration })` is now typed per model via
+`VideoDurationForAdapter<TAdapter>`.
+
+**`@tanstack/ai-gemini`**: new Veo adapter over the long-running
+`:predictLongRunning` operation, supporting `veo-3.1-generate-preview`,
+`veo-3.1-fast-generate-preview`, `veo-3.0-generate-001`,
+`veo-3.0-fast-generate-001`, and `veo-2.0-generate-001`:
+
+- `geminiVideo('veo-3.0-generate-001')` → `duration?: 4 | 6 | 8`
+  (Veo 2: `5 | 6 | 8`); `adapter.snapDuration(7)` → `6`.
+- Multimodal prompts: the first un-roled / `'start_frame'` image part
+  becomes the input image, `'end_frame'` → `lastFrame`, `'reference'` /
+  `'character'` → `referenceImages`.
+- `size` takes Veo aspect ratios (`'16:9' | '9:16'`); everything else from
+  the SDK's `GenerateVideosConfig` (e.g. `resolution`, `generateAudio`,
+  `negativePrompt`) is available through `modelOptions`.
+- Responsible-AI filtering is surfaced as a failed job with the filter
+  reasons.
+
+Note: Veo result URLs are served by the Gemini Files API and require the
+Google API key to download (`x-goog-api-key` header or `key` query
+parameter).
diff --git a/docs/media/video-generation.md b/docs/media/video-generation.md
@@ -2,11 +2,13 @@
 title: Video Generation
 id: video-generation
 order: 6
-description: "Generate video from text prompts with OpenAI Sora using TanStack AI's experimental generateVideo() jobs/polling API."
+description: "Generate video from text prompts with OpenAI Sora or Google Veo using TanStack AI's experimental generateVideo() jobs/polling API."
 keywords:
   - tanstack ai
   - video generation
   - sora
+  - veo
+  - gemini
   - generateVideo
   - jobs api
   - experimental
@@ -36,6 +38,7 @@ TanStack AI provides experimental support for video generation through dedicated
 
 Currently supported:
 - **OpenAI**: Sora-2 and Sora-2-Pro models (when available)
+- **Google Gemini**: Veo 3.1, Veo 3, and Veo 2 models (via the long-running operations API)
 
 ## Basic Usage
 
@@ -417,9 +420,9 @@ adapter uses to route the input to the provider-specific field:
 
 | Role            | Maps to                                                       |
 | --------------- | ------------------------------------------------------------- |
-| `'start_frame'` | fal `start_image_url` (positional default for the first input)         |
-| `'end_frame'`   | fal `end_image_url` (Veo `lastFrame` planned — no Veo adapter yet)      |
-| `'reference'`   | fal `reference_image_urls` (Veo `referenceImages` planned)              |
+| `'start_frame'` | fal `start_image_url`, Veo input `image` (positional default for the first input) |
+| `'end_frame'`   | fal `end_image_url`, Veo `lastFrame`                          |
+| `'reference'`   | fal `reference_image_urls`, Veo `referenceImages`             |
 | `'character'`   | Same as `'reference'` — character consistency images                    |
 
 ```typescript
@@ -445,7 +448,7 @@ await generateVideo({
 | ------------ | -------------------------------------------------------------------------------------------------------- |
 | **OpenAI**   | Sora-2 / Sora-2-Pro → the image part goes to `input_reference`; flattened text is the prompt. Single image only — throws if more than one. |
 | **fal.ai**   | Field names resolve per endpoint from a map generated from the fal SDK's endpoint types — e.g. `role: 'start_frame'` lands on `image_url` for Kling/Veo image-to-video, `first_frame_url` for first-last-frame endpoints, and `start_image_url` otherwise. Defaults: single input → `image_url` (start frame); `role: 'end_frame'` → `end_image_url`; `role: 'reference'` / `'character'` → `reference_image_urls`. Override per-endpoint via `modelOptions` — the media-conditioning fields are typed optional there (even when the endpoint requires them) since they usually arrive as prompt parts. |
-| **Gemini**   | Veo adapter not yet implemented — image prompt parts will be supported when Veo lands.                    |
+| **Gemini**   | Veo → the first un-roled / `'start_frame'` image becomes the input image; `'end_frame'` → `lastFrame`; `'reference'` / `'character'` → `referenceImages` (asset references, Veo 3.1). Throws on multiple starting images. |
 
 Adapters whose underlying API can't accept image inputs throw a clear
 runtime error so calls fail fast.
@@ -488,6 +491,67 @@ const { jobId } = await generateVideo({
 })
 ```
 
+### Google Veo (Gemini) Model Options
+
+Veo runs on Google's long-running operations API. The adapter starts the
+operation, and `getVideoJobStatus` polls it until the video is ready:
+
+```typescript
+import { generateVideo } from '@tanstack/ai'
+import { geminiVideo } from '@tanstack/ai-gemini'
+
+const adapter = geminiVideo('veo-3.1-generate-preview')
+
+const { jobId } = await generateVideo({
+  adapter,
+  prompt: 'A close-up of a luthier carving a guitar neck',
+  size: '16:9', // aspect ratio: '16:9' or '9:16'
+  duration: 8, // typed per model — see below
+  modelOptions: {
+    resolution: '1080p', // '720p' (default), '1080p', '4k' (Veo 3.1 only)
+    negativePrompt: 'cartoon, low quality',
+    generateAudio: true, // Veo 3+ generates synchronized audio
+  },
+})
+```
+
+#### Typed durations
+
+Each Veo model accepts a fixed set of durations, enforced at compile time on
+the `duration` option:
+
+| Model | `duration` values (seconds) |
+|-------|------------------------------|
+| `veo-3.1-generate-preview` | `4`, `6`, `8` |
+| `veo-3.1-fast-generate-preview` | `4`, `6`, `8` |
+| `veo-3.0-generate-001` | `4`, `6`, `8` |
+| `veo-3.0-fast-generate-001` | `4`, `6`, `8` |
+| `veo-2.0-generate-001` | `5`, `6`, `8` |
+
+If you have raw seconds (for example from a UI slider), coerce them with
+`snapDuration`, or inspect the full set with `availableDurations`:
+
+```typescript
+const adapter = geminiVideo('veo-3.0-generate-001')
+
+adapter.availableDurations() // { kind: 'discrete', values: [4, 6, 8] }
+adapter.snapDuration(7) // 6 — closest valid duration
+
+await generateVideo({
+  adapter,
+  prompt: 'A timelapse of a city skyline at dusk',
+  duration: adapter.snapDuration(7),
+})
+```
+
+Adapters that haven't declared a per-model duration map keep the plain
+`duration?: number` typing, return `{ kind: 'none' }` from
+`availableDurations()`, and return `undefined` from `snapDuration()`.
+
+> **Note:** The video URL returned for Veo jobs is served by the Gemini
+> Files API and requires your API key to download (send it as an
+> `x-goog-api-key` header or `key` query parameter).
+
 ## Response Types
 
 > **Note:** The interfaces below are the underlying adapter-level types. The `getVideoJobStatus()` helper returns a single merged object, `{ status, progress?, url?, error?, usage? }` — it does not return `jobId` or `expiresAt`.
@@ -586,9 +650,11 @@ Check the [OpenAI documentation](https://platform.openai.com/docs) for current l
 
 ## Environment Variables
 
-The video adapter uses the same environment variable as other OpenAI adapters:
+The video adapters use the same environment variables as the other adapters
+for their provider:
 
-- `OPENAI_API_KEY`: Your OpenAI API key
+- `OPENAI_API_KEY`: Your OpenAI API key (Sora)
+- `GOOGLE_API_KEY` or `GEMINI_API_KEY`: Your Google API key (Veo)
 
 ## Explicit API Keys