| title | Chunk Definitions |
|---|---|
| id | chunk-definitions |
All streaming responses in TanStack AI consist of a series of StreamChunks - discrete JSON objects representing different events during the conversation. These chunks enable real-time updates for content generation, tool calls, errors, and completion signals.
This document defines the data structures (chunks) that flow between the TanStack AI server and client during streaming chat operations.
All chunks share a common base structure:
interface BaseStreamChunk {
type: StreamChunkType;
id: string; // Unique identifier for the message/response
model: string; // Model identifier (e.g., "gpt-5.2", "claude-3-5-sonnet")
timestamp: number; // Unix timestamp in milliseconds
}type StreamChunkType =
| 'content' // Text content being generated
| 'thinking' // Model's reasoning process (when supported)
| 'tool_call' // Model calling a tool/function
| 'tool-input-available' // Tool inputs are ready for client execution
| 'approval-requested' // Tool requires user approval
| 'tool_result' // Result from tool execution
| 'done' // Stream completion
| 'error'; // Error occurredEmitted when the model generates text content. Sent incrementally as tokens are generated.
interface ContentStreamChunk extends BaseStreamChunk {
type: 'content';
delta: string; // The incremental content token (new text since last chunk)
content: string; // Full accumulated content so far
role?: 'assistant';
}Example:
{
"type": "content",
"id": "chatcmpl-abc123",
"model": "gpt-5.2",
"timestamp": 1701234567890,
"delta": "Hello",
"content": "Hello",
"role": "assistant"
}Usage:
- Display
deltafor smooth streaming effect - Use
contentfor the complete message so far - Multiple content chunks will be sent for a single response
Emitted when the model exposes its reasoning process (e.g., Claude with extended thinking, o1 models).
interface ThinkingStreamChunk extends BaseStreamChunk {
type: 'thinking';
delta?: string; // The incremental thinking token
content: string; // Full accumulated thinking content so far
}Example:
{
"type": "thinking",
"id": "chatcmpl-abc123",
"model": "claude-3-5-sonnet",
"timestamp": 1701234567890,
"delta": "First, I need to",
"content": "First, I need to"
}Usage:
- Display in a separate "thinking" UI element
- Thinking is excluded from messages sent back to the model
- Not all models support thinking chunks
Emitted when the model decides to call a tool/function.
interface ToolCallStreamChunk extends BaseStreamChunk {
type: 'tool_call';
toolCall: {
id: string;
type: 'function';
function: {
name: string;
arguments: string; // JSON string (may be partial/incremental)
};
};
index: number; // Index of this tool call (for parallel calls)
}Example:
{
"type": "tool_call",
"id": "chatcmpl-abc123",
"model": "gpt-5.2",
"timestamp": 1701234567890,
"toolCall": {
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\":\"San Francisco\"}"
}
},
"index": 0
}Usage:
- Multiple chunks may be sent for a single tool call (streaming arguments)
argumentsmay be incomplete until all chunks for this tool call are receivedindexallows multiple parallel tool calls
Emitted when tool inputs are complete and ready for client-side execution.
interface ToolInputAvailableStreamChunk extends BaseStreamChunk {
type: 'tool-input-available';
toolCallId: string; // ID of the tool call
toolName: string; // Name of the tool to execute
input: any; // Parsed tool arguments (JSON object)
}Example:
{
"type": "tool-input-available",
"id": "chatcmpl-abc123",
"model": "gpt-5.2",
"timestamp": 1701234567890,
"toolCallId": "call_abc123",
"toolName": "get_weather",
"input": {
"location": "San Francisco",
"unit": "fahrenheit"
}
}Usage:
- Signals that the client should execute the tool
- Only sent for tools without a server-side
executefunction - Client calls
onToolCallcallback with these parameters
Emitted when a tool requires user approval before execution.
interface ApprovalRequestedStreamChunk extends BaseStreamChunk {
type: 'approval-requested';
toolCallId: string; // ID of the tool call
toolName: string; // Name of the tool requiring approval
input: any; // Tool arguments for review
approval: {
id: string; // Unique approval request ID
needsApproval: true; // Always true
};
}Example:
{
"type": "approval-requested",
"id": "chatcmpl-abc123",
"model": "gpt-5.2",
"timestamp": 1701234567890,
"toolCallId": "call_abc123",
"toolName": "send_email",
"input": {
"to": "user@example.com",
"subject": "Hello",
"body": "Test email"
},
"approval": {
"id": "approval_xyz789",
"needsApproval": true
}
}Usage:
- Display approval UI to user
- User responds with approval decision via
addToolApprovalResponse() - Tool execution pauses until approval is granted or denied
Emitted when a tool execution completes (either server-side or client-side).
interface ToolResultStreamChunk extends BaseStreamChunk {
type: 'tool_result';
toolCallId: string; // ID of the tool call that was executed
content: string; // Result of the tool execution (JSON stringified)
}Example:
{
"type": "tool_result",
"id": "chatcmpl-abc123",
"model": "gpt-5.2",
"timestamp": 1701234567891,
"toolCallId": "call_abc123",
"content": "{\"temperature\":72,\"conditions\":\"sunny\"}"
}Usage:
- Sent after tool execution completes
- Model uses this result to continue the conversation
- May trigger additional model responses
Emitted when the stream completes successfully.
interface DoneStreamChunk extends BaseStreamChunk {
type: 'done';
finishReason: 'stop' | 'length' | 'content_filter' | 'tool_calls' | null;
usage?: TokenUsage;
}
interface TokenUsage {
// Core token counts (always present when usage is available)
promptTokens: number;
completionTokens: number;
totalTokens: number;
// Detailed prompt token breakdown
promptTokensDetails?: {
cachedTokens?: number; // Tokens from prompt cache hits
cacheWriteTokens?: number; // Tokens written to cache
cacheCreationTokens?: number; // Anthropic cache creation tokens
cacheReadTokens?: number; // Anthropic cache read tokens
audioTokens?: number; // Audio input tokens
videoTokens?: number; // Video input tokens
imageTokens?: number; // Image input tokens
textTokens?: number; // Text input tokens
};
// Detailed completion token breakdown
completionTokensDetails?: {
reasoningTokens?: number; // Reasoning/thinking tokens (o1, Claude)
audioTokens?: number; // Audio output tokens
videoTokens?: number; // Video output tokens
imageTokens?: number; // Image output tokens
textTokens?: number; // Text output tokens
acceptedPredictionTokens?: number; // Accepted prediction tokens
rejectedPredictionTokens?: number; // Rejected prediction tokens
};
// Provider-specific details
providerUsageDetails?: Record<string, unknown>;
// Duration (for some billing models)
durationSeconds?: number;
}Example (basic usage):
{
"type": "done",
"id": "chatcmpl-abc123",
"model": "gpt-5.2",
"timestamp": 1701234567892,
"finishReason": "stop",
"usage": {
"promptTokens": 150,
"completionTokens": 75,
"totalTokens": 225
}
}Example (with cached tokens - OpenAI):
{
"type": "done",
"id": "chatcmpl-abc123",
"model": "gpt-4o",
"timestamp": 1701234567892,
"finishReason": "stop",
"usage": {
"promptTokens": 150,
"completionTokens": 75,
"totalTokens": 225,
"promptTokensDetails": {
"cachedTokens": 100
}
}
}Example (with reasoning tokens - o1):
{
"type": "done",
"id": "chatcmpl-abc123",
"model": "o1-preview",
"timestamp": 1701234567892,
"finishReason": "stop",
"usage": {
"promptTokens": 150,
"completionTokens": 500,
"totalTokens": 650,
"completionTokensDetails": {
"reasoningTokens": 425
}
}
}Example (Anthropic with cache):
{
"type": "done",
"id": "msg_abc123",
"model": "claude-3-5-sonnet",
"timestamp": 1701234567892,
"finishReason": "stop",
"usage": {
"promptTokens": 150,
"completionTokens": 75,
"totalTokens": 225,
"promptTokensDetails": {
"cacheCreationTokens": 50,
"cacheReadTokens": 100
}
}
}Finish Reasons:
stop- Natural completionlength- Reached max tokenscontent_filter- Stopped by content filteringtool_calls- Stopped to execute toolsnull- Unknown or not provided
Usage:
- Marks the end of a successful stream
- Clean up streaming state
- Display token usage (if available)
Token Usage Notes:
promptTokensDetails.cachedTokens- OpenAI prompt cachingpromptTokensDetails.cacheCreationTokens/cacheReadTokens- Anthropic cachingcompletionTokensDetails.reasoningTokens- Internal reasoning tokens (o1, Claude thinking)providerUsageDetails- Provider-specific fields not in the standard schema- For Gemini, modality-specific token counts (audio, video, image, text) are extracted from the response
Emitted when an error occurs during streaming.
interface ErrorStreamChunk extends BaseStreamChunk {
type: 'error';
error: {
message: string; // Human-readable error message
code?: string; // Optional error code
};
}Example:
{
"type": "error",
"id": "chatcmpl-abc123",
"model": "gpt-5.2",
"timestamp": 1701234567893,
"error": {
"message": "Rate limit exceeded",
"code": "rate_limit_exceeded"
}
}Common Error Codes:
rate_limit_exceeded- API rate limit hitinvalid_request- Malformed requestauthentication_error- API key issuestimeout- Request timed outserver_error- Internal server error
Usage:
- Display error to user
- Stream ends after error chunk
- Retry logic should be implemented client-side
-
Content Generation:
ContentStreamChunk (delta: "Hello") ContentStreamChunk (delta: " world") ContentStreamChunk (delta: "!") DoneStreamChunk (finishReason: "stop") -
With Thinking:
ThinkingStreamChunk (delta: "I need to...") ThinkingStreamChunk (delta: " check the weather") ContentStreamChunk (delta: "Let me check") DoneStreamChunk (finishReason: "stop") -
Tool Usage:
ToolCallStreamChunk (name: "get_weather") ToolResultStreamChunk (content: "{...}") ContentStreamChunk (delta: "The weather is...") DoneStreamChunk (finishReason: "stop") -
Client Tool with Approval:
ToolCallStreamChunk (name: "send_email") ApprovalRequestedStreamChunk (toolName: "send_email") [User approves] ToolInputAvailableStreamChunk (toolName: "send_email") [Client executes] ToolResultStreamChunk (content: "{\"sent\":true}") ContentStreamChunk (delta: "Email sent successfully") DoneStreamChunk (finishReason: "stop")
When the model calls multiple tools in parallel:
ToolCallStreamChunk (index: 0, name: "get_weather")
ToolCallStreamChunk (index: 1, name: "get_time")
ToolResultStreamChunk (toolCallId: "call_1")
ToolResultStreamChunk (toolCallId: "call_2")
ContentStreamChunk (delta: "Based on the data...")
DoneStreamChunk (finishReason: "stop")
All chunks are represented as a discriminated union:
type StreamChunk =
| ContentStreamChunk
| ThinkingStreamChunk
| ToolCallStreamChunk
| ToolInputAvailableStreamChunk
| ApprovalRequestedStreamChunk
| ToolResultStreamChunk
| DoneStreamChunk
| ErrorStreamChunk;This enables type-safe handling in TypeScript:
function handleChunk(chunk: StreamChunk) {
switch (chunk.type) {
case 'content':
console.log(chunk.delta); // TypeScript knows this is ContentStreamChunk
break;
case 'thinking':
console.log(chunk.content); // TypeScript knows this is ThinkingStreamChunk
break;
case 'tool_call':
console.log(chunk.toolCall.function.name); // TypeScript knows structure
break;
// ... other cases
}
}- SSE Protocol - How chunks are transmitted via Server-Sent Events
- HTTP Stream Protocol - How chunks are transmitted via HTTP streaming
- Connection Adapters Guide - Client implementation