All streamText calls in the codebase now include a hard limit on tool call iterations using stopWhen: stepCountIs(5). This is non-negotiable for production and prevents infinite tool call loops that can occur in agentic AI systems.
Without a step limit, the following scenario can occur:
User: "Find me headphones under ₹10,000"
↓
Agent calls searchProducts tool
↓
Tool returns results
↓
Agent thinks: "I should filter these by brand"
↓
Agent calls searchProducts tool again (with brand filter)
↓
Tool returns results
↓
Agent thinks: "Let me check if there are similar products"
↓
Agent calls searchProducts tool again...
↓
[INFINITE LOOP - Never terminates]
- Ambiguous Tool Definitions: Tools with overlapping responsibilities can cause the LLM to call them repeatedly.
- Recursive Tool Results: A tool result that triggers the same condition that called it.
- LLM Hallucination: The model may incorrectly believe it needs to call a tool again with slightly different parameters.
- Multi-Turn Context Confusion: In conversation, the LLM may re-interpret previous tool calls as incomplete.
- Cost: Each tool call consumes API credits and compute resources
- Latency: Users wait indefinitely for a response that never completes
- Resource Exhaustion: Can exhaust rate limits, database connections, or memory
- User Experience: Application appears frozen or broken
Vercel AI SDK v6 Pattern:
import { streamText, stepCountIs } from 'ai';
const result = streamText({
model: llm,
messages: conversationMessages,
stopWhen: stepCountIs(5), // ← HARD LIMIT: Max 5 tool calls per turn
tools: {
searchProducts: { /* ... */ },
addToCart: { /* ... */ },
},
});Deprecated Pattern (AI SDK v4):
// ❌ DON'T USE - maxSteps removed in v5+
const result = streamText({
model: llm,
messages,
maxSteps: 5, // Removed in AI SDK v5+
tools: { /* ... */ },
});The limit of 5 tool calls per turn is based on:
- Cognitive Load: Most user queries require 1-3 tool calls maximum
- Diminishing Returns: After 5 calls, the agent is likely stuck in a loop
- Cost Control: Limits maximum API cost per user request
- Latency Budget: 5 tool calls × ~2s each = 10s max wait time (acceptable)
When the agent reaches 5 tool calls:
- Graceful Termination: The
streamTextcall stops generating - Partial Response: The model provides a summary based on available tool results
- User Can Continue: User can ask follow-up questions to get more information
- No Error: The response completes normally, just with limited tool usage
Location: sendMessage() function
const result = streamText({
model: llm,
messages: conversationMessages,
stopWhen: stepCountIs(5), // CRITICAL: Prevent infinite tool call loops
tools: {
searchProducts: { /* ... */ },
addToCart: { /* ... */ },
trackOrder: { /* ... */ },
},
});Location: streamAgentResponse() function
import { streamText, stepCountIs } from 'ai';
export async function streamAgentResponse(messages, options = {}) {
'use server';
const {
model = google('gemini-1.5-flash'),
maxSteps = 5, // CRITICAL: Prevent infinite tool call loops
onStepComplete = null,
includeUI = true,
} = options;
const result = await streamText({
model,
messages,
stopWhen: stepCountIs(maxSteps), // ← Applied to streamText
tools: { /* ... */ },
});
}Monitor for these patterns in production:
- High Step Counts: If many requests hit the 5-step limit, review tool definitions
- Repeated Tool Calls: Same tool called 3+ times in one turn indicates confusion
- Timeout Errors: Requests timing out may be stuck in loops
- User Complaints: "The assistant keeps searching but never shows results"
Add structured logging to track tool usage:
const result = streamText({
model: llm,
messages,
stopWhen: stepCountIs(5),
tools: {
searchProducts: {
description: 'Search for products',
inputSchema: SearchProductsParams,
execute: async (params) => {
console.log('[Tool Call] searchProducts', { params });
// ... implementation
},
},
},
onFinish: ({ usage, steps }) => {
console.log('[Stream Complete]', { usage, steps });
},
});Each tool should do one thing and do it well:
// ✅ GOOD: Focused tool
searchProducts: {
description: 'Search products by query and filters',
execute: async ({ query, maxPrice, category }) => {
return await hybridProductSearch(query, { maxPrice, category });
}
}
// ❌ BAD: Multiple responsibilities
searchAndFilterProducts: {
description: 'Search, filter, sort, and paginate products',
// Too complex - LLM may call multiple times for different operations
}Help the LLM understand when to use each tool:
// ✅ GOOD: Clear description
searchProducts: {
description: 'Search for products. Call ONCE with all filters. Returns up to 6 results.',
// Clear expectations about usage
}
// ❌ BAD: Vague description
searchProducts: {
description: 'Search for stuff',
// LLM may call repeatedly to "refine" search
}Give the LLM enough information in one call:
// ✅ GOOD: Rich result
return {
products: [...],
total: 42,
hasMore: true,
filters: { applied: [...] },
suggestions: ['headphones', 'earbuds'],
};
// ❌ BAD: Sparse result
return {
products: [...],
// LLM may call again to get "more info"
};When the limit is hit, provide helpful context:
const result = streamText({
model: llm,
messages,
stopWhen: stepCountIs(5),
system: `You are a helpful assistant.
IMPORTANT: You can call tools up to 5 times per conversation turn.
If you reach this limit, summarize what you found and ask if the user
wants to continue with more specific queries.`,
});import { streamText, stepCountIs } from 'ai';
import { describe, it, expect } from 'vitest';
describe('streamText configuration', () => {
it('should have stopWhen limit to prevent infinite loops', async () => {
const result = streamText({
model: testModel,
messages: [{ role: 'user', content: 'test' }],
stopWhen: stepCountIs(5),
tools: { testTool },
});
// Consume stream and verify it completes
const response = await result.toUIMessageStream();
expect(response).toBeDefined();
// Should complete without hanging
});
});import { test, expect } from 'playwright/test';
test('agent should not exceed 5 tool calls per turn', async ({ page }) => {
let toolCallCount = 0;
// Mock tool endpoint
await page.route('**/api/tools/*', (route) => {
toolCallCount++;
route.fulfill({ json: { result: 'mocked' } });
});
// Trigger agent response
await page.fill('[data-testid="chat-input"]', 'Find me products...');
await page.click('[data-testid="send-button"]');
// Wait for response to complete
await page.waitForSelector('[data-testid="response-complete"]');
// Verify tool calls stayed within limit
expect(toolCallCount).toBeLessThanOrEqual(5);
});Before (v4):
import { streamText } from 'ai';
const result = streamText({
model,
messages,
maxSteps: 5, // ❌ Removed in v5+
tools,
});After (v6):
import { streamText, stepCountIs } from 'ai';
const result = streamText({
model,
messages,
stopWhen: stepCountIs(5), // ✅ Correct for v5+
tools,
});- Import
stepCountIsfromai - Replace
maxSteps: 5withstopWhen: stepCountIs(5) - Verify TypeScript compiles without errors
- Test that tool calls still work correctly
- Monitor production for requests hitting the limit
| Aspect | Detail |
|---|---|
| Parameter | stopWhen: stepCountIs(5) |
| Location | All streamText calls with tools |
| Purpose | Prevent infinite tool call loops |
| Limit | 5 tool calls per conversation turn |
| Files | apps/web/app/chat-dashboard/actions.tsx, apps/web/app/actions.js |
| Status | ✅ Implemented and verified |
Remember: This is a safety-critical feature. Never remove or increase the limit without careful consideration of the failure modes.