Skip to content

Commit ca7e3b6

Browse files
feat: migrate Bedrock provider to AI SDK (#11243)
* feat: migrate Bedrock provider to AI SDK Replace the raw AWS SDK (@aws-sdk/client-bedrock-runtime) Bedrock handler with the Vercel AI SDK (@ai-sdk/amazon-bedrock). Reduces provider from 1,633 lines to 575 lines (65% reduction). Key changes: - Use streamText()/generateText() instead of ConverseStreamCommand/ConverseCommand - Use createAmazonBedrock() with native auth (access key, secret, session, profile via credentialProvider, API key, VPC endpoint as baseURL) - Reasoning config via providerOptions.bedrock.reasoningConfig - Anthropic beta headers via providerOptions.bedrock.anthropicBeta - Thinking signature captured from providerMetadata.bedrock.signature on reasoning-delta stream events - Thinking signature round-tripped via providerOptions.bedrock.signature on reasoning parts in convertToAiSdkMessages() - Redacted thinking captured from providerMetadata.bedrock.redactedData - isAiSdkProvider() returns true for reasoning block preservation - Keep: getModel, ARN parsing, cross-region inference, cost calculation, service tier pricing, 1M context beta Tests: 83 tests skipped (mock old AWS SDK internals, need rewrite for AI SDK mocking). 106 tests pass. 0 tests fail. * fix: address review feedback for Bedrock AI SDK migration - Wire usePromptCache into AI SDK via providerOptions.bedrock.cachePoint on system prompt and last two user messages - Remove debug logger.info that fires on every stream event with providerMetadata - Tighten isThrottlingError to match 'rate limit' instead of broad 'rate'/'limit' substrings that false-positive on context length errors - Use shared handleAiSdkError utility for consistent error handling with status code preservation for retry logic * fix: bedrock AI SDK migration - fix usage metrics, rewrite tests, remove dead code - Fix reasoningTokens always 0 (usage.details?.reasoningTokens → usage.reasoningTokens) - Fix cacheReadInputTokens always 0 (read from usage.inputTokenDetails instead of providerMetadata) - Fix invokedModelId not extracted for prompt router cost calculation - Rewrite all 6 skipped bedrock test suites for AI SDK mocking pattern (140 tests pass) - Remove dead code: bedrock-converse-format.ts, cache-strategy/ (6 files, ~2700 lines) * chore: remove dead @anthropic-ai/bedrock-sdk dep and stale AWS SDK mocks * chore: update pnpm-lock.yaml after removing @anthropic-ai/bedrock-sdk * fix: compute cache point indices from original Anthropic messages before AI SDK conversion The previous approach naively targeted the last 2 user messages in the post-conversion AI SDK array, but convertToAiSdkMessages() splits user messages containing tool_results into separate tool + user messages, causing cache points to land on the wrong messages (tiny text fragments instead of the intended meaty user turns). Now we identify the last 2 user messages in the original Anthropic message array (matching the Anthropic provider's caching strategy) and build a parallel-walk mapping to apply cachePoint to the correct corresponding AI SDK message. * perf: optimize prompt caching with 3-point message strategy + anchor for 20-block window Previous approach only cached the last 2 user messages (using 2 of 4 available cache checkpoints for messages). This left significant cache savings on the table for longer conversations. New strategy uses up to 3 message cache points (+ 1 system = 4 total): - Last user message: write to cache for next request - Second-to-last user message: read from cache for current request - Anchor message at ~1/3 position: ensures the 20-block lookback window from the second-to-last breakpoint hits a stable cache entry, covering all assistant/tool messages in the middle of the conversation Also extracted the parallel-walk mapping logic into a reusable applyCachePointsToAiSdkMessages() helper method. Industry benchmarks show 70-95% token cache rates are achievable; this change should significantly improve our 39% baseline for longer multi-turn conversations. * chore: remove stale bedrock-sdk external, fix arnInfo property name, remove unused exports --------- Co-authored-by: daniel-lxs <ricciodaniel98@gmail.com>
1 parent 0e5407a commit ca7e3b6

20 files changed

Lines changed: 1781 additions & 5700 deletions

apps/cli/tsup.config.ts

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,6 @@ export default defineConfig({
1616
external: [
1717
// Keep native modules external
1818
"@anthropic-ai/sdk",
19-
"@anthropic-ai/bedrock-sdk",
2019
"@anthropic-ai/vertex-sdk",
2120
// Keep @vscode/ripgrep external - we bundle the binary separately
2221
"@vscode/ripgrep",

pnpm-lock.yaml

Lines changed: 36 additions & 396 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

src/api/providers/__tests__/bedrock-custom-arn.spec.ts

Lines changed: 2 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -22,38 +22,6 @@ vitest.mock("../../../utils/logging", () => ({
2222
},
2323
}))
2424

25-
// Mock AWS SDK
26-
vitest.mock("@aws-sdk/client-bedrock-runtime", () => {
27-
const mockModule = {
28-
lastCommandInput: null as Record<string, any> | null,
29-
mockSend: vitest.fn().mockImplementation(async function () {
30-
return {
31-
output: new TextEncoder().encode(JSON.stringify({ content: "Test response" })),
32-
}
33-
}),
34-
mockConverseCommand: vitest.fn(function (input) {
35-
mockModule.lastCommandInput = input
36-
return { input }
37-
}),
38-
MockBedrockRuntimeClient: class {
39-
public config: any
40-
public send: any
41-
42-
constructor(config: { region?: string }) {
43-
this.config = config
44-
this.send = mockModule.mockSend
45-
}
46-
},
47-
}
48-
49-
return {
50-
BedrockRuntimeClient: mockModule.MockBedrockRuntimeClient,
51-
ConverseCommand: mockModule.mockConverseCommand,
52-
ConverseStreamCommand: vitest.fn(),
53-
__mock: mockModule, // Expose mock internals for testing
54-
}
55-
})
56-
5725
describe("Bedrock ARN Handling", () => {
5826
// Helper function to create a handler with specific options
5927
const createHandler = (options: Partial<ApiHandlerOptions> = {}) => {
@@ -224,8 +192,8 @@ describe("Bedrock ARN Handling", () => {
224192
"arn:aws:bedrock:eu-west-1:123456789012:inference-profile/anthropic.claude-3-sonnet-20240229-v1:0",
225193
})
226194

227-
// Verify the client was created with the ARN region, not the provided region
228-
expect((handler as any).client.config.region).toBe("eu-west-1")
195+
// Verify the handler's options were updated with the ARN region
196+
expect((handler as any).options.awsRegion).toBe("eu-west-1")
229197
})
230198

231199
it("should log region mismatch warning when ARN region differs from provided region", () => {

0 commit comments

Comments
 (0)