Add Wafer as a model provider#664
Open
ianye23301 wants to merge 1 commit into
Open
Conversation
Wafer (https://wafer.ai) exposes an OpenAI-compatible chat completions endpoint at https://pass.wafer.ai/v1, so it slots into the existing OpenAI-format passthrough — no new request/response translator needed. Wiring: - new endpoint type "wafer" (schema/models.ts, schema/secrets.ts, schema/index.ts, scripts/verify_proxy_models.ts) - WAFER_API_KEY -> wafer in AISecretTypes - EndpointProviderToBaseURL.wafer = https://pass.wafer.ai/v1 - 7 entries in model_list.json for the public catalog (GLM-5.1, Kimi-K2.6, Qwen3.5-397B-A17B, Qwen3.6-35B-A3B, qwen3.7-max, deepseek-v4-flash, deepseek-v4-pro), each tagged available_providers: ["wafer"]. Pricing/context taken from https://pass.wafer.ai/v1/models on 2026-05-27.
|
ianye23301 is attempting to deploy a commit to the Braintrust Team on Vercel. A member of the Team first needs to authorize it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
cerebras,groq,together, etc.https://pass.wafer.ai/v1and is OpenAI-compatible (chat completions, streaming, tools,/v1/models), so it slots into the existing OpenAI-format passthrough — no new request/response translator is needed.available_providers: ["wafer"]. Pricing and context lengths come straight fromhttps://pass.wafer.ai/v1/modelson 2026-05-27.Changes
packages/proxy/schema/models.ts— add"wafer"toModelEndpointType.packages/proxy/schema/secrets.ts— add"wafer"to the simple-types enum inAPISecretSchema.packages/proxy/schema/index.ts— registerWAFER_API_KEY→waferinAISecretTypesandEndpointProviderToBaseURL.wafer = "https://pass.wafer.ai/v1".packages/proxy/schema/model_list.json— add the 7 Wafer-served public models:GLM-5.1($1.20 / $3.60 per Mtok, 203K ctx)Kimi-K2.6($0.88 / $3.84, 262K ctx, multimodal)Qwen3.5-397B-A17B($0.48 / $2.88, 262K ctx, multimodal)Qwen3.6-35B-A3B($0.15 / $1.00, 262K ctx, multimodal)qwen3.7-max($5.00 / $15.00, 256K ctx)deepseek-v4-flash($0.14 / $0.28, 1M ctx)deepseek-v4-pro($1.74 / $3.48, 1M ctx)packages/proxy/scripts/verify_proxy_models.ts— keep the localModelEndpointTypemirror in sync.How it works at request time
A secret of
type: "wafer"withsecret: <WAFER_API_KEY>is routed throughfetchOpenAIexactly like cerebras/groq/together:EndpointProviderToBaseURL.wafersupplies the base URL, theAuthorization: Bearer <key>header is set by the shared code path, and the chat-completions body is forwarded unchanged. End users can call any of the model IDs above through the proxy and get OpenAI-formatted responses back.Test plan
pnpm install && pnpm --filter @braintrust/proxy build— clean (TS build succeeds, including the schema d.ts regeneration).pnpm test— same 89 pre-existing failures asmain(all "No API keys found" against live providers; no new failures introduced).python3 -c 'import json; json.load(open("packages/proxy/schema/model_list.json"))'— model list still parses.WAFER_API_KEYagainst e.g.GLM-5.1. Happy to do this once a maintainer points me at the right staging entry point — or it can be done at review time.Notes
"available_regions"value, etc.) before I clean it up.WaferMetadataSchemalater (e.g. for anapi_baseoverride so on-prem / EU edges can be pointed at directly), but didn't include one in this PR to keep the diff minimal — Wafer customers on the default region don't need any metadata.