Skip to content

Add Wafer as a model provider#664

Open
ianye23301 wants to merge 1 commit into
braintrustdata:mainfrom
ianye23301:wafer-provider
Open

Add Wafer as a model provider#664
ianye23301 wants to merge 1 commit into
braintrustdata:mainfrom
ianye23301:wafer-provider

Conversation

@ianye23301
Copy link
Copy Markdown

Summary

  • Adds Wafer (https://wafer.ai) as a new endpoint type alongside cerebras, groq, together, etc.
  • Wafer's serverless catalog is exposed at https://pass.wafer.ai/v1 and is OpenAI-compatible (chat completions, streaming, tools, /v1/models), so it slots into the existing OpenAI-format passthrough — no new request/response translator is needed.
  • Registers the 7 publicly-listed Wafer models with available_providers: ["wafer"]. Pricing and context lengths come straight from https://pass.wafer.ai/v1/models on 2026-05-27.

Changes

  • packages/proxy/schema/models.ts — add "wafer" to ModelEndpointType.
  • packages/proxy/schema/secrets.ts — add "wafer" to the simple-types enum in APISecretSchema.
  • packages/proxy/schema/index.ts — register WAFER_API_KEYwafer in AISecretTypes and EndpointProviderToBaseURL.wafer = "https://pass.wafer.ai/v1".
  • packages/proxy/schema/model_list.json — add the 7 Wafer-served public models:
    • GLM-5.1 ($1.20 / $3.60 per Mtok, 203K ctx)
    • Kimi-K2.6 ($0.88 / $3.84, 262K ctx, multimodal)
    • Qwen3.5-397B-A17B ($0.48 / $2.88, 262K ctx, multimodal)
    • Qwen3.6-35B-A3B ($0.15 / $1.00, 262K ctx, multimodal)
    • qwen3.7-max ($5.00 / $15.00, 256K ctx)
    • deepseek-v4-flash ($0.14 / $0.28, 1M ctx)
    • deepseek-v4-pro ($1.74 / $3.48, 1M ctx)
  • packages/proxy/scripts/verify_proxy_models.ts — keep the local ModelEndpointType mirror in sync.

How it works at request time

A secret of type: "wafer" with secret: <WAFER_API_KEY> is routed through fetchOpenAI exactly like cerebras/groq/together: EndpointProviderToBaseURL.wafer supplies the base URL, the Authorization: Bearer <key> header is set by the shared code path, and the chat-completions body is forwarded unchanged. End users can call any of the model IDs above through the proxy and get OpenAI-formatted responses back.

Test plan

  • pnpm install && pnpm --filter @braintrust/proxy build — clean (TS build succeeds, including the schema d.ts regeneration).
  • pnpm test — same 89 pre-existing failures as main (all "No API keys found" against live providers; no new failures introduced).
  • python3 -c 'import json; json.load(open("packages/proxy/schema/model_list.json"))' — model list still parses.
  • Smoke-test end-to-end through a deployed proxy with a real WAFER_API_KEY against e.g. GLM-5.1. Happy to do this once a maintainer points me at the right staging entry point — or it can be done at review time.

Notes

  • I'm posting this as a draft so you can shape the wiring (naming, where to register, whether you want an "available_regions" value, etc.) before I clean it up.
  • Happy to add a WaferMetadataSchema later (e.g. for an api_base override so on-prem / EU edges can be pointed at directly), but didn't include one in this PR to keep the diff minimal — Wafer customers on the default region don't need any metadata.

Wafer (https://wafer.ai) exposes an OpenAI-compatible chat completions
endpoint at https://pass.wafer.ai/v1, so it slots into the existing
OpenAI-format passthrough — no new request/response translator needed.

Wiring:
- new endpoint type "wafer" (schema/models.ts, schema/secrets.ts,
  schema/index.ts, scripts/verify_proxy_models.ts)
- WAFER_API_KEY -> wafer in AISecretTypes
- EndpointProviderToBaseURL.wafer = https://pass.wafer.ai/v1
- 7 entries in model_list.json for the public catalog
  (GLM-5.1, Kimi-K2.6, Qwen3.5-397B-A17B, Qwen3.6-35B-A3B,
  qwen3.7-max, deepseek-v4-flash, deepseek-v4-pro), each tagged
  available_providers: ["wafer"]. Pricing/context taken from
  https://pass.wafer.ai/v1/models on 2026-05-27.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 27, 2026

ianye23301 is attempting to deploy a commit to the Braintrust Team on Vercel.

A member of the Team first needs to authorize it.

@ianye23301 ianye23301 marked this pull request as ready for review May 27, 2026 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant