Skip to content

feat: implement two-level model pooling and clustering#1

Merged
anschmieg merged 1 commit into
mainfrom
feat/model-pooling
Feb 20, 2026
Merged

feat: implement two-level model pooling and clustering#1
anschmieg merged 1 commit into
mainfrom
feat/model-pooling

Conversation

@anschmieg
Copy link
Copy Markdown
Owner

Summary

Implements a two-level model pooling and clustering system.

Level 1 — Pools

Aggregate the same logical model from multiple providers under a single virtual model ID.

Example: Request claude-sonnet-4 → gateway tries Kiro first, then Claude API on 429/503.

Level 2 — Clusters

Group multiple pools under semantic intent-based names.

Example: coding-high, coding-fast, chat, reasoning, vision.

Changes

New Files

  • internal/pool/resolver.go — thread-safe pool/cluster resolver with hot-reload
  • internal/pool/resolver_test.go — 11 unit tests
  • sdk/api/handlers/pool_execution.goExecuteWithPoolFailover + ExecuteStreamWithPoolFailover
  • internal/api/handlers/management/model_pools.go — management API handlers

Modified Files

  • internal/config/config.goModelPoolConfig, ModelPool, ModelCluster types
  • sdk/api/handlers/handlers.goPoolResolver field, hot-reload in UpdateConfig, pool failover in all Execute* methods
  • sdk/api/handlers/*/Models() includes pool/cluster virtual models
  • internal/api/server.go — management routes registered
  • config.example.yaml — full documentation with examples

Testing

  • 11 pool unit tests (all passing)
  • Full suite: 28/28 packages green, zero failures

Level 1 Pools: aggregate the same model across multiple providers
under a single virtual model ID. Supports priority and round-robin
selection strategies, with cross-member failover on 429/503 errors.

Level 2 Clusters: group multiple pools (or raw model IDs) under
semantic names (e.g., coding-high, coding-fast, chat, reasoning).

Changes:
- internal/config/config.go: ModelPoolConfig, ModelPool, ModelCluster,
  PoolMember, ClusterMember types; Config.ModelPools field
- internal/pool/resolver.go: Resolver with Resolve, Reload,
  IsPoolOrCluster, ListVirtualModels; thread-safe hot-reload
- internal/pool/resolver_test.go: 11 unit tests covering all scenarios
- sdk/api/handlers/handlers.go: PoolResolver field on BaseAPIHandler;
  VirtualModels() helper; UpdateConfig hook for hot-reload;
  pool failover wired into Execute*, ExecuteCount*, ExecuteStream*
- sdk/api/handlers/pool_execution.go: ExecuteWithPoolFailover and
  ExecuteStreamWithPoolFailover; isPoolFailoverError; member iteration
- sdk/api/handlers/*/: Models() in OpenAI/Claude/Gemini handlers
  now append virtual pool+cluster models to /v1/models listing
- internal/api/handlers/management/model_pools.go: GET/PUT/PATCH/DELETE
  management API endpoints for pools and clusters
- internal/api/server.go: register /v0/management/model-pools routes
- config.example.yaml: full documentation with examples
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @anschmieg! 👋

Your private repo does not have access to Sourcery.

Please upgrade to continue using Sourcery ✨

@anschmieg anschmieg merged commit 6cd1163 into main Feb 20, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants