Skip to content

Cache NyxID /llm/status + /proxy/services catalog with stale-while-revalidate #646

@eanzhao

Description

@eanzhao

Background

IResponsesRouteResolver.ResolveRouteValueAsync is called on every /v1/responses request to map vendor/model to a NyxID route. It in turn drives NyxIdLlmCatalogHttpClient to hit NyxID /llm/status and /proxy/services.

Catalog data is slow-changing (provider list changes hours-to-days), but we re-fetch on every chat turn.

Proposal

In-process catalog cache with stale-while-revalidate semantics:

  • TTL: 60s (fresh), 600s (stale-but-usable)
  • Background refresh task wakes every 30s
  • On fetch failure during refresh: keep serving stale, surface metric
  • Invalidate on observed 404 unknown route from NyxIdLLMProvider (route disappeared upstream)
  • Bound key set by scope (catalog is global, not per-caller — single cache instance)

Constraints

  • Cannot push catalog state into a GAgent — it would re-introduce an actor with no business reason (CLAUDE.md "ReadModel 按需创建"). Plain memory cache is fine.
  • Must coexist with cc-switch users who configure new providers — surface a /v1/admin/invalidate-catalog (auth-gated) or rely on the stale-window to converge

Related

Acceptance

  • p50 latency on /v1/responses drops by the catalog fetch time (~tens of ms)
  • Catalog cache hit ratio observable in metrics
  • Stale-fallback engaged at least once in chaos test (NyxID down)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions