fix: disabled-key 429 lockout, slow availability page, public-status Redis leak by ding113 · Pull Request #1187 · ding113/claude-code-hub

ding113 · 2026-05-14T15:36:28Z

Summary

Fixes three user-reported bugs:

Disabled API keys returned HTTP 429 "Too many authentication failures" instead of "key disabled". Every request with a disabled/expired key tripped the pre-auth brute-force rate limiter; after 20 attempts the IP/key got locked out and the real 401 was masked forever.
Provider availability dashboard was very slow. Commit 095c607e (fix(metrics): align success-rate outcome semantics #1080) replaced the SARGable status_code IS NOT NULL finalized predicate with a fn_is_message_request_finalized(...) PL/pgSQL call. PostgreSQL never inlines PL/pgSQL, so the partial index idx_message_request_provider_created_at_finalized_active was no longer usable and the query degraded to a sequential scan.
Public status page Redis keys leaked indefinitely. Commit b2497a46 (feat: add redis-projected public status page #1056) added Redis-backed config snapshots written via bare redis.set(...) (no TTL). Every config-version mint (provider / group / system settings change) created a new versioned snapshot key that never expired — public-status:v1:config:* and the internal variant grew without bound on busy operators. Neighbouring projection writers in rebuild-worker.ts were already using a 30-day TTL; only the config publishing path was missed.

Changes

Auth chain (`fix(auth)`)

New resolveApiKeyAuthOutcome in src/repository/key.ts returning a discriminated union (not_found / key_disabled / key_expired / ok). validateApiKeyAndGetUser is kept as a back-compat wrapper. Duplicate-row handling: keys.key has no unique constraint; the classifier prefers an active duplicate, falls back to key_expired when any duplicate is enabled, else key_disabled.
ProxyAuthenticator (proxy auth guard) maps each reason to a distinct 401 (invalid_api_key, key_disabled, key_expired, user_disabled, user_expired). The branch is an exhaustive switch with assertNever — adding a new ApiKeyAuthFailureReason produces a TS error until the new branch is handled.
New AuthFailureKind tag on AuthState; the guard only feeds credentials failures (missing/mismatched/unknown key) into the brute-force rate limiter. Admin-disabled or expired keys/users bypass it entirely. A buildAuthFailure factory forces every failure call site to tag both failureKind and errorResponse at compile time.
The /v1/models chain (available-models.ts) is updated the same way.

Availability dashboard (`perf(availability)`)

buildAvailabilityFinalizedCondition inlines a semantically-equivalent SQL expression so status_code IS NOT NULL becomes the dominant SARGable branch and the planner can use the partial index. The PL/pgSQL function definition is unchanged (still called by the upsert trigger on the write path); a header comment marks the keep-in-sync requirement.
The provider-chain JSONB branch is wrapped in a CASE so jsonb_array_length cannot evaluate against a non-array historical row (PG does not guarantee AND short-circuit). FINALIZED_PROVIDER_CHAIN_REASONS is extracted as a top-level constant; a comment documents the ? JSONB operator's driver assumption.

Public status TTL (`fix(public-status)`)

New PUBLIC_STATUS_CONFIG_TTL_SECONDS = 60 * 60 * 24 * 30 (matches GENERATION_PROJECTION_TTL_SECONDS in rebuild-worker.ts).
All four redis.set(...) writes in publishPublicStatusConfigSnapshot, publishInternalPublicStatusConfigSnapshot, and publishCurrentPublicStatusConfigPointers now use (key, value, "EX", seconds). The Lua script in the pointer publisher's eval path applies SET ... EX <ttl> atomically with the version compare so the TTL refreshes on every successful publish.

Test plan

New unit tests in tests/unit/proxy/auth-guard-account-state.test.ts (7 cases — disabled key, expired key, disabled user, expired user, unknown key, missing creds, plus a "25 disabled-key attempts never trip 429" regression case)
New unit tests in tests/unit/repository/key-resolve-outcome.test.ts (9 cases including 3 dedicated to multi-row duplicate-key scenarios)
Updated assertions in tests/unit/lib/availability-service.test.ts guard against the slow function-call form returning and assert the new CASE shape
New unit tests in tests/unit/public-status/config-snapshot.test.ts (4 cases — publishPublicStatusConfigSnapshot, publishInternalPublicStatusConfigSnapshot, and both Lua / non-Lua pointer-publish paths each assert EX + PUBLIC_STATUS_CONFIG_TTL_SECONDS)
Pre-push checklist green: bun run build, bun run lint, bun run typecheck, bun run test (5964 tests pass)
Manual: disable a key in admin UI, hit the proxy 25 times in a row, confirm every response is 401 key_disabled and no 429 lockout
Manual: load /dashboard/availability against a populated DB, confirm noticeable speed-up vs. dev
Manual: run redis-cli --scan --pattern 'public-status:v1:config:*' | xargs -I {} redis-cli TTL {} and confirm every key has a positive TTL

🤖 Generated with Claude Code

Greptile Summary

This PR fixes three distinct production bugs: a brute-force rate-limiter incorrectly recording disabled/expired API key failures (causing 429 lockouts), a PL/pgSQL call making the availability dashboard query non-SARGable (sequential scan), and Redis versioned config-snapshot keys accumulating without a TTL.

Auth chain: resolveApiKeyAuthOutcome returns a discriminated union distinguishing not_found / key_disabled / key_expired; the proxy guard's exhaustive switch maps each reason to a distinct 401 and gates rate-limiter recording behind a failureKind !== \"account_state\" check. Multi-row duplicate-key handling correctly prefers active rows.
Availability dashboard: buildAvailabilityFinalizedCondition inlines the PL/pgSQL predicate as explicit SQL so status_code IS NOT NULL remains SARGable; JSONB array operations are wrapped in CASE to guard against non-array rows.
Public-status TTL: Versioned config snapshot keys now carry a 30-day TTL. Pointer keys are intentionally left without TTL \u2014 but pointer keys store references to versioned keys, so if the versioned key expires after 30 idle days, the status page silently goes dark.", "fileAnalyses":[{"path": "src/repository/key.ts", "confidence": 5, "summary": "Adds resolveApiKeyAuthOutcome returning a discriminated union; widens the DB query to include disabled/expired keys and classifies them in JS with correct multi-row deduplication logic."}, {"path": "src/app/v1/_lib/proxy/auth-guard.ts", "confidence": 5, "summary": "Exhaustive switch on ApiKeyAuthFailureReason with assertNever; buildAuthFailure factory enforces failureKind on every failure path; account-state failures correctly bypass the brute-force rate-limiter."}, {"path": "src/lib/availability/availability-service.ts", "confidence": 5, "summary": "Inlines the PL/pgSQL function as a SQL expression to restore SARGability; wraps JSONB array ops in a CASE to guard against non-array rows."}, {"path": "src/lib/public-status/config-snapshot.ts", "confidence": 3, "summary": "Adds 30-day TTL to versioned snapshot keys and correctly withholds TTL from pointer keys, but pointer keys reference versioned snapshot keys that also expire \u2014 causing the status page to go dark on deployments idle for 30+ days."}, {"path": "tests/unit/proxy/auth-guard-account-state.test.ts", "confidence": 5, "summary": "New test file with 7 cases including the regression scenario (25 disabled-key attempts never trigger 429)."}, {"path": "tests/unit/repository/key-resolve-outcome.test.ts", "confidence": 5, "summary": "New test file with 9 cases including 3 multi-row duplicate-key scenarios verifying active > expired > disabled classification."}]

Confidence Score: 4/5

The auth and availability fixes are solid and well-tested; the public-status TTL fix risks silently darkening the status page on idle deployments.

The auth refactor and availability index fix are correct and comprehensively tested. The public-status TTL logic introduces a scenario where an idle deployment with no config change for 30 days loses its versioned snapshot while the pointer key survives, causing the read path to return null with no fallback — a regression from the pre-PR state where no data ever expired.

src/lib/public-status/config-snapshot.ts — the TTL applied to versioned snapshot keys needs to be substantially longer, or the read path needs a fallback when a versioned key has expired.

Comments Outside Diff (1)

src/lib/public-status/config-snapshot.ts, line 234-241 (link)

Versioned snapshot expiry silences the status page on idle deployments

The pointer keys correctly have no TTL, but each pointer stores a reference to a versioned snapshot key that carries the 30-day TTL. In readCurrentPublicStatusConfigSnapshot, once currentVersion is non-null (the version pointer key has no TTL so it survives indefinitely), the function returns immediately from reading the versioned snapshot — with no fallback if that key has expired. On any deployment that goes 30 days without a config change, the pointer is alive but the snapshot it references has silently expired, causing the status page to return null and go dark.

The legacy pointer path has the same problem: buildPublicStatusConfigSnapshotKey() (no TTL) stores a reference to the versioned key path, and that versioned key also expires after 30 days.

A straightforward fix is to use a much longer TTL (e.g., one year). Config accumulation remains bounded while idle deployments do not silently lose snapshot data.

Prompt To Fix With AI

This is a comment left during a code review.
Path: src/lib/public-status/config-snapshot.ts
Line: 234-241

Comment:
**Versioned snapshot expiry silences the status page on idle deployments**

The pointer keys correctly have no TTL, but each pointer stores a reference to a versioned snapshot key that carries the 30-day TTL. In `readCurrentPublicStatusConfigSnapshot`, once `currentVersion` is non-null (the version pointer key has no TTL so it survives indefinitely), the function returns immediately from reading the versioned snapshot — with no fallback if that key has expired. On any deployment that goes 30 days without a config change, the pointer is alive but the snapshot it references has silently expired, causing the status page to return `null` and go dark.

The legacy pointer path has the same problem: `buildPublicStatusConfigSnapshotKey()` (no TTL) stores a reference to the versioned key path, and that versioned key also expires after 30 days.

A straightforward fix is to use a much longer TTL (e.g., one year). Config accumulation remains bounded while idle deployments do not silently lose snapshot data.

How can I resolve this? If you propose a fix, please make it concise.

Prompt To Fix All With AI

Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
src/lib/public-status/config-snapshot.ts:234-241
**Versioned snapshot expiry silences the status page on idle deployments**

The pointer keys correctly have no TTL, but each pointer stores a reference to a versioned snapshot key that carries the 30-day TTL. In `readCurrentPublicStatusConfigSnapshot`, once `currentVersion` is non-null (the version pointer key has no TTL so it survives indefinitely), the function returns immediately from reading the versioned snapshot — with no fallback if that key has expired. On any deployment that goes 30 days without a config change, the pointer is alive but the snapshot it references has silently expired, causing the status page to return `null` and go dark.

The legacy pointer path has the same problem: `buildPublicStatusConfigSnapshotKey()` (no TTL) stores a reference to the versioned key path, and that versioned key also expires after 30 days.

A straightforward fix is to use a much longer TTL (e.g., one year). Config accumulation remains bounded while idle deployments do not silently lose snapshot data.

_{Reviews (5): Last reviewed commit: "fix(public-status): drop TTL from curren..." | Re-trigger Greptile}

Disabled keys hit the proxy pre-auth rate limiter on every request, incremented the failure counter, and locked the IP/key out with HTTP 429 "Too many authentication failures" after 20 attempts. The legitimate "key disabled" 401 was masked forever once the lockout tripped. Introduce `resolveApiKeyAuthOutcome` returning a discriminated union (`not_found` / `key_disabled` / `key_expired` / ok), map each reason to its own 401 error, and only feed `credentials` failures to the rate-limiter — admin-disabled or expired keys/users now bypass it entirely. `validateApiKeyAndGetUser` stays as a backwards-compatible wrapper. The /v1/models chain is updated the same way. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The availability dashboard query was slow because the WHERE clause called the PL/pgSQL function `fn_is_message_request_finalized(...)`. PostgreSQL never inlines PL/pgSQL functions, so the predicate became opaque to the planner and the partial index `idx_message_request_provider_created_at_finalized_active` (predicate `status_code IS NOT NULL AND deleted_at IS NULL`) was no longer usable — the dashboard fell back to a sequential scan that re-evaluated the function per row. Inline a semantically-equivalent SQL expression so `status_code IS NOT NULL` becomes the dominant SARGable branch. The SQL function definition is unchanged (still called by the upsert trigger on the write path); a header comment marks the keep-in-sync requirement. Tests assert on the inlined form to guard against the slow function-call form returning. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-14T15:38:19Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

该 PR 引入结构化 API 密钥鉴权结果（resolveApiKeyAuthOutcome）并在代理守卫与模型端点按失败类型区分是否计入暴力破解限流；将 availability 的 finalized 判定内联为可索引的 SQL 谓词；为 public-status 快照写入引入 Redis TTL 并更新相关测试与接口提示。

Changes

API 密钥认证结果 API 与限流集成

Layer / File(s)	Summary
认证结果 API 核心定义与实现 `src/repository/key.ts`, `src/repository/index.ts`	新增 `ApiKeyAuthFailureReason` 与 `ApiKeyAuthOutcome`，实现 `resolveApiKeyAuthOutcome(keyString)`（Vacuum Filter → Redis → DB 链路，后置分类 enabled/expired/disabled），并让 `validateApiKeyAndGetUser` 调用该实现以保持兼容；在仓库导出中暴露新类型与函数。
代理守卫认证与失败分类 `src/app/v1/_lib/proxy/session.ts`, `src/app/v1/_lib/proxy/auth-guard.ts`	新增 `AuthFailureKind` 类型及 `AuthState.failureKind` 字段；`auth-guard` 使用 `resolveApiKeyAuthOutcome` 并按 `outcome.reason` 设置 `failureKind`（`credentials` / `account_state`）；`ensure` 仅在 `failureKind !== "account_state"` 时调用 `proxyAuthPolicy.recordFailure`，避免账户状态类失败被计入限流。
模型端点认证集成 `src/app/v1/_lib/models/available-models.ts`	`authenticateRequest` 改为调用 `resolveApiKeyAuthOutcome`，按 `outcome.reason` 映射到不同 401 错误类型与错误信息（使用本地化消息），成功时解构 `user` 与 `key` 供后续校验。
认证结果 API 单元测试 `tests/unit/repository/key-resolve-outcome.test.ts`, `tests/unit/proxy/auth-guard-account-state.test.ts`	新增 `key-resolve-outcome.test.ts` 覆盖 `resolveApiKeyAuthOutcome` 的各分支与重复行分类；新增 `auth-guard-account-state.test.ts` 覆盖账户状态失败、未知 key、缺失凭证、不触发限流等场景。
已有测试适配新认证 API `tests/unit/proxy/auth-guard-precheck.test.ts`, `tests/unit/models/available-models-gemini-key.test.ts`, `tests/unit/proxy/available-models.test.ts`	更新测试中的 `@/repository/key` mocks，从 `validateApiKeyAndGetUser` 切换为 `resolveApiKeyAuthOutcome`，并适配其 `{ ok, reason }` / `{ ok: true, user, key }` 结构；为本地化错误消息在测试中注入 `next-intl` / getErrorMessageServer 的 mock。

数据库查询优化

Layer / File(s)	Summary
终态判定条件内联实现 `src/lib/availability/availability-service.ts`	新增 `FINALIZED_PROVIDER_CHAIN_REASONS` 与 `FINALIZED_PROVIDER_CHAIN_REASONS_SQL`，并将 `buildAvailabilityFinalizedCondition()` 从调用 `fn_is_message_request_finalized(...)` 改为内联等价的 SQL 谓词（statusCode / blockedBy / errorMessage / providerChain 最后元素的保护性 CASE 逻辑），以保留索引可用性并保持与 drizzle SQL 的语义等价。
终态条件相关测试适配 `tests/unit/lib/availability-service.test.ts`	更新测试断言，验证生成的 SQL 不包含 `fn_is_message_request_finalized`，且包含内联判定片段（`"status_code" is not null`、`"blocked_by" is not null`、`"provider_chain" -> -1 ->> 'reason'`、以及 jsonb 类型保护的 CASE 逻辑）。

Public-status Redis TTL

Layer / File(s)	Summary
引入 TTL 与修改写入逻辑 `src/lib/public-status/config-snapshot.ts`	新增 `PUBLIC_STATUS_CONFIG_TTL_SECONDS` 常量，扩展 `RedisWriter.set` 签名并在 `publishPublicStatusConfigSnapshot` / `publishInternalPublicStatusConfigSnapshot` / `publishCurrentPublicStatusConfigPointers` 的写入中加入 `EX` TTL（Lua eval 分支也将 TTL 作为 ARGV 传入）。
测试适配 `tests/unit/public-status/config-snapshot.test.ts`	扩展测试模块接口以包含 `PUBLIC_STATUS_CONFIG_TTL_SECONDS`，更新 redis.set mock 合同为可变参数形式，并新增断言验证所有相关 Redis 写入在版本化 snapshot key 写入时携带 `EX` 与 TTL，而 pointer key 写入不带 TTL（包含 eval 路径断言）。

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

ding113/claude-code-hub#1040: Both PRs modify src/app/v1/_lib/proxy/auth-guard.ts inside ProxyAuthenticator.ensure/validate—changing how candidate API keys and auth outcomes drive failure/success recording and failure-kind/rate-limit behavior.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	标题清晰明确地总结了PR的三个主要修复：disabled-key 429锁定、availability页面性能问题和public-status Redis泄漏。
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	PR 描述与变更集高度相关，清晰阐述了三个用户报告的 bug 修复、关键改动和测试方案。

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/provider-availability-and-auth-errors

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b5d7449d87

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-14T15:39:45Z

    .from(keys)
    .innerJoin(users, eq(keys.userId, users.id))
-    .where(
-      and(
-        eq(keys.key, keyString),
-        isNull(keys.deletedAt),
-        eq(keys.isEnabled, true),
-        or(isNull(keys.expiresAt), gt(keys.expiresAt, new Date())),
-        isNull(users.deletedAt)
-      )
-    );
+    .where(and(eq(keys.key, keyString), isNull(keys.deletedAt), isNull(users.deletedAt)));


Make auth lookup deterministic when duplicate key rows exist

Because this query now returns all non-deleted rows for a key string and later uses result[0], authentication outcome depends on arbitrary row order when duplicate keys.key values exist (the schema has no unique constraint on keys.key). A disabled/expired duplicate can be picked before an active duplicate, causing valid credentials to be rejected as key_disabled/key_expired. Before this change, the SQL filter only returned active rows, so this mixed-state false rejection path was not present.

Useful? React with 👍 / 👎.

github-actions · 2026-05-14T15:42:04Z

🧪 测试结果

测试类型	状态
代码质量	✅
单元测试	✅
集成测试	✅
API 测试	✅

总体结果: ✅ 所有测试通过

gemini-code-assist

Code Review

This pull request refactors the API key authentication logic by introducing resolveApiKeyAuthOutcome, which distinguishes between missing, disabled, and expired keys. This change allows the ProxyAuthenticator to skip rate-limiting for administrative account-state failures, preventing potential lockouts. Additionally, the availability-service.ts was optimized by inlining SQL logic to improve query planning and index utilization. Feedback suggests extracting hardcoded status reasons into a shared constant to mitigate maintenance risks.

gemini-code-assist · 2026-05-14T15:42:04Z

 function buildAvailabilityFinalizedCondition() {
-  return sql`fn_is_message_request_finalized(
-    ${messageRequest.blockedBy},
-    ${messageRequest.statusCode},
-    ${messageRequest.providerChain},
-    ${messageRequest.errorMessage}
+  // The `IS NOT NULL` checks below are individually SARGable. Listing
+  // status_code first encourages the planner to scan the partial index.
+  return sql`(
+    ${messageRequest.statusCode} IS NOT NULL
+    OR ${messageRequest.blockedBy} IS NOT NULL
+    OR COALESCE(${messageRequest.errorMessage}, '') <> ''
+    OR (
+      ${messageRequest.providerChain} IS NOT NULL
+      AND jsonb_typeof(${messageRequest.providerChain}) = 'array'
+      AND jsonb_array_length(${messageRequest.providerChain}) > 0
+      AND jsonb_typeof(${messageRequest.providerChain} -> -1) = 'object'
+      AND (
+        (${messageRequest.providerChain} -> -1 ->> 'reason') IN (
+          'request_success', 'retry_success', 'retry_failed', 'system_error',
+          'resource_not_found', 'client_error_non_retryable',
+          'concurrent_limit_failed', 'hedge_winner', 'hedge_loser_cancelled',
+          'client_abort'
+        )
+        OR (
+          (${messageRequest.providerChain} -> -1 ? 'statusCode')
+          AND jsonb_typeof(${messageRequest.providerChain} -> -1 -> 'statusCode') = 'number'
+        )
+        OR COALESCE(${messageRequest.providerChain} -> -1 ->> 'errorMessage', '') <> ''
+      )
+    )
  )`;
 }


The inlining of the fn_is_message_request_finalized logic into this SQL template is a significant performance improvement as it allows the PostgreSQL planner to utilize partial indexes. However, this creates a maintenance risk as the logic is now duplicated between the TypeScript code and the PL/pgSQL function in the database.

Consider extracting the list of finalized reasons (lines 97-100) into a shared constant at the top of the file to improve readability and ensure consistency if these reasons are used elsewhere in the service.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/app/v1/_lib/models/available-models.ts`:
- Around line 66-67: The thrown error responses in available-models.ts (the
throw c.json calls that return messages like "API 密钥已被禁用" and the other messages
around the same block at lines ~69-73) must stop returning hardcoded user-facing
strings and instead return i18n keys; update the code that throws these errors
(the throw c.json(...) statements) to use defined translation keys (e.g.,
models.key_disabled, models.invalid_key, etc.) and ensure the response payload
carries the key (and optional interpolation data) so the central translation
layer will render the localized text for zh-CN/zh-TW/en/ja/ru before sending to
the client; keep the error structure ({ error: { messageKey: "...", type: "..."
} } or follow the project’s established i18n error contract) and replace all
hardcoded messages in that block (including the ones referenced at 69-73)
accordingly.

In `@src/app/v1/_lib/proxy/auth-guard.ts`:
- Around line 188-191: The 401 error message passed to ProxyResponses.buildError
is hardcoded; replace literal user-facing strings in auth-guard.ts (the
ProxyResponses.buildError calls for the invalid API key / disabled / expired
branches) with i18n keys (e.g. use keys for "invalid_api_key", "key_disabled",
"key_expired") and pass those keys into the response payload so the translation
layer renders the localized text; update the three affected calls (the
buildError invocations around the invalid_api_key, key_disabled, and key_expired
branches) and ensure corresponding entries are added to the i18n resources for
zh-CN, zh-TW, en, ja, and ru.

In `@src/lib/availability/availability-service.ts`:
- Around line 91-106: The SQL predicate using ${messageRequest.providerChain}
relies on short-circuiting and can still call jsonb_array_length/jsonb_typeof on
non-array values causing query errors; change the WHERE fragment to use a CASE
(or a NULLIF/WHEN) guard that first checks
jsonb_typeof(${messageRequest.providerChain}) = 'array' and only then evaluates
jsonb_array_length(...) and the -> -1 expressions so evaluation order is forced,
and add a regression test feeding a non-array provider_chain (object, scalar,
and JSON null) to the code path that builds this query to assert it no longer
throws a 500.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 979c59c4-0c49-4384-bf62-b28339970251

📥 Commits

Reviewing files that changed from the base of the PR and between 724f070 and b5d7449.

📒 Files selected for processing (12)

src/app/v1/_lib/models/available-models.ts
src/app/v1/_lib/proxy/auth-guard.ts
src/app/v1/_lib/proxy/session.ts
src/lib/availability/availability-service.ts
src/repository/index.ts
src/repository/key.ts
tests/unit/lib/availability-service.test.ts
tests/unit/models/available-models-gemini-key.test.ts
tests/unit/proxy/auth-guard-account-state.test.ts
tests/unit/proxy/auth-guard-precheck.test.ts
tests/unit/proxy/available-models.test.ts
tests/unit/repository/key-resolve-outcome.test.ts

greptile-apps · 2026-05-14T15:44:00Z

 export interface AuthState {
  user: User | null;
  key: Key | null;
  apiKey: string | null;
  success: boolean;
  errorResponse?: Response; // 认证失败时的详细错误响应
+  /**
+   * Set when `success` is false. Determines whether the proxy auth guard
+   * records the failure against the IP/key rate-limiter.
+   */
+  failureKind?: AuthFailureKind;
 }


failureKind should be required on failure states

AuthState uses a flat success: boolean shape, so TypeScript does not enforce that failureKind is always present when success is false. The guard check authState.failureKind !== "account_state" silently treats undefined as a credentials failure — the safe default, but only by accident. Any future failure branch that forgets the field will record against the rate-limiter without any compile-time warning. Modeling this as a discriminated union would make the constraint explicit and prevent the footgun entirely.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/app/v1/_lib/proxy/session.ts Line: 47-58 Comment: **`failureKind` should be required on failure states** `AuthState` uses a flat `success: boolean` shape, so TypeScript does not enforce that `failureKind` is always present when `success` is `false`. The guard check `authState.failureKind !== "account_state"` silently treats `undefined` as a `credentials` failure — the safe default, but only by accident. Any future failure branch that forgets the field will record against the rate-limiter without any compile-time warning. Modeling this as a discriminated union would make the constraint explicit and prevent the footgun entirely. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-05-14T15:44:03Z

+      // outcome.reason === "key_expired"
+      logger.warn("[ProxyAuthenticator] API key has expired", {
        apiKeyLength: apiKey.length,
-        fromHeader: !!headers.authHeader || !!headers.apiKeyHeader || !!headers.geminiApiKeyHeader,
-        fromQuery: !!headers.geminiApiKeyQuery,
      });
      return {
        user: null,
        key: null,
        apiKey,
        success: false,
+        failureKind: "account_state",
        errorResponse: ProxyResponses.buildError(
          401,
-          "API 密钥无效。提供的密钥不存在、已被删除、已被禁用或已过期。",
-          "invalid_api_key"
+          "API 密钥已过期。请联系管理员续期或更换密钥。",
+          "key_expired"
        ),
      };


Non-exhaustive fallthrough on outcome.reason

The comment // outcome.reason === "key_expired" is correct today because the union has exactly three members, but there is no compile-time exhaustiveness check. If a new ApiKeyAuthFailureReason is ever added (e.g., "key_revoked"), the new variant silently falls into the key_expired handler, sends the wrong error message, and — critically — classifies the failure as account_state when it might need to be credentials. A switch with an assertNever exhaustiveness guard, or an explicit else branch that asserts the type, would surface this at build time instead.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/app/v1/_lib/proxy/auth-guard.ts Line: 214-229 Comment: **Non-exhaustive fallthrough on `outcome.reason`** The comment `// outcome.reason === "key_expired"` is correct today because the union has exactly three members, but there is no compile-time exhaustiveness check. If a new `ApiKeyAuthFailureReason` is ever added (e.g., `"key_revoked"`), the new variant silently falls into the `key_expired` handler, sends the wrong error message, and — critically — classifies the failure as `account_state` when it might need to be `credentials`. A `switch` with an `assertNever` exhaustiveness guard, or an explicit `else` branch that asserts the type, would surface this at build time instead. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-05-14T15:44:06Z

+        OR (
+          (${messageRequest.providerChain} -> -1 ? 'statusCode')
+          AND jsonb_typeof(${messageRequest.providerChain} -> -1 -> 'statusCode') = 'number'
+        )


Raw ? JSONB operator in Drizzle sql template

(${messageRequest.providerChain} -> -1 ? 'statusCode') uses the PostgreSQL JSONB key-existence operator ?. With the pg driver (which parameterizes via $N) the raw ? passes through to the server unchanged and works correctly. However, some connection-pooling middlewares or alternative drivers (e.g., postgres.js) treat bare ? as a positional-parameter placeholder and will mangle the query. Consider adding a brief comment documenting the driver assumption here, in case the underlying driver is ever swapped.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/lib/availability/availability-service.ts Line: 102-105 Comment: **Raw `?` JSONB operator in Drizzle `sql` template** `(${messageRequest.providerChain} -> -1 ? 'statusCode')` uses the PostgreSQL JSONB key-existence operator `?`. With the `pg` driver (which parameterizes via `$N`) the raw `?` passes through to the server unchanged and works correctly. However, some connection-pooling middlewares or alternative drivers (e.g., `postgres.js`) treat bare `?` as a positional-parameter placeholder and will mangle the query. Consider adding a brief comment documenting the driver assumption here, in case the underlying driver is ever swapped. How can I resolve this? If you propose a fix, please make it concise.

github-actions

Code Review Summary

This PR fixes two distinct production issues: a 429 lockout caused by disabled API keys and a slow availability dashboard due to a non-inlinable PL/pgSQL predicate. Both fixes are well-reasoned, thoroughly tested, and the auth discriminated-union refactor correctly isolates brute-force signals from admin actions.

PR Size: L

Lines changed: 744 (685 additions, 59 deletions)
Files changed: 12
Split suggestion: This PR contains two independent fixes that could ship separately: (1) the auth-chain rate-limiter routing (7 files, ~300 lines), and (2) the availability predicate inlining (2 files, ~100 lines). Splitting would make rollback safer if either fix causes unexpected issues.

Issues Found

Category	Critical	High	Medium	Low
Logic/Bugs	0	1	0	0
Security	0	0	0	0
Error Handling	0	0	0	0
Types	0	1	0	0
Comments/Docs	0	0	0	0
Tests	0	0	0	0
Simplification	0	0	0	0

High Priority Issues (Should Fix)

1. Non-exhaustive `ApiKeyAuthFailureReason` handling in `auth-guard.ts`

File: src/app/v1/_lib/proxy/auth-guard.ts:214
Confidence: 95/100

The sequential if statements handling outcome.reason lack a compile-time exhaustiveness guard. If ApiKeyAuthFailureReason is ever expanded (e.g., adding "key_revoked"), the new variant silently falls through to the key_expired handler. This misclassifies the failure as account_state (bypassing the rate limiter) and returns the wrong error message. The same pattern exists in available-models.ts (line 71) but with less severe consequences since it falls through to invalid_api_key (credentials class).

Suggested fix: Convert to an if/else if chain with a never exhaustiveness guard:

if (\!outcome.ok) {
  if (outcome.reason === "not_found") {
    // ... existing handler
  } else if (outcome.reason === "key_disabled") {
    // ... existing handler
  } else if (outcome.reason === "key_expired") {
    // ... existing handler
  } else {
    const _exhaustiveCheck: never = outcome.reason;
    throw new Error(`Unhandled auth failure reason: ${_exhaustiveCheck}`);
  }
}

Review Coverage

Logic and correctness - One exhaustiveness fragility identified
Security (OWASP Top 10) - Clean; auth failure classification is sound
Error handling - Clean; all failure paths log and return precise errors
Type safety - One type-union exhaustiveness gap in auth-guard.ts
Documentation accuracy - Clean; excellent comments on inlined SQL predicate
Test coverage - Adequate; 13 new unit tests cover all auth outcomes + regression
Code clarity - Good; discriminated union refactor improves readability

Automated review by Claude AI

github-actions · 2026-05-14T15:52:06Z


-    if (!authResult) {
-      logger.debug("[ProxyAuthenticator] API key validation failed", {
+      // outcome.reason === "key_expired"


[High] [LOGIC-BUG] Non-exhaustive ApiKeyAuthFailureReason handling

Why this is a problem: The sequential if statements for outcome.reason lack a compile-time exhaustiveness guard. If ApiKeyAuthFailureReason is ever expanded (e.g., adding "key_revoked"), the new variant silently falls through to the key_expired handler below. This returns the wrong error message and misclassifies the failure as account_state, bypassing the brute-force rate limiter.

Suggested fix: Convert to if/else if with a never exhaustiveness guard in the final else branch.

- key.ts: classify duplicate-row matches deterministically. The relaxed WHERE clause can return multiple non-deleted rows for one key string (no unique constraint on keys.key) and result[0] was non-deterministic; prefer an active row, fall back to "any enabled = key_expired", else key_disabled. (chatgpt-codex P1) - availability-service.ts: wrap the provider_chain jsonb branch in a CASE expression so jsonb_array_length cannot run on a non-array row. PostgreSQL does not guarantee AND short-circuit, so a single non-array historical row would otherwise crash the dashboard query. Extract the finalized provider_chain reason list into FINALIZED_PROVIDER_CHAIN_REASONS and document the JSONB `?` operator's driver assumption. (coderabbit P2, gemini, greptile) - auth-guard.ts / available-models.ts: convert the outcome.reason branch to an exhaustive switch with assertNever, and introduce a buildAuthFailure factory so every failure path is forced to tag its failureKind at compile time. Adding a new ApiKeyAuthFailureReason now produces a TypeScript error until the new branch is handled. (greptile P2 ×2) - Tests cover the duplicate-row cases (ok / key_expired / key_disabled across mixed-state rows) and assert the CASE guard appears in the generated SQL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ding113 · 2026-05-14T16:07:51Z

Addressed review feedback in d314561:

Bugs fixed

chatgpt-codex P1 (key.ts): the relaxed WHERE clause could return multiple non-deleted rows for the same key string (no unique constraint on keys.key); result[0] was non-deterministic. Now we prefer an active duplicate, fall back to key_expired when any row is still enabled, and only return key_disabled if every duplicate is disabled. Three new regression tests cover mixed-state row sets.
coderabbit P2 (availability-service.ts): wrapped the provider_chain JSONB branch in a CASE so jsonb_array_length cannot evaluate against a non-array historical row (PG does not guarantee AND short-circuit). Test asserts the CASE shape now appears in the generated SQL.
greptile P2 (auth-guard.ts exhaustiveness): the outcome.reason branch is now an exhaustive switch with assertNever. Adding a new ApiKeyAuthFailureReason will fail to compile until the new branch is handled. Same pattern applied in available-models.ts.
greptile P2 (session.ts failureKind footgun): introduced a buildAuthFailure factory so every failure return is forced (by the function signature) to tag both failureKind and errorResponse. Less invasive than refactoring AuthState to a discriminated union (which would require null-narrowing edits at every consumer).

Nice-to-have

gemini-code-assist: extracted FINALIZED_PROVIDER_CHAIN_REASONS constant.
greptile (JSONB ? operator): added a comment documenting the pg driver assumption.

Deferred (with reason)

coderabbit (i18n on available-models.ts and auth-guard.ts): the pre-existing 401 error messages in these files (e.g. 用户账户已被禁用) were already hardcoded Chinese strings before this PR. My change preserved the file's existing pattern. Introducing i18n for proxy error JSON payloads is a cross-cutting refactor that should land separately rather than be half-applied to a handful of new branches.

Local pre-push: build ✓, typecheck ✓, lint ✓, all 5960 unit tests ✓.

github-actions · 2026-05-14T16:13:40Z

🧪 测试结果

测试类型	状态
代码质量	✅
单元测试	✅
集成测试	✅
API 测试	✅

总体结果: ✅ 所有测试通过

github-actions · 2026-05-14T16:14:52Z

+  const outcome = await resolveApiKeyAuthOutcome(apiKey);
+  if (!outcome.ok) {
+    // Exhaustive switch: see auth-guard.ts for rationale. Adding a new
+    // ApiKeyAuthFailureReason will produce a TypeScript error on the


[HIGH] [TEST-MISSING-CRITICAL] /v1/models key_disabled / key_expired branches have no unit coverage

Why this is a problem: This PR adds new auth behavior in authenticateRequest (distinct 401 key_disabled vs key_expired). There is currently no unit test that asserts these new branches, so a regression back to a generic invalid_api_key (or other behavior) would go undetected. Guideline: 2. **Test Coverage** - All new features must have unit test coverage of at least 80%.

Suggested fix:

// tests/unit/models/available-models-auth-outcome.test.ts import { describe, expect, it, vi } from "vitest"; vi.mock("@/repository/key", () => ({ resolveApiKeyAuthOutcome: vi.fn(), })); function makeCtx(apiKey: string) { return { req: { path: "/v1/models", header: (name: string) => (name.toLowerCase() === "x-api-key" ? apiKey : undefined), query: () => undefined, }, json: (body: unknown, status?: number) => new Response(JSON.stringify(body), { status: status ?? 200, headers: { "content-type": "application/json" }, }), } as any; } describe("handleAvailableModels auth outcomes", () => { it("returns 401 key_disabled", async () => { const { resolveApiKeyAuthOutcome } = await import("@/repository/key"); vi.mocked(resolveApiKeyAuthOutcome).mockResolvedValueOnce({ ok: false, reason: "key_disabled" }); const { handleAvailableModels } = await import("@/app/v1/_lib/models/available-models"); const response = await handleAvailableModels(makeCtx("sk-disabled")); expect(response.status).toBe(401); const payload = (await response.json()) as { error: { type: string } }; expect(payload.error.type).toBe("key_disabled"); }); it("returns 401 key_expired", async () => { const { resolveApiKeyAuthOutcome } = await import("@/repository/key"); vi.mocked(resolveApiKeyAuthOutcome).mockResolvedValueOnce({ ok: false, reason: "key_expired" }); const { handleAvailableModels } = await import("@/app/v1/_lib/models/available-models"); const response = await handleAvailableModels(makeCtx("sk-expired")); expect(response.status).toBe(401); const payload = (await response.json()) as { error: { type: string } }; expect(payload.error.type).toBe("key_expired"); }); });

github-actions · 2026-05-14T16:15:34Z

          401,
          "提供了多个冲突的 API 密钥。请仅使用一种认证方式。",
          "authentication_error"
        ),


[MEDIUM] [STANDARD-VIOLATION] New auth error messages hardcode user-facing text (bypasses i18n)

Why this is a problem: Guideline: 3. **i18n Required** - All user-facing strings must use i18n (5 languages supported). Never hardcode display text. The new responses embed hardcoded message strings (e.g. "API 密钥无效。提供的密钥不存在或已被删除。"), which cannot be localized and will leak a single-language UX into non-zh locales.

Suggested fix:

// src/app/v1/_lib/proxy/auth-guard.ts const { getLocale, getTranslations } = await import("next-intl/server"); const locale = await getLocale(); const t = await getTranslations({ locale, namespace: "auth" }); return { user: null, key: null, apiKey, success: false, failureKind: "credentials", errorResponse: ProxyResponses.buildError( 401, t("errors.apiKeyNotFoundOrDeleted"), "invalid_api_key" ), };

Add the new keys to messages/{locale}/auth.json under errors (e.g. apiKeyNotFoundOrDeleted, apiKeyDisabled, apiKeyExpired) so all supported locales are covered.

github-actions

Code Review Summary

This PR combines two substantial changes (proxy auth error classification and availability dashboard query perf) into a single large review surface.
Two issues need follow-up: the new /v1/models auth outcome branches are not unit-tested, and the new auth error responses add additional hardcoded user-facing strings despite the repo i18n requirement.

PR Size: L

Lines changed: 744
Files changed: 12
Split suggestion: separate the auth-chain changes (key lookup outcome + proxy auth guard + tests) from the availability dashboard perf changes (availability finalized predicate + tests).

Issues Found

Category	High	Medium
Logic/Bugs	0	0
Security	0	0
Error Handling	0	0
Types	0	0
Comments/Docs	0	1
Tests	1	0
Simplification	0	0

Critical Issues (Must Fix)

None.

High Priority Issues (Should Fix)

[HIGH] [TEST-MISSING-CRITICAL] Missing unit coverage for new /v1/models key_disabled / key_expired branches (src/app/v1/_lib/models/available-models.ts:66). Confidence: 90.
[MEDIUM] [STANDARD-VIOLATION] New auth error messages hardcode user-facing strings instead of i18n (src/app/v1/_lib/proxy/auth-guard.ts:190). Confidence: 80.

Review Coverage

Automated review by Codex AI

github-actions

Identified PR #1187 and applied label size/L (744 lines changed across 12 files).
Posted 2 inline review comments on diff lines:
- src/app/v1/_lib/models/available-models.ts:66 — [HIGH] [TEST-MISSING-CRITICAL] missing unit coverage for new key_disabled / key_expired branches.
- src/app/v1/_lib/proxy/auth-guard.ts:190 — [MEDIUM] [STANDARD-VIOLATION] new hardcoded user-facing auth error messages bypass i18n requirement.
Submitted the required PR review summary via gh pr review --comment (includes PR size, split suggestion, issue table, and confidence notes).

publishPublicStatusConfigSnapshot, publishInternalPublicStatusConfigSnapshot, and publishCurrentPublicStatusConfigPointers all wrote keys with bare redis.set(...) and no TTL. Every config-version mint (provider/group/system settings change) created a new versioned snapshot key that never expired — on a busy operator the public-status:v1:config:* and config-internal:* key namespaces grow without bound. Neighbouring projection writers in rebuild-worker.ts already use a 30-day TTL via setWithTtl; only the config publishing path was missed when that pattern was introduced (#1056). Add PUBLIC_STATUS_CONFIG_TTL_SECONDS (30 days, matching GENERATION_PROJECTION_TTL_SECONDS in rebuild-worker.ts), widen the local RedisWriter type to the (key, value, "EX", seconds) ioredis overload, and apply the TTL to all four call sites — including the Lua script used by the pointer publisher so SET ... EX is atomic with the version compare. Each successful publish refreshes the TTL on the live pointer keys, so as long as configs are published at least every 30 days the active pointer never expires while stale versioned snapshots get cleaned up naturally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ding113 · 2026-05-14T16:21:29Z

Pushed 26d3b9f0 covering a third user-reported bug — the public-status config snapshots accumulated forever in Redis because the four writes in src/lib/public-status/config-snapshot.ts used bare redis.set(...) without TTL.

Root cause traced to #1056: rebuild-worker.ts correctly used setWithTtl(...) for manifest / series / snapshot projections, but the config publishing path was missed. Fix adds PUBLIC_STATUS_CONFIG_TTL_SECONDS = 30d (matching GENERATION_PROJECTION_TTL_SECONDS), widens the local RedisWriter to accept the (key, value, "EX", seconds) ioredis overload, and applies the TTL to all four sites — including the Lua pointer-update script so SET ... EX is atomic with the version compare. Each successful publish refreshes the TTL on the live pointer keys, so as long as configs publish at least every 30 days the active pointer stays warm while stale versioned snapshots expire naturally.

Bundling this into the existing PR since all three bugs come from the same user report batch and are independent file-scoped fixes. Happy to split into a separate PR if preferred.

PR title and body updated accordingly. Pre-push re-run: build ✓, typecheck ✓, lint ✓, 5964 tests ✓.

github-actions · 2026-05-14T16:26:39Z

🧪 测试结果

测试类型	状态
代码质量	✅
单元测试	✅
集成测试	✅
API 测试	✅

总体结果: ✅ 所有测试通过

…tcomes CodeRabbit and CI flagged two remaining gaps in the previous fix: - The new 401 messages for invalid_api_key / key_disabled / key_expired were hardcoded Chinese strings, ignoring the project's i18n guideline (5 locales: zh-CN, zh-TW, en, ja, ru). The pre-existing strings in the same files (e.g. user_disabled, user_expired) were already hardcoded before this PR and remain so — see PR scope note — but the new branches should follow the established pattern. - /v1/models had no unit coverage for the new key_disabled / key_expired branches in handleAvailableModels.authenticateRequest. A regression back to a generic invalid_api_key would have gone undetected. Add PROXY_INVALID_API_KEY, PROXY_API_KEY_DISABLED, PROXY_API_KEY_EXPIRED codes to AUTH_ERRORS plus translations across all 5 locales. Wire auth-guard.ts and available-models.ts through getErrorMessageServer + next-intl/server's getLocale — same pattern the neighbouring rate-limit-guard.ts already uses. Add tests/unit/models/available-models-auth-outcome.test.ts covering all five 401 branches (key_disabled, key_expired, not_found, user_disabled, user_expired). Existing auth-guard tests mock next-intl/server + getErrorMessageServer so the unit tests can run outside a Next.js request context, and message assertions pin on the ERROR_CODES key (the localized text lives in messages/<locale>/errors.json). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ding113 · 2026-05-14T16:47:32Z

Pushed c983b640 addressing the second-round review findings:

[HIGH] TEST-MISSING-CRITICAL — /v1/models lacks key_disabled / key_expired coverage
Added tests/unit/models/available-models-auth-outcome.test.ts covering all five 401 branches in handleAvailableModels.authenticateRequest: key_disabled, key_expired, not_found, user_disabled, user_expired. A regression back to a generic invalid_api_key would now be caught.

[MEDIUM] STANDARD-VIOLATION — Hardcoded auth error messages
Reversing my earlier defer: the codebase has an established i18n pattern (ERROR_CODES + getErrorMessageServer + next-intl/server's getLocale) and the neighbouring rate-limit-guard.ts already uses it. Added 3 new codes (PROXY_INVALID_API_KEY, PROXY_API_KEY_DISABLED, PROXY_API_KEY_EXPIRED) under AUTH_ERRORS, translations across all 5 locales (zh-CN, zh-TW, en, ja, ru), and wired both auth-guard.ts and available-models.ts through getErrorMessageServer. Scope note: the pre-existing strings in the same files (e.g. user_disabled, user_expired) are still hardcoded — they were that way before this PR; converting them is a separate small refactor that doesn't belong in this bug-fix branch.

The auth-guard test suite mocks next-intl/server + getErrorMessageServer so unit tests can run outside a Next.js request context; message assertions pin on the ERROR_CODES key (the localized text lives in messages/<locale>/errors.json).

Pre-push: build ✓, typecheck ✓, lint ✓, 5969 unit tests ✓.

greptile-apps · 2026-05-14T16:51:44Z

 }

 interface RedisWriter {
+  // ioredis supports both bare `set(key, value)` and the EX-variant
+  // `set(key, value, "EX", seconds)`. Widening the type here lets us pass
+  // an explicit TTL on every write — see PUBLIC_STATUS_CONFIG_TTL_SECONDS.
+  set(key: string, value: string, mode: "EX", seconds: number): Promise<unknown> | unknown;
  set(key: string, value: string): Promise<unknown> | unknown;
  get?(key: string): Promise<string | null> | string | null;


Current-pointer keys will expire on idle deployments

buildPublicStatusConfigSnapshotKey(), buildPublicStatusInternalConfigSnapshotKey(), and buildPublicStatusConfigVersionPointerKey() are current pointer keys — they always point at the latest config and should persist until explicitly overwritten. Applying the same 30-day TTL to them as to the accumulating versioned snapshot keys means that any deployment that goes 30 days without a config change loses its pointer keys, causing readCurrentPublicStatusConfigSnapshot to return null and the public status page to silently show no config data.

The versioned snapshot keys (public-status:v1:config:<version>) are the ones that accumulate and need a TTL; the three pointer keys should either carry no TTL or a much longer one (e.g., 1 year), since they are overwritten atomically on every config publish and never accumulate.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/lib/public-status/config-snapshot.ts Line: 82-90 Comment: **Current-pointer keys will expire on idle deployments** `buildPublicStatusConfigSnapshotKey()`, `buildPublicStatusInternalConfigSnapshotKey()`, and `buildPublicStatusConfigVersionPointerKey()` are *current* pointer keys — they always point at the latest config and should persist until explicitly overwritten. Applying the same 30-day TTL to them as to the accumulating versioned snapshot keys means that any deployment that goes 30 days without a config change loses its pointer keys, causing `readCurrentPublicStatusConfigSnapshot` to return `null` and the public status page to silently show no config data. The versioned snapshot keys (`public-status:v1:config:<version>`) are the ones that accumulate and need a TTL; the three pointer keys should either carry no TTL or a much longer one (e.g., 1 year), since they are overwritten atomically on every config publish and never accumulate. How can I resolve this? If you propose a fix, please make it concise.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c983b6402a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-14T16:52:26Z

        buildPublicStatusConfigSnapshotKey(),
-        JSON.stringify({ key, configVersion: snapshot.configVersion })
+        JSON.stringify({ key, configVersion: snapshot.configVersion }),
+        "EX",
+        PUBLIC_STATUS_CONFIG_TTL_SECONDS


Keep public-status current pointers from expiring

Applying EX PUBLIC_STATUS_CONFIG_TTL_SECONDS to the config:current pointer means a stable deployment with no config publishes for 30 days will lose its only pointer to the latest snapshot, after which readCurrentPublicStatusConfigSnapshot falls through to null and endpoints like /api/public-site-meta report projection_missing. The leak fix only needs TTL on versioned snapshot keys; expiring the single “current” pointer key introduces a time-based availability regression.

Useful? React with 👍 / 👎.

github-actions · 2026-05-14T16:53:17Z

🧪 测试结果

测试类型	状态
代码质量	✅
单元测试	✅
集成测试	✅
API 测试	✅

总体结果: ✅ 所有测试通过

coderabbitai

🧹 Nitpick comments (1)

tests/unit/models/available-models-auth-outcome.test.ts (1)
157-157: ⚡ Quick win

将 as never 替换为显式类型或 satisfies 以保持类型约束

Line 157 和 Line 173 中的 as never 会绕过 TypeScript 的类型检查，降低测试对 resolveApiKeyAuthOutcome 返回类型变更的捕获能力。建议改用显式类型或 satisfies ApiKeyAuthOutcome 来确保 mock 返回值始终与实际类型定义保持同步。
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/models/available-models-auth-outcome.test.ts` at line 157, The
test currently uses "as never" to cast mocked return values which disables
TypeScript checks; instead update the mocks passed to resolveApiKeyAuthOutcome
to use an explicit type or the "satisfies ApiKeyAuthOutcome" operator so the
returned shape matches ApiKeyAuthOutcome—locate the mock objects around
resolveApiKeyAuthOutcome calls (previously cast with "as never") and replace
those casts with either a typed annotation or "... } satisfies
ApiKeyAuthOutcome" to restore compile-time validation.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/unit/models/available-models-auth-outcome.test.ts`:
- Line 157: The test currently uses "as never" to cast mocked return values
which disables TypeScript checks; instead update the mocks passed to
resolveApiKeyAuthOutcome to use an explicit type or the "satisfies
ApiKeyAuthOutcome" operator so the returned shape matches
ApiKeyAuthOutcome—locate the mock objects around resolveApiKeyAuthOutcome calls
(previously cast with "as never") and replace those casts with either a typed
annotation or "... } satisfies ApiKeyAuthOutcome" to restore compile-time
validation.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9b790b2e-362d-4897-9836-7095e18cd202

📥 Commits

Reviewing files that changed from the base of the PR and between 26d3b9f and c983b64.

📒 Files selected for processing (11)

messages/en/errors.json
messages/ja/errors.json
messages/ru/errors.json
messages/zh-CN/errors.json
messages/zh-TW/errors.json
src/app/v1/_lib/models/available-models.ts
src/app/v1/_lib/proxy/auth-guard.ts
src/lib/utils/error-messages.ts
tests/unit/models/available-models-auth-outcome.test.ts
tests/unit/proxy/auth-guard-account-state.test.ts
tests/unit/proxy/auth-guard-precheck.test.ts

✅ Files skipped from review due to trivial changes (5)

messages/en/errors.json
messages/zh-TW/errors.json
messages/zh-CN/errors.json
messages/ja/errors.json
messages/ru/errors.json

🚧 Files skipped from review as they are similar to previous changes (4)

tests/unit/proxy/auth-guard-precheck.test.ts
src/app/v1/_lib/proxy/auth-guard.ts
src/app/v1/_lib/models/available-models.ts
tests/unit/proxy/auth-guard-account-state.test.ts

greptile flagged a P1: applying the same 30-day TTL to the three "current pointer" keys would dark out any deployment that goes longer than the TTL without publishing a new config. Pointer keys don't accumulate — only one entry per pointer name exists, overwritten atomically on every publish — so they MUST persist until explicitly overwritten. Only the versioned snapshot keys (`public-status:v1:config:<version>` and `:config-internal:<version>`) keep the 30-day TTL — those are the ones that accumulate as new config versions are minted. The pointer publisher's Lua script and JS fallback now write bare `SET` without `EX`. Tests assert the split: the versioned write carries `EX <ttl>` and the pointer write is a bare two-arg `set(key, value)` with no TTL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5cd75941a4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-14T17:09:41Z

  if (redis) {
-    await redis.set(key, JSON.stringify(snapshot));
+    // Versioned snapshot key: TTL'd so old versions get cleaned up.
+    await redis.set(key, JSON.stringify(snapshot), "EX", PUBLIC_STATUS_CONFIG_TTL_SECONDS);


Keep latest public config snapshot from expiring

Applying EX PUBLIC_STATUS_CONFIG_TTL_SECONDS to the versioned config snapshot means the currently referenced snapshot disappears after 30 days if no config republish occurs. In the normal publish flow (publishCurrentPublicStatusConfigProjection), both publish*ConfigSnapshot calls pass setCurrentPointer: false, so reads depend on configVersion + public-status:v1:config:<version>; once that versioned key expires, readCurrentPublicStatusConfigSnapshot returns null and endpoints like /api/public-site-meta fall back to projection_missing until another rebuild/publish path runs.

Useful? React with 👍 / 👎.

github-actions · 2026-05-14T17:10:03Z

🧪 测试结果

测试类型	状态
代码质量	✅
单元测试	✅
集成测试	✅
API 测试	✅

总体结果: ✅ 所有测试通过

ding113 and others added 2 commits May 14, 2026 23:35

github-project-automation Bot added this to Claude Code Hub Roadmap May 14, 2026

github-project-automation Bot moved this to Backlog in Claude Code Hub Roadmap May 14, 2026

github-actions Bot added bug Something isn't working area:Rate Limit area:provider area:core labels May 14, 2026

chatgpt-codex-connector Bot reviewed May 14, 2026

View reviewed changes

github-actions Bot added the size/L Large PR (< 1000 lines) label May 14, 2026

gemini-code-assist Bot reviewed May 14, 2026

View reviewed changes

coderabbitai Bot requested changes May 14, 2026

View reviewed changes

Comment thread src/app/v1/_lib/models/available-models.ts Outdated

Comment thread src/app/v1/_lib/proxy/auth-guard.ts Outdated

Comment thread src/lib/availability/availability-service.ts Outdated

greptile-apps Bot reviewed May 14, 2026

View reviewed changes

github-actions Bot reviewed May 14, 2026

View reviewed changes

ding113 changed the title ~~fix: disabled-key 429 lockout + slow provider availability page~~ fix: disabled-key 429 lockout, slow availability page, public-status Redis leak May 14, 2026

greptile-apps Bot reviewed May 14, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 14, 2026

View reviewed changes

coderabbitai Bot reviewed May 14, 2026

View reviewed changes

coderabbitai Bot approved these changes May 14, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 14, 2026

View reviewed changes

ding113 merged commit b0c9eaf into dev May 14, 2026
12 checks passed

github-project-automation Bot moved this from Backlog to Done in Claude Code Hub Roadmap May 14, 2026

This was referenced May 15, 2026

fix(availability): restore status_code IS NOT NULL terminal filter #1189

Merged

优化公开状态页 Redis 聚合与轮询性能 #1211

Open

Uh oh!

Conversation

ding113 commented May 14, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Auth chain (fix(auth))

Availability dashboard (perf(availability))

Public status TTL (fix(public-status))

Test plan

Greptile Summary

Confidence Score: 4/5

Comments Outside Diff (1)

Uh oh!

coderabbitai Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 14, 2026

🧪 测试结果

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Code Review Summary

PR Size: L

Issues Found

High Priority Issues (Should Fix)

1. Non-exhaustive ApiKeyAuthFailureReason handling in auth-guard.ts

Review Coverage

Uh oh!

github-actions Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

ding113 commented May 14, 2026

Uh oh!

github-actions Bot commented May 14, 2026

🧪 测试结果

Uh oh!

github-actions Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

ding113 commented May 14, 2026 •

edited by greptile-apps Bot

Loading

Auth chain (`fix(auth)`)

Availability dashboard (`perf(availability)`)

Public status TTL (`fix(public-status)`)

coderabbitai Bot commented May 14, 2026 •

edited

Loading

1. Non-exhaustive `ApiKeyAuthFailureReason` handling in `auth-guard.ts`