fix(availability): restore status_code IS NOT NULL terminal filter by ding113 · Pull Request #1189 · ding113/claude-code-hub

ding113 · 2026-05-15T06:40:54Z

Related Issues & PRs

Fixes 可用性监控导致数据库100%cpu占用 #1168 — 可用性监控查询导致 PostgreSQL CPU 飙升至 100%，根因是终态谓词无法命中部分索引。
Supersedes 修复可用性监控加载过慢 #1186 — 同一修复的另一版本，当前存在合并冲突。
Related to fix: disabled-key 429 lockout, slow availability page, public-status Redis leak #1187 — 引入了本次需要回退的内联终态谓词。
Related to fix: 可用性监控仅统计终态请求 #1018 — 最初建立 status_code IS NOT NULL 终态边界与部分索引。

背景

可用性监控的 provider + 时间范围聚合查询加载很久。热路径原本依赖 message_request 上的部分索引：

idx_message_request_provider_created_at_finalized_active
  (provider_id, created_at DESC)
  WHERE deleted_at IS NULL AND status_code IS NOT NULL

近期的"已终态"判定被改成内联复刻 fn_is_message_request_finalized(...) 的语义（status_code / blocked_by / errorMessage / providerChain 任一非空即视为终态）。这样做有两个问题：

SQL 不再以 status_code IS NOT NULL 作为可单独证明的谓词分支，Postgres 难以判定它蕴含索引谓词，常退化为大范围扫描；
把仅写入了 providerChain / errorMessage 片段、但 statusCode 仍为 NULL 的"请求中"记录也判为终态。fn_compute_message_request_success_rate_outcome(...) 会把这些请求中的中间状态算成 failure，污染可用性数据。

改动

把 buildAvailabilityFinalizedCondition 恢复为 status_code IS NOT NULL，重新对齐部分索引，同时去掉对内联 finalized 谓词的依赖（以及随之而来的 jsonb / CASE 复杂度和 FINALIZED_PROVIDER_CHAIN_REASONS_SQL 常量）。
终态记录的成功/失败/排除分类继续走 fn_compute_message_request_success_rate_outcome(...)，逻辑不变。
单元测试（TDD）：先把"必须包含 status_code IS NOT NULL、不再包含 fn_is_message_request_finalized 或 blocked_by / provider_chain reason 内联片段、且仍由 outcome 函数做分类"作为断言写到红，再回到生产代码上让测试转绿。

Test plan

bun run typecheck
bun run lint
bun run build
bun run test — 5969/5982 passed (13 skipped)
bunx vitest run --config tests/configs/{integration,my-usage,quota,proxy-guard-pipeline,public-status.integration,thinking-signature-rectifier,codex-session-id-completer,include-session-id-in-errors,logs-sessionid-time-filter,usage-logs-sessionid-search}.config.ts 全部 exit 0
bunx vitest run tests/unit/lib/availability-service.test.ts — 19/19 passed
e2e (tests/configs/e2e.config.ts) 本地未跑通：依赖 13500 端口上的 dev server（pre-existing on origin/dev 同样如此），与本改动无关
待回归：在带数据库的环境上 EXPLAIN 大时间范围的 availability 聚合查询，确认 partial index 被命中

🤖 Generated with Claude Code

Greptile Summary

This PR restores the buildAvailabilityFinalizedCondition function to the simple status_code IS NOT NULL predicate, replacing the previously inlined multi-branch OR expression that mirrored fn_is_message_request_finalized. The simplified predicate re-aligns with the partial index idx_message_request_provider_created_at_finalized_active and stops partially-written records (e.g. rows with only providerChain/errorMessage but no status_code) from being pulled into availability aggregation as false failures.

buildAvailabilityFinalizedCondition is reduced to one line (isNotNull(messageRequest.statusCode)), removing ~50 lines of JSONB CASE guards and the FINALIZED_PROVIDER_CHAIN_REASONS_SQL constant.
Tests gain a shared expectStatusCodeOnlyFinalizedBoundary helper that asserts the four contract invariants (index-aligned predicate present; blocked_by, error_message, and provider_chain paths absent) and are applied consistently across all relevant test cases.

Confidence Score: 5/5

Safe to merge — the change narrows the finalized-request filter to a single SARGable predicate that matches the existing partial index, and all 19 unit tests pass.

The production change is a straightforward simplification: a 50-line multi-branch SQL expression is replaced by a single Drizzle call, and every call site remains unchanged. The new predicate is a strict subset of the old one, which is intentional and well-documented. Tests cover the boundary contract with a shared helper applied consistently across four test cases.

No files require special attention.

Important Files Changed

Filename	Overview
src/lib/availability/availability-service.ts	Simplifies buildAvailabilityFinalizedCondition to isNotNull(messageRequest.statusCode), aligning the WHERE predicate with the partial index and removing ~50 lines of complex JSONB logic; all call sites are unchanged.
tests/unit/lib/availability-service.test.ts	Adds expectStatusCodeOnlyFinalizedBoundary helper and updates four test assertions to match the simplified finalized condition; also adds fn_compute_message_request_success_rate_outcome presence checks.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Incoming message_request rows] --> B{deleted_at IS NULL?}
    B -- No --> SKIP1[Excluded — soft-deleted]
    B -- Yes --> C{status_code IS NOT NULL?\nbuildAvailabilityFinalizedCondition}
    C -- No --> SKIP2[Excluded — in-progress / partial write]
    C -- Yes --> D[Finalized requests CTE]
    D --> E[fn_compute_message_request_success_rate_outcome\nblocked_by, status_code, error_message, provider_chain]
    E --> F{outcome}
    F -- success --> GREEN[greenCount]
    F -- failure --> RED[redCount]
    F -- excluded --> EXCL[Excluded from counts]
    GREEN & RED --> AGG[Bucket aggregation\nprovider_bucket_stats]
    AGG --> RESULT[AvailabilityQueryResult]

Prompt To Fix All With AI

Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
tests/unit/lib/availability-service.test.ts:59-61
The `error_message` regression guard checks for the IS NOT NULL form, but the original code used `COALESCE("error_message", '') <> ''` — a form that would never match this assertion. If the COALESCE pattern were reintroduced, this check would silently pass. Widening the guard to reject any occurrence of `"error_message"` in the finalized predicate closes that gap.

```suggestion
  expect(sqlText).not.toContain(`"blocked_by" is not null`);
  expect(sqlText).not.toContain(`"error_message"`);
  expect(sqlText).not.toContain(`"provider_chain" -> -1 ->> 'reason'`);
```

_{Reviews (2): Last reviewed commit: "test(availability): factor finalized-bou..." | Re-trigger Greptile}

可用性监控的 provider + 时间范围聚合改用 status_code IS NOT NULL 收敛终态边界，与部分索引 idx_message_request_provider_created_at_finalized_active (deleted_at IS NULL AND status_code IS NOT NULL) 对齐，让热路径查询能稳定命中索引；同时避免把仅写入 providerChain / errorMessage 片段但 statusCode 仍为 NULL 的"请求中"记录算入聚合（之前的内联终态谓词复刻 fn_is_message_request_finalized 语义后，会让分类函数把它们误算成 failure）。终态记录的成功/失败/排除分类继续由 fn_compute_message_request_success_rate_outcome(...) 完成。 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-15T06:41:06Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5f68c3e4-bb14-4a0d-badd-7470a15d03e0

📥 Commits

Reviewing files that changed from the base of the PR and between 21885f2 and 0fd90a0.

📒 Files selected for processing (1)

tests/unit/lib/availability-service.test.ts

📝 Walkthrough

Walkthrough

将 availability-service 中的“已终态(finalized)”判定由复杂的 providerChain/blockedBy/errorMessage 组合逻辑重写为仅以 message_request.status_code 非空为准，并同步更新相关单元测试的 SQL 断言以匹配该新边界。

变更

终态判定逻辑简化

Layer / File(s)	摘要
核心终态判定条件改写 `src/lib/availability/availability-service.ts`	导入新增 `isNotNull`，移除 `FINALIZED_PROVIDER_CHAIN_REASONS` 常量，并将 `buildAvailabilityFinalizedCondition` 中原先基于 `blockedBy`/`errorMessage`/`providerChain` 的复杂 SQL 条件替换为单一 `isNotNull(messageRequest.statusCode)`，同时更新注释。
终态判定测试断言更新 `tests/unit/lib/availability-service.test.ts`	新增 `expectStatusCodeOnlyFinalizedBoundary` 并在多个场景（聚合统计、排除进行中请求、保留 Gemini passthrough、避免中间状态计入 red、getCurrentProviderStatus 短窗口聚合）中将对 finalized 边界的断言统一为仅包含 `"status_code" is not null`，移除对 `fn_is_message_request_finalized`、`blocked_by`、`provider_chain` reason 等旧片段的检查。

评估代码审查工作量

🎯 4 (Complex) | ⏱️ ~45 minutes

可能相关的其他 PR

ding113/claude-code-hub#1018：同样在 availability-service.ts 与相关测试中将“finalized”判定收敛为 message_request.statusCode IS NOT NULL。

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	PR标题清晰准确，明确表示恢复status_code IS NOT NULL作为终态过滤条件，直接对应主要变更。
Description check	✅ Passed	PR描述详细阐述了问题背景、改动内容、测试计划和相关issue链接，与代码变更高度关联。
Linked Issues check	✅ Passed	PR成功解决了#1168中的数据库CPU占用问题，通过恢复status_code IS NOT NULL谓词对齐部分索引，阻止了Postgres执行宽表扫描。
Out of Scope Changes check	✅ Passed	所有变更均聚焦于恢复buildAvailabilityFinalizedCondition和相应的单元测试调整，无超范围或无关变更。

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch worktree-fix-availability-monitor-index

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request simplifies the 'finalized request' logic in the availability service by replacing a complex multi-field SQL condition with a single status_code IS NOT NULL check. This change optimizes database performance by aligning with the existing partial index and prevents in-progress requests from being misclassified as failures. Unit tests have been updated to verify the simplified SQL generation and ensure that outcome classification still functions correctly. I have no feedback to provide as there were no review comments to assess.

coderabbitai

🧹 Nitpick comments (1)

tests/unit/lib/availability-service.test.ts (1)

312-325: ⚡ Quick win

把 error_message 也纳入负向断言。

现在这些用例只排除了 blocked_by 和 provider_chain 路径；如果 finalized 条件回归成 "status_code" is not null or "error_message" is not null，测试依然会通过，但会重新把仅写入 errorMessage、statusCode 仍为 NULL 的记录纳入可用性统计。建议把这组断言提成一个 helper，并补上对 "error_message" is not null 的排除断言。

可考虑这样收紧断言

+function expectStatusCodeOnlyFinalizedBoundary(sqlText: string) {
+  expect(sqlText).toContain(`"status_code" is not null`);
+  expect(sqlText).not.toContain(`"blocked_by" is not null`);
+  expect(sqlText).not.toContain(`"error_message" is not null`);
+  expect(sqlText).not.toContain(`"provider_chain" -> -1 ->> 'reason'`);
+}
+
 ...
-    expect(finalizedRequestsSql).toContain(`"status_code" is not null`);
-    expect(finalizedRequestsSql).not.toContain(`"blocked_by" is not null`);
-    expect(finalizedRequestsSql).not.toContain(`"provider_chain" -> -1 ->> 'reason'`);
+    expectStatusCodeOnlyFinalizedBoundary(finalizedRequestsSql);

Also applies to: 494-499, 533-535, 568-575, 742-747

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/lib/availability-service.test.ts` around lines 312 - 325, The
tests assert finalizedRequestsSql and queryText must exclude certain
non-finalized paths but miss `"error_message" is not null`; update the
assertions (and extract them into a reusable helper) so each check for
finalizedRequestsSql includes
expect(finalizedRequestsSql).not.toContain(`"error_message" is not null`)
alongside the existing not.toContain checks for `"blocked_by" is not null` and
`"provider_chain" -> -1 ->> 'reason'`, and reuse that helper across all affected
cases (references: finalizedRequestsSql, queryText,
fn_is_message_request_finalized,
fn_compute_message_request_success_rate_outcome, and the `"status_code" is not
null` positive assertion).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/unit/lib/availability-service.test.ts`:
- Around line 312-325: The tests assert finalizedRequestsSql and queryText must
exclude certain non-finalized paths but miss `"error_message" is not null`;
update the assertions (and extract them into a reusable helper) so each check
for finalizedRequestsSql includes
expect(finalizedRequestsSql).not.toContain(`"error_message" is not null`)
alongside the existing not.toContain checks for `"blocked_by" is not null` and
`"provider_chain" -> -1 ->> 'reason'`, and reuse that helper across all affected
cases (references: finalizedRequestsSql, queryText,
fn_is_message_request_finalized,
fn_compute_message_request_success_rate_outcome, and the `"status_code" is not
null` positive assertion).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2a4f203b-de64-4fd8-9615-6e05e5f09567

📥 Commits

Reviewing files that changed from the base of the PR and between b0c9eaf and 21885f2.

📒 Files selected for processing (2)

src/lib/availability/availability-service.ts
tests/unit/lib/availability-service.test.ts

github-actions · 2026-05-15T06:47:08Z

🧪 测试结果

测试类型	状态
代码质量	✅
单元测试	✅
集成测试	✅
API 测试	✅

总体结果: ✅ 所有测试通过

github-actions

Code Review Summary

No significant issues identified in this PR. The change correctly restores as the terminal-state filter, aligning the availability query with the partial index predicate and eliminating the CPU spike caused by planner-opaque inlined SQL. Tests were TDD'd and validate the new boundary behavior.

PR Size: S

Lines changed: 138 (38 additions, 100 deletions)
Files changed: 2

Review Coverage

Logic and correctness - Clean. The simplified predicate aligns with and correctly excludes in-flight records that were previously misclassified as failures.
Security (OWASP Top 10) - Clean. No user input reaches raw SQL; all parameters remain properly escaped through Drizzle ORM.
Error handling - Clean. No new catch blocks or error paths introduced.
Type safety - Clean. is a standard Drizzle type-safe operator, consistent with existing usage in the same file.
Documentation accuracy - Clean. The new JSDoc accurately explains the rationale, known limitations, and why the function-call approach was abandoned.
Test coverage - Adequate. Tests assert both the new predicate and the absence of the old complex SQL; edge cases (Gemini passthrough, intermediate persistence state) are covered.
Code clarity - Good. Removing ~60 lines of opaque SQL and the constants dramatically improves maintainability.

Automated review by Claude AI

github-actions

Code Review Summary

No significant issues identified in this PR. The change correctly restores status_code IS NOT NULL as the terminal-state filter, aligning the availability query with the partial index predicate and eliminating the CPU spike caused by planner-opaque inlined SQL. Tests were TDD'd and validate the new boundary behavior.

PR Size: S

Lines changed: 138 (38 additions, 100 deletions)
Files changed: 2

Review Coverage

Logic and correctness - Clean. The simplified predicate aligns with idx_message_request_provider_created_at_finalized_active and correctly excludes in-flight records that were previously misclassified as failures.
Security (OWASP Top 10) - Clean. No user input reaches raw SQL; all parameters remain properly escaped through Drizzle ORM.
Error handling - Clean. No new catch blocks or error paths introduced.
Type safety - Clean. isNotNull is a standard Drizzle type-safe operator, consistent with existing isNull usage in the same file.
Documentation accuracy - Clean. The new JSDoc accurately explains the rationale, known limitations, and why the function-call approach was abandoned.
Test coverage - Adequate. Tests assert both the new predicate and the absence of the old complex SQL; edge cases (Gemini passthrough, intermediate persistence state) are covered.
Code clarity - Good. Removing ~60 lines of opaque SQL and the FINALIZED_PROVIDER_CHAIN_REASONS constants dramatically improves maintainability.

Automated review by Claude AI

…ssage 把"终态边界只看 status_code IS NOT NULL"的 5 处断言抽到 expectStatusCodeOnlyFinalizedBoundary helper，并补上对 "error_message" is not null 的负向断言：之前只排除 blocked_by / provider_chain 路径，如果未来 finalized 回归成 "status_code" is not null or "error_message" is not null，仅写入 errorMessage 的"请求中"记录会重新被纳入可用性统计而不被发现。 Addresses CodeRabbit nitpick on #1189. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions

Code Review Summary

No significant issues identified in this PR.

PR Size: S

Lines changed: 138
Files changed: 2

Review Coverage

Automated review by Codex AI

github-actions · 2026-05-15T07:09:37Z

🧪 测试结果

测试类型	状态
代码质量	✅
单元测试	✅
集成测试	✅
API 测试	✅

总体结果: ✅ 所有测试通过

github-actions

Identified PR #1189 (fix(availability): restore status_code IS NOT NULL terminal filter) and reviewed the full diff for changed files: src/lib/availability/availability-service.ts and tests/unit/lib/availability-service.test.ts.
Computed PR size as S (38 additions, 100 deletions; 138 total lines changed across 2 files) and applied label size/S to the PR.
Ran a diff-scoped, multi-perspective review (comments, tests, error handling, types, logic/security/perf, simplification) and found no issues that met the reporting threshold (>= 80 confidence) after full-file/context validation.
Posted the required PR review summary via gh pr review --comment (no inline comments, since there were no validated reportable diff-line issues).

github-project-automation Bot added this to Claude Code Hub Roadmap May 15, 2026

github-project-automation Bot moved this to Backlog in Claude Code Hub Roadmap May 15, 2026

github-actions Bot added bug Something isn't working area:statistics labels May 15, 2026

gemini-code-assist Bot reviewed May 15, 2026

View reviewed changes

coderabbitai Bot reviewed May 15, 2026

View reviewed changes

coderabbitai Bot approved these changes May 15, 2026

View reviewed changes

github-actions Bot added the size/S Small PR (< 200 lines) label May 15, 2026

github-actions Bot approved these changes May 15, 2026

View reviewed changes

github-actions Bot reviewed May 15, 2026

View reviewed changes

ding113 merged commit 18b8e60 into dev May 15, 2026
10 checks passed

github-project-automation Bot moved this from Backlog to Done in Claude Code Hub Roadmap May 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(availability): restore status_code IS NOT NULL terminal filter#1189

fix(availability): restore status_code IS NOT NULL terminal filter#1189
ding113 merged 2 commits into
devfrom
worktree-fix-availability-monitor-index

ding113 commented May 15, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

coderabbitai Bot commented May 15, 2026 •

edited

Loading

Walkthrough

变更

评估代码审查工作量

可能相关的其他 PR

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

github-actions Bot commented May 15, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot commented May 15, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ding113 commented May 15, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issues & PRs

背景

改动

Test plan

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

coderabbitai Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

变更

评估代码审查工作量

可能相关的其他 PR

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 15, 2026

🧪 测试结果

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Code Review Summary

PR Size: S

Review Coverage

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Code Review Summary

PR Size: S

Review Coverage

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Code Review Summary

PR Size: S

Review Coverage

Uh oh!

github-actions Bot commented May 15, 2026

🧪 测试结果

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ding113 commented May 15, 2026 •

edited by greptile-apps Bot

Loading

coderabbitai Bot commented May 15, 2026 •

edited

Loading