Skip to content

fix(availability): restore status_code IS NOT NULL terminal filter#1189

Merged
ding113 merged 2 commits into
devfrom
worktree-fix-availability-monitor-index
May 15, 2026
Merged

fix(availability): restore status_code IS NOT NULL terminal filter#1189
ding113 merged 2 commits into
devfrom
worktree-fix-availability-monitor-index

Conversation

@ding113
Copy link
Copy Markdown
Owner

@ding113 ding113 commented May 15, 2026

Related Issues & PRs

背景

可用性监控的 provider + 时间范围聚合查询加载很久。热路径原本依赖 message_request 上的部分索引:

idx_message_request_provider_created_at_finalized_active
  (provider_id, created_at DESC)
  WHERE deleted_at IS NULL AND status_code IS NOT NULL

近期的"已终态"判定被改成内联复刻 fn_is_message_request_finalized(...) 的语义(status_code / blocked_by / errorMessage / providerChain 任一非空即视为终态)。这样做有两个问题:

  1. SQL 不再以 status_code IS NOT NULL 作为可单独证明的谓词分支,Postgres 难以判定它蕴含索引谓词,常退化为大范围扫描;
  2. 把仅写入了 providerChain / errorMessage 片段、但 statusCode 仍为 NULL 的"请求中"记录也判为终态。fn_compute_message_request_success_rate_outcome(...) 会把这些请求中的中间状态算成 failure,污染可用性数据。

改动

  • buildAvailabilityFinalizedCondition 恢复为 status_code IS NOT NULL,重新对齐部分索引,同时去掉对内联 finalized 谓词的依赖(以及随之而来的 jsonb / CASE 复杂度和 FINALIZED_PROVIDER_CHAIN_REASONS_SQL 常量)。
  • 终态记录的成功/失败/排除分类继续走 fn_compute_message_request_success_rate_outcome(...),逻辑不变。
  • 单元测试(TDD):先把"必须包含 status_code IS NOT NULL、不再包含 fn_is_message_request_finalizedblocked_by / provider_chain reason 内联片段、且仍由 outcome 函数做分类"作为断言写到红,再回到生产代码上让测试转绿。

Test plan

  • bun run typecheck
  • bun run lint
  • bun run build
  • bun run test — 5969/5982 passed (13 skipped)
  • bunx vitest run --config tests/configs/{integration,my-usage,quota,proxy-guard-pipeline,public-status.integration,thinking-signature-rectifier,codex-session-id-completer,include-session-id-in-errors,logs-sessionid-time-filter,usage-logs-sessionid-search}.config.ts 全部 exit 0
  • bunx vitest run tests/unit/lib/availability-service.test.ts — 19/19 passed
  • e2e (tests/configs/e2e.config.ts) 本地未跑通:依赖 13500 端口上的 dev server(pre-existing on origin/dev 同样如此),与本改动无关
  • 待回归:在带数据库的环境上 EXPLAIN 大时间范围的 availability 聚合查询,确认 partial index 被命中

🤖 Generated with Claude Code

Greptile Summary

This PR restores the buildAvailabilityFinalizedCondition function to the simple status_code IS NOT NULL predicate, replacing the previously inlined multi-branch OR expression that mirrored fn_is_message_request_finalized. The simplified predicate re-aligns with the partial index idx_message_request_provider_created_at_finalized_active and stops partially-written records (e.g. rows with only providerChain/errorMessage but no status_code) from being pulled into availability aggregation as false failures.

  • buildAvailabilityFinalizedCondition is reduced to one line (isNotNull(messageRequest.statusCode)), removing ~50 lines of JSONB CASE guards and the FINALIZED_PROVIDER_CHAIN_REASONS_SQL constant.
  • Tests gain a shared expectStatusCodeOnlyFinalizedBoundary helper that asserts the four contract invariants (index-aligned predicate present; blocked_by, error_message, and provider_chain paths absent) and are applied consistently across all relevant test cases.

Confidence Score: 5/5

Safe to merge — the change narrows the finalized-request filter to a single SARGable predicate that matches the existing partial index, and all 19 unit tests pass.

The production change is a straightforward simplification: a 50-line multi-branch SQL expression is replaced by a single Drizzle call, and every call site remains unchanged. The new predicate is a strict subset of the old one, which is intentional and well-documented. Tests cover the boundary contract with a shared helper applied consistently across four test cases.

No files require special attention.

Important Files Changed

Filename Overview
src/lib/availability/availability-service.ts Simplifies buildAvailabilityFinalizedCondition to isNotNull(messageRequest.statusCode), aligning the WHERE predicate with the partial index and removing ~50 lines of complex JSONB logic; all call sites are unchanged.
tests/unit/lib/availability-service.test.ts Adds expectStatusCodeOnlyFinalizedBoundary helper and updates four test assertions to match the simplified finalized condition; also adds fn_compute_message_request_success_rate_outcome presence checks.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Incoming message_request rows] --> B{deleted_at IS NULL?}
    B -- No --> SKIP1[Excluded — soft-deleted]
    B -- Yes --> C{status_code IS NOT NULL?\nbuildAvailabilityFinalizedCondition}
    C -- No --> SKIP2[Excluded — in-progress / partial write]
    C -- Yes --> D[Finalized requests CTE]
    D --> E[fn_compute_message_request_success_rate_outcome\nblocked_by, status_code, error_message, provider_chain]
    E --> F{outcome}
    F -- success --> GREEN[greenCount]
    F -- failure --> RED[redCount]
    F -- excluded --> EXCL[Excluded from counts]
    GREEN & RED --> AGG[Bucket aggregation\nprovider_bucket_stats]
    AGG --> RESULT[AvailabilityQueryResult]
Loading
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
tests/unit/lib/availability-service.test.ts:59-61
The `error_message` regression guard checks for the IS NOT NULL form, but the original code used `COALESCE("error_message", '') <> ''` — a form that would never match this assertion. If the COALESCE pattern were reintroduced, this check would silently pass. Widening the guard to reject any occurrence of `"error_message"` in the finalized predicate closes that gap.

```suggestion
  expect(sqlText).not.toContain(`"blocked_by" is not null`);
  expect(sqlText).not.toContain(`"error_message"`);
  expect(sqlText).not.toContain(`"provider_chain" -> -1 ->> 'reason'`);
```

Reviews (2): Last reviewed commit: "test(availability): factor finalized-bou..." | Re-trigger Greptile

可用性监控的 provider + 时间范围聚合改用 status_code IS NOT NULL 收敛
终态边界,与部分索引 idx_message_request_provider_created_at_finalized_active
(deleted_at IS NULL AND status_code IS NOT NULL) 对齐,让热路径查询
能稳定命中索引;同时避免把仅写入 providerChain / errorMessage 片段但
statusCode 仍为 NULL 的"请求中"记录算入聚合(之前的内联终态谓词复刻
fn_is_message_request_finalized 语义后,会让分类函数把它们误算成 failure)。
终态记录的成功/失败/排除分类继续由
fn_compute_message_request_success_rate_outcome(...) 完成。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5f68c3e4-bb14-4a0d-badd-7470a15d03e0

📥 Commits

Reviewing files that changed from the base of the PR and between 21885f2 and 0fd90a0.

📒 Files selected for processing (1)
  • tests/unit/lib/availability-service.test.ts

📝 Walkthrough

Walkthrough

将 availability-service 中的“已终态(finalized)”判定由复杂的 providerChain/blockedBy/errorMessage 组合逻辑重写为仅以 message_request.status_code 非空为准,并同步更新相关单元测试的 SQL 断言以匹配该新边界。

变更

终态判定逻辑简化

Layer / File(s) 摘要
核心终态判定条件改写
src/lib/availability/availability-service.ts
导入新增 isNotNull,移除 FINALIZED_PROVIDER_CHAIN_REASONS 常量,并将 buildAvailabilityFinalizedCondition 中原先基于 blockedBy/errorMessage/providerChain 的复杂 SQL 条件替换为单一 isNotNull(messageRequest.statusCode),同时更新注释。
终态判定测试断言更新
tests/unit/lib/availability-service.test.ts
新增 expectStatusCodeOnlyFinalizedBoundary 并在多个场景(聚合统计、排除进行中请求、保留 Gemini passthrough、避免中间状态计入 red、getCurrentProviderStatus 短窗口聚合)中将对 finalized 边界的断言统一为仅包含 "status_code" is not null,移除对 fn_is_message_request_finalizedblocked_byprovider_chain reason 等旧片段的检查。

评估代码审查工作量

🎯 4 (Complex) | ⏱️ ~45 minutes

可能相关的其他 PR

  • ding113/claude-code-hub#1018:同样在 availability-service.ts 与相关测试中将“finalized”判定收敛为 message_request.statusCode IS NOT NULL
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed PR标题清晰准确,明确表示恢复status_code IS NOT NULL作为终态过滤条件,直接对应主要变更。
Description check ✅ Passed PR描述详细阐述了问题背景、改动内容、测试计划和相关issue链接,与代码变更高度关联。
Linked Issues check ✅ Passed PR成功解决了#1168中的数据库CPU占用问题,通过恢复status_code IS NOT NULL谓词对齐部分索引,阻止了Postgres执行宽表扫描。
Out of Scope Changes check ✅ Passed 所有变更均聚焦于恢复buildAvailabilityFinalizedCondition和相应的单元测试调整,无超范围或无关变更。

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch worktree-fix-availability-monitor-index

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added bug Something isn't working area:statistics labels May 15, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request simplifies the 'finalized request' logic in the availability service by replacing a complex multi-field SQL condition with a single status_code IS NOT NULL check. This change optimizes database performance by aligning with the existing partial index and prevents in-progress requests from being misclassified as failures. Unit tests have been updated to verify the simplified SQL generation and ensure that outcome classification still functions correctly. I have no feedback to provide as there were no review comments to assess.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/unit/lib/availability-service.test.ts (1)

312-325: ⚡ Quick win

error_message 也纳入负向断言。

现在这些用例只排除了 blocked_byprovider_chain 路径;如果 finalized 条件回归成 "status_code" is not null or "error_message" is not null,测试依然会通过,但会重新把仅写入 errorMessagestatusCode 仍为 NULL 的记录纳入可用性统计。建议把这组断言提成一个 helper,并补上对 "error_message" is not null 的排除断言。

可考虑这样收紧断言
+function expectStatusCodeOnlyFinalizedBoundary(sqlText: string) {
+  expect(sqlText).toContain(`"status_code" is not null`);
+  expect(sqlText).not.toContain(`"blocked_by" is not null`);
+  expect(sqlText).not.toContain(`"error_message" is not null`);
+  expect(sqlText).not.toContain(`"provider_chain" -> -1 ->> 'reason'`);
+}
+
 ...
-    expect(finalizedRequestsSql).toContain(`"status_code" is not null`);
-    expect(finalizedRequestsSql).not.toContain(`"blocked_by" is not null`);
-    expect(finalizedRequestsSql).not.toContain(`"provider_chain" -> -1 ->> 'reason'`);
+    expectStatusCodeOnlyFinalizedBoundary(finalizedRequestsSql);

Also applies to: 494-499, 533-535, 568-575, 742-747

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/lib/availability-service.test.ts` around lines 312 - 325, The
tests assert finalizedRequestsSql and queryText must exclude certain
non-finalized paths but miss `"error_message" is not null`; update the
assertions (and extract them into a reusable helper) so each check for
finalizedRequestsSql includes
expect(finalizedRequestsSql).not.toContain(`"error_message" is not null`)
alongside the existing not.toContain checks for `"blocked_by" is not null` and
`"provider_chain" -> -1 ->> 'reason'`, and reuse that helper across all affected
cases (references: finalizedRequestsSql, queryText,
fn_is_message_request_finalized,
fn_compute_message_request_success_rate_outcome, and the `"status_code" is not
null` positive assertion).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/unit/lib/availability-service.test.ts`:
- Around line 312-325: The tests assert finalizedRequestsSql and queryText must
exclude certain non-finalized paths but miss `"error_message" is not null`;
update the assertions (and extract them into a reusable helper) so each check
for finalizedRequestsSql includes
expect(finalizedRequestsSql).not.toContain(`"error_message" is not null`)
alongside the existing not.toContain checks for `"blocked_by" is not null` and
`"provider_chain" -> -1 ->> 'reason'`, and reuse that helper across all affected
cases (references: finalizedRequestsSql, queryText,
fn_is_message_request_finalized,
fn_compute_message_request_success_rate_outcome, and the `"status_code" is not
null` positive assertion).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2a4f203b-de64-4fd8-9615-6e05e5f09567

📥 Commits

Reviewing files that changed from the base of the PR and between b0c9eaf and 21885f2.

📒 Files selected for processing (2)
  • src/lib/availability/availability-service.ts
  • tests/unit/lib/availability-service.test.ts

@github-actions github-actions Bot added the size/S Small PR (< 200 lines) label May 15, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🧪 测试结果

测试类型 状态
代码质量
单元测试
集成测试
API 测试

总体结果: ✅ 所有测试通过

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

No significant issues identified in this PR. The change correctly restores as the terminal-state filter, aligning the availability query with the partial index predicate and eliminating the CPU spike caused by planner-opaque inlined SQL. Tests were TDD'd and validate the new boundary behavior.

PR Size: S

  • Lines changed: 138 (38 additions, 100 deletions)
  • Files changed: 2

Review Coverage

  • Logic and correctness - Clean. The simplified predicate aligns with and correctly excludes in-flight records that were previously misclassified as failures.
  • Security (OWASP Top 10) - Clean. No user input reaches raw SQL; all parameters remain properly escaped through Drizzle ORM.
  • Error handling - Clean. No new catch blocks or error paths introduced.
  • Type safety - Clean. is a standard Drizzle type-safe operator, consistent with existing usage in the same file.
  • Documentation accuracy - Clean. The new JSDoc accurately explains the rationale, known limitations, and why the function-call approach was abandoned.
  • Test coverage - Adequate. Tests assert both the new predicate and the absence of the old complex SQL; edge cases (Gemini passthrough, intermediate persistence state) are covered.
  • Code clarity - Good. Removing ~60 lines of opaque SQL and the constants dramatically improves maintainability.

Automated review by Claude AI

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

No significant issues identified in this PR. The change correctly restores status_code IS NOT NULL as the terminal-state filter, aligning the availability query with the partial index predicate and eliminating the CPU spike caused by planner-opaque inlined SQL. Tests were TDD'd and validate the new boundary behavior.

PR Size: S

  • Lines changed: 138 (38 additions, 100 deletions)
  • Files changed: 2

Review Coverage

  • Logic and correctness - Clean. The simplified predicate aligns with idx_message_request_provider_created_at_finalized_active and correctly excludes in-flight records that were previously misclassified as failures.
  • Security (OWASP Top 10) - Clean. No user input reaches raw SQL; all parameters remain properly escaped through Drizzle ORM.
  • Error handling - Clean. No new catch blocks or error paths introduced.
  • Type safety - Clean. isNotNull is a standard Drizzle type-safe operator, consistent with existing isNull usage in the same file.
  • Documentation accuracy - Clean. The new JSDoc accurately explains the rationale, known limitations, and why the function-call approach was abandoned.
  • Test coverage - Adequate. Tests assert both the new predicate and the absence of the old complex SQL; edge cases (Gemini passthrough, intermediate persistence state) are covered.
  • Code clarity - Good. Removing ~60 lines of opaque SQL and the FINALIZED_PROVIDER_CHAIN_REASONS constants dramatically improves maintainability.

Automated review by Claude AI

…ssage

把"终态边界只看 status_code IS NOT NULL"的 5 处断言抽到
expectStatusCodeOnlyFinalizedBoundary helper,并补上对
"error_message" is not null 的负向断言:之前只排除 blocked_by /
provider_chain 路径,如果未来 finalized 回归成
"status_code" is not null or "error_message" is not null,
仅写入 errorMessage 的"请求中"记录会重新被纳入可用性统计而不被发现。

Addresses CodeRabbit nitpick on #1189.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

No significant issues identified in this PR.

PR Size: S

  • Lines changed: 138
  • Files changed: 2

Review Coverage

  • Logic and correctness - Clean
  • Security (OWASP Top 10) - Clean
  • Error handling - Clean
  • Type safety - Clean
  • Documentation accuracy - Clean
  • Test coverage - Adequate
  • Code clarity - Good

Automated review by Codex AI

@github-actions
Copy link
Copy Markdown
Contributor

🧪 测试结果

测试类型 状态
代码质量
单元测试
集成测试
API 测试

总体结果: ✅ 所有测试通过

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Identified PR #1189 (fix(availability): restore status_code IS NOT NULL terminal filter) and reviewed the full diff for changed files: src/lib/availability/availability-service.ts and tests/unit/lib/availability-service.test.ts.
  • Computed PR size as S (38 additions, 100 deletions; 138 total lines changed across 2 files) and applied label size/S to the PR.
  • Ran a diff-scoped, multi-perspective review (comments, tests, error handling, types, logic/security/perf, simplification) and found no issues that met the reporting threshold (>= 80 confidence) after full-file/context validation.
  • Posted the required PR review summary via gh pr review --comment (no inline comments, since there were no validated reportable diff-line issues).

@ding113 ding113 merged commit 18b8e60 into dev May 15, 2026
10 checks passed
@github-project-automation github-project-automation Bot moved this from Backlog to Done in Claude Code Hub Roadmap May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:statistics bug Something isn't working size/S Small PR (< 200 lines)

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant