Skip to content

fix(public-status): persist hourly rollups#1156

Open
dofastted wants to merge 2 commits into
ding113:devfrom
dofastted:fix/public-status-rollup-storage
Open

fix(public-status): persist hourly rollups#1156
dofastted wants to merge 2 commits into
ding113:devfrom
dofastted:fix/public-status-rollup-storage

Conversation

@dofastted
Copy link
Copy Markdown
Contributor

@dofastted dofastted commented May 5, 2026

Summary

This PR reduces public-status Redis memory growth by moving finalized public status history out of Redis and into persisted hourly database rollups.

Problem

The public-status feature was storing 30 days of hourly history in Redis using long-lived keys (public-status:v1:snapshot:* and public-status:v1:series:*). This caused:

  • Unbounded Redis memory growth
  • Redis becoming the long-term storage for historical data
  • Large payload writes from the rebuild worker

Solution

Move finalized hourly summaries from Redis to PostgreSQL with a 30-day retention window, keeping only the current incomplete hour in Redis with a short TTL.

Changes

Core Changes

  • Database Schema: Add public_status_hourly_rollups table with proper indexes for 30-day finalized hourly history
  • Hourly Rollups Logic: Build finalized hourly summaries from request rows and upsert them idempotently
  • Redis Optimization: Store only the current incomplete hour in Redis with short TTL; stop writing long-lived snapshot/series payloads
  • API Compatibility: Keep existing /api/public-status response shape and legacy Redis snapshot fallback for rollout safety
  • Config Integration: Pass internal public-status config snapshot into read path for stable source group names

Supporting Changes

  • Database Migration: drizzle/0103_public_status_hourly_rollups.sql with table creation and indexes
  • Tests: Unit tests for hourly-rollups, read-store, rebuild-worker; integration tests for route behavior

Redis Usage Changes

Redis remains in use for:

  • Rebuild locks
  • Rebuild hints
  • Short-lived runtime manifests
  • Short-lived current-hour summaries

Redis is no longer the long-term storage location for 30-day public status history.

Compatibility Notes

This branch is based on upstream/dev and intentionally excludes local fork work, development mock routes, and status page UI changes. The PR only contains the public-status storage root fix, migration metadata, and focused tests.

Migration

This PR includes Drizzle migration 0103 which will be auto-applied if AUTO_MIGRATE=true:

bun run db:generate  # Migration already generated
bun run db:migrate   # Apply manually if needed

Testing

Automated Tests

bunx vitest run tests/unit/public-status/hourly-rollups.test.ts tests/unit/public-status/read-store.test.ts tests/unit/public-status/rebuild-worker.test.ts tests/integration/public-status/route-redis-only.test.ts --reporter=verbose

Verification Commands

bunx biome check src/lib/public-status/hourly-rollups.ts src/lib/public-status/rebuild-worker.ts src/lib/public-status/read-store.ts src/app/api/public-status/route.ts src/drizzle/schema.ts drizzle/0103_public_status_hourly_rollups.sql drizzle/meta/0103_snapshot.json drizzle/meta/_journal.json tests/unit/public-status/hourly-rollups.test.ts tests/unit/public-status/read-store.test.ts tests/unit/public-status/rebuild-worker.test.ts tests/integration/public-status/route-redis-only.test.ts
bun run typecheck
bun run validate:migrations

Checklist

  • Code follows project conventions (Biome linting)
  • TypeScript type checking passes
  • Database migration validated
  • Unit tests added for new functionality
  • Integration tests updated
  • API response shape preserved for backward compatibility

Enhanced by Claude AI

Greptile Summary

This PR moves 30-day public-status history out of Redis and into a new public_status_hourly_rollups PostgreSQL table, keeping only the current incomplete hour in Redis with a short TTL. The rebuild worker is updated to skip already-finalized hours (checking the DB first), and the read path is updated to serve the payload from DB rollups + a Redis current-hour cache before falling back to the legacy manifest/snapshot path.

  • New schema & migration: adds public_status_hourly_rollups with a unique constraint on (bucket_start, public_group_slug, public_model_key, request_type_badge) and supporting indexes; the cleanup DELETE filters on bucket_end but no bucket_end index exists.
  • Rebuild optimization: findHoursNeedingRollupRefresh skips hours that already have all expected rollup rows, only unconditionally re-querying the last 2 finalized hours for stragglers.
  • Config version split: the route reads configVersion from the public snapshot but groups from the internal snapshot; the rebuild worker writes with the internal version, creating a window where DB and Redis lookups return empty results if the two snapshots disagree.

Confidence Score: 3/5

The new DB rollup read path silently returns empty results whenever the public and internal config snapshot versions differ, falling back to legacy Redis behavior and making the migration's primary goal unreliable in production.

The config version split between the read path (public snapshot) and the write path (internal snapshot) means the primary new code path silently returns zero rows whenever the two snapshots have different versions. This is a structural issue in a critical hot path that runs on every API request.

src/app/api/public-status/route.ts and src/lib/public-status/read-store.ts — the configVersion threading between public and internal snapshots needs reconciliation before the rollup path can be relied upon.

Important Files Changed

Filename Overview
src/app/api/public-status/route.ts Reads both public and internal config snapshots; passes public configVersion for DB/Redis lookups but internal configSnapshot for group definitions — version mismatch between the two can cause the entire rollup layer to return empty results
src/lib/public-status/read-store.ts New DB-first read path: queries historical rollup rows from Postgres and current-hour rows from Redis before falling back to legacy manifest/snapshot path; configVersion passed from public snapshot may not match version written by rebuild worker using internal snapshot
src/lib/public-status/hourly-rollups.ts Core new module: computes hourly rollups from request rows, reads/writes them to Postgres and Redis; missing bucket_end index means cleanup runs a sequential scan on every rebuild cycle
src/lib/public-status/rebuild-worker.ts Rebuild orchestration: uses findHoursNeedingRollupRefresh to skip already-finalized hours and only re-scans the last 2 finalized hours, substantially reducing prior full-720-hour overhead
drizzle/0103_public_status_hourly_rollups.sql Adds public_status_hourly_rollups table with correct indexes for read queries; missing index on bucket_end which is used in the cleanup DELETE
src/drizzle/schema.ts Adds Drizzle ORM table definition for public_status_hourly_rollups matching the SQL migration; schema and indexes look consistent
drizzle/meta/_journal.json Migration journal updated with entry 0103; consistent with the SQL migration file
tests/unit/public-status/hourly-rollups.test.ts Unit tests cover rollup building, state derivation, Redis read/write, and cleanup; appear thorough for the happy path
tests/unit/public-status/rebuild-worker.test.ts Tests verify rebuild locking, skipping already-finalized hours, and cleanup invocation
tests/unit/public-status/read-store.test.ts Tests for read-store DB and Redis paths; no test covers the configVersion mismatch scenario between public and internal snapshots
tests/integration/public-status/route-redis-only.test.ts Integration tests for the route with Redis-only mode; backward compatibility with the legacy snapshot path is exercised

Sequence Diagram

sequenceDiagram
    participant Route as GET /api/public-status
    participant RS as read-store
    participant DB as PostgreSQL hourly_rollups
    participant Redis as Redis current-hour cache
    participant RW as rebuild-worker

    Route->>RS: readPublicStatusPayload(configVersion=public_v, configSnapshot=internal)
    RS->>DB: readPublicStatusHourlyRollups(configVersion=public_v)
    DB-->>RS: historical rows (0 if version mismatch)
    RS->>Redis: readCurrentHourPublicStatusSummary(configVersion=public_v)
    Redis-->>RS: current-hour rows (miss if version mismatch)
    alt rows found
        RS-->>Route: buildPublicStatusPayloadFromHourlyRollups()
    else no rows
        RS->>Redis: read manifest/snapshot
        RS-->>Route: legacy payload or rebuilding
    end
    Note over RW: On rebuild cycle
    RW->>DB: findHoursNeedingRollupRefresh(configVersion=internal_v)
    DB-->>RW: missing/recent hours
    RW->>DB: upsertPublicStatusHourlyRollups(configVersion=internal_v)
    RW->>Redis: writeCurrentHourPublicStatusSummary(configVersion=internal_v)
    RW->>DB: cleanupPublicStatusHourlyRollups DELETE WHERE bucket_end <= cutoff
Loading

Comments Outside Diff (3)

  1. src/lib/public-status/rebuild-worker.ts, line 299-330 (link)

    P1 Rebuild lock TTL too short for initial 30-day backfill

    REBUILD_LOCK_TTL_MS is 60 s, but processing 720 hourly buckets (30 days × 24) in batches of 6 means 120 sequential Promise.all rounds. If each DB round-trip takes ≥ 0.5 s on the first run, the lock expires while work is still in progress. Another instance then acquires the lock and begins a competing rebuild. Because the inFlightRebuilds map is per-process, it provides no protection across instances — both would query the requests table for all 720 hours simultaneously, multiplying DB load. The Lua release script safely prevents lock collision, but the concurrent re-scan is the real issue. Consider setting the TTL to at least ROLLUP_WRITE_BATCH_SIZE × expected-batch-latency-ms × (ROLLUP_HISTORY_HOURS / ROLLUP_WRITE_BATCH_SIZE) or extending the lock with KEEPTTL/periodic refresh during the loop.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/lib/public-status/rebuild-worker.ts
    Line: 299-330
    
    Comment:
    **Rebuild lock TTL too short for initial 30-day backfill**
    
    `REBUILD_LOCK_TTL_MS` is 60 s, but processing 720 hourly buckets (30 days × 24) in batches of 6 means 120 sequential `Promise.all` rounds. If each DB round-trip takes ≥ 0.5 s on the first run, the lock expires while work is still in progress. Another instance then acquires the lock and begins a competing rebuild. Because the `inFlightRebuilds` map is per-process, it provides no protection across instances — both would query the requests table for all 720 hours simultaneously, multiplying DB load. The Lua release script safely prevents lock collision, but the concurrent re-scan is the real issue. Consider setting the TTL to at least `ROLLUP_WRITE_BATCH_SIZE × expected-batch-latency-ms × (ROLLUP_HISTORY_HOURS / ROLLUP_WRITE_BATCH_SIZE)` or extending the lock with `KEEPTTL`/periodic refresh during the loop.
    
    How can I resolve this? If you propose a fix, please make it concise.
  2. src/app/api/public-status/route.ts, line 17-43 (link)

    P1 Config version used to query rollups may not match the version used to write them

    The route passes configVersion: configSnapshot?.configVersion (the PUBLIC snapshot's version) into readPublicStatusPayload, which forwards it to both readPublicStatusHourlyRollups (as a WHERE filter) and readCurrentHourPublicStatusSummary (as part of the Redis key). However, the rebuild worker writes rollups and the current-hour Redis key using configSnapshot.configVersion from readCurrentInternalPublicStatusConfigSnapshot (the INTERNAL snapshot's version). If the public and internal config versions ever differ, every historical DB query returns zero rows and the current-hour Redis lookup misses its key — causing the DB path to silently fall through to the Redis manifest path on every request, making the new rollup layer non-functional. The integration test in route-redis-only.test.ts even uses distinct values ("cfg-public" vs "cfg-internal") and only verifies call parameters, not the end-to-end data flow, so this mismatch is not caught by tests.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/app/api/public-status/route.ts
    Line: 17-43
    
    Comment:
    **Config version used to query rollups may not match the version used to write them**
    
    The route passes `configVersion: configSnapshot?.configVersion` (the PUBLIC snapshot's version) into `readPublicStatusPayload`, which forwards it to both `readPublicStatusHourlyRollups` (as a WHERE filter) and `readCurrentHourPublicStatusSummary` (as part of the Redis key). However, the rebuild worker writes rollups and the current-hour Redis key using `configSnapshot.configVersion` from `readCurrentInternalPublicStatusConfigSnapshot` (the INTERNAL snapshot's version). If the public and internal config versions ever differ, every historical DB query returns zero rows and the current-hour Redis lookup misses its key — causing the DB path to silently fall through to the Redis manifest path on every request, making the new rollup layer non-functional. The integration test in `route-redis-only.test.ts` even uses distinct values (`"cfg-public"` vs `"cfg-internal"`) and only verifies call parameters, not the end-to-end data flow, so this mismatch is not caught by tests.
    
    How can I resolve this? If you propose a fix, please make it concise.
  3. src/app/api/public-status/route.ts, line 17-43 (link)

    P1 Config version used to query rollups may not match the version used to write them

    The route passes configVersion: configSnapshot?.configVersion (the PUBLIC snapshot's version) into readPublicStatusPayload, which forwards it to both readPublicStatusHourlyRollups (as a WHERE filter) and readCurrentHourPublicStatusSummary (as part of the Redis key). However, the rebuild worker writes rollups and the current-hour Redis key using configSnapshot.configVersion from readCurrentInternalPublicStatusConfigSnapshot (the INTERNAL snapshot's version). If the public and internal config versions ever differ, every historical DB query returns zero rows and the current-hour Redis lookup misses — causing the new rollup layer to be non-functional and silently falling through to the legacy Redis snapshot path on every request.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/app/api/public-status/route.ts
    Line: 17-43
    
    Comment:
    **Config version used to query rollups may not match the version used to write them**
    
    The route passes `configVersion: configSnapshot?.configVersion` (the PUBLIC snapshot's version) into `readPublicStatusPayload`, which forwards it to both `readPublicStatusHourlyRollups` (as a WHERE filter) and `readCurrentHourPublicStatusSummary` (as part of the Redis key). However, the rebuild worker writes rollups and the current-hour Redis key using `configSnapshot.configVersion` from `readCurrentInternalPublicStatusConfigSnapshot` (the INTERNAL snapshot's version). If the public and internal config versions ever differ, every historical DB query returns zero rows and the current-hour Redis lookup misses — causing the new rollup layer to be non-functional and silently falling through to the legacy Redis snapshot path on every request.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
src/lib/public-status/hourly-rollups.ts:398-409
**Cleanup DELETE lacks a `bucket_end` index, causing full table scans**

`cleanupPublicStatusHourlyRollups` issues a `DELETE WHERE bucket_end <= cutoff`, but the migration only creates indexes on `bucket_start`, `(config_version, bucket_start)`, and `(public_group_slug, public_model_key)` — none on `bucket_end`. On a table with 30 days × 24 hours × many model/group combos this cleanup runs as a sequential scan on every rebuild cycle. Adding an index on `bucket_end` in the migration would allow the DELETE to use an index range scan.

### Issue 2 of 2
src/app/api/public-status/route.ts:17-43
**Config version used to query rollups may not match the version used to write them**

The route passes `configVersion: configSnapshot?.configVersion` (the PUBLIC snapshot's version) into `readPublicStatusPayload`, which forwards it to both `readPublicStatusHourlyRollups` (as a WHERE filter) and `readCurrentHourPublicStatusSummary` (as part of the Redis key). However, the rebuild worker writes rollups and the current-hour Redis key using `configSnapshot.configVersion` from `readCurrentInternalPublicStatusConfigSnapshot` (the INTERNAL snapshot's version). If the public and internal config versions ever differ, every historical DB query returns zero rows and the current-hour Redis lookup misses — causing the new rollup layer to be non-functional and silently falling through to the legacy Redis snapshot path on every request.

Reviews (2): Last reviewed commit: "fix(public-status): address rollup revie..." | Re-trigger Greptile

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a2710042-2f53-4f50-adb1-421468bc4382

📥 Commits

Reviewing files that changed from the base of the PR and between ced80b6 and c71aeb3.

📒 Files selected for processing (5)
  • src/drizzle/schema.ts
  • src/lib/public-status/hourly-rollups.ts
  • src/lib/public-status/rebuild-worker.ts
  • tests/unit/public-status/hourly-rollups.test.ts
  • tests/unit/public-status/rebuild-worker.test.ts
✅ Files skipped from review due to trivial changes (1)
  • src/lib/public-status/hourly-rollups.ts
🚧 Files skipped from review as they are similar to previous changes (4)
  • src/drizzle/schema.ts
  • tests/unit/public-status/rebuild-worker.test.ts
  • src/lib/public-status/rebuild-worker.ts
  • tests/unit/public-status/hourly-rollups.test.ts

📝 Walkthrough

Walkthrough

该 PR 引入按小时的公开状态汇总:新增 DB 表与索引、实现按小时汇总的计算/持久化与 Redis 当前小时缓存、将读取路径与重建流程调整为使用汇总数据,并更新 API 路由以支持并优先使用内部配置快照。

Changes

Public Status Hourly Rollups

Layer / File(s) Summary
Schema / Migration
drizzle/0103_public_status_hourly_rollups.sql, drizzle/meta/_journal.json, src/drizzle/schema.ts
新增 public_status_hourly_rollups 表(小时桶、config/group/model、状态、计数与 double precision 指标、generated/created/updated 时间),添加唯一与辅助索引,迁移日志追加条目。
Core Types & Helpers
src/lib/public-status/hourly-rollups.ts (头部、helpers)
新增常量与 PublicStatusHourlyRollupRow 接口,数值归一化、median、UTC 小时对齐、状态推导与 Redis-ready 包装等工具函数。
Rollup Build
src/lib/public-status/hourly-rollups.ts (buildPublicStatusHourlyRollupsFromRequests)
实现按小时从请求行聚合为每 (group, model, requestTypeBadge) 的 rollup:计数、可用率、TTFB/TPS 中位数与派生离散 state。
Persistence / Query / Cleanup
src/lib/public-status/hourly-rollups.ts (upsertPublicStatusHourlyRollups,readPublicStatusHourlyRollups,cleanupPublicStatusHourlyRollups)
实现批量 upsert(冲突键:(bucketStart, publicGroupSlug, publicModelKey, requestTypeBadge))、按时间/可选 configVersion 查询并规范化非有限值、以及基于 UTC 午夜的保留期删除。
Current-hour Redis Cache
src/lib/public-status/hourly-rollups.ts (writeCurrentHourPublicStatusSummary,readCurrentHourPublicStatusSummary,buildPublicStatusCurrentHourSummaryKey)
为“当前小时”提供 JSON 序列化缓存键/写入(EX TTL)与安全读取/形状验证,序列化 ISO 时间并在解析时恢复类型与空值。
Payload Reconstruction
src/lib/public-status/hourly-rollups.ts (buildPublicStatusPayloadFromHourlyRollups)
基于配置组与小时范围,将汇总行映射为固定长度 timeline,进行有界缺口填充,计算每桶/每模型的可用率、最新指标和生成指纹/freshUntil。
Read Path Integration
src/lib/public-status/read-store.ts
readPublicStatusPayload 新增快路径:当传入 configSnapshot 且含 groups 时,读取 DB 汇总并(若 configVersion 存在)合并当前小时 Redis 摘要,返回由汇总构建的 payload,若无行则触发 rollup-missing 重建提示;保留旧回退路径。
Rebuild Worker
src/lib/public-status/rebuild-worker.ts
将重建改为按小时分片查询请求并分批构建/持久化汇总(chunked 构建 + 批量 upsert),单独生成当前小时摘要并清理过期汇总;publishPublicStatusProjection 简化为写入短 TTL 的版本化 manifest 并按条件更新 currentManifestKey,移除旧的临时/生成快照写入流程。
API Route Wiring
src/app/api/public-status/route.ts
GET 并行读取公开与内部配置快照,并将 configSnapshot 设为 internalConfigSnapshot ?? configSnapshot,使内部快照优先传入读取流程。
Tests — Unit / Integration
tests/unit/public-status/*, tests/integration/public-status/route-redis-only.test.ts
添加并更新单元/集成测试:验证汇总构建、payload 重建、Redis 当前小时读写(含 TTL 与 key 形态)、DB upsert/cleanup 调用;重建测试改为断言按小时聚合与写入当前小时摘要;路由测试断言内部快照被转发。

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed 标题准确概括了主要变更:将小时粒度的状态汇总从Redis迁移到PostgreSQL数据库。
Description check ✅ Passed 描述详细阐述了问题、解决方案、核心变更、兼容性说明和验证步骤,与代码变更完全相关。
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch fix/public-status-rollup-storage

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added bug Something isn't working area:statistics area:core size/XL Extra Large PR (> 1000 lines) labels May 5, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a database-backed hourly rollup system for public status reporting, replacing the previous purely Redis-based approach. It adds a new public_status_hourly_rollups table, logic to aggregate request data into hourly buckets, and updates the API route and rebuild worker to utilize these rollups for historical data. Review feedback highlights a significant performance regression in the rebuild worker due to sequential database queries, identifies a loss of granularity in status classification (missing 'degraded' state), and suggests improvements for the median calculation and Drizzle ORM usage.

Comment thread src/lib/public-status/rebuild-worker.ts Outdated
Comment on lines +308 to +326
const finalizedHourStarts: Date[] = [];
for (
let cursorMs = Date.parse(rollupCoveredFrom);
cursorMs < Date.parse(coveredTo);
cursorMs += 60 * 60 * 1000
) {
finalizedHourStarts.push(new Date(cursorMs));
}
for (const hourStartBatch of chunkArray(finalizedHourStarts, ROLLUP_WRITE_BATCH_SIZE)) {
await Promise.all(
hourStartBatch.map((hourStart) =>
buildAndPersistPublicStatusHourlyRollup({
configVersion: configSnapshot.configVersion,
hourStart,
groups,
})
)
);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation introduces a significant performance regression by performing 720 sequential batches of database queries (one for each hour in the 30-day retention window) every time a rebuild is triggered.

In the previous implementation, a single query was used to fetch the entire range. Performing 720 individual queries to the message_request table on every worker run (which can be triggered frequently via hints) will put excessive and unnecessary load on the database.

Consider optimizing this by:

  1. Querying the entire range in a single query (or larger daily chunks) and aggregating the results in memory.
  2. Implementing logic to only process hours that are missing from the public_status_hourly_rollups table or are very recent (e.g., the last 2 hours) to account for late-arriving data.

Comment thread src/lib/public-status/hourly-rollups.ts Outdated
label: model.label,
vendorIconKey: model.vendorIconKey,
requestTypeBadge: model.requestTypeBadge,
state: sampleCount === 0 ? "no_data" : bucket.successCount > 0 ? "operational" : "failed",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The state calculation logic is overly lenient and results in a regression where the degraded status is lost.

Currently, any hour with at least one success is marked as operational, even if the failure rate is extremely high (e.g., 1 success and 99 failures). Furthermore, the degraded state, which is explicitly handled in the read path (read-store.ts), is omitted here and in the database schema definition.

It is recommended to use a threshold-based approach for status classification (e.g., availabilityPct > 99.9 ? 'operational' : availabilityPct > 95 ? 'degraded' : 'failed') and update the schema to support the degraded value.

Comment on lines +52 to +69
function median(values: number[]): number | null {
if (values.length === 0) {
return null;
}

const sorted = [...values].sort((left, right) => left - right);
const middle = Math.floor(sorted.length / 2);
if (sorted.length % 2 === 0) {
const left = sorted[middle - 1];
const right = sorted[middle];
if (typeof left !== "number" || typeof right !== "number") {
return null;
}
return Number(((left + right) / 2).toFixed(4));
}

return sorted[middle] ?? null;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The median function has inconsistent rounding logic and redundant type checks. It rounds the result to 4 decimal places when the input length is even, but returns the raw value when the length is odd. Additionally, since the input is typed as number[], the typeof checks for left and right are unnecessary.

function median(values: number[]): number | null {
  if (values.length === 0) {
    return null;
  }

  const sorted = [...values].sort((left, right) => left - right);
  const middle = Math.floor(sorted.length / 2);
  
  const result = sorted.length % 2 === 0
    ? (sorted[middle - 1] + sorted[middle]) / 2
    : sorted[middle];

  return result !== undefined ? Number(result.toFixed(4)) : null;
}

Comment thread src/lib/public-status/hourly-rollups.ts Outdated
lt(publicStatusHourlyRollups.bucketStart, input.end),
];
if (input.configVersion) {
conditions.push(sql`${publicStatusHourlyRollups.configVersion} = ${input.configVersion}`);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using the sql template literal for a simple equality check is less idiomatic in Drizzle than using the built-in eq operator. This improves type safety and readability.

Suggested change
conditions.push(sql`${publicStatusHourlyRollups.configVersion} = ${input.configVersion}`);
conditions.push(eq(publicStatusHourlyRollups.configVersion, input.configVersion));

vendorIconKey: model.vendorIconKey,
requestTypeBadge: model.requestTypeBadge,
state: sampleCount === 0 ? "no_data" : bucket.successCount > 0 ? "operational" : "failed",
successCount: bucket.successCount,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Medium] [LOGIC-BUG] Timezone/DST bug in retention cutoff calculation

Why this is a problem: The cutoff calculation uses millisecond arithmetic (retentionDays * 24 * 60 * 60 * 1000) which assumes all days have exactly 86400 seconds. This is incorrect during DST transitions, potentially causing data to be retained longer than 30 days or deleted prematurely.

Suggested fix:

// Use date truncation for calendar-aware cutoff
const cutoff = new Date(now.getTime());
cutoff.setUTCDate(cutoff.getUTCDate() - retentionDays);
cutoff.setUTCHours(0, 0, 0, 0); // Align to day boundary for consistency

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

This PR implements a significant architectural improvement to reduce Redis memory growth by moving finalized public status history from Redis to persistent PostgreSQL hourly rollups. The implementation is well-structured and follows existing code patterns. One minor time calculation issue was identified.

PR Size: XL

  • Lines changed: 6211 (6086 additions + 125 deletions)
  • Files changed: 12
  • Split suggestion: Consider splitting large snapshot JSON files into separate PRs in future migrations to reduce review load.

Issues Found

Category Critical High Medium Low
Logic/Bugs 0 0 1 0
Security 0 0 0 0
Error Handling 0 0 0 0
Types 0 0 0 0
Comments/Docs 0 0 0 0
Tests 0 0 0 0
Simplification 0 0 0 0

Medium Priority Issues (Should Fix)

  1. [LOGIC-BUG] Timezone/DST bug in retention cutoff calculation (hourly-rollups.ts:271)
    • The cutoff calculation uses millisecond arithmetic which doesn't account for DST transitions
    • Use calendar-aware date arithmetic instead

Review Coverage

  • Logic and correctness - Minor DST issue found
  • Security (OWASP Top 10) - Clean, no injection risks or exposed secrets
  • Error handling - Clean, errors properly logged and surfaced
  • Type safety - Clean, proper TypeScript typing throughout
  • Documentation accuracy - Clean, Chinese comments follow existing pattern
  • Test coverage - Adequate unit and integration tests included
  • Code clarity - Good, well-structured with clear function names

Notable Observations (Non-blocking)

  1. Test Coverage: New code has good test coverage with unit tests for hourly-rollups.ts, read-store.ts, and rebuild-worker.ts
  2. Migration: Properly follows CLAUDE.md database migration workflow (schema.ts modified, generated migration file included)
  3. Redis Memory Optimization: Successfully removes 30-day long-lived Redis keys, replacing with database persistence
  4. Backward Compatibility: Maintains existing /api/public-status response shape and Redis fallback

Automated review by Claude AI

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 60eb8bdaf6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +220 to +225
const [historyRows, currentRows] = await Promise.all([
readPublicStatusHourlyRollups({
start: historyStart,
end: currentHourStart,
configVersion: input.configVersion,
}),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Catch rollup read failures before fallback

The DB-first path calls readPublicStatusHourlyRollups without a guard, so any transient Postgres error rejects readPublicStatusPayload and the route returns 500 instead of falling back to the existing Redis snapshot/manifest path. This is a regression from the previous Redis-only read behavior and breaks the intended rollout safety when DB connectivity is degraded.

Useful? React with 👍 / 👎.

const windowStart = new Date(
currentHourStart.getTime() - (input.rangeHours - 1) * 60 * 60 * 1000
);
const bucketCount = input.rangeHours;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve interval semantics in rollup payloads

This payload builder hardcodes one-hour buckets (bucketCount = rangeHours and + 60 * 60 * 1000 stepping), so requests like interval=5 or interval=15 now return hourly timelines and freshness behavior inconsistent with the requested interval. Because the DB rollup path is now preferred, clients using non-60-minute intervals will receive less granular and potentially stale-looking data compared to prior behavior.

Useful? React with 👍 / 👎.

Comment thread src/lib/public-status/rebuild-worker.ts Outdated
Comment on lines +308 to +325
const finalizedHourStarts: Date[] = [];
for (
let cursorMs = Date.parse(rollupCoveredFrom);
cursorMs < Date.parse(coveredTo);
cursorMs += 60 * 60 * 1000
) {
finalizedHourStarts.push(new Date(cursorMs));
}
for (const hourStartBatch of chunkArray(finalizedHourStarts, ROLLUP_WRITE_BATCH_SIZE)) {
await Promise.all(
hourStartBatch.map((hourStart) =>
buildAndPersistPublicStatusHourlyRollup({
configVersion: configSnapshot.configVersion,
hourStart,
groups,
})
)
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Full 720-hour re-scan on every rebuild run

ROLLUP_HISTORY_HOURS = 720. On every rebuild invocation, the loop issues a SELECT from the requests table for each of the 720 hourly windows, then upserts into public_status_hourly_rollups. Historical hours (older than 1 hour) have finalized data and will never change, yet they are re-queried and re-upserted unconditionally. After the initial backfill the work is pure overhead: 720 reads + 720 upserts × each rebuild cycle. Restricting the full-scan to hours not yet present in the rollups table (or reducing the re-scan window to recent hours after the first run) would substantially reduce this ongoing cost.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/public-status/rebuild-worker.ts
Line: 308-325

Comment:
**Full 720-hour re-scan on every rebuild run**

`ROLLUP_HISTORY_HOURS = 720`. On every rebuild invocation, the loop issues a SELECT from the requests table for each of the 720 hourly windows, then upserts into `public_status_hourly_rollups`. Historical hours (older than 1 hour) have finalized data and will never change, yet they are re-queried and re-upserted unconditionally. After the initial backfill the work is pure overhead: 720 reads + 720 upserts × each rebuild cycle. Restricting the full-scan to hours not yet present in the rollups table (or reducing the re-scan window to recent hours after the first run) would substantially reduce this ongoing cost.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread src/lib/public-status/hourly-rollups.ts Outdated
@@ -0,0 +1,646 @@
import { and, asc, gte, lt, lte, sql } from "drizzle-orm";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The configVersion filter uses a raw sql template tag instead of Drizzle's eq() helper, which is the conventional pattern for equality conditions in this codebase. Both produce parameterized queries so there is no functional difference, but using eq() is more consistent.

Suggested change
import { and, asc, gte, lt, lte, sql } from "drizzle-orm";
import { and, asc, eq, gte, lt, lte, sql } from "drizzle-orm";
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/public-status/hourly-rollups.ts
Line: 1

Comment:
The `configVersion` filter uses a raw `sql` template tag instead of Drizzle's `eq()` helper, which is the conventional pattern for equality conditions in this codebase. Both produce parameterized queries so there is no functional difference, but using `eq()` is more consistent.

```suggestion
import { and, asc, eq, gte, lt, lte, sql } from "drizzle-orm";
```

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment thread src/lib/public-status/hourly-rollups.ts Outdated
lt(publicStatusHourlyRollups.bucketStart, input.end),
];
if (input.configVersion) {
conditions.push(sql`${publicStatusHourlyRollups.configVersion} = ${input.configVersion}`);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The configVersion equality condition is expressed via a raw sql template rather than the eq() helper. Using eq() is consistent with the rest of the filter conditions.

Suggested change
conditions.push(sql`${publicStatusHourlyRollups.configVersion} = ${input.configVersion}`);
conditions.push(eq(publicStatusHourlyRollups.configVersion, input.configVersion));
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/public-status/hourly-rollups.ts
Line: 351

Comment:
The `configVersion` equality condition is expressed via a raw `sql` template rather than the `eq()` helper. Using `eq()` is consistent with the rest of the filter conditions.

```suggestion
    conditions.push(eq(publicStatusHourlyRollups.configVersion, input.configVersion));
```

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/lib/public-status/rebuild-worker.ts (1)

274-326: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

修复 hourStart 传参的语义不匹配

虽然 buildAndPersistPublicStatusHourlyRollup 内部会调用 alignHourStartUtc(input.hourStart) 来对齐参数(第 289 行),但调用方在循环中传入的 cursorMs(如 09:15Z)并未显式对齐到整点。这造成语义混淆:参数名为 hourStart 暗示应接收整点时间,而实际传入的是基于 intervalMinutes 边界的非整点值。

虽然内部对齐机制防止了数据损坏,但建议显式传入整点对齐的值,以遵守参数的语义约定:

建议修复
  const manifestCoveredFrom = new Date(
    Date.parse(coveredTo) - input.rangeHours * 60 * 60 * 1000
  ).toISOString();
-  const rollupCoveredFrom = new Date(
-    Date.parse(coveredTo) - ROLLUP_HISTORY_HOURS * 60 * 60 * 1000
-  ).toISOString();
+  const currentHourStart = alignHourStartUtc(now);
+  const rollupCoveredFromMs = currentHourStart.getTime() - ROLLUP_HISTORY_HOURS * 60 * 60 * 1000;
        const finalizedHourStarts: Date[] = [];
        for (
-          let cursorMs = Date.parse(rollupCoveredFrom);
-          cursorMs < Date.parse(coveredTo);
+          let cursorMs = rollupCoveredFromMs;
+          cursorMs < currentHourStart.getTime();
           cursorMs += 60 * 60 * 1000
        ) {
          finalizedHourStarts.push(new Date(cursorMs));
        }
        }

-        const currentHourStart = alignHourStartUtc(now);
         const currentHourRequests = await queryPublicStatusRequests({

此外,建议在 tests/unit/public-status/rebuild-worker.test.ts 中补充 intervalMinutes=15now 非整点(如 2026-04-21T10:17:00.000Z)的测试用例,以覆盖这一边界情况。

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/public-status/rebuild-worker.ts` around lines 274 - 326, 循环中用来生成
finalizedHourStarts 的 cursorMs 可能不是整点(例如 09:15Z),导致传入
buildAndPersistPublicStatusHourlyRollup 的 hourStart 语义不明确;虽然该函数内部会调用
alignHourStartUtc,但调用方应显式传入对齐后的整点时间以保持语义一致。修复方法:在构造 finalizedHourStarts 时用
alignHourStartUtc(new Date(cursorMs))(或等效方式)将每个 hourStart 对齐到整点(替换当前直接 new
Date(cursorMs) 的做法),继续通过 chunkArray 批量调用 buildAndPersistPublicStatusHourlyRollup
如前;并新增单元测试(tests/unit/public-status/rebuild-worker.test.ts)覆盖 intervalMinutes=15
且 now 为非整点(例如 2026-04-21T10:17:00.000Z)的场景以验证行为。
🧹 Nitpick comments (1)
src/lib/public-status/rebuild-worker.ts (1)

309-313: 💤 Low value

循环条件中 Date.parse(coveredTo) 每轮重复计算

Date.parse(coveredTo) 在每轮迭代都重新解析字符串。即便循环上限只有 720 次也算不上瓶颈,但提到循环外缓存能让意图更清晰,也方便后续按上面建议改成与 currentHourStart 对齐:

建议优化
-        for (
-          let cursorMs = Date.parse(rollupCoveredFrom);
-          cursorMs < Date.parse(coveredTo);
-          cursorMs += 60 * 60 * 1000
-        ) {
+        const rollupFromMs = Date.parse(rollupCoveredFrom);
+        const rollupToMs = Date.parse(coveredTo);
+        for (let cursorMs = rollupFromMs; cursorMs < rollupToMs; cursorMs += 60 * 60 * 1000) {
           finalizedHourStarts.push(new Date(cursorMs));
         }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/public-status/rebuild-worker.ts` around lines 309 - 313, The loop
repeatedly calls Date.parse(coveredTo) each iteration; compute and reuse a
parsed value (e.g., const coveredToMs = Date.parse(coveredTo)) before the for
loop and change the loop condition to use coveredToMs instead of
Date.parse(coveredTo). Also consider aligning the loop bounds to the hourly
boundary used elsewhere by using the existing currentHourStart (or deriving an
aligned boundary) for the upper limit so cursorMs and currentHourStart are on
the same hour boundaries.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/public-status/hourly-rollups.ts`:
- Around line 52-69: The median function returns inconsistent precision: the
even- count branch rounds to 4 decimal places while the odd- count branch
returns the raw number; update median (function median) to normalize precision
for both cases (e.g., round the odd-case value to 4 decimals the same way as the
even case or use a shared roundTo4 helper) so TPS values (from
computeTokensPerSecond / tps) are stored with consistent precision.

In `@src/lib/public-status/rebuild-worker.ts`:
- Around line 308-326: The loop currently rebuilds every hourly bucket in the
retention window by generating finalizedHourStarts and upserting all of them,
causing heavy DB load; change the logic in rebuild-worker.ts so instead of
iterating Date.parse(rollupCoveredFrom) .. coveredTo you compute only the
most-recent finalized hours plus the current hour (e.g., RECENT_FINALIZED_HOURS
= 2..6), by calculating finalizedFromMs = currentHourStartMs -
RECENT_FINALIZED_HOURS * 60*60*1000 and iterating from finalizedFromMs to
currentHourStartMs (and include current hour if desired), then feed those hour
starts to chunkArray and call buildAndPersistPublicStatusHourlyRollup as before
(keep ROLLUP_WRITE_BATCH_SIZE batching). Also update the unit test that asserts
30*24 calls (tests/unit/public-status/rebuild-worker.test.ts) to match the new
RECENT_FINALIZED_HOURS behavior or add a separate backfill-only code path for
full 30-day runs.

---

Outside diff comments:
In `@src/lib/public-status/rebuild-worker.ts`:
- Around line 274-326: 循环中用来生成 finalizedHourStarts 的 cursorMs 可能不是整点(例如
09:15Z),导致传入 buildAndPersistPublicStatusHourlyRollup 的 hourStart
语义不明确;虽然该函数内部会调用 alignHourStartUtc,但调用方应显式传入对齐后的整点时间以保持语义一致。修复方法:在构造
finalizedHourStarts 时用 alignHourStartUtc(new Date(cursorMs))(或等效方式)将每个 hourStart
对齐到整点(替换当前直接 new Date(cursorMs) 的做法),继续通过 chunkArray 批量调用
buildAndPersistPublicStatusHourlyRollup
如前;并新增单元测试(tests/unit/public-status/rebuild-worker.test.ts)覆盖 intervalMinutes=15
且 now 为非整点(例如 2026-04-21T10:17:00.000Z)的场景以验证行为。

---

Nitpick comments:
In `@src/lib/public-status/rebuild-worker.ts`:
- Around line 309-313: The loop repeatedly calls Date.parse(coveredTo) each
iteration; compute and reuse a parsed value (e.g., const coveredToMs =
Date.parse(coveredTo)) before the for loop and change the loop condition to use
coveredToMs instead of Date.parse(coveredTo). Also consider aligning the loop
bounds to the hourly boundary used elsewhere by using the existing
currentHourStart (or deriving an aligned boundary) for the upper limit so
cursorMs and currentHourStart are on the same hour boundaries.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b9fe64d6-c5bf-485d-ba21-1affb01966c6

📥 Commits

Reviewing files that changed from the base of the PR and between 0c58703 and 60eb8bd.

📒 Files selected for processing (12)
  • drizzle/0103_public_status_hourly_rollups.sql
  • drizzle/meta/0103_snapshot.json
  • drizzle/meta/_journal.json
  • src/app/api/public-status/route.ts
  • src/drizzle/schema.ts
  • src/lib/public-status/hourly-rollups.ts
  • src/lib/public-status/read-store.ts
  • src/lib/public-status/rebuild-worker.ts
  • tests/integration/public-status/route-redis-only.test.ts
  • tests/unit/public-status/hourly-rollups.test.ts
  • tests/unit/public-status/read-store.test.ts
  • tests/unit/public-status/rebuild-worker.test.ts

Comment thread src/lib/public-status/hourly-rollups.ts
Comment thread src/lib/public-status/rebuild-worker.ts Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
tests/unit/public-status/rebuild-worker.test.ts (2)

175-178: ⚡ Quick win

upsertPublicStatusHourlyRollups 的正向调用路径(非空行数据)在此缺少覆盖

当前测试通过将 mockBuildPublicStatusHourlyRollupsFromRequests 全程设置为返回 [] 来验证"空数据不触发 upsert"的优化路径(第 175 行与第 498 行),但 始终未验证当 build 函数返回非空 rows 时,rebuild-worker 是否确实调用了 upsertPublicStatusHourlyRollups。本 PR 的核心目标正是将汇总数据持久化到 PostgreSQL,而唯一涉及 DB 写入的代码路径未被任何 rebuild-worker 层面的测试用例覆盖。

建议补充一个测试用例,令 mockBuildPublicStatusHourlyRollupsFromRequests 在若干特定 hour 返回非空行,并断言 mockUpsertPublicStatusHourlyRollups 被调用且传入了正确的行数据,例如:

+  it("calls upsertPublicStatusHourlyRollups with non-empty rollup rows during rebuild", async () => {
+    const mod = await importRebuildWorkerModule();
+
+    const fakeRow = { configVersion: "cfg-1", hourStart: "2026-04-21T09:00:00.000Z" };
+    mockBuildPublicStatusHourlyRollupsFromRequests.mockImplementation(({ hourStart }) => {
+      const h = hourStart instanceof Date ? hourStart : new Date(hourStart);
+      return h.getUTCHours() === 9 ? [fakeRow] : [];
+    });
+
+    mockReadCurrentInternalPublicStatusConfigSnapshot.mockResolvedValue({
+      configVersion: "cfg-1",
+      generatedAt: "2026-04-21T10:00:00.000Z",
+      siteTitle: "Claude Code Hub Status",
+      siteDescription: "Request-derived public status",
+      defaultIntervalMinutes: 5,
+      defaultRangeHours: 24,
+      groups: [],
+    });
+    mockQueryPublicStatusRequests.mockResolvedValue([]);
+    mockRedisSet.mockResolvedValue("OK");
+
+    await mod.rebuildPublicStatusProjection({
+      intervalMinutes: 5,
+      rangeHours: 24,
+      now: new Date("2026-04-21T10:02:00.000Z"),
+    });
+
+    expect(mockUpsertPublicStatusHourlyRollups).toHaveBeenCalledWith(
+      expect.arrayContaining([fakeRow])
+    );
+  });

Also applies to: 485-498

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/public-status/rebuild-worker.test.ts` around lines 175 - 178, Add
a positive-path unit test in rebuild-worker.test.ts that configures
mockBuildPublicStatusHourlyRollupsFromRequests to return non-empty rows for one
or more hours and then asserts that mockUpsertPublicStatusHourlyRollups is
called exactly once (or expected times) with those same rows; make sure to also
assert any related calls (e.g., mockWriteCurrentHourPublicStatusSummary or
mockCleanupPublicStatusHourlyRollups) behave as expected in this scenario, and
restore/clear mocks between tests so the new case doesn't interfere with the
existing empty-row test that returns [].

102-114: 💤 Low value

alignHourStartUtc 在 mock 中被重新实现,存在与真实逻辑漂移的风险

第 104–109 行在 mock 中完整复制了 alignHourStartUtc 的逻辑。如果 src/lib/public-status/hourly-rollups.ts 中的真实实现(例如时区处理或边界条件)日后发生变化,此处的 mock 不会感知,测试断言会静默地验证错误行为。

建议将 mock 中的 alignHourStartUtc 替换为透传到真实模块的方式(部分 mock + 保留真实函数),或至少添加注释声明它与真实实现保持同步的意图,以降低维护负担。

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/public-status/rebuild-worker.test.ts` around lines 102 - 114, The
mock reimplements alignHourStartUtc which risks drifting from the real
implementation; change the mock to reuse the real alignHourStartUtc instead of
copying logic (e.g., in your vi.doMock return, call vi.importActual to obtain
the real module and pass through alignHourStartUtc or only partially mock other
exports), or if you must keep a local stub, add a clear comment above
alignHourStartUtc stating it must mirror src/lib/public-status/hourly-rollups.ts
and include a test to detect divergence; reference the mocked function name
alignHourStartUtc and the mocked module
buildPublicStatusHourlyRollupsFromRequests/cleanupPublicStatusHourlyRollups/upsertPublicStatusHourlyRollups/writeCurrentHourPublicStatusSummary
so you only override those while preserving the real alignHourStartUtc.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/unit/public-status/rebuild-worker.test.ts`:
- Around line 175-178: Add a positive-path unit test in rebuild-worker.test.ts
that configures mockBuildPublicStatusHourlyRollupsFromRequests to return
non-empty rows for one or more hours and then asserts that
mockUpsertPublicStatusHourlyRollups is called exactly once (or expected times)
with those same rows; make sure to also assert any related calls (e.g.,
mockWriteCurrentHourPublicStatusSummary or mockCleanupPublicStatusHourlyRollups)
behave as expected in this scenario, and restore/clear mocks between tests so
the new case doesn't interfere with the existing empty-row test that returns [].
- Around line 102-114: The mock reimplements alignHourStartUtc which risks
drifting from the real implementation; change the mock to reuse the real
alignHourStartUtc instead of copying logic (e.g., in your vi.doMock return, call
vi.importActual to obtain the real module and pass through alignHourStartUtc or
only partially mock other exports), or if you must keep a local stub, add a
clear comment above alignHourStartUtc stating it must mirror
src/lib/public-status/hourly-rollups.ts and include a test to detect divergence;
reference the mocked function name alignHourStartUtc and the mocked module
buildPublicStatusHourlyRollupsFromRequests/cleanupPublicStatusHourlyRollups/upsertPublicStatusHourlyRollups/writeCurrentHourPublicStatusSummary
so you only override those while preserving the real alignHourStartUtc.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d16ac930-14a7-4386-a2ee-766b4c937364

📥 Commits

Reviewing files that changed from the base of the PR and between 60eb8bd and ced80b6.

📒 Files selected for processing (5)
  • src/drizzle/schema.ts
  • src/lib/public-status/hourly-rollups.ts
  • src/lib/public-status/rebuild-worker.ts
  • tests/unit/public-status/hourly-rollups.test.ts
  • tests/unit/public-status/rebuild-worker.test.ts
✅ Files skipped from review due to trivial changes (1)
  • tests/unit/public-status/hourly-rollups.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/lib/public-status/hourly-rollups.ts
  • src/lib/public-status/rebuild-worker.ts

@dofastted dofastted force-pushed the fix/public-status-rollup-storage branch from ced80b6 to c71aeb3 Compare May 5, 2026 08:36
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c71aeb3c4a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

}));

return {
rebuildState: input.rows.length > 0 ? "fresh" : "no-data",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Mark stale DB rollup responses before returning

When any historical row exists, this function always emits rebuildState: "fresh", so readPublicStatusPayload returns early and never triggers a rebuild hint even if the current-hour cache is missing or the newest rollup is old. In practice, if the rebuild worker stalls (or current-hour Redis TTL expires), clients can receive hours-old data as ready/fresh indefinitely instead of stale/rebuilding, which hides data freshness regressions and suppresses recovery signaling.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core area:statistics bug Something isn't working size/XL Extra Large PR (> 1000 lines)

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

1 participant