fix(dashboard): SSE 建连回放本进程关闭的会话,修 restore 期 zombie row 看板丢失/残留 stale-active#281
Open
deepcoldy wants to merge 1 commit into
Open
fix(dashboard): SSE 建连回放本进程关闭的会话,修 restore 期 zombie row 看板丢失/残留 stale-active#281deepcoldy wants to merge 1 commit into
deepcoldy wants to merge 1 commit into
Conversation
…ctive #277 的 follow-up(Codex 二审第三轮发现)。 #277 让 /api/events 建连先 subscribe 再回放当前活跃会话,确定性修住了「恢复后仍 active」的行;但只遍历 active Map。restore 期间的 zombie(backing pane 单独死掉) 是「先 announceSessionRow 再立刻 closeSession 从 Map 删除」,这俩事件都早于一个正 在 descriptor→restore 窗口里重连的 dashboard 的 SSE 订阅 → 全丢;等 SSE 连上时 zombie 已不在 active Map,active-only 回放补不到它(它此刻是 closed row)。新建 aggregator 永远看不到它;若 dashboard 原本缓存它为 active,hydrateSessions 只 upsert 不删 absent → 残留 stale active(已死会话显示成活的)。 修:snapshot-on-connect 在回放活跃会话后,再回放「closedAt ≥ 进程启动时刻」的已关 会话(composeRowFromClosed 作 session.spawned)。按进程启动时刻过滤 = 只补本 run 的关闭(含 restore 期 zombie),不回放整个 closed 历史(那本就由 GET /api/sessions hydrate 提供)。用 session.spawned 因目标行可能未知,两端按 sessionId upsert,closed 行的 status 覆盖任何 stale active 条目。补回归测试。
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
背景
#277(已合并)的 follow-up,处理 Codex 二审第三轮发现的残留时序洞。
#277 让
GET /api/events建连时先 subscribe 再回放当前活跃会话快照,确定性修住了「daemon 重启后仍 active 的恢复会话在看板消失」。但那次回放只遍历 active Map,没覆盖 restore 期间被关闭的 zombie 会话。问题
restore 期间,一个「backing pane 单独死掉」的 zombie 会话会被
restoreActiveSessions():注册进 active Map →announceSessionRow()广播session.spawned→ 立刻 probe 到missing→closeSession()(从 active Map 删除、置 store 行 closed)。如果一个 dashboard 恰好在 daemon「发布发现 descriptor(restore 之前)→ restore 完成」这个窗口里重连:
GET /api/sessionshydrate 在 restore 前发生,拿到的 active 为空、该 zombie 当时还是status=active不在 closed rows 里 → 完全没拿到;DashboardEventBus无 buffer 全丢;结果:新建 aggregator 永远看不到它;若该 dashboard 进程原本缓存它为 active,
hydrateSessions()只 upsert 不删 absent 行 → 残留一条 stale active(已死会话在看板显示成活的)。改动
/api/events的 snapshot-on-connect 在回放活跃会话后,再回放「closedAt ≥ 进程启动时刻的已关会话」(composeRowFromClosed作session.spawned发出):GET /api/sessionshydrate 提供。session.spawned(而非session.update):目标行可能对该客户端未知,update会被当 unknown row 丢;两端(aggregator / 前端 store)均按sessionIdupsert,closed 行的status:'closed'会覆盖任何 stale active 条目,stale-active 一并修掉。activeIds,跳过避免与活跃会话重复。验证
pnpm exec tsc --noEmit绿。/api/events仍收到该会话的session.spawned(status==='closed'、closedAt为数字)。pnpm test:4946 passed,22 failed 全是已知 env 失败(workflow-cli*/whiteboard-cli/preset-export-cli/workflow-c0-isolation,CLI 子进程测试,与本 PR 无关)。