fix(server): resync a lagged consumer in-band instead of silently dropping VT#134
Open
phall1 wants to merge 1 commit into
Open
fix(server): resync a lagged consumer in-band instead of silently dropping VT#134phall1 wants to merge 1 commit into
phall1 wants to merge 1 commit into
Conversation
…pping VT The last known corruption path from the TUI diagnosis (Rank 1). Each attached client has a per-pane output pump that forwards the actor's `broadcast` of VT chunks. The broadcast buffer is bounded (DEFAULT_OUTPUT_BROADCAST = 256); under a sustained output burst (a full-screen TUI repainting on attach/split, a flood of logs) a pump that falls behind gets `RecvError::Lagged(n)` — the n chunks are gone. The arm only `warn!`ed, so the client's libghostty mirror stayed permanently diverged until some unrelated resize/reattach happened to resync. Silent, unrecoverable screen corruption on exactly the bursty events users hit. Fix: on `Lagged`, the pump asks the actor to broadcast an in-band resync — a full grid snapshot (`PaneOutput::Resync`) on the *same* ordered broadcast channel. Because it rides the same channel, it lands in the pump's receiver after the post-lag tail and cleanly supersedes the gap: no double-apply (the snapshot is a full-grid reset, not an additive delta) and no lost output (unlike a point-to-point snapshot, which would race the buffered deltas). Mechanism: a new `ResizeRequest::resync_only` flag tells the actor to skip the resize and only run its existing debounced resync broadcast (`broadcast_resync`, renamed from `broadcast_resync_after_resize` since it now serves both callers). The debounce coalesces a burst of lag events into one snapshot. `try_send` failure is benign (a resync is already queued, or the actor is gone). Tested: new `resync_only_request_rebroadcasts_snapshot_without_resizing` asserts the snapshot is re-broadcast and the grid size is unchanged (the ignored 0x0 geometry must not take effect). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The last known corruption path from the TUI diagnosis (Rank 1). Closes the "stays mangled under heavy output" case.
Root cause
Each attached client has a per-pane output pump forwarding the actor's
broadcastof VT chunks. The broadcast buffer is bounded (DEFAULT_OUTPUT_BROADCAST = 256); under a sustained output burst (a full-screen TUI repainting on attach/split, a log flood) a pump that falls behind getsRecvError::Lagged(n)— thosenchunks are gone. The arm onlywarn!ed, so the client's libghostty mirror stayed permanently diverged until some unrelated resize/reattach happened to resync.Fix
On
Lagged, the pump asks the actor to broadcast an in-band resync — a full grid snapshot (PaneOutput::Resync) on the same ordered broadcast channel. Because it rides the same channel it lands after the post-lag tail and cleanly supersedes the gap:Mechanism: a new
ResizeRequest::resync_onlyflag tells the actor to skip the resize and only run its existing debounced resync broadcast (broadcast_resync, renamed frombroadcast_resync_after_resizesince it now serves both resize and lag recovery). The debounce coalesces a burst of lag events into one snapshot.try_sendfailure is benign (a resync is already queued, or the actor is gone).Relationship to #132
#132 made a dropped-byte divergence self-heal on the next resync; this triggers that resync immediately on the drop instead of waiting for an unrelated event. Together they close the path.
Testing
resync_only_request_rebroadcasts_snapshot_without_resizing: the snapshot is re-broadcast and the grid size stays unchanged (the ignored 0×0 geometry must not take effect).just cigreen.🤖 Generated with Claude Code