Skip to content

shift (module 2): work-queue runner + live dashboard/control + token & runtime visibility#2

Merged
AllenBW merged 12 commits into
mainfrom
shift-watch
Jun 16, 2026
Merged

shift (module 2): work-queue runner + live dashboard/control + token & runtime visibility#2
AllenBW merged 12 commits into
mainfrom
shift-watch

Conversation

@AllenBW

@AllenBW AllenBW commented Jun 16, 2026

Copy link
Copy Markdown
Owner

Closes the candor gap in the headless runner: a run was a black box while it ran. shift watch is a zero-dependency live TUI over .shift/ with two-way control.

Built autonomously while the author was away (this branch + a reviewable trail is itself the shift use case).

Visibility

A dashboard redraws on an interval: progress bar (done/total), every bin with status ( done · current · · pending · skipped · blocked), elapsed, decision-log tail, "Needs you" count.

Control (a status bar can't take input, so the TUI does)

File-based signals under .shift/ the engine honors at the next stop:

  • [p] pause/resume — the headless runner idles (still time-boxed)
  • [k] skip current bin — new skipped status; work stays on the branch
  • [q] stop — the existing kill switch
  • [x] close the watcher (run keeps going)

Module-1 tie-in

shift status --line⚙ shift 2/5 · 18m · ⚑1 for a ccstatusline custom-command widget — surfaces shift where you already look.

Design

  • lib/control.cjs — the signal channel (writer: watch; readers: hook + runner).
  • lib/watch-model.cjsbuildModel + a pure renderFrame/renderLine, so the dashboard is unit-tested without a TTY.
  • bin/shift watch is a thin TTY shell; hook applies SKIP, run-loop honors PAUSE.

77 shift tests (14 new: control + watch-model + hook-skip + run-loop-pause), all green. Known limitation (SPEC §13): pause/skip apply at bin boundaries, not mid-bin.

Base is shift-v1 so the diff is just this feature.

AllenBW added 7 commits June 13, 2026 20:37
- shift/install.sh wires the Stop hook into ~/.claude/settings.json idempotently
  (backup -> merge -> validate -> atomic move); never duplicates, updates the path
  on repo move, preserves existing hooks/settings.
- shift/lib/install.cjs: pure mergeStopHook() (tested); install.sh is a thin shell.
- shift/test/install.test.cjs: 7 tests (unit merge + live install.sh integration).
- README (root): list shift in the Modules table + candor pointer.
- shift/README: swap manual hook-wiring for the installer; resolve the hook-schema
  caveat (block/reason contract verified against the Claude Code hooks docs).
A real `shift run` smoke confirmed headless `claude -p` honors the Stop-hook
block and drives the queue warm (resolves the SPEC §9.2 open question). A
pre-flight audit of the previously-untested runner path drove these fixes:

- No false-green: classifyOutcome returns 'completed' only when the engine
  finalized (summary.md). A code-0 exit without finalize is 'incomplete' — the
  runner resumes if the queue advanced, else stops with a 'is the Stop hook
  wired?' diagnostic. `shift run` grades on summary.md, not the exit line.
- Stale-reset guard: auto-resume stops cleanly when the cached reset time is
  already in the past (was a maxResumes-bounded busy-spin).
- Per-spawn timeout (spawnTimeoutMinutes, default 30) kills a wedged claude so
  spawnSync can't hang the runner; launch failures + kills are surfaced.
- Warn when a headless run uses a Bash-prompting permission mode.
- Dropped a spurious audit suggestion (runner writing state.iterations) that
  would have double-counted the hook's bound tracking.

63 shift tests green (pure unit + hook/CLI/run-loop/install integration).
A headless run was a black box while running — good trail after, no visibility
during. `shift watch` is a zero-dep live TUI over .shift/ that closes that gap:

- Dashboard: progress bar, per-bin status (done/current/pending/skipped/blocked),
  elapsed, decision-log tail, Needs-you count. Redraws on an interval.
- Two-way control via a file-based channel the engine honors: [p]ause (runner
  idles, still time-boxed), [k] skip current bin (new 'skipped' status), [q] stop
  (existing kill switch), [x] close watcher.
- lib/control.cjs (signal channel) + lib/watch-model.cjs (buildModel + a PURE
  renderFrame/renderLine, so the dashboard is unit-tested without a TTY).
- bin/shift gains 'watch' and 'status --line' (one-liner for the module-1 status
  bar — surfaces shift where you're already looking).

Engine integration: Stop hook applies SKIP (marks current bin skipped, advances)
and summary now reports skipped; run-loop honors PAUSE between spawns.

77 shift tests green.
Two P2s + cheap P3s from the verification pass (no P0/P1; nothing crashed or
corrupted state):

- Terminal hygiene (P2): restore cursor + raw mode on SIGTERM/SIGHUP/exit, not
  just SIGINT/keys; wrap draw() in try/catch; idempotent cleanup; drop Esc as an
  exit key (a split arrow escape sends a lone \x1b).
- Stale SKIP (P2): consume-on-read in the hook — a skip that misses its target is
  discarded, never left to fire on a later bin.
- STOP honored while paused (P3): pause+stop no longer parks until the time box.
- Progress bar fills by resolved bins (done+blocked+skipped), so a finalized run
  shows a full bar instead of ~40% under '● finalized'.
- Atomic state.json write (temp+rename) so a redraw never reads a half-written file.
- Ellipsis on truncated bin ids.
- examples/watch-demo.cjs: zero-cost demo of the dashboard + control flow.

Tests: 79 shift (+2: stop-while-paused, stale-skip-discarded), all green.
Residual P3s documented: [k] no-op feedback when no current bin; narrow-terminal
line wrapping.
@AllenBW

AllenBW commented Jun 16, 2026

Copy link
Copy Markdown
Owner Author

Adversarial verification done — verdict SHIP, fixes landed (15970c6).

A 3-lens review (TUI hygiene / control races / render robustness) with adversarial re-verification found no P0/P1 — nothing crashes the TUI or corrupts run state. Fixed the two P2s + the cheap P3s:

  • Terminal hygiene (P2): cursor + raw mode now restored on SIGTERM/SIGHUP/exit (not just SIGINT/keys); draw() wrapped in try/catch; idempotent cleanup. Esc dropped as an exit key (a split arrow escape sends a lone \x1b).
  • Stale SKIP (P2): consume-on-read in the hook — a skip that misses its target is discarded, never left to fire on a later bin. (This also corrected the demo to model real [k] semantics: you can only skip the current bin.)
  • STOP while paused (P3): pause+stop no longer parks until the time box.
  • Progress bar fills by resolved bins (done+blocked+skipped) — a finalized run shows a full bar, not ~40% under ● finalized.
  • Atomic state.json write (temp+rename) so a redraw never reads a half-written file.

79 tests (+2). Residual P3s documented in SPEC §13: [k] gives no feedback when there's no current bin; narrow-terminal line wrapping. Zero-cost demo: node shift/examples/watch-demo.cjs.

- Tokens (output, the honest 'work' figure — not cache-inflated total) + runtime in
  the dashboard header, status --line, and per-bin columns. Summed from the session
  transcript (transcript_path from the hook payload; usage in message.usage).
- Up/down select a bin, Enter opens a detail view (status, runtime, token breakdown,
  commit, brief), esc back.
- Work record: every finalized run appended to .shift/history.jsonl; 'shift history'
  shows per-run rows + a totals footer; 'shift history <runId>' drills into one run.
- New pure modules: transcript.cjs (window-sum usage), timeline.cjs (append-only bin
  boundaries), history.cjs (ledger append/read/aggregate). Hook attributes per-bin
  runtime+tokens and writes the history record on finalize.
- Brief now tells the agent .shift/ is append-only bookkeeping (never edit state.json).

Known limitation (SPEC §13): per-bin attribution is best-effort in fully-headless runs
— an autonomous agent rewrites/deletes .shift/ mid-run and Claude Code sandboxes hook
writes to the project dir, so the boundary record can't be put out of reach. Run-level
tokens/runtime + the history record (the hook's final write) are authoritative.

96 shift tests, all green.
@AllenBW

AllenBW commented Jun 16, 2026

Copy link
Copy Markdown
Owner Author

Tokens, runtime + a work record added (7f44160).

Per the candor goal of making consumption legible:

  • Output tokens (the honest 'work produced' figure — not cache-inflated total) + runtime in the dashboard header, status --line, and per-bin columns. Summed from the session transcript (the hook gets transcript_path; usage lives in message.usage).
  • Drill-down: ↑/↓ select a bin, opens a detail view (status, runtime, token breakdown in/out/cache, commit, brief).
  • Work record: every finalized run appends to .shift/history.jsonl; shift history shows per-run rows + a totals footer (all runs, total time, total output tokens); shift history <runId> drills into one run.
  • New pure modules: transcript.cjs, timeline.cjs, history.cjs. 96 tests, all green. See it free: node shift/examples/watch-demo.cjs.

Honest limitation I want to flag (SPEC §13). I validated this against real claude -p runs and hit a wall worth recording: per-bin token/runtime attribution is best-effort in fully-headless autonomous mode. Three findings, together, make it unachievable there:

  1. An autonomous agent rewrites/deletes files under .shift/ mid-run (observed: it rewrote state.json + log.md, deleted config.json/timeline.jsonl) — clobbering the hook's per-bin stamps.
  2. Claude Code sandboxes hook file-writes to the project dir, so the boundary record can't be relocated out-of-repo where the agent can't reach it.
  3. The transcript carries no per-bin marker to reconstruct boundaries from.

Run-level tokens/runtime and the history record (written as the hook's final action on finalize, after the agent's last turn) are authoritative regardless. Per-bin columns populate in interactive runs / when the agent leaves .shift/ alone / in the demo, and show otherwise. The brief now tells the agent .shift/ is append-only; making per-bin robust would need an engine-owned store the agent can't reach (a future item).

…(real per-bin fix)

Root cause (verified): a headless autonomous agent rewrites .shift/state.json to mark
bins done itself — bypassing the keep-going engine so the hook never drives the queue
or records per-bin boundaries. A probe hook disproved a sandbox: a Stop hook can write
anywhere (~/.local/state, /tmp, env-provided). So the fix is an engine-owned store
OUTSIDE the repo, where the agent (which only works in the repo) can't reach it.

- lib/store.cjs: engineDir(cwd) = $XDG_STATE_HOME/shift/<sha256(realpath(cwd))>
  (realpath so /tmp == /private/tmp; full-path hash so siblings don't collide;
  SHIFT_STATE_DIR override). mkdir -p.
- state.json, usage.json, history.jsonl, timeline now live in engineDir; the hook +
  bin/shift + watch-model read/write there. config.json stays user-editable in .shift/
  and is snapshotted into engineDir so a deletion can't break a run; summary/log/control
  stay in .shift/. A stray agent-written .shift/state.json is simply ignored.
- Always emit a timeline 'start' per bin (binWindows dedupes) so every bin has a window.

Validated: a real bypassPermissions run now records per-bin runtime+tokens for every
bin (35s/7k, 13s/2k) + a full history row. 99 shift tests green.
@AllenBW

AllenBW commented Jun 16, 2026

Copy link
Copy Markdown
Owner Author

The real fix landed (5044dae) — per-bin attribution now works headless.

Root cause (verified, not guessed): a headless autonomous agent rewrites .shift/state.json to mark bins done itself, usurping the keep-going engine so the hook never drives the queue or records boundaries. (My earlier 'sandbox' theory was wrong — a one-shot probe hook proved a Stop hook can write anywhere, including ~/.local/state and /tmp.)

Fix: the engine's authoritative state now lives OUTSIDE the repo, where the agent — which only operates inside the repo — can't reach it. lib/store.cjs: engineDir(cwd) = $XDG_STATE_HOME/shift/<sha256(realpath(cwd))> (realpath so /tmp==/private/tmp; full-path hash so siblings don't collide; SHIFT_STATE_DIR override). state.json/usage.json/history.jsonl/timeline moved there. .shift/ keeps only what you + the agent legitimately touch: config.json (you edit; also snapshotted into the engine dir so a deletion can't break a run), summary.md, log.md/blocked.jsonl, and STOP/PAUSE/SKIP. A stray agent-written .shift/state.json is simply ignored.

Validated with a real bypassPermissions run: every bin now records runtime + tokens (35s · 7k, 13s · 2k) and a full history row — and there's no state.json in the repo for the agent to clobber. 99 tests green. Adversarial verification of the refactor is running.

…factor

Verification of 5044dae surfaced four issues (core path-agreement was already correct):

- P1 cross-run leak: cmdStart cleared STOP/summary/usage/timeline but NOT the other
  repo-side control/blocker signals — a stale PAUSE made `shift start && shift run` a
  multi-hour no-op, a stale SKIP silently skipped a bin, a stale blocked.jsonl re-blocked
  one. Now scrubs PAUSE/SKIP/blocked.jsonl/summary.md too. + regression test (cli.test).
- P1 broken demo: examples/watch-demo.cjs still seeded/read repo-side state after the
  relocation → 6 empty frames. Migrated to engineDir (seed state + snapshot config there,
  readHistory(edir)); now renders real per-bin data + history.
- P3 stale prompt: brief.cjs no longer tells the agent not to touch .shift/state.json
  (gone from the repo); guards the real repo-side surface + notes engine state is external.
  Test updated.
- P3 stale comments: history.cjs / shift-stop.cjs / bin/shift now say history lives in the
  engine dir, not .shift/.

100 shift tests green; demo verified end-to-end.
@AllenBW

AllenBW commented Jun 16, 2026

Copy link
Copy Markdown
Owner Author

Adversarial verification of the relocation → 4 issues found + fixed (6b58697). Verdict: SHIP.

The review confirmed the core writer/reader path-agreement is correct across symlink/relative/trailing-slash cwd forms, and caught four issues (one I'd missed mattered):

  • P1 — cross-run leak (real daily-driver bug). cmdStart scrubbed STOP/summary/usage/timeline but not the other repo-side signals, so a second shift start in the same repo could leave a stale PAUSE (→ shift run idle-polls for the whole time box, a multi-hour no-op), SKIP (→ silently skips a bin), or blocked.jsonl (→ re-blocks with last run's note). Now scrubs all of them; +regression test.
  • P1 — demo broken by the relocation. examples/watch-demo.cjs still seeded/read repo-side state → 6 empty frames. Migrated to the engine dir; now renders real per-bin data + history.
  • P3 ×2 — stale wording. The agent-facing brief no longer warns about .shift/state.json (gone from the repo); module comments now say history lives in the engine dir.

100 tests green. This wraps the per-bin work: root cause proven, engine state relocated out of the agent's reach, validated on a real headless run, and hardened against cross-run residue.

@AllenBW AllenBW changed the base branch from shift-v1 to main June 16, 2026 18:29
@AllenBW AllenBW changed the title shift: live dashboard + keyboard control (shift watch) shift (module 2): work-queue runner + live dashboard/control + token & runtime visibility Jun 16, 2026
@AllenBW AllenBW marked this pull request as ready for review June 16, 2026 18:29
@AllenBW

AllenBW commented Jun 16, 2026

Copy link
Copy Markdown
Owner Author

Retargeted to main — this is now the full module-2 PR (supersedes #1).

Contains the entire shift module in 11 commits: keep-going Stop-hook engine (v1) → headless runner + usage cap (v2) → verify gate (v3) → one-command installer → post-smoke hardening → live dashboard + keyboard control → per-bin & run-level tokens/runtime + work-record history → engine-state relocation (the real per-bin fix) → adversarial-review fixes → CI.

100 shift + 7 code-status-bar tests; GitHub Actions now runs both on every push/PR. Validated end-to-end on real headless claude -p runs (keep-going, per-bin token attribution). Ready to merge.

Adversarial coverage audit found the CLI surface had zero integration coverage and
several agent-proof contracts were untested. Added:

- CLI (cli.test): status (plain/PAUSED/no-run), status --line (the finalize-suppression
  gate + color), history <runId> drill-down + branch-suffix + no-match, unknown-subcommand
  usage/exit, config shallow-merge, history-preserved-across-restart.
- Agent-proof contracts (hook.test): a planted repo-side .shift/state.json is ignored;
  config falls back to the repo copy when the engine snapshot is gone; per-bin tokens
  recover from the transcript window when state.bins was clobbered.
- watch-model: transcript-derived per-bin/run tokens, the current-bin open window
  (live runtime/tokens), finalized read from .shift/summary.md while state is out-of-repo.
- store.test (new): engineDir key = sha256(realpath) basename, idempotent, sibling-collision-
  resistant, SHIFT_STATE_DIR/XDG base precedence.
- brief: per-git-flag forbid-guard combinations.
- Extracted moveSelection/clampSelection from cmdWatch into watch-model (pure, unit-tested).

118 shift tests, all green.
@AllenBW AllenBW merged commit 95f51c5 into main Jun 16, 2026
1 check passed
@AllenBW AllenBW deleted the shift-watch branch June 16, 2026 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant