shift (module 2): work-queue runner + live dashboard/control + token & runtime visibility#2
Conversation
- shift/install.sh wires the Stop hook into ~/.claude/settings.json idempotently (backup -> merge -> validate -> atomic move); never duplicates, updates the path on repo move, preserves existing hooks/settings. - shift/lib/install.cjs: pure mergeStopHook() (tested); install.sh is a thin shell. - shift/test/install.test.cjs: 7 tests (unit merge + live install.sh integration). - README (root): list shift in the Modules table + candor pointer. - shift/README: swap manual hook-wiring for the installer; resolve the hook-schema caveat (block/reason contract verified against the Claude Code hooks docs).
A real `shift run` smoke confirmed headless `claude -p` honors the Stop-hook block and drives the queue warm (resolves the SPEC §9.2 open question). A pre-flight audit of the previously-untested runner path drove these fixes: - No false-green: classifyOutcome returns 'completed' only when the engine finalized (summary.md). A code-0 exit without finalize is 'incomplete' — the runner resumes if the queue advanced, else stops with a 'is the Stop hook wired?' diagnostic. `shift run` grades on summary.md, not the exit line. - Stale-reset guard: auto-resume stops cleanly when the cached reset time is already in the past (was a maxResumes-bounded busy-spin). - Per-spawn timeout (spawnTimeoutMinutes, default 30) kills a wedged claude so spawnSync can't hang the runner; launch failures + kills are surfaced. - Warn when a headless run uses a Bash-prompting permission mode. - Dropped a spurious audit suggestion (runner writing state.iterations) that would have double-counted the hook's bound tracking. 63 shift tests green (pure unit + hook/CLI/run-loop/install integration).
A headless run was a black box while running — good trail after, no visibility during. `shift watch` is a zero-dep live TUI over .shift/ that closes that gap: - Dashboard: progress bar, per-bin status (done/current/pending/skipped/blocked), elapsed, decision-log tail, Needs-you count. Redraws on an interval. - Two-way control via a file-based channel the engine honors: [p]ause (runner idles, still time-boxed), [k] skip current bin (new 'skipped' status), [q] stop (existing kill switch), [x] close watcher. - lib/control.cjs (signal channel) + lib/watch-model.cjs (buildModel + a PURE renderFrame/renderLine, so the dashboard is unit-tested without a TTY). - bin/shift gains 'watch' and 'status --line' (one-liner for the module-1 status bar — surfaces shift where you're already looking). Engine integration: Stop hook applies SKIP (marks current bin skipped, advances) and summary now reports skipped; run-loop honors PAUSE between spawns. 77 shift tests green.
Two P2s + cheap P3s from the verification pass (no P0/P1; nothing crashed or corrupted state): - Terminal hygiene (P2): restore cursor + raw mode on SIGTERM/SIGHUP/exit, not just SIGINT/keys; wrap draw() in try/catch; idempotent cleanup; drop Esc as an exit key (a split arrow escape sends a lone \x1b). - Stale SKIP (P2): consume-on-read in the hook — a skip that misses its target is discarded, never left to fire on a later bin. - STOP honored while paused (P3): pause+stop no longer parks until the time box. - Progress bar fills by resolved bins (done+blocked+skipped), so a finalized run shows a full bar instead of ~40% under '● finalized'. - Atomic state.json write (temp+rename) so a redraw never reads a half-written file. - Ellipsis on truncated bin ids. - examples/watch-demo.cjs: zero-cost demo of the dashboard + control flow. Tests: 79 shift (+2: stop-while-paused, stale-skip-discarded), all green. Residual P3s documented: [k] no-op feedback when no current bin; narrow-terminal line wrapping.
|
Adversarial verification done — verdict SHIP, fixes landed (15970c6). A 3-lens review (TUI hygiene / control races / render robustness) with adversarial re-verification found no P0/P1 — nothing crashes the TUI or corrupts run state. Fixed the two P2s + the cheap P3s:
79 tests (+2). Residual P3s documented in SPEC §13: |
- Tokens (output, the honest 'work' figure — not cache-inflated total) + runtime in the dashboard header, status --line, and per-bin columns. Summed from the session transcript (transcript_path from the hook payload; usage in message.usage). - Up/down select a bin, Enter opens a detail view (status, runtime, token breakdown, commit, brief), esc back. - Work record: every finalized run appended to .shift/history.jsonl; 'shift history' shows per-run rows + a totals footer; 'shift history <runId>' drills into one run. - New pure modules: transcript.cjs (window-sum usage), timeline.cjs (append-only bin boundaries), history.cjs (ledger append/read/aggregate). Hook attributes per-bin runtime+tokens and writes the history record on finalize. - Brief now tells the agent .shift/ is append-only bookkeeping (never edit state.json). Known limitation (SPEC §13): per-bin attribution is best-effort in fully-headless runs — an autonomous agent rewrites/deletes .shift/ mid-run and Claude Code sandboxes hook writes to the project dir, so the boundary record can't be put out of reach. Run-level tokens/runtime + the history record (the hook's final write) are authoritative. 96 shift tests, all green.
|
Tokens, runtime + a work record added (7f44160). Per the candor goal of making consumption legible:
Honest limitation I want to flag (SPEC §13). I validated this against real
Run-level tokens/runtime and the history record (written as the hook's final action on finalize, after the agent's last turn) are authoritative regardless. Per-bin columns populate in interactive runs / when the agent leaves |
…(real per-bin fix) Root cause (verified): a headless autonomous agent rewrites .shift/state.json to mark bins done itself — bypassing the keep-going engine so the hook never drives the queue or records per-bin boundaries. A probe hook disproved a sandbox: a Stop hook can write anywhere (~/.local/state, /tmp, env-provided). So the fix is an engine-owned store OUTSIDE the repo, where the agent (which only works in the repo) can't reach it. - lib/store.cjs: engineDir(cwd) = $XDG_STATE_HOME/shift/<sha256(realpath(cwd))> (realpath so /tmp == /private/tmp; full-path hash so siblings don't collide; SHIFT_STATE_DIR override). mkdir -p. - state.json, usage.json, history.jsonl, timeline now live in engineDir; the hook + bin/shift + watch-model read/write there. config.json stays user-editable in .shift/ and is snapshotted into engineDir so a deletion can't break a run; summary/log/control stay in .shift/. A stray agent-written .shift/state.json is simply ignored. - Always emit a timeline 'start' per bin (binWindows dedupes) so every bin has a window. Validated: a real bypassPermissions run now records per-bin runtime+tokens for every bin (35s/7k, 13s/2k) + a full history row. 99 shift tests green.
|
The real fix landed (5044dae) — per-bin attribution now works headless. Root cause (verified, not guessed): a headless autonomous agent rewrites Fix: the engine's authoritative state now lives OUTSIDE the repo, where the agent — which only operates inside the repo — can't reach it. Validated with a real |
…factor Verification of 5044dae surfaced four issues (core path-agreement was already correct): - P1 cross-run leak: cmdStart cleared STOP/summary/usage/timeline but NOT the other repo-side control/blocker signals — a stale PAUSE made `shift start && shift run` a multi-hour no-op, a stale SKIP silently skipped a bin, a stale blocked.jsonl re-blocked one. Now scrubs PAUSE/SKIP/blocked.jsonl/summary.md too. + regression test (cli.test). - P1 broken demo: examples/watch-demo.cjs still seeded/read repo-side state after the relocation → 6 empty frames. Migrated to engineDir (seed state + snapshot config there, readHistory(edir)); now renders real per-bin data + history. - P3 stale prompt: brief.cjs no longer tells the agent not to touch .shift/state.json (gone from the repo); guards the real repo-side surface + notes engine state is external. Test updated. - P3 stale comments: history.cjs / shift-stop.cjs / bin/shift now say history lives in the engine dir, not .shift/. 100 shift tests green; demo verified end-to-end.
|
Adversarial verification of the relocation → 4 issues found + fixed (6b58697). Verdict: SHIP. The review confirmed the core writer/reader path-agreement is correct across symlink/relative/trailing-slash cwd forms, and caught four issues (one I'd missed mattered):
100 tests green. This wraps the per-bin work: root cause proven, engine state relocated out of the agent's reach, validated on a real headless run, and hardened against cross-run residue. |
|
Retargeted to Contains the entire 100 |
Adversarial coverage audit found the CLI surface had zero integration coverage and several agent-proof contracts were untested. Added: - CLI (cli.test): status (plain/PAUSED/no-run), status --line (the finalize-suppression gate + color), history <runId> drill-down + branch-suffix + no-match, unknown-subcommand usage/exit, config shallow-merge, history-preserved-across-restart. - Agent-proof contracts (hook.test): a planted repo-side .shift/state.json is ignored; config falls back to the repo copy when the engine snapshot is gone; per-bin tokens recover from the transcript window when state.bins was clobbered. - watch-model: transcript-derived per-bin/run tokens, the current-bin open window (live runtime/tokens), finalized read from .shift/summary.md while state is out-of-repo. - store.test (new): engineDir key = sha256(realpath) basename, idempotent, sibling-collision- resistant, SHIFT_STATE_DIR/XDG base precedence. - brief: per-git-flag forbid-guard combinations. - Extracted moveSelection/clampSelection from cmdWatch into watch-model (pure, unit-tested). 118 shift tests, all green.
Closes the candor gap in the headless runner: a run was a black box while it ran.
shift watchis a zero-dependency live TUI over.shift/with two-way control.Built autonomously while the author was away (this branch + a reviewable trail is itself the
shiftuse case).Visibility
A dashboard redraws on an interval: progress bar (
done/total), every bin with status (✓done ·▶current ··pending ·⤫skipped ·✗blocked), elapsed, decision-log tail, "Needs you" count.Control (a status bar can't take input, so the TUI does)
File-based signals under
.shift/the engine honors at the next stop:skippedstatus; work stays on the branchModule-1 tie-in
shift status --line→⚙ shift 2/5 · 18m · ⚑1for a ccstatuslinecustom-commandwidget — surfaces shift where you already look.Design
lib/control.cjs— the signal channel (writer: watch; readers: hook + runner).lib/watch-model.cjs—buildModel+ a purerenderFrame/renderLine, so the dashboard is unit-tested without a TTY.bin/shift watchis a thin TTY shell; hook applies SKIP, run-loop honors PAUSE.77
shifttests (14 new: control + watch-model + hook-skip + run-loop-pause), all green. Known limitation (SPEC §13): pause/skip apply at bin boundaries, not mid-bin.Base is
shift-v1so the diff is just this feature.