shift (module 2): work-queue runner + live dashboard/control + token & runtime visibility by AllenBW · Pull Request #2 · AllenBW/agentic-workflow-toolkit

AllenBW · 2026-06-16T13:29:37Z

Closes the candor gap in the headless runner: a run was a black box while it ran. shift watch is a zero-dependency live TUI over .shift/ with two-way control.

Built autonomously while the author was away (this branch + a reviewable trail is itself the shift use case).

Visibility

A dashboard redraws on an interval: progress bar (done/total), every bin with status (✓ done · ▶ current · · pending · ⤫ skipped · ✗ blocked), elapsed, decision-log tail, "Needs you" count.

Control (a status bar can't take input, so the TUI does)

File-based signals under .shift/ the engine honors at the next stop:

[p] pause/resume — the headless runner idles (still time-boxed)
[k] skip current bin — new skipped status; work stays on the branch
[q] stop — the existing kill switch
[x] close the watcher (run keeps going)

Module-1 tie-in

shift status --line → ⚙ shift 2/5 · 18m · ⚑1 for a ccstatusline custom-command widget — surfaces shift where you already look.

Design

lib/control.cjs — the signal channel (writer: watch; readers: hook + runner).
lib/watch-model.cjs — buildModel + a pure renderFrame/renderLine, so the dashboard is unit-tested without a TTY.
bin/shift watch is a thin TTY shell; hook applies SKIP, run-loop honors PAUSE.

77 shift tests (14 new: control + watch-model + hook-skip + run-loop-pause), all green. Known limitation (SPEC §13): pause/skip apply at bin boundaries, not mid-bin.

Base is shift-v1 so the diff is just this feature.

…fy gate)

- shift/install.sh wires the Stop hook into ~/.claude/settings.json idempotently (backup -> merge -> validate -> atomic move); never duplicates, updates the path on repo move, preserves existing hooks/settings. - shift/lib/install.cjs: pure mergeStopHook() (tested); install.sh is a thin shell. - shift/test/install.test.cjs: 7 tests (unit merge + live install.sh integration). - README (root): list shift in the Modules table + candor pointer. - shift/README: swap manual hook-wiring for the installer; resolve the hook-schema caveat (block/reason contract verified against the Claude Code hooks docs).

A real `shift run` smoke confirmed headless `claude -p` honors the Stop-hook block and drives the queue warm (resolves the SPEC §9.2 open question). A pre-flight audit of the previously-untested runner path drove these fixes: - No false-green: classifyOutcome returns 'completed' only when the engine finalized (summary.md). A code-0 exit without finalize is 'incomplete' — the runner resumes if the queue advanced, else stops with a 'is the Stop hook wired?' diagnostic. `shift run` grades on summary.md, not the exit line. - Stale-reset guard: auto-resume stops cleanly when the cached reset time is already in the past (was a maxResumes-bounded busy-spin). - Per-spawn timeout (spawnTimeoutMinutes, default 30) kills a wedged claude so spawnSync can't hang the runner; launch failures + kills are surfaced. - Warn when a headless run uses a Bash-prompting permission mode. - Dropped a spurious audit suggestion (runner writing state.iterations) that would have double-counted the hook's bound tracking. 63 shift tests green (pure unit + hook/CLI/run-loop/install integration).

A headless run was a black box while running — good trail after, no visibility during. `shift watch` is a zero-dep live TUI over .shift/ that closes that gap: - Dashboard: progress bar, per-bin status (done/current/pending/skipped/blocked), elapsed, decision-log tail, Needs-you count. Redraws on an interval. - Two-way control via a file-based channel the engine honors: [p]ause (runner idles, still time-boxed), [k] skip current bin (new 'skipped' status), [q] stop (existing kill switch), [x] close watcher. - lib/control.cjs (signal channel) + lib/watch-model.cjs (buildModel + a PURE renderFrame/renderLine, so the dashboard is unit-tested without a TTY). - bin/shift gains 'watch' and 'status --line' (one-liner for the module-1 status bar — surfaces shift where you're already looking). Engine integration: Stop hook applies SKIP (marks current bin skipped, advances) and summary now reports skipped; run-loop honors PAUSE between spawns. 77 shift tests green.

Two P2s + cheap P3s from the verification pass (no P0/P1; nothing crashed or corrupted state): - Terminal hygiene (P2): restore cursor + raw mode on SIGTERM/SIGHUP/exit, not just SIGINT/keys; wrap draw() in try/catch; idempotent cleanup; drop Esc as an exit key (a split arrow escape sends a lone \x1b). - Stale SKIP (P2): consume-on-read in the hook — a skip that misses its target is discarded, never left to fire on a later bin. - STOP honored while paused (P3): pause+stop no longer parks until the time box. - Progress bar fills by resolved bins (done+blocked+skipped), so a finalized run shows a full bar instead of ~40% under '● finalized'. - Atomic state.json write (temp+rename) so a redraw never reads a half-written file. - Ellipsis on truncated bin ids. - examples/watch-demo.cjs: zero-cost demo of the dashboard + control flow. Tests: 79 shift (+2: stop-while-paused, stale-skip-discarded), all green. Residual P3s documented: [k] no-op feedback when no current bin; narrow-terminal line wrapping.

AllenBW · 2026-06-16T13:39:12Z

Adversarial verification done — verdict SHIP, fixes landed (15970c6).

A 3-lens review (TUI hygiene / control races / render robustness) with adversarial re-verification found no P0/P1 — nothing crashes the TUI or corrupts run state. Fixed the two P2s + the cheap P3s:

Terminal hygiene (P2): cursor + raw mode now restored on SIGTERM/SIGHUP/exit (not just SIGINT/keys); draw() wrapped in try/catch; idempotent cleanup. Esc dropped as an exit key (a split arrow escape sends a lone \x1b).
Stale SKIP (P2): consume-on-read in the hook — a skip that misses its target is discarded, never left to fire on a later bin. (This also corrected the demo to model real [k] semantics: you can only skip the current bin.)
STOP while paused (P3): pause+stop no longer parks until the time box.
Progress bar fills by resolved bins (done+blocked+skipped) — a finalized run shows a full bar, not ~40% under ● finalized.
Atomic state.json write (temp+rename) so a redraw never reads a half-written file.

79 tests (+2). Residual P3s documented in SPEC §13: [k] gives no feedback when there's no current bin; narrow-terminal line wrapping. Zero-cost demo: node shift/examples/watch-demo.cjs.

- Tokens (output, the honest 'work' figure — not cache-inflated total) + runtime in the dashboard header, status --line, and per-bin columns. Summed from the session transcript (transcript_path from the hook payload; usage in message.usage). - Up/down select a bin, Enter opens a detail view (status, runtime, token breakdown, commit, brief), esc back. - Work record: every finalized run appended to .shift/history.jsonl; 'shift history' shows per-run rows + a totals footer; 'shift history <runId>' drills into one run. - New pure modules: transcript.cjs (window-sum usage), timeline.cjs (append-only bin boundaries), history.cjs (ledger append/read/aggregate). Hook attributes per-bin runtime+tokens and writes the history record on finalize. - Brief now tells the agent .shift/ is append-only bookkeeping (never edit state.json). Known limitation (SPEC §13): per-bin attribution is best-effort in fully-headless runs — an autonomous agent rewrites/deletes .shift/ mid-run and Claude Code sandboxes hook writes to the project dir, so the boundary record can't be put out of reach. Run-level tokens/runtime + the history record (the hook's final write) are authoritative. 96 shift tests, all green.

AllenBW · 2026-06-16T14:57:29Z

Tokens, runtime + a work record added (7f44160).

Per the candor goal of making consumption legible:

Output tokens (the honest 'work produced' figure — not cache-inflated total) + runtime in the dashboard header, status --line, and per-bin columns. Summed from the session transcript (the hook gets transcript_path; usage lives in message.usage).
Drill-down: ↑/↓ select a bin, ⏎ opens a detail view (status, runtime, token breakdown in/out/cache, commit, brief).
Work record: every finalized run appends to .shift/history.jsonl; shift history shows per-run rows + a totals footer (all runs, total time, total output tokens); shift history <runId> drills into one run.
New pure modules: transcript.cjs, timeline.cjs, history.cjs. 96 tests, all green. See it free: node shift/examples/watch-demo.cjs.

Honest limitation I want to flag (SPEC §13). I validated this against real claude -p runs and hit a wall worth recording: per-bin token/runtime attribution is best-effort in fully-headless autonomous mode. Three findings, together, make it unachievable there:

An autonomous agent rewrites/deletes files under .shift/ mid-run (observed: it rewrote state.json + log.md, deleted config.json/timeline.jsonl) — clobbering the hook's per-bin stamps.
Claude Code sandboxes hook file-writes to the project dir, so the boundary record can't be relocated out-of-repo where the agent can't reach it.
The transcript carries no per-bin marker to reconstruct boundaries from.

Run-level tokens/runtime and the history record (written as the hook's final action on finalize, after the agent's last turn) are authoritative regardless. Per-bin columns populate in interactive runs / when the agent leaves .shift/ alone / in the demo, and show — otherwise. The brief now tells the agent .shift/ is append-only; making per-bin robust would need an engine-owned store the agent can't reach (a future item).

…(real per-bin fix) Root cause (verified): a headless autonomous agent rewrites .shift/state.json to mark bins done itself — bypassing the keep-going engine so the hook never drives the queue or records per-bin boundaries. A probe hook disproved a sandbox: a Stop hook can write anywhere (~/.local/state, /tmp, env-provided). So the fix is an engine-owned store OUTSIDE the repo, where the agent (which only works in the repo) can't reach it. - lib/store.cjs: engineDir(cwd) = $XDG_STATE_HOME/shift/<sha256(realpath(cwd))> (realpath so /tmp == /private/tmp; full-path hash so siblings don't collide; SHIFT_STATE_DIR override). mkdir -p. - state.json, usage.json, history.jsonl, timeline now live in engineDir; the hook + bin/shift + watch-model read/write there. config.json stays user-editable in .shift/ and is snapshotted into engineDir so a deletion can't break a run; summary/log/control stay in .shift/. A stray agent-written .shift/state.json is simply ignored. - Always emit a timeline 'start' per bin (binWindows dedupes) so every bin has a window. Validated: a real bypassPermissions run now records per-bin runtime+tokens for every bin (35s/7k, 13s/2k) + a full history row. 99 shift tests green.

AllenBW · 2026-06-16T15:59:57Z

The real fix landed (5044dae) — per-bin attribution now works headless.

Root cause (verified, not guessed): a headless autonomous agent rewrites .shift/state.json to mark bins done itself, usurping the keep-going engine so the hook never drives the queue or records boundaries. (My earlier 'sandbox' theory was wrong — a one-shot probe hook proved a Stop hook can write anywhere, including ~/.local/state and /tmp.)

Fix: the engine's authoritative state now lives OUTSIDE the repo, where the agent — which only operates inside the repo — can't reach it. lib/store.cjs: engineDir(cwd) = $XDG_STATE_HOME/shift/<sha256(realpath(cwd))> (realpath so /tmp==/private/tmp; full-path hash so siblings don't collide; SHIFT_STATE_DIR override). state.json/usage.json/history.jsonl/timeline moved there. .shift/ keeps only what you + the agent legitimately touch: config.json (you edit; also snapshotted into the engine dir so a deletion can't break a run), summary.md, log.md/blocked.jsonl, and STOP/PAUSE/SKIP. A stray agent-written .shift/state.json is simply ignored.

Validated with a real bypassPermissions run: every bin now records runtime + tokens (35s · 7k, 13s · 2k) and a full history row — and there's no state.json in the repo for the agent to clobber. 99 tests green. Adversarial verification of the refactor is running.

…factor Verification of 5044dae surfaced four issues (core path-agreement was already correct): - P1 cross-run leak: cmdStart cleared STOP/summary/usage/timeline but NOT the other repo-side control/blocker signals — a stale PAUSE made `shift start && shift run` a multi-hour no-op, a stale SKIP silently skipped a bin, a stale blocked.jsonl re-blocked one. Now scrubs PAUSE/SKIP/blocked.jsonl/summary.md too. + regression test (cli.test). - P1 broken demo: examples/watch-demo.cjs still seeded/read repo-side state after the relocation → 6 empty frames. Migrated to engineDir (seed state + snapshot config there, readHistory(edir)); now renders real per-bin data + history. - P3 stale prompt: brief.cjs no longer tells the agent not to touch .shift/state.json (gone from the repo); guards the real repo-side surface + notes engine state is external. Test updated. - P3 stale comments: history.cjs / shift-stop.cjs / bin/shift now say history lives in the engine dir, not .shift/. 100 shift tests green; demo verified end-to-end.

AllenBW · 2026-06-16T16:08:22Z

Adversarial verification of the relocation → 4 issues found + fixed (6b58697). Verdict: SHIP.

The review confirmed the core writer/reader path-agreement is correct across symlink/relative/trailing-slash cwd forms, and caught four issues (one I'd missed mattered):

P1 — cross-run leak (real daily-driver bug). cmdStart scrubbed STOP/summary/usage/timeline but not the other repo-side signals, so a second shift start in the same repo could leave a stale PAUSE (→ shift run idle-polls for the whole time box, a multi-hour no-op), SKIP (→ silently skips a bin), or blocked.jsonl (→ re-blocks with last run's note). Now scrubs all of them; +regression test.
P1 — demo broken by the relocation. examples/watch-demo.cjs still seeded/read repo-side state → 6 empty frames. Migrated to the engine dir; now renders real per-bin data + history.
P3 ×2 — stale wording. The agent-facing brief no longer warns about .shift/state.json (gone from the repo); module comments now say history lives in the engine dir.

100 tests green. This wraps the per-bin work: root cause proven, engine state relocated out of the agent's reach, validated on a real headless run, and hardened against cross-run residue.

AllenBW · 2026-06-16T18:29:32Z

Retargeted to main — this is now the full module-2 PR (supersedes #1).

Contains the entire shift module in 11 commits: keep-going Stop-hook engine (v1) → headless runner + usage cap (v2) → verify gate (v3) → one-command installer → post-smoke hardening → live dashboard + keyboard control → per-bin & run-level tokens/runtime + work-record history → engine-state relocation (the real per-bin fix) → adversarial-review fixes → CI.

100 shift + 7 code-status-bar tests; GitHub Actions now runs both on every push/PR. Validated end-to-end on real headless claude -p runs (keep-going, per-bin token attribution). Ready to merge.

Adversarial coverage audit found the CLI surface had zero integration coverage and several agent-proof contracts were untested. Added: - CLI (cli.test): status (plain/PAUSED/no-run), status --line (the finalize-suppression gate + color), history <runId> drill-down + branch-suffix + no-match, unknown-subcommand usage/exit, config shallow-merge, history-preserved-across-restart. - Agent-proof contracts (hook.test): a planted repo-side .shift/state.json is ignored; config falls back to the repo copy when the engine snapshot is gone; per-bin tokens recover from the transcript window when state.bins was clobbered. - watch-model: transcript-derived per-bin/run tokens, the current-bin open window (live runtime/tokens), finalized read from .shift/summary.md while state is out-of-repo. - store.test (new): engineDir key = sha256(realpath) basename, idempotent, sibling-collision- resistant, SHIFT_STATE_DIR/XDG base precedence. - brief: per-git-flag forbid-guard combinations. - Extracted moveSelection/clampSelection from cmdWatch into watch-model (pure, unit-tested). 118 shift tests, all green.

AllenBW added 7 commits June 13, 2026 20:37

code-status-bar: add usage-bar tests + harden installer (review #1, #4)

6578288

shift: implement v1 keep-going engine with tests (module 2)

bebeeb2

shift: add v2 (headless auto-resume + usage cap) and v3 (per-bin veri…

f5e67ad

…fy gate)

ci: run both modules' tests on push + PR (GitHub Actions)

7451d79

AllenBW changed the base branch from shift-v1 to main June 16, 2026 18:29

AllenBW changed the title ~~shift: live dashboard + keyboard control (shift watch)~~ shift (module 2): work-queue runner + live dashboard/control + token & runtime visibility Jun 16, 2026

AllenBW marked this pull request as ready for review June 16, 2026 18:29

AllenBW mentioned this pull request Jun 16, 2026

shift (module 2): v1+v2+v3 + code-status-bar review fixes #1

Closed

AllenBW merged commit 95f51c5 into main Jun 16, 2026
1 check passed

AllenBW deleted the shift-watch branch June 16, 2026 20:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

shift (module 2): work-queue runner + live dashboard/control + token & runtime visibility#2

shift (module 2): work-queue runner + live dashboard/control + token & runtime visibility#2
AllenBW merged 12 commits into
mainfrom
shift-watch

AllenBW commented Jun 16, 2026

Uh oh!

AllenBW commented Jun 16, 2026

Uh oh!

AllenBW commented Jun 16, 2026

Uh oh!

AllenBW commented Jun 16, 2026

Uh oh!

AllenBW commented Jun 16, 2026

Uh oh!

AllenBW commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AllenBW commented Jun 16, 2026

Visibility

Control (a status bar can't take input, so the TUI does)

Module-1 tie-in

Design

Uh oh!

AllenBW commented Jun 16, 2026

Uh oh!

AllenBW commented Jun 16, 2026

Uh oh!

AllenBW commented Jun 16, 2026

Uh oh!

AllenBW commented Jun 16, 2026

Uh oh!

AllenBW commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant