Skip to content

shift (module 2): v1+v2+v3 + code-status-bar review fixes#1

Closed
AllenBW wants to merge 5 commits into
mainfrom
shift-v1
Closed

shift (module 2): v1+v2+v3 + code-status-bar review fixes#1
AllenBW wants to merge 5 commits into
mainfrom
shift-v1

Conversation

@AllenBW

@AllenBW AllenBW commented Jun 14, 2026

Copy link
Copy Markdown
Owner

Autonomous work-queue runner for Claude Code — built solid before merge (v1 + v2 + v3), plus the prior review fixes. 63 tests (shift 52, code-status-bar 7, all green), zero dependencies.

What it does

Pre-load bins of work (hand-written briefs and/or plugin plan folders), leave, and shift works the queue autonomously until it's empty or a bound trips — surviving the 5-hour wall by waiting for the reset. You review the branch + summary at the end. Design: shift/SPEC.md, shift/PLAN.md.

v1 — keep-going engine (Stop hook)

Marks each finished bin done, feeds back the next, finalizes on queue-empty / bound / kill switch. Bounds: time box + max iterations. Branch-only, no-push, decision log, Needs you: flagging.

v2 — all-day headless runner

  • shift run outer loop: spawn → let the engine grind → on a rate-limit wall, wait for the window to reset and resume; bounded by time, iterations, a usage cap, and a resume backstop.
  • Rate-limit exit signature is undocumented, so detection is inferred from cached usage (near-limit + future reset, sourced from the Stop hook payload's rate_limits) with config-overridable stderr patterns — no dependency on an exact exit code. Degrades gracefully when usage data is absent.

v3 — per-bin verify gate

A bin counts as done only if verify.command passes; failures re-feed it with the output up to maxAttempts, then block it. Catches "looked done but wasn't."

Review fixes (from the prior review)

  1. Tests for the shipped usage-bar.cjs (was untested). 2. Hook resolves repo from the payload cwd. 3. Summary surfaces logged Needs you: items. 4. Hardened install.sh (temp→validate-JSON→move; piped local-file guard; Node check).

Testing

  • Pure modules unit-tested (discovery, state, bounds, brief, decision, verify, usage, outcome, run-loop).
  • Integration: the Stop hook driven with crafted payloads; the run loop driven with injected effects (simulates a rate-limit→wait→resume→finalize cycle).
  • Not machine-testable here: the thin real-effects wiring in bin/shift run (real claude spawn) and a live rate-limit hit — flagged in SPEC §13.

Operational note

Unattended runs that execute commands likely need permissionMode: "dontAsk" (+ permissions.allow) or "bypassPermissions"; acceptEdits (default) only auto-approves edits. Documented in shift/README.md.

Still a draft — review at your pace.

@AllenBW AllenBW changed the title shift (module 2) v1 + code-status-bar review fixes shift (module 2): v1+v2+v3 + code-status-bar review fixes Jun 14, 2026
- shift/install.sh wires the Stop hook into ~/.claude/settings.json idempotently
  (backup -> merge -> validate -> atomic move); never duplicates, updates the path
  on repo move, preserves existing hooks/settings.
- shift/lib/install.cjs: pure mergeStopHook() (tested); install.sh is a thin shell.
- shift/test/install.test.cjs: 7 tests (unit merge + live install.sh integration).
- README (root): list shift in the Modules table + candor pointer.
- shift/README: swap manual hook-wiring for the installer; resolve the hook-schema
  caveat (block/reason contract verified against the Claude Code hooks docs).
@AllenBW

AllenBW commented Jun 15, 2026

Copy link
Copy Markdown
Owner Author

Merge-prep pushed (e172bb0).

  • shift/install.sh + shift/lib/install.cjs — one-command Stop-hook installer that merges into ~/.claude/settings.json idempotently (backup → merge → validate → atomic move); never duplicates, preserves existing hooks/settings, updates the path on repo move. Pure mergeStopHook() in lib/, shell is a thin I/O wrapper — same architecture as the rest of the module.
  • shift/test/install.test.cjs — +7 tests (unit merge cases + live install.sh integration against a temp HOME). shift suite now 59, all green.
  • Root README lists shift in the Modules table; shift/README.md swaps the manual hook-wiring for bash shift/install.sh.
  • Hook contract re-verified against the Claude Code hooks docs: {"decision":"block","reason":…} continues the session, {}/exit-0 allows the stop — matches shift-stop.cjs. Resolved the open "verify the schema" caveat in the README.

Still draft pending the manual end-to-end smoke (PLAN Task 9) — the real claude spawn in bin/shift run + a live rate-limit→resume are the only paths the 59 tests can't cover.

A real `shift run` smoke confirmed headless `claude -p` honors the Stop-hook
block and drives the queue warm (resolves the SPEC §9.2 open question). A
pre-flight audit of the previously-untested runner path drove these fixes:

- No false-green: classifyOutcome returns 'completed' only when the engine
  finalized (summary.md). A code-0 exit without finalize is 'incomplete' — the
  runner resumes if the queue advanced, else stops with a 'is the Stop hook
  wired?' diagnostic. `shift run` grades on summary.md, not the exit line.
- Stale-reset guard: auto-resume stops cleanly when the cached reset time is
  already in the past (was a maxResumes-bounded busy-spin).
- Per-spawn timeout (spawnTimeoutMinutes, default 30) kills a wedged claude so
  spawnSync can't hang the runner; launch failures + kills are surfaced.
- Warn when a headless run uses a Bash-prompting permission mode.
- Dropped a spurious audit suggestion (runner writing state.iterations) that
  would have double-counted the hook's bound tracking.

63 shift tests green (pure unit + hook/CLI/run-loop/install integration).
@AllenBW

AllenBW commented Jun 16, 2026

Copy link
Copy Markdown
Owner Author

Hardening pushed (c7adfb8) — validated by a real smoke + audit.

Ran a live shift run smoke (2 commit-a-file bins, bypassPermissions): both bins completed and committed in a single claude -p spawn, clean summary.md. This empirically resolves SPEC §9.2 — headless claude -p honors the Stop-hook {"decision":"block"} and keeps the session warm (was undocumented/unconfirmed).

A pre-flight adversarial audit of the previously-untested runner path then drove four fixes (each adversarially re-verified before commit; verdict: SHIP):

  1. No false-greenclassifyOutcome returns completed only when the engine finalized (summary.md). A claude -p that exits 0 without finalizing is incomplete: the runner resumes if the queue advanced, else stops with a "is the Stop hook wired?" diagnostic. Caught live with a fake claude — the old code reported this as success.
  2. Stale-reset guard — auto-resume stops cleanly when the cached reset time is in the past (was a maxResumes-bounded busy-spin).
  3. Per-spawn timeout (spawnTimeoutMinutes, default 30) so a wedged claude can't hang the runner; launch failures (claude not on PATH) and kills are now surfaced.
  4. Permission warning when a headless run uses a Bash-prompting mode.

Dropped one spurious audit suggestion (making the runner write state.iterations) — it would have double-counted the hook's bound tracking. One residual logged as a known limitation in SPEC §13 (P3, no regression): spawnSync timeout SIGTERMs only the direct claude, not tool grandchildren.

63 shift tests green. PLAN Task 9 (manual smoke) is satisfied — ready for review/merge.

@AllenBW

AllenBW commented Jun 16, 2026

Copy link
Copy Markdown
Owner Author

Superseded by #2, which now targets main and contains all of these commits plus the watch/tokens/relocation work. Closing in favor of the single PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant