Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@
*.bak
*.bak-*
node_modules/
.shift/
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,12 @@ Transparency isn't a feature bolted on the side; for agentic coding it's the who
| Module | What it is | Targets |
|---|---|---|
| [**code-status-bar**](./code-status-bar) | A status line that shows usage limits, cost, context health, and git/worktree state at a glance | Claude Code (via [ccstatusline](https://github.com/sirmalloc/ccstatusline)) |
| [**shift**](./shift) | An autonomous work-queue runner: pre-load bins of work, leave, and it keeps the agent grinding through them — past natural stop points and across rate-limit resets — leaving every decision logged and every change a reviewable commit | Claude Code (Stop hook + headless `-p`) |

> **New here? Start with the [Code Status Bar](./code-status-bar).** It installs as a portable, zero-dependency default, or an [opt-in colored variant](./code-status-bar#color--static-by-default-status-driven-by-opt-in) that recolors the usage bars **green → yellow → red** as you approach each limit — so you *feel* a wall coming before you read a single number. You could build it by hand in ccstatusline's editor; this is that setup already done — one command, no configuration, and still fully editable.

> **Going heads-down?** [**shift**](./shift) turns an unattended run — the *least* transparent mode there is — into an honest paper trail: you trade real-time steering for a `shift/<date>` branch, a decision log, and a "here's what I did and what needs you" summary. One command wires the hook; the safety model keeps the work on a branch and off your remotes.

More to come. Each module is self-contained, declares which agent it targets, and explains *why* every piece earns its place — because justifying the real estate is part of the philosophy.

## License
Expand Down
41 changes: 35 additions & 6 deletions code-status-bar/install.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,17 +12,47 @@ DEST="$CONFIG_DIR/settings.json"
COLORED=0
[ "${1:-}" = "--colored" ] && COLORED=1

if [ "$COLORED" -eq 1 ] && ! command -v node >/dev/null 2>&1; then
echo "Error: --colored needs Node on your PATH (the helper runs via node)." >&2
echo "Install Node, or use the default (no-flag) config." >&2
exit 1
fi

mkdir -p "$CONFIG_DIR"

# Only treat the script's directory as a real clone if it actually contains this
# module (both files present). When run via `curl | bash`, BASH_SOURCE is unset and
# this stays 0, so we always download instead of copying a stray local file.
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]:-$0}")" 2>/dev/null && pwd || echo "")"
LOCAL_OK=0
if [ -n "$SCRIPT_DIR" ] && [ -f "$SCRIPT_DIR/install.sh" ] && [ -f "$SCRIPT_DIR/settings.json" ]; then
LOCAL_OK=1
fi

# fetch <relative-path> <destination> — prefer a local clone, fall back to download.
# fetch <relative-path> <destination>: download (or copy from a verified clone) to a
# temp file, validate, then move into place — so a failed fetch never leaves a broken
# or empty config at the destination.
fetch() {
local rel="$1" out="$2"
if [ -n "$SCRIPT_DIR" ] && [ -f "$SCRIPT_DIR/$rel" ]; then
cp "$SCRIPT_DIR/$rel" "$out"
local rel="$1" out="$2" tmp
tmp="$(mktemp)"
if [ "$LOCAL_OK" -eq 1 ] && [ -f "$SCRIPT_DIR/$rel" ]; then
cp "$SCRIPT_DIR/$rel" "$tmp"
else
curl -fsSL "$REPO_RAW/$rel" -o "$out"
curl -fsSL "$REPO_RAW/$rel" -o "$tmp"
fi
if [ ! -s "$tmp" ]; then
echo "Error: fetched '$rel' is empty; aborting (your existing config is untouched)." >&2
rm -f "$tmp"; exit 1
fi
case "$rel" in
*.json)
if command -v node >/dev/null 2>&1; then
node -e 'JSON.parse(require("fs").readFileSync(process.argv[1],"utf8"))' "$tmp" 2>/dev/null \
|| { echo "Error: fetched '$rel' is not valid JSON; aborting." >&2; rm -f "$tmp"; exit 1; }
fi
;;
esac
mv "$tmp" "$out"
}

if [ -f "$DEST" ]; then
Expand All @@ -37,7 +67,6 @@ if [ "$COLORED" -eq 1 ]; then
fetch "settings.colored.json" "$DEST"
echo "Installed COLORED variant -> $DEST"
echo "Helper script -> $SCRIPTS_DIR/usage-bar.cjs"
echo "(needs Node on your PATH at render time — ccstatusline already provides it)"
else
fetch "settings.json" "$DEST"
echo "Installed -> $DEST"
Expand Down
8 changes: 8 additions & 0 deletions code-status-bar/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"name": "code-status-bar",
"version": "0.1.0",
"private": true,
"description": "Usage-limit-aware Claude Code status bar (Agentic Workflow Toolkit module 1)",
"engines": { "node": ">=18" },
"scripts": { "test": "node --test" }
}
73 changes: 73 additions & 0 deletions code-status-bar/test/usage-bar.test.cjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
'use strict';
const { test } = require('node:test');
const assert = require('node:assert');
const cp = require('node:child_process');
const path = require('node:path');

const SCRIPT = path.resolve(__dirname, '..', 'scripts', 'usage-bar.cjs');
const ESC = '\x1b';
const YELLOW = `${ESC}[38;2;252;233;79m`;
const GREEN = `${ESC}[38;2;138;226;52m`;
const RED = `${ESC}[38;2;239;41;41m`;

function run(args, payload) {
return cp.execFileSync('node', [SCRIPT, ...args], {
input: payload === undefined ? '' : JSON.stringify(payload),
encoding: 'utf8'
});
}

function rl(over) {
const now = Math.floor(Date.now() / 1000);
return {
rate_limits: Object.assign({
five_hour: { used_percentage: 72, resets_at: now + 7200 },
seven_day: { used_percentage: 41, resets_at: now + 432000 },
seven_day_opus: { used_percentage: 88, resets_at: now + 432000 }
}, over || {})
};
}

test('session at 72% renders bold yellow with label and percent', () => {
const out = run(['session'], rl());
assert.ok(out.includes(YELLOW), 'expected yellow');
assert.ok(out.includes('Session: '), 'expected label');
assert.ok(out.includes('72.0%'), 'expected percent');
assert.ok(out.startsWith(`${ESC}[1m`), 'expected bold prefix');
});

test('weekly at 41% is green, opus at 88% is red', () => {
assert.ok(run(['weekly'], rl()).includes(GREEN));
assert.ok(run(['opus'], rl()).includes(RED));
});

test('multiple limits are joined with a separator', () => {
const out = run(['weekly', 'opus'], rl());
assert.ok(out.includes('Weekly: '));
assert.ok(out.includes('Weekly Opus: '));
assert.ok(out.includes(' | '));
});

test('absent data renders nothing so the widget collapses', () => {
assert.equal(run(['session'], {}), '');
assert.equal(run(['session']), '');
assert.equal(run(['session'], rl({ five_hour: undefined })), '');
});

test('non-numeric percentage renders nothing', () => {
assert.equal(run(['session'], rl({ five_hour: { used_percentage: 'oops', resets_at: 0 } })), '');
});

test('thresholds: 50 -> yellow, just under -> green; 85 -> red, just under -> yellow', () => {
const at = (p) => run(['session'], rl({
five_hour: { used_percentage: p, resets_at: Math.floor(Date.now() / 1000) + 1 }
}));
assert.ok(at(50).includes(YELLOW), '50 should be yellow');
assert.ok(at(49.9).includes(GREEN), '49.9 should be green');
assert.ok(at(85).includes(RED), '85 should be red');
assert.ok(at(84.9).includes(YELLOW), '84.9 should be yellow');
});

test('unknown limit name renders nothing', () => {
assert.equal(run(['bogus'], rl()), '');
});
25 changes: 25 additions & 0 deletions shift/PLAN.md
Original file line number Diff line number Diff line change
Expand Up @@ -1018,3 +1018,28 @@ Re-run with `maxIterations: 1`; confirm the run ends on "max iterations" with pe
- **Testing strategy (unit pure modules + integration hook/CLI + manual smoke + dry-run):** Tasks 1–7, 9. ✔
- **No third-party deps:** all `node:` built-ins. ✔
- **Known gaps (deferred, documented):** usage-cap data source and rate-limit termination signature → v2 (SPEC §9). Mid-bin early-stop accepted in v1, reviewer-caught; verify pass → v3.

---

## Implementation notes (as-built deviations)

Built on branch `shift-v1`. The draft code blocks above are the design intent; these corrections were applied during implementation:

- **`state.cjs` — carry `text` through the merge, strip it on save.** `mergeDiscovered` copies each bin's freshly-read `text` into the in-memory bin (the brief needs the body); `saveState` strips `text` before writing so `state.json` stays lean. Without this the fed-back brief had the instructions but not the task body — caught by the hook integration test, not the unit tests.
- **Review fix #2 — `shift-stop.cjs` resolves the repo from the hook payload's `cwd`** (`input.cwd || process.cwd()`); a hook's process cwd isn't guaranteed to be the project root. Has a dedicated test.
- **Review fix #3 — summary surfaces logged `Needs you:` lines**, not just blocked bins; `brief.cjs` documents the `Needs you: <detail>` convention. Has a test.
- **Security — `bin/shift` uses `execFileSync('git', [...args])`** (argument array, no shell) for branch ops, so a config-supplied branch name can't inject shell metacharacters; added `git checkout` fallbacks for Git < 2.23.
- **`package.json` — `"test": "node --test"`** (Node ≥18 auto-discovery; a bare `test/` arg isn't accepted) and `"engines": { "node": ">=18" }`.

All 28 `shift` tests + 7 `code-status-bar` tests pass; `install.sh` verified end-to-end.

---

## v2 + v3 (built on the same branch)

Added after v1, same TDD discipline (52 `shift` tests total). See SPEC §13 for the design decisions.

- **v3 verify gate** — `lib/verify.cjs` (injectable exec) + a gate in the Stop hook: a bin passes only if `verify.command` exits 0; failures re-feed the bin with the output up to `verify.maxAttempts`, then block it. Tests: `verify.test.cjs` + hook gate cases.
- **v2 usage cap** — `lib/usage.cjs` caches the hook payload's `rate_limits` to `.shift/usage.json`; `evaluateBounds` gains a `usagePercent` arg (cap on weekly %); the hook reads it from the payload and degrades gracefully when absent. Tests: `usage.test.cjs` + bounds/decision/hook cases.
- **v2 headless runner** — `lib/outcome.cjs` (classify a spawn: completed / rate_limited / error, inferring rate-limit from cached usage since the exit signature is undocumented) + `lib/run-loop.cjs` (pure outer loop with injected effects: bounds, max-resumes backstop, wait-until-reset auto-resume) + `bin/shift run` (thin real-effects wiring). Tests: `outcome.test.cjs`, `run-loop.test.cjs`.
- **Security** — `lib/verify.cjs` uses `spawnSync(command, { shell: true })` with the whole user-config command (not interpolated); documented inline.
111 changes: 111 additions & 0 deletions shift/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# shift

Autonomous work-queue runner for **Claude Code** — module 2 of the [Agentic Workflow Toolkit](../). Pre-load bins of work, leave, and `shift` keeps Claude working through them past natural stop points, using its best judgment, until the queue is empty or a bound is hit — surviving the 5-hour rate-limit wall by waiting for the window to reopen. You review the output at the end.

See [SPEC.md](./SPEC.md) and [PLAN.md](./PLAN.md) for the design.

## How it works

You drop work into source folders — hand-written briefs and/or plugin-generated plans (e.g. Superpowers' plans dir). `shift start` discovers them, records a run in `.shift/`, and creates a `shift/<date>` branch. Then:

- **Keep-going engine (Stop hook).** Each time the agent would stop, the hook marks the finished bin done, picks the next pending one, and feeds it back as the next instruction — so the session keeps working. When the queue drains (or a bound trips, or the kill switch is set) it lets the session stop and writes `.shift/summary.md`.
- **Verify gate.** If you set a `verify.command`, each bin must pass it (e.g. `npm test`) before it counts as done; failures re-feed the bin with the output (up to `maxAttempts`), then mark it blocked. This catches "looked done but wasn't."
- **All-day runner (`shift run`).** A headless outer loop that spawns Claude, lets the engine grind, and — when a spawn dies on the rate-limit wall — waits until the window resets and resumes. Bounded by wall-clock, max iterations, a usage cap, and a resume backstop.

The hook is safe to register globally: it no-ops in any repo that isn't an active `shift` run, and resolves the repo from the hook payload's `cwd`.

## Safety model

Full best-judgment autonomy on reversible, in-worktree work. By default it will **not** push, publish, send externally, or delete outside the worktree — it does the preparable part and records a `Needs you:` line, which the summary collects. All work lands on the `shift/<date>` branch, so review is a clean diff. Every decision is logged. Hard stops: time box, max iterations, usage cap, kill switch (`shift stop`).

## Install

1. Clone the toolkit (the hook runs from these files by absolute path, so it installs locally — no `curl | bash`).
2. Wire the Stop hook into `~/.claude/settings.json` — one command, idempotent:

```bash
bash shift/install.sh
```

It merges the entry below (safe globally — the hook no-ops in any repo without an active `.shift/` run), backs up any existing settings first, and never duplicates on re-run — re-running after a `git pull` or a repo move just updates the path:

```json
{ "hooks": { "Stop": [
{ "matcher": "", "hooks": [
{ "type": "command", "command": "node /ABSOLUTE/PATH/TO/shift/hooks/shift-stop.cjs" }
] }
] } }
```

> **Hook contract (verified against the [Claude Code hooks docs](https://code.claude.com/docs/en/hooks)).** The Stop hook returns `{"decision":"block","reason":…}` to keep the session going — the `reason` becomes the next instruction — and omits `decision` (or exits 0) to allow the stop. The usage cap and `shift run` auto-resume read the hook payload's `rate_limits` when present and **skip cleanly when it's absent** (e.g. non-Pro/Max), so the engine never depends on it.

3. (Optional) put `shift/bin/shift` on your PATH — the installer prints the `ln -s` command.

## Use

```bash
cd your-repo
mkdir queue && $EDITOR queue/01-first-task.md # one brief per file
shift start --dry-run # preview the queue, branch, bounds
shift start # init run + create shift/<date> branch
```

Then either:

- **Interactive:** open Claude Code in the repo and say *"begin the shift"* — the Stop hook drives it while you're away (within this session).
- **All-day / unattended:** `shift run` — the headless loop drives Claude, survives rate-limit resets, and stops on a bound.

```bash
shift status # progress anytime
shift stop # stop cleanly after the current bin
```

When it ends, read `.shift/summary.md` (bins done/blocked + a "Needs you" section) and review the `shift/<date>` branch.

## Configure (`.shift/config.json`)

```json
{
"sources": [
{ "path": "queue", "kind": "briefs" },
{ "path": "docs/superpowers/plans", "kind": "plans" }
],
"bounds": {
"maxHours": 4,
"maxIterations": 30,
"maxResumes": 12,
"spawnTimeoutMinutes": 30,
"usageCapPercent": 90,
"autoResumeOnReset": true
},
"definitionOfDone": "Builds and tests pass; work committed on the run branch.",
"verify": { "command": "npm test", "maxAttempts": 2 },
"permissionMode": "acceptEdits",
"git": { "branch": "shift/{date}", "allowPush": false, "allowOutwardActions": false }
}
```

- **`usageCapPercent`** — stop when weekly usage reaches this (read from the hook payload's `rate_limits`; skipped when that data is absent, e.g. non-Pro/Max).
- **`autoResumeOnReset`** — on a rate-limit wall, `shift run` waits for the 5-hour window to reopen and resumes (never past the time box). If the cached reset time is stale/in the past it stops cleanly rather than busy-spinning.
- **`maxResumes`** — the runner's own backstop on the number of `claude` spawns (independent of the hook-maintained `maxIterations`/`maxHours`).
- **`spawnTimeoutMinutes`** — hard per-spawn wall: a wedged `claude` is killed (SIGTERM) so it can't hang the runner. Default 30.
- **`verify.command`** — per-bin acceptance gate; `null` disables it.

> A headless `shift run` grades success on `.shift/summary.md` (written only when the engine finalizes), not on the exit line: a `claude -p` that exits without finalizing is reported as *"no summary written — did NOT finalize"* with a hint to check the hook wiring, never as a false success.

### Permissions for unattended runs

`shift run` invokes `claude -p --permission-mode <permissionMode>`. `acceptEdits` (the default) auto-approves file edits but **other tools (e.g. Bash) can still prompt — and a headless run can't answer prompts.** For real unattended work that runs tests/commands, either:

- pre-allow the tools the work needs via `permissions.allow` in your Claude settings and set `"permissionMode": "dontAsk"`, or
- set `"permissionMode": "bypassPermissions"` (broadest; rely on the branch-only / no-push safety model and bounds).

Pick the narrowest mode that lets the work actually proceed.

## Develop

```bash
cd shift && npm test # node --test, zero dependencies
```

Pure logic lives in `lib/` (discovery, state, bounds, brief, decision, verify, usage, outcome, run-loop) and is unit-tested; `hooks/shift-stop.cjs` (the keep-going engine) and the `shift run` loop are integration-tested by driving them with injected effects / crafted hook input.
Loading