Add /harness skill (control surface for ~/.claude/)#1
Conversation
Packages the harness-engineering setup as a reusable, idempotent skill: operating-contract CLAUDE.md template, 4 guardrail hooks (block-force-push, format-on-edit, post-compact-reinject, verify-before-stop), /verify + /plan slash commands, auto-memory seeds (MEMORY.md index + concise / plan-first / verification-gate feedback + user_role template), and helper scripts for snapshotting ~/.claude/ to a private git repo plus a prompt template for a monthly remote-audit routine. The installer (scripts/install.sh) supports --dry-run / --force / --skip-memory / --skip-settings and never clobbers existing files unless asked. settings.json is patched additively (env + 4 hook entries only), preserving permissions/marketplaces/statusLine/etc. Smoke-tested in an isolated $HOME: dry-run writes nothing, real run installs 12 files + patches settings, second run is fully idempotent (skip-exists everywhere, "settings.json already current"), and the installed block-force-push.sh correctly blocks 'git push --force origin main' while allowing --force-with-lease and echoed strings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Symmetric reversal of install.sh, conservative by default: - Removes hooks and slash commands only if their sha256 still matches the installed template; user-modified files are kept and reported. - Strips the 4 hook entries from settings.json; drops empty hook arrays. Leaves permissions, marketplaces, statusLine, etc. untouched. - Keeps CLAUDE.md, memory entries, and CLAUDE_CODE_AUTO_COMPACT_WINDOW by default — those tend to be customised. Opt in with --remove-claude-md, --remove-memory, --remove-env, or --all. - Flags: --dry-run, --force (skip content-match), --all. Smoke-tested: - Default uninstall after fresh install removes 6 files + cleans settings, keeps CLAUDE.md/memory/env. Empty parent dirs rmdir'd. - Modified hook is reported "keep (modified)" without --force; removed with. - --all on a fresh install reduces 13 files → empty settings.json + 0 files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ources
Restructure as a control surface with five sub-actions (install / uninstall /
snapshot / status / audit) instead of a one-shot bootstrap. Rationale: the
skill's job has grown beyond first-time setup — uninstall, snapshotting, and
the monthly audit loop are equal peers, and "bootstrap" undersold what it does.
Changes:
- skills/bootstrap-harness/ → skills/harness/
- SKILL.md frontmatter: name: harness; description rewritten around sub-actions
- README.md: rewritten human-facing overview with prominent sources section
(OpenAI harness-engineering article, Fowler writeup, Cherny / Willison /
Vincent / Huntley / Husain / Yegge), plus a sibling-link to
datashaman/harness-template (the project-scope counterpart)
- New scripts/status.sh: read-only reporter — installed / modified / missing
per surface, plus settings.json hook-wiring + env var, plus snapshot-repo
ahead/behind if SNAPSHOT_REPO is set
- install.sh, uninstall.sh: self-reference text updated; stale audit-routine.json
reference fixed to audit-prompt.md
- Top-level README.md: skill heading renamed and rewritten to advertise
sub-actions and credit sources
Smoke-tested round trip: install → status (everything 'installed'/'wired') →
modify a hook → status (correctly reports 'modified') → uninstall (default
keeps memory/CLAUDE.md/env) → uninstall --all (sweeps everything; settings.json
back to {}).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds a new /harness skill that packages a reusable, user-scope (~/.claude/) Claude Code “harness” setup (operating contract, hooks, slash commands, memory seeds) with install/uninstall/status/snapshot utilities and an audit prompt.
Changes:
- Introduces the
harnessskill definition + documentation, including sub-action dispatch (install,uninstall,status,snapshot,audit). - Adds idempotent installer/uninstaller plus status + snapshot scripts to manage
~/.claude/surfaces and patchsettings.json. - Adds the shipped templates: CLAUDE.md operating contract, guardrail hooks, slash commands, and memory seeds; plus an audit prompt template.
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| skills/harness/scripts/uninstall.sh | Conservative uninstaller that removes only unmodified installed files and cleans related settings.json entries |
| skills/harness/scripts/status.sh | Read-only status report for installed/modified/missing surfaces and settings wiring |
| skills/harness/scripts/snapshot.sh | Mirrors ~/.claude/ into a sanitized git snapshot repo, secret-scans, commits, and pushes |
| skills/harness/scripts/install.sh | Idempotent installer that lays down templates and patches settings.json to wire hooks/env |
| skills/harness/scripts/audit-prompt.md | Prompt template for a monthly remote audit routine |
| skills/harness/assets/memory/user_role.md.tmpl | User-role memory template seed |
| skills/harness/assets/memory/feedback_verification.md.tmpl | Verification-gate memory template seed |
| skills/harness/assets/memory/feedback_plan_first.md.tmpl | Plan-first memory template seed |
| skills/harness/assets/memory/feedback_concise.md.tmpl | Concise-output memory template seed |
| skills/harness/assets/memory/MEMORY.md.tmpl | Memory index template seed |
| skills/harness/assets/hooks/verify-before-stop.sh | Stop hook to block “done” when verification fails |
| skills/harness/assets/hooks/post-compact-reinject.sh | Post-compact hook to re-inject key context files |
| skills/harness/assets/hooks/format-on-edit.sh | PostToolUse hook to run available formatters after edits |
| skills/harness/assets/hooks/block-force-push.sh | PreToolUse hook to block destructive shell commands (e.g., force push / rm -rf) |
| skills/harness/assets/commands/verify.md | /verify slash command definition |
| skills/harness/assets/commands/plan.md | /plan slash command definition |
| skills/harness/assets/CLAUDE.md.tmpl | User-scope operating contract template |
| skills/harness/SKILL.md | Skill manifest + agent-facing operating instructions and sub-action dispatch |
| skills/harness/README.md | Human-facing README for the harness skill |
| README.md | Repository README updated to list and describe /harness |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Substantive fixes:
- **Portable SHA-256.** Replace bare `shasum` with a fallback chain (sha256sum
preferred on Linux → shasum on macOS → python3 hashlib). uninstall.sh and
status.sh both have the same helper. status.sh now reports `cannot hash`
explicitly when no tool is available, instead of silently treating empty
hashes as "matching".
- **Drop `HOME_CLAUDE` override.** Was half-implemented — the env override
affected file paths but settings.json had hard-coded `~/.claude/...`, so
hooks installed under an overridden home would never actually run. Use
$HOME/.claude consistently across install/uninstall/status. Tests use
HOME=/tmp/... instead. settings.json still stores the literal "~/.claude/..."
form (Claude Code expands ~ at hook-execution time, which is what we want
for portability across machines/users).
- **`do_or_dry` no longer uses `eval`.** Take argv directly via "$@" and use
printf %q for shell-quoted dry-run preview. Smoke-tested with a path
containing a single quote (`foo'bar`) — prints correctly escaped, doesn't
break.
- **`changed` flag now tracks env-var insertion.** Was a real bug: if hooks
were already wired but the env var was missing, install rewrote the file
but reported "already current". Now any env-var change marks `changed`.
Doc/comment fixes:
- install.sh usage no longer says "interactive" (script is non-interactive).
- block-force-push.sh comment corrected: handles `&&`, not bare `&`.
- uninstall.sh `--help` line range fixed: was 2-17, leaked `set -euo pipefail`;
now 2-16 (last comment line).
- README requirements: list sha256sum/shasum/python3 explicitly under the
hashing dependency.
End-to-end smoke test (HOME override mode) verifies:
- install (fresh) → `settings.json updated`; second install → `already current`
- env-var bug: strip env after install, re-run → correctly says `updated`
- modified hook reports `modified` in status, `keep (modified)` in uninstall
- settings.json hook commands stay as literal `~/.claude/...` strings
- single-quote in $HOME path doesn't break dry-run preview
- `--all` sweep: 24 actions, settings.json reset to `{}`, all files gone
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address two pieces of feedback:
1. Scope is no longer hard-coded to user — derive it from $SKILL_DIR.
The skill installs at the scope it itself lives in:
- $SKILL_DIR is under <X>/.claude/skills/... → target is <X>/.claude/
- If <X> is $HOME → user scope (CLAUDE.md at ~/.claude/, settings hooks
use ~/.claude/hooks/..., env var set, memory seeded)
- Otherwise → project scope (CLAUDE.md at <project>/, settings hooks use
project-relative .claude/hooks/..., no env var, no memory seeding —
memory is per-user by design)
- Running from a checkout with no .claude/ ancestor errors with a clear
prompt for --scope=user|project or --target=PATH
- Override always available via --scope or --target
2. Escape hatches so users can configure which changes to accept:
install.sh: --skip-{claude-md,hooks,commands,memory,settings}, plus a
positive --include=LIST shorthand (e.g. --include=hooks,commands).
uninstall.sh: --keep-hooks, --keep-commands, --keep-settings to preserve
specific surfaces that the default would remove. Existing --remove-*
flags continue to broaden the sweep.
3. Preflight banners on both install.sh and uninstall.sh — print scope,
target, the per-surface plan (with SKIP / KEEP markers reflecting active
flags), and a pointer to the inverse script with --all warnings. Read
before proceeding. Always recommend --dry-run for first runs.
4. Bug fixes:
- uninstall.sh CLAUDE_MD_PATH ordering: was referenced in banner before
definition; moved up.
- uninstall.sh --help: was using fixed line range that overshot when the
usage block grew; replaced with awk that prints leading comment block
up to first non-comment line — robust to script edits.
Smoke-tested both scopes end to end:
- User scope (skill at $HOME/.claude/skills/harness/): installs full kit,
settings.json hook commands are ~/.claude/hooks/..., status reports
scope=user, --all sweeps everything.
- Project scope (skill at <proj>/.claude/skills/harness/): installs only
to <proj>/.claude/ + project CLAUDE.md, hook commands are .claude/hooks/...,
$HOME is not touched, --all sweeps the project files.
- --include=hooks,commands installs only those two surfaces.
- --keep-hooks --keep-commands during uninstall preserves them.
- Round trip: 12 installed → 14 removed default → 21 removed --all.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add CI workflow: shellcheck + install/uninstall round-trip on Linux + macOS. - Soften block-force-push.sh: branch -D rule narrowed to protected names so worktree feature-branch cleanup is allowed. - Soften verify-before-stop.sh: only fires on a dirty working tree; clean tree short-circuits to allow Stop. Documents CLAUDE_SKIP_VERIFY=1 override. - Add scripts/_detect_stack.py — detects PHP/Laravel, Node/JS/TS, Python, Go, Rust, Ruby, Elixir from manifest files and emits Markdown bullets for the '## Stack signals' section. - install.sh now auto-fills CLAUDE.md '## Stack signals' from the detector at install time, removing one of the manual hand-edit steps. - Cosmetic: shellcheck-clean status.sh and uninstall.sh; selective SC2088/SC2016 disables for literal $VAR/tilde strings written into JSON. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- scripts/update.sh: refresh installed files vs current templates without clobbering customisations. sha256 content-match per file → identical / missing / modified. Modified files default to SKIP with a printed diff; --force overwrites, --merge writes the new template to <file>.new alongside for manual review. - scripts/doctor.sh: end-to-end diagnostic. Combines status with sanity checks: hashing tool present, target writable, settings.json parseable, all 4 hooks executable and wired (no foreign paths), block-force-push smoke- test, memory dir populated, CLAUDE.md '## Stack signals' not still placeholder, snapshot repo recency. - SKILL.md + README.md: dispatch tables, file inventories, and prose updated for the new sub-actions. - Add MIT LICENSE at repo root and link from main README. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 25 out of 25 changed files in this pull request and generated 8 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Adoption was buried in chat. This wires it into the skill so the same
language ("adopt", "retrofit", "add to my project") routes to a real path.
- scripts/adopt.sh: project-scope only. Detects project root (refuses to
write into $HOME), reports existing CLAUDE.md / .claude/ / settings.json /
scripts/harness-check.sh, runs _detect_stack.py for stack signals, scaffolds
a starter scripts/harness-check.sh from a template, prints the next-step
install command. Skips overwrite without --force; --dry-run supported.
- assets/harness-check.sh.tmpl: stack-aware starter pass/fail gate. Runs
lint / types / tests for whichever ecosystem files are present
(package.json, composer.json, pyproject.toml, go.mod, Cargo.toml, Gemfile).
Empty-sensor case is treated as PASS so the script never strands Stop.
- SKILL.md / README.md / main README updated with dispatch entry, walkthrough,
flag table, and file inventory.
- CI roundtrip: new step verifies adopt writes the starter, won't overwrite
customisations without --force, refuses $HOME with exit 2, and emits the
expected starter blocks for the detected stack.
Doesn't run install itself — adopt prepares, the user reviews the install
preflight separately.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- status.sh: guard python3 availability and try/except settings.json parse so missing python3 / invalid JSON produces a clear status line instead of a runtime error. - snapshot.sh: secret scan now uses --exclude-dir=.git/node_modules/__pycache__ so it doesn't traverse those trees before filtering. - block-force-push.sh: comment claimed segment splitter handles lone & (background); it doesn't. Fix the comment to match reality. - update.sh: 'identical — no-op' now genuinely skips the cp instead of rewriting the file with the same contents (no mtime churn). - install.sh fill_stack_signals: export DETECT_ROOT inside the function so the log line is accurate without relying on caller-provided env. Drop the redundant DETECT_ROOT= prefixes from the two callers. - SKILL.md: collapse the two near-duplicate post-install hand-edit blocks into one that covers both user and project scope. - doctor.sh: explicit python3 capability check; settings.json validity + hook-wiring checks now skipped (with a clear WARN) when python3 is missing rather than misreporting 'not valid JSON'. PR description updated to reflect the actual count and surface of sub- actions (eight, including update / doctor / adopt added later in the PR). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot round-2 review — triage + fixesAddressed in Round 2 (8 comments) — fixes
Round 1 (11 comments on
|
Addresses two genuine correctness/security gaps from review. - install.sh + uninstall.sh: settings.json writes go through tmp + os.replace so an interrupted process can't leave settings.json empty/partial. Claude Code refuses to load invalid JSON, and "interrupted" is plausible (laptop sleep, ^C between truncate and flush, OOM). - block-force-push.sh: previously aspirational name — `git push origin :main` (refspec push-delete), `git push --delete origin main`, and `git push origin +HEAD:main` (the leading + is force without --force) all bypassed the rule. Three new patterns close the loop. Protected-branch list matches the existing `git branch -D` rule so feature-branch deletes still flow through. - CI: matrix step exercises all three new push patterns (block + allow), plus asserts no settings.json.tmp leftover after install and that the resulting JSON parses. Verified locally: 15/15 block-force-push test cases pass (8 block, 7 allow); shellcheck silent; doctor 11/0/0; settings.json write leaves no tmp file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- harness-check.sh.tmpl: drop the narrow grep on composer.json scripts before running pint. If ./vendor/bin/pint exists, just run it. The previous guard skipped pint in any project that didn't have a "pint" or "format:check" script wired — which is most of them. - CI: drop a fake package.json into the test $HOME so _detect_stack picks it up; assert the placeholder block in CLAUDE.md was replaced and that the detected bullets contain something from the manifest (TypeScript / Next / React / npm). - CI: doctor.sh round-trip step asserts "0 fail" on a clean install and that the block-force-push smoke result is present in the output. - CI: update.sh round-trip — modify a hook, run update, assert (a) the modified hook is flagged as such and not overwritten, (b) an identical hook's mtime is unchanged (genuine no-op), and (c) update --merge writes <file>.new alongside without clobbering the original. Smoke verified locally: stack signals filled correctly (TypeScript / Next / Vitest detected from a fake package.json), doctor 11/0/0, update.sh identical-mtime preserved, --merge writes .new. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The stack-signal auto-fill (added earlier this PR) replaces the placeholder in CLAUDE.md with detected bullets at install time. The existing 'status — every surface should report installed/wired/set' assertion grepped the full status output for any 'modified' line and failed — but the modification is now expected. Filter CLAUDE.md out of the modified check; any other modified surface still fails the build. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Build went green on 1fb78de — this is just clearing the Node.js 20 deprecation warning that GitHub now emits on v4. v5 ships with Node 24, which is the runtime GitHub will force on June 2nd 2026. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Adds a
/harnessskill that turns~/.claude/(or<project>/.claude/) into a proper harness — feedforward guides Claude reads before it acts, deterministic sensors that catch drift after, and an optional drift-detection loop that PRs deltas against the latest releases each month.A control surface, not a one-shot bootstrap. Eight sub-actions, all idempotent:
installCLAUDE.md, four guardrail hooks,/verifyand/planslash commands, auto-memory seeds,settings.jsonpatch (env + hooks only — never touches permissions / marketplaces / statusLine). Stack signals auto-filled from manifest detection.uninstall--allfor full sweep;--keep-*escape hatchesupdate--mergewrites new template to<file>.newfor diffable side-by-side reviewdoctoradoptscripts/harness-check.shpass/fail gate, prints next-step install command. Refuses$HOMEsnapshot~/.claude/→ a private git repo (caches scrubbed, secret-pattern scanned with--exclude-dir=.git, idempotent)statusinstalled/modified/missingper surface, hook-wiring, env var, snapshot-repo stateaudit/schedule) that researches the last ~30 days of Anthropic releases + canonical Claude Code voices and PRsaudits/YYYY-MM-DD-setup-audit.mdagainst the snapshot repoScope auto-detection: derived from
$SKILL_DIR. Override with--scope=user|projector--target=PATH.The four hooks:
block-force-push.sh(PreToolUse:Bash) — segment-aware matcher. Blocks force-push tomain/master, hard reset to remote,rm -rf ~,--no-verify, world-writable chmod, branch -D on protected names. Allows--force-with-leaseand worktree feature-branch deletion. Doesn't false-trigger on echoed strings.format-on-edit.sh(PostToolUse:Write|Edit) — runs Pint /bun run format/npm run format/ ruff / gofmt / cargo fmt if config exists. Silent on success.post-compact-reinject.sh(PostCompact) — re-cats./CLAUDE.md,./AGENTS.md,~/.claude/CLAUDE.mdafter autocompact, so the operating contract survives.verify-before-stop.sh(Stop) — refuses Stop only if the working tree is dirty AND./scripts/harness-check.shfails. Clean-tree short-circuit prevents stranding mid-investigation.CLAUDE_SKIP_VERIFY=1to override.CI
GitHub Actions workflow (
.github/workflows/skill-harness.yml):.shubuntu-latest+macos-latest$HOMErefusal + customisation preservationSources
The vocabulary follows OpenAI's Harness engineering and Martin Fowler's writeup. Day-to-day patterns are convergent picks from:
Sibling to datashaman/harness-template (the project-scope counterpart).
/harness adoptis the bridge — drops the spine of harness-template's pattern into any project.A note on the name
Originally drafted as
bootstrap-harness. Renamed toharnessbecause:/harness <action>reads like the corresponding sub-CLI in any other tool.🤖 Generated with Claude Code