Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions skills/harness/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ If the skill is being invoked from a checkout (no `.claude/` ancestor), the scri
| ------------------------ | --------------------------------------------------------------------------------------------------- |
| `~/.claude/CLAUDE.md` | Operating-contract template (default stance, editing rules, expected tools) |
| `~/.claude/hooks/` | `block-force-push.sh`, `format-on-edit.sh`, `post-compact-reinject.sh`, `verify-before-stop.sh` |
| `~/.claude/commands/` | `/verify` (run the project's pass/fail check), `/plan` (Goal/Constraints/Acceptance template) |
| `~/.claude/commands/` | `/verify` (project pass/fail), `/plan` (Goal/Constraints/Acceptance), `/critique` (mid-flow critique pass) |
| `~/.claude/projects/<slug>/memory/` | `MEMORY.md` index + `user_role`, `feedback_concise`, `feedback_plan_first`, `feedback_verification` |
| `~/.claude/settings.json`| Adds `env.CLAUDE_CODE_AUTO_COMPACT_WINDOW=400000` + 4 hook entries (only if missing) |

Expand All @@ -69,7 +69,7 @@ If the skill is being invoked from a checkout (no `.claude/` ancestor), the scri
| --------------- | ------------------------------------------------------------------------------------------------ |
| `<project>/CLAUDE.md` | Operating contract — **skipped if a project CLAUDE.md already exists** (`--force` overrides) |
| `<project>/.claude/hooks/` | Same 4 hooks |
| `<project>/.claude/commands/` | `/verify`, `/plan` |
| `<project>/.claude/commands/` | `/verify`, `/plan`, `/critique` |
| `<project>/.claude/settings.json` | 4 hook entries with **project-relative** paths (`.claude/hooks/...`) |
| Memory | NOT seeded at project scope — memory is per-user by design and lives under `$HOME` |
| Env var | NOT set at project scope — `CLAUDE_CODE_AUTO_COMPACT_WINDOW` is session-wide |
Expand Down Expand Up @@ -170,7 +170,7 @@ None. The skill detects intent from natural language. If unclear, it runs `statu
| `README.md` | This — human-facing overview |
| `assets/CLAUDE.md.tmpl` | Operating-contract template |
| `assets/hooks/*.sh` | Four hook scripts |
| `assets/commands/*.md` | `/verify`, `/plan` |
| `assets/commands/*.md` | `/verify`, `/plan`, `/critique` |
| `assets/memory/*.tmpl` | MEMORY.md index + 3 feedback memories + user_role template |
| `scripts/install.sh` | Idempotent installer (`--dry-run` / `--force` / `--skip-*`) |
| `scripts/uninstall.sh` | Symmetric uninstaller (content-match check; `--all` for full sweep) |
Expand Down
10 changes: 7 additions & 3 deletions skills/harness/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ What lands at user scope:
| ------------------------ | --------------------------------------------------------------------------------------------------- |
| Operating contract | `~/.claude/CLAUDE.md` |
| Hooks | `~/.claude/hooks/{block-force-push,format-on-edit,post-compact-reinject,verify-before-stop}.sh` |
| Slash commands | `~/.claude/commands/{verify,plan}.md` |
| Slash commands | `~/.claude/commands/{verify,plan,critique}.md` |
| Auto-memory | `~/.claude/projects/<slug>/memory/{MEMORY.md, user_role, feedback_concise, feedback_plan_first, feedback_verification}` |
| settings.json | Adds `env.CLAUDE_CODE_AUTO_COMPACT_WINDOW=400000` + 4 hook entries (uses `~/.claude/hooks/...` form) |

Expand All @@ -75,7 +75,7 @@ What lands at project scope (`<project>/.claude/`):
| --------------- | ----------------------------------------------------------------------------------------------- |
| Operating contract | `<project>/CLAUDE.md` (skipped if it already exists — most projects have one. `--force` overrides) |
| Hooks | `<project>/.claude/hooks/*.sh` |
| Commands | `<project>/.claude/commands/*.md` |
| Commands | `<project>/.claude/commands/{verify,plan,critique}.md` |
| settings.json | `<project>/.claude/settings.json` — 4 hook entries with `.claude/hooks/...` (project-relative) form. **No env var, no memory** at project scope. |
| Memory | (skipped — memory is per-user by design and lives under `$HOME` regardless of project scope) |

Expand Down Expand Up @@ -278,6 +278,10 @@ Env knobs (mirror `snapshot.sh`):

Scope: the snapshot repo doesn't mirror harness scripts or local search roots, so the remote agent does its own in-process structural pass (index sync, frontmatter) and adds the semantic checks. Stale-citation analysis stays local-only — the search roots aren't available remotely. The remote agent PRs `audits/memory/YYYY-MM-DD.md` with proposed edits and never modifies any memory entry.

## PostToolUse critique hook — recommendation

We considered a hook that fires after N consecutive `Edit`/`Write` calls and runs an automatic critique pass (diff summary + `advisor()` check). **Recommendation: don't ship it.** The boundary critiques we already have (`verify-before-stop.sh` at Stop, `advisor()` calls bracketing non-trivial work, `/critique` available on demand mid-flow) cover the same gap with much better signal-to-noise. An always-on PostToolUse critique would (a) burn tokens and wall-clock on edits that don't need it, (b) train the agent to ignore the noise, and (c) duplicate what `/critique` already provides on user demand. Revisit if users report that mid-flow drift is escaping all three boundary surfaces — but the right next step there would be tuning when `/critique` gets *suggested*, not making it automatic.

## Constraints

- **Never auto-fill stack signals or user_role.** Templates have placeholders; ask the user to fill them.
Expand All @@ -294,7 +298,7 @@ Scope: the snapshot repo doesn't mirror harness scripts or local search roots, s
| `README.md` | Human-facing overview (with sources / inspiration) |
| `assets/CLAUDE.md.tmpl` | Operating-contract template |
| `assets/hooks/*.sh` | Four hook scripts |
| `assets/commands/*.md` | `/verify`, `/plan` |
| `assets/commands/*.md` | `/verify`, `/plan`, `/critique` |
| `assets/memory/*.tmpl` | MEMORY.md index + 3 feedback memories + user_role template |
| `scripts/install.sh` | Idempotent installer (`--dry-run` / `--force` / `--skip-*`) |
| `scripts/uninstall.sh` | Symmetric uninstaller (content-match check; `--all` for full sweep) |
Expand Down
7 changes: 7 additions & 0 deletions skills/harness/assets/CLAUDE.md.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,18 @@
- Don't add comments unless the WHY is non-obvious. Don't add docs unless asked.
- Don't add error handling for impossible cases. Don't introduce abstractions for hypothetical futures.

## The PTC loop
Work in this harness is a **Plan → Tool → Critique** loop, named so we share vocabulary. Reflexion (Shinn et al. 2023, https://arxiv.org/abs/2303.11366) is the canonical reference for the pattern.
- **Plan** — `/plan`, plan mode (shift-tab twice). Draft Goal/Constraints/Acceptance before editing.
- **Tool** — `Edit`, `Write`, `Bash`, sub-agents. Where the work happens.
- **Critique** — `verify-before-stop.sh` and `advisor()` at boundaries; `/critique` *between* tool steps when an interpretation is about to harden. Don't only critique at Stop.

## Tools I expect you to use
- `advisor()` before committing to an approach on tasks longer than a few steps, and before declaring done.
- Sub-agents (`Explore`, `general-purpose`) for parallel research — don't duplicate their searches in the main thread.
- `/verify` to run the project's pass/fail check.
- `/plan` to draft Goal/Constraints/Acceptance for a non-trivial task.
- `/critique` to deliberately critique the most recent change mid-flow (don't wait for Stop).

## Stack signals
<!-- Replace with your default stack. Per-project CLAUDE.md overrides this. -->
Expand Down
19 changes: 19 additions & 0 deletions skills/harness/assets/commands/critique.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
description: Critique the most recent change before continuing — diff re-read + advisor() pass
---

A deliberate critique pass *between* steps, without waiting for Stop. Use mid-flow when an interpretation is about to harden into a chain of edits, or before declaring a milestone done.

Procedure:

1. **Re-read the change.** Pick the branch that matches the repo state — don't run all three:
- **Dirty working tree (git):** `git diff HEAD` for tracked changes, then `git status --porcelain` and `Read` any `??` (untracked) entries directly — `git diff` skips brand-new files.
- **Clean tree, change is the last commit (git):** `git show HEAD` to re-read what was just committed. Catches formatter output, Bash-generated files, and edits from earlier turns that the transcript may have aged out.
- **Fresh repo with no commits, or not a git repo:** summarise the last few `Edit`/`Write`/`Bash` calls that touched files, working from the tool transcript.
2. **Critique it yourself first.** One paragraph: what does this change actually do, what could be wrong, what assumption is it leaning on?
3. **Call `advisor()`.** The advisor sees the full transcript — frame the question concretely. Examples: "Is the off-by-one in `foo()` actually a bug or am I misreading the loop?", "Does this migration handle the concurrent-write case I claimed it did?", "I rewrote X to use Y — is the boundary correct, or did I leak Y's concerns into a caller?"
4. **Report a punch list.** Under 200 words. What stands, what's shaky, what's the next concrete check.

If `$ARGUMENTS` is provided, treat it as a scope hint — `/critique migration safety`, `/critique the test I just wrote` — and focus the diff re-read and the advisor question on that.

Do not start new implementation work in this turn. The output of `/critique` is *findings*, not edits — the user (or the next turn) decides what to do about them.
4 changes: 2 additions & 2 deletions skills/harness/scripts/install.sh
Original file line number Diff line number Diff line change
Expand Up @@ -201,13 +201,13 @@ EOF
if [ "$SCOPE" = "user" ]; then
plan_line "$TARGET/CLAUDE.md" "$SKIP_CLAUDE_MD" "0" "operating contract — edit Stack signals after install"
plan_line "$TARGET/hooks/*.sh" "$SKIP_HOOKS" "0" "4 guardrails: block-force-push / format-on-edit / post-compact-reinject / verify-before-stop"
plan_line "$TARGET/commands/{verify,plan}.md" "$SKIP_COMMANDS" "0" "slash commands"
plan_line "$TARGET/commands/*.md" "$SKIP_COMMANDS" "0" "slash commands"
plan_line "$MEMORY_DIR/*.md" "$SKIP_MEMORY" "0" "MEMORY.md index + 4 auto-memory templates"
plan_line "$TARGET/settings.json" "$SKIP_SETTINGS" "0" "additive: env.CLAUDE_CODE_AUTO_COMPACT_WINDOW + 4 hook entries (NEVER touches permissions, marketplaces, statusLine, advisorModel, theme)"
else
plan_line "$(dirname "$TARGET")/CLAUDE.md" "$SKIP_CLAUDE_MD" "0" "operating contract — skipped if project already has a CLAUDE.md (use --force to override)"
plan_line "$TARGET/hooks/*.sh" "$SKIP_HOOKS" "0" "4 guardrails"
plan_line "$TARGET/commands/{verify,plan}.md" "$SKIP_COMMANDS" "0" "slash commands"
plan_line "$TARGET/commands/*.md" "$SKIP_COMMANDS" "0" "slash commands"
plan_line "memory templates" "$SKIP_MEMORY" "$PROJECT_SCOPE_NO_MEMORY" "memory is per-user; not seeded at project scope"
plan_line "$TARGET/settings.json" "$SKIP_SETTINGS" "0" "additive: 4 hook entries (no env var at project scope)"
echo " (nothing in \$HOME is modified at project scope)"
Expand Down
4 changes: 2 additions & 2 deletions skills/harness/scripts/uninstall.sh
Original file line number Diff line number Diff line change
Expand Up @@ -168,8 +168,8 @@ cat <<EOF
EOF
[ $KEEP_HOOKS -eq 1 ] && echo " — $TARGET/hooks/*.sh (KEEP — --keep-hooks)" \
|| echo " • $TARGET/hooks/*.sh"
[ $KEEP_COMMANDS -eq 1 ] && echo " — $TARGET/commands/{verify,plan}.md (KEEP — --keep-commands)" \
|| echo " • $TARGET/commands/{verify,plan}.md"
[ $KEEP_COMMANDS -eq 1 ] && echo " — $TARGET/commands/*.md (KEEP — --keep-commands)" \
|| echo " • $TARGET/commands/*.md"
[ $KEEP_SETTINGS -eq 1 ] && echo " — Hook entries in $TARGET/settings.json (KEEP — --keep-settings)" \
|| echo " • Hook entries in $TARGET/settings.json (other settings untouched)"
cat <<EOF
Expand Down