Skip to content

feat(setup-swarm): broaden git reset allow-rule to all forms#25

Open
gabadi wants to merge 75 commits into
unclebob:mainfrom
gabadi:feat/broaden-git-reset-allow-rule
Open

feat(setup-swarm): broaden git reset allow-rule to all forms#25
gabadi wants to merge 75 commits into
unclebob:mainfrom
gabadi:feat/broaden-git-reset-allow-rule

Conversation

@gabadi

@gabadi gabadi commented Jun 23, 2026

Copy link
Copy Markdown

What

Step 4 of setup-swarm pre-authorizes git/gh commands so the integrator and specifier run unattended. The reset rule was scoped to remote refs only (git reset --hard origin/*), so resets to HEAD or a sha still tripped the auto-mode permission classifier.

Broadens it to git reset --hard* in both the JSON example and the Python merge list.

Why

From the crap4py enforcement-gate backlog (2026-06-22): the auto-mode classifier blocked in-role autonomous git reset --hard / gh pr merge across specifier+integrator in the same pipeline run. gh pr merge* was already covered; this closes the reset gap.

🤖 Generated with Claude Code

gabadi and others added 30 commits June 14, 2026 01:33
Record only actual discrepancies vs unclebob/swarm-forge, one ADR each:
- 0001 permanent fork, synced by merge (sync policy: merge not rebase, additive, rerere)
- 0002 idle gate + clear-first Stop-hook delivery (the only engine discrepancy;
  presence side channel for the awake notification)
Plus CONTEXT.md: minimal glossary of fork-specific terms only.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ude Code first

- idle/busy marker is the one new piece; reuse upstream's queue as the inbox
- two cases: busy -> Stop hook delivers on next stop; idle -> deliver now (clear first either way)
- /clear clears the session for any agent -> re-injection is universal (no backend split)
- implement via Claude Code hooks first; codex/grok delivery pending

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ADRs 0003-0008 define fork divergences vs upstream (definition only,
not implementation):

- 0003 setup is a one-time, stack-aware skill (additive file, no merge
  conflict); run path does no project setup
- 0004 rework routes back to the stage whose decision it exposes;
  local fixes stay with finder; at most one bounce
- 0005 QA refutes (assumes build fails spec + tests too weak) rather
  than confirms; includes conversion fidelity
- 0006 harness-enforced holdout — status: proposed (open item)
- 0007 UX Engineer role that fixes against UX Intent (six-pack)
- 0008 terminal Integrator lands only behind a green CI gate, no
  local-merge fallback

CONTEXT.md gains glossary terms: Setup skill, Integrator, UX Engineer,
UX Intent, Refuting QA, Back-routing.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Settle the open item as accepted. The end-to-end QA suite is held out
mechanically, not by prompt instruction:

- mechanism: git sparse-checkout excludes the suite's pinned path from
  each role worktree (absent from disk, still tracked in the commit) --
  rejecting rm-from-worktree (commit would drop it) and separate-branch
  (more flow, no extra protection)
- scope: hidden from all implementer roles; visible only to the
  specifier (authors it) and QA (runs it)
- precondition: specifier writes the suite under a pinned path; existing
  coder "ignore it" line stays as defense-in-depth

Backed by verification_loop.md: holdout leakage "must be enforced
architecturally," not instructionally. Adds CONTEXT.md term "QA holdout".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Record the feature-template divergence (idea L) from the real artifact
on backup/six-pre-reset, not the idea-file summary:

- structured comment header above the Gherkin: TRACKING, CONTRACT,
  CONSTRAINTS, SEQUENCING, NFR, SIDE EFFECTS, SCOPE (each with an
  Ask:/Format:); the spec-authoring layer the scenarios can't carry
- six-pack adds an 8th section, UX INTENT (home for ADR 0007); four-pack
  has the 7 only, since the UX Engineer is six-pack-only
- address every section; SEQUENCING/SIDE EFFECTS/UX INTENT default to
  `none` (a deliberate answer; `none` UX INTENT = UX Engineer skips)
- impl note: six-pack specifier prompt still says "seven sections" but
  the template has eight -- fix to eight/all on landing

Adds CONTEXT.md term "Spec header".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Idea P recorded from the real backup/six-pre-reset artifacts, split into
two independent divergences (plus idea Q folded into 0010):

- 0010 surface-harness doctrine: live-verification roles (QA both packs,
  UX Engineer six-pack) drive the running system through its real surface
  via a declared per-surface tool (surface tool table in
  engineering.prompt); every surface carries a mandatory idle baseline
  scenario -- the direct fix for the tetris idle-state defects. Replaces
  QA's "through the UI only" with "through the declared surface harness"
  + every Expected bullet -> assertion or NOT AUTOMATED (idea Q). No
  surface field in project.prompt (resolves the stale monolith table row)
- 0011 dependency-fidelity-manifest: new dependency-manifest.prompt
  declaring deps by tier 1/2/3 with machine-readable gaps the specifier
  and QA refuse to build on; twin-authorship + post-interaction-state
  rules; specifier-owned, defaults (none)

0005 gains a cross-ref: conversion fidelity is audited by 0010's
bullet->assertion-or-NOT-AUTOMATED rule. CONTEXT.md gains: Surface
harness, Baseline scenario, Fidelity manifest, Dependency tier.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Record idea U from the backup design. Optional inline key=value tail on
window lines (existing 4-field lines unchanged; upstream hard-rejects
!=4 fields, so this is a real but backward-compatible parser change):

- model (all backends), effort (claude/copilot/grok), advisor (claude
  only); unsupported keys silently ignored
- per-role not per-backend; no pre-populated values (fully opt-in)
- lands on main (swarmforge.sh); runnable configs stay topology-only
- impl note: verify `claude --advisor` actually exists; treat as
  reserved-but-inert if not

No new CONTEXT.md vocabulary.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…hy only)

Audit of the real backup/six-pre-reset prompts vs the intent-written
ADRs surfaced lost decisions and contradictions; fix them and adopt the
house style (no rejected-options sections, no historical/legacy notes):

- 0007: add the universal visual-quality bar (AI-aesthetic anti-patterns,
  type hierarchy, WCAG contrast) the UX Engineer enforces regardless of
  project input; make the durable artifact concrete (observation-harness/,
  golden snapshots, rendering invariants); DESIGN.md is referenced-only
  (no scaffold, no walk-up)
- 0010: add the observation-harness/ committed regression record that QA
  re-executes
- 0008: integrator hands off to the curator; post-merge gate; N=3 cap
- 0004: two-scope cap -- one bounce per finding, N=3 cycles per feature
- 0005: conversion fidelity audited via 0010's bullet->assertion rule
- strip "## Considered options" from every ADR; remove legacy mapping
- CONTEXT.md: add Observation harness; update Back-routing for the caps

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Record the knowledge-promotion loop (issue #20), split into two
divergences:

- 0013 curator: terminal role after the integrator that promotes session
  retros to versioned repo knowledge via one self-merging PR, then
  releases the specifier. Capture-everything with scope tags; single
  discard gate (non-inferable check); routing ladder (gate > AGENTS.md >
  role file > reference > skill > upstream > ledger); append-only ledger;
  budgets 60/40; integrator->curator->specifier; empty run never stalls
- 0014 .agents contract: AGENTS.md + .agents/ written only by the curator,
  versioned in the repo (not ~/.claude), injected into each role bundle
  at launch (AGENTS.md for all, role file role-scoped; references by
  pointer); missing files silently skipped

CONTEXT.md: add Curator, Promoted knowledge, Knowledge ledger.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…detection

Final pipeline divergences (ideas R, S):

- 0015 R: constitution rule (workflow.prompt) -- a spec requirement that
  conflicts with real platform capability stops and reports to the user;
  a workaround comment in code is the smell the rule fired and was
  suppressed. Narrow to spec-vs-platform, binds all roles.
- 0016 S: cleaner (six-pack) / refactorer (four-pack) also scans boundary
  files at a ~15-20 mutation-site threshold (vs 100 for testable files);
  above = logic in an adapter shell, extract before handoff. Plus the
  "tested only through a stripped view = untested" anti-pattern.

Idea T (evidence-as-code) is covered by 0010's observation-harness loop;
G/H/I are rejected (no divergence, no ADR).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Permanent rebase index (one row per divergence: change + target + recover-from
branch:path + ADR) plus per-row recovery detail for the main script layer,
six-pack role prompts, and the setup-swarm skill. Records the section-C ADR
assignments and the decisions settled in the what's-missing pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ivergences

Promote the uncaptured main-side script divergences to ADRs: inlined prompt
bundle (idea B, translated onto the tmux harness), swarm self-pin/upgrade
(idea N), autonomous permission mode (--permission-mode auto, verified real),
worktree auto-compaction (idea F), and the retro-triage operator skill.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
0002 session-restart executing {message,hash,sender}; 0013 session-retro
transcript capture + before-idle trigger (idea J); 0003 setup-swarm rename,
setup-first/guard model (idea-K dropped) + install scaffold (idea O); 0010
hardener rendering-invariant property tests. Glossary: setup-swarm,
swarm-ready marker, session retro, retro-triage, prompt bundle.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolve six consistency/accuracy issues across the fork-divergence ADRs:

- four-pack freeze: scrub "apply to four-pack" instructions from ADRs
  0008/0009/0010/0011/0013/0015/0016 (frozen per ADR 0001 / manifest)
- logbook.json -> logbook.jsonl (real runtime file; handoff-lib.sh:33)
- ADR 0012 advisor: no --advisor flag; written as advisorModel into
  settings.local.json (per backup write_worktree_advisor)
- manifest Section C: drop stale "16 ADRs / no ADR" (now 0017-0021 assigned)
- CONTEXT.md: remove leaked memory-slug wikilink, keep ADR 0003 ref
- ADR 0006 holdout: key sparse-checkout on specifier role, not the master
  worktree name (ADR 0008 renames it to specifier)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ranch

Task-by-task plan (C1-C11 on main, D1-D14 on six-pack) re-applying every
documented divergence onto pristine upstream. One PR per delivery branch,
each divergence an ordered squashed commit. four-pack frozen; cmux dropped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…stream policy

Record the pristine-upstream baseline explicitly (manifest SHA line +
fork-base/<date>-<branch> tag) since a hard reset erases merge-history
anchoring. Fork divergences are squash-merged (one clean commit each);
upstream syncs are history-preserving merges (never squash/rebase).
Resolves manifest still-open #5 (PR shape).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs: ADRs for fork divergences vs upstream

Record only actual discrepancies vs unclebob/swarm-forge, one ADR each:
- 0001 permanent fork, synced by merge (sync policy: merge not rebase, additive, rerere)
- 0002 idle gate + clear-first Stop-hook delivery (the only engine discrepancy;
  presence side channel for the awake notification)
Plus CONTEXT.md: minimal glossary of fork-specific terms only.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(adr-0002): settle two-case delivery; universal re-injection; Claude Code first

- idle/busy marker is the one new piece; reuse upstream's queue as the inbox
- two cases: busy -> Stop hook delivers on next stop; idle -> deliver now (clear first either way)
- /clear clears the session for any agent -> re-injection is universal (no backend split)
- implement via Claude Code hooks first; codex/grok delivery pending

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(adr): capture five accepted pipeline divergences + one open item

ADRs 0003-0008 define fork divergences vs upstream (definition only,
not implementation):

- 0003 setup is a one-time, stack-aware skill (additive file, no merge
  conflict); run path does no project setup
- 0004 rework routes back to the stage whose decision it exposes;
  local fixes stay with finder; at most one bounce
- 0005 QA refutes (assumes build fails spec + tests too weak) rather
  than confirms; includes conversion fidelity
- 0006 harness-enforced holdout — status: proposed (open item)
- 0007 UX Engineer role that fixes against UX Intent (six-pack)
- 0008 terminal Integrator lands only behind a green CI gate, no
  local-merge fallback

CONTEXT.md gains glossary terms: Setup skill, Integrator, UX Engineer,
UX Intent, Refuting QA, Back-routing.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(adr-0006): decide harness-enforced holdout via sparse-checkout

Settle the open item as accepted. The end-to-end QA suite is held out
mechanically, not by prompt instruction:

- mechanism: git sparse-checkout excludes the suite's pinned path from
  each role worktree (absent from disk, still tracked in the commit) --
  rejecting rm-from-worktree (commit would drop it) and separate-branch
  (more flow, no extra protection)
- scope: hidden from all implementer roles; visible only to the
  specifier (authors it) and QA (runs it)
- precondition: specifier writes the suite under a pinned path; existing
  coder "ignore it" line stays as defense-in-depth

Backed by verification_loop.md: holdout leakage "must be enforced
architecturally," not instructionally. Adds CONTEXT.md term "QA holdout".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(adr-0009): feature files open with a structured spec header

Record the feature-template divergence (idea L) from the real artifact
on backup/six-pre-reset, not the idea-file summary:

- structured comment header above the Gherkin: TRACKING, CONTRACT,
  CONSTRAINTS, SEQUENCING, NFR, SIDE EFFECTS, SCOPE (each with an
  Ask:/Format:); the spec-authoring layer the scenarios can't carry
- six-pack adds an 8th section, UX INTENT (home for ADR 0007); four-pack
  has the 7 only, since the UX Engineer is six-pack-only
- address every section; SEQUENCING/SIDE EFFECTS/UX INTENT default to
  `none` (a deliberate answer; `none` UX INTENT = UX Engineer skips)
- impl note: six-pack specifier prompt still says "seven sections" but
  the template has eight -- fix to eight/all on landing

Adds CONTEXT.md term "Spec header".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(adr-0010,0011): split observation harness into surface + fidelity

Idea P recorded from the real backup/six-pre-reset artifacts, split into
two independent divergences (plus idea Q folded into 0010):

- 0010 surface-harness doctrine: live-verification roles (QA both packs,
  UX Engineer six-pack) drive the running system through its real surface
  via a declared per-surface tool (surface tool table in
  engineering.prompt); every surface carries a mandatory idle baseline
  scenario -- the direct fix for the tetris idle-state defects. Replaces
  QA's "through the UI only" with "through the declared surface harness"
  + every Expected bullet -> assertion or NOT AUTOMATED (idea Q). No
  surface field in project.prompt (resolves the stale monolith table row)
- 0011 dependency-fidelity-manifest: new dependency-manifest.prompt
  declaring deps by tier 1/2/3 with machine-readable gaps the specifier
  and QA refuse to build on; twin-authorship + post-interaction-state
  rules; specifier-owned, defaults (none)

0005 gains a cross-ref: conversion fidelity is audited by 0010's
bullet->assertion-or-NOT-AUTOMATED rule. CONTEXT.md gains: Surface
harness, Baseline scenario, Fidelity manifest, Dependency tier.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(adr-0012): per-role model/effort/advisor in swarmforge.conf

Record idea U from the backup design. Optional inline key=value tail on
window lines (existing 4-field lines unchanged; upstream hard-rejects
!=4 fields, so this is a real but backward-compatible parser change):

- model (all backends), effort (claude/copilot/grok), advisor (claude
  only); unsupported keys silently ignored
- per-role not per-backend; no pre-populated values (fully opt-in)
- lands on main (swarmforge.sh); runnable configs stay topology-only
- impl note: verify `claude --advisor` actually exists; treat as
  reserved-but-inert if not

No new CONTEXT.md vocabulary.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(adr): recapture lost role detail; lean ADR style (divergence + why only)

Audit of the real backup/six-pre-reset prompts vs the intent-written
ADRs surfaced lost decisions and contradictions; fix them and adopt the
house style (no rejected-options sections, no historical/legacy notes):

- 0007: add the universal visual-quality bar (AI-aesthetic anti-patterns,
  type hierarchy, WCAG contrast) the UX Engineer enforces regardless of
  project input; make the durable artifact concrete (observation-harness/,
  golden snapshots, rendering invariants); DESIGN.md is referenced-only
  (no scaffold, no walk-up)
- 0010: add the observation-harness/ committed regression record that QA
  re-executes
- 0008: integrator hands off to the curator; post-merge gate; N=3 cap
- 0004: two-scope cap -- one bounce per finding, N=3 cycles per feature
- 0005: conversion fidelity audited via 0010's bullet->assertion rule
- strip "## Considered options" from every ADR; remove legacy mapping
- CONTEXT.md: add Observation harness; update Back-routing for the caps

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(adr-0013,0014): curator role + .agents knowledge injection (idea V)

Record the knowledge-promotion loop (issue #20), split into two
divergences:

- 0013 curator: terminal role after the integrator that promotes session
  retros to versioned repo knowledge via one self-merging PR, then
  releases the specifier. Capture-everything with scope tags; single
  discard gate (non-inferable check); routing ladder (gate > AGENTS.md >
  role file > reference > skill > upstream > ledger); append-only ledger;
  budgets 60/40; integrator->curator->specifier; empty run never stalls
- 0014 .agents contract: AGENTS.md + .agents/ written only by the curator,
  versioned in the repo (not ~/.claude), injected into each role bundle
  at launch (AGENTS.md for all, role file role-scoped; references by
  pointer); missing files silently skipped

CONTEXT.md: add Curator, Promoted knowledge, Knowledge ledger.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(adr-0015,0016): platform-feasibility stop rule + boundary-logic detection

Final pipeline divergences (ideas R, S):

- 0015 R: constitution rule (workflow.prompt) -- a spec requirement that
  conflicts with real platform capability stops and reports to the user;
  a workaround comment in code is the smell the rule fired and was
  suppressed. Narrow to spec-vs-platform, binds all roles.
- 0016 S: cleaner (six-pack) / refactorer (four-pack) also scans boundary
  files at a ~15-20 mutation-site threshold (vs 100 for testable files);
  above = logic in an adapter shell, extract before handoff. Plus the
  "tested only through a stripped view = untested" anti-pattern.

Idea T (evidence-as-code) is covered by 0010's observation-harness loop;
G/H/I are rejected (no divergence, no ADR).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: fork change manifest + migration recovery docs (impl playbook)

Permanent rebase index (one row per divergence: change + target + recover-from
branch:path + ADR) plus per-row recovery detail for the main script layer,
six-pack role prompts, and the setup-swarm skill. Records the section-C ADR
assignments and the decisions settled in the what's-missing pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(adr-0017,0018,0019,0020,0021): document section-C script-layer divergences

Promote the uncaptured main-side script divergences to ADRs: inlined prompt
bundle (idea B, translated onto the tmux harness), swarm self-pin/upgrade
(idea N), autonomous permission mode (--permission-mode auto, verified real),
worktree auto-compaction (idea F), and the retro-triage operator skill.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(adr-0002,0003,0010,0013): extend ADRs + CONTEXT glossary

0002 session-restart executing {message,hash,sender}; 0013 session-retro
transcript capture + before-idle trigger (idea J); 0003 setup-swarm rename,
setup-first/guard model (idea-K dropped) + install scaffold (idea O); 0010
hardener rendering-invariant property tests. Glossary: setup-swarm,
swarm-ready marker, session retro, retro-triage, prompt bundle.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: fix cross-doc inconsistencies from code review

Resolve six consistency/accuracy issues across the fork-divergence ADRs:

- four-pack freeze: scrub "apply to four-pack" instructions from ADRs
  0008/0009/0010/0011/0013/0015/0016 (frozen per ADR 0001 / manifest)
- logbook.json -> logbook.jsonl (real runtime file; handoff-lib.sh:33)
- ADR 0012 advisor: no --advisor flag; written as advisorModel into
  settings.local.json (per backup write_worktree_advisor)
- manifest Section C: drop stale "16 ADRs / no ADR" (now 0017-0021 assigned)
- CONTEXT.md: remove leaked memory-slug wikilink, keep ADR 0003 ref
- ADR 0006 holdout: key sparse-checkout on specifier role, not the master
  worktree name (ADR 0008 renames it to specifier)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: fork divergence implementation plan — 2 PRs, one per delivery branch

Task-by-task plan (C1-C11 on main, D1-D14 on six-pack) re-applying every
documented divergence onto pristine upstream. One PR per delivery branch,
each divergence an ordered squashed commit. four-pack frozen; cmux dropped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(adr-0001): upstream baseline anchor + squash-divergence/merge-upstream policy

Record the pristine-upstream baseline explicitly (manifest SHA line +
fork-base/<date>-<branch> tag) since a hard reset erases merge-history
anchoring. Fork divergences are squash-merged (one clean commit each);
upstream syncs are history-preserving merges (never squash/rebase).
Resolves manifest still-open #5 (PR shape).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
docs: ADRs 0003–0021, fork-change-manifest, and implementation plan
* feat(swarmforge): autonomous permission mode for unattended roles (ADR 0019)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(swarmforge): pre-resolve role prompt bundle into XML envelope (ADR 0017)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(swarmforge): inject AGENTS.md + .agents/roles into role bundle (ADR 0014)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(swarmforge): per-role model/effort/advisor in swarmforge.conf (ADR 0012)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(swarmforge): enable auto-compaction on role worktrees (ADR 0020)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(swarmforge): sparse-checkout the QA holdout from shaping roles (ADR 0006)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(swarmforge): carry {message,hash,sender} in executing logbook entry (ADR 0002)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(swarmforge): pin-aware idempotent skill install at launch (ADR 0018)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(swarmforge): add agent-retro skill — scoped, capture-first, autonomous (ADR 0013)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat: restore retro-triage skill (ADR 0021)

Allow .claude/skills/ tracking via gitignore negation so operator-facing
skills can be committed alongside the repo.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(swarmforge): setup-swarm skill + swarm-ready marker guard + scaffold (ADR 0003, Idea O)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(setup-swarm): ask for stack instead of detecting; list only implemented stacks

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(swarmforge): address PR #31 review on swarmforge.sh

- Drop run-path .gitignore writes (logbook.jsonl/tmp/) from ensure_initial_gitignore
  and ensure_runtime_git_excludes; the .gitignore scaffold is owned by the
  setup-swarm skill (ADR 0003). Reverts both functions to upstream form. (comment 3)
- Remove remove_nonessential_clone_files + its call: rm -rf $WORKING_DIR/examples
  is unbacked data loss on any project with an examples/ dir. (comment 4 / review #1)
- Shell-quote per-role model/effort launch flags with ${(q)...} across all four
  backends so a quote in swarmforge.conf can't break the launch command. (review #5)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(agent-retro): model-aware pricing + correct end_time on truncated tail

- extract.py priced every agent/subagent at hardcoded Opus rates regardless of
  model (and the rate was stale pre-4.6 Opus $15/$75). Replace PRICE_PER_M with
  a per-family PRICE_TABLE (opus/sonnet/haiku/fable, current rates) + price_for_model
  + compute_cost; capture the actual model from each session's and subagent's own
  assistant records and price accordingly. Unknown models fall back to Opus so cost
  is never understated. (review #2/#3)
- extract_metadata_lite read end_time from a partial record when the file exceeds
  the 64KB tail buffer (tail starts mid-line; a timestamp embedded in a large
  tool_result could win). Drop the partial first tail line before the backward
  scan over complete lines. (review #4)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* refactor(swarmforge): fold worktree settings into one shared writer (ADR 0020)

write_worktree_permissions (0020 compaction keys) and write_worktree_advisor
(0012 advisor model) were two separate inline-python read-modify-write passes
over the same .claude/settings.local.json. Replace both with a single
write_worktree_settings <path> [advisor] that applies the compaction keys
always and advisorModel when set, sharing one RMW. prepare_worktrees calls it
for compaction; launch_role calls it with the advisor. Resolves review #8.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(retro-triage): also glob curator processed/ archive in detector (ADR 0021)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: ADR 0002 clear-first delivery engine design spec

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(swarmforge): implement ADR 0002 clear-first delivery engine

- handoff-lib.sh: add handoff_project_dir/from, handoff_pending_dir,
  handoff_busy_file, handoff_agent_type, handoff_clear_first_deliver
- send-handoff.sh: replace direct notify-agent.sh call with
  claude-aware idle path (pending queue + noclobber busy marker)
- swarm-stop.sh: new Stop hook that drains pending queue on task end
- swarmforge.sh: write_worktree_settings gains stop_script param;
  launch_role wires Stop hook for claude roles and drops positional
  prompt from claude launch_cmd (clear-first delivery handles it)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…aude branching

- Remove executing logbook entry from complete-handoff and swarm-stop
- Revert handoff_append_logbook to upstream 3-arg schema (no hash/sender)
- Extract _do_deliver shared function in swarm-handoff
- Remove claude/non-claude branching; always clear-first deliver
- Remove resolve_agent duplication; resolve_session parameterized
- Remove dangling executing-field reference from ADR 0017
gabadi and others added 30 commits June 15, 2026 01:08
…rotocol

tmux has extended-keys on (3.6b + xterm-ghostty). Claude Code in Ghostty
requests the kitty keyboard protocol, so Enter sends \x1b[13u, not \x0d.
send-keys C-m injects raw \x0d which Claude Code in kitty mode ignores —
causing all delivered text to pile up in the input bar without submitting.

tmux named key 'Enter' is translated through the extended-keys system and
sends the correct sequence for whatever keyboard protocol the pane requested.

Also switched to /swarm-persona slash command (faster than natural language)
and trimmed the C-m + C-j redundancy down to a single Enter per submit.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 1743b4a6144f
…lds (#33)

* feat(roles): add curator, integrator, ux-engineer and feature template

Adds the three new ADR-mandated roles (0007 ux-engineer, 0008 integrator,
0013/0014 curator) and the Gherkin feature template to main, bringing it
to parity with the six-pack branch.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(main): correct PR scope — back-routes fix only

Remove role files (curator, integrator, ux-engineer, feature.feature)
that belong on six-pack, not main. Shared articles on main are copied
to runnable branches at startup; role files live on the runnable branch.

Apply the back-routes fix to handoffs.prompt (the shared article):
forward-handoff body rule is now scoped to forward handoffs only;
back-routes carry role-specified diagnostic fields.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(swarmforge): local-* article load order and curator skill symlinks

Load order: queue non-local articles before local-* so local overrides
win — the previous alphabetical glob caused local-workflow.prompt to
load before workflow.prompt, reversing the intended precedence.

link_curator_skills(): creates .claude/skills/<n> → ../../.agents/skills/<n>
symlinks at boot and per worktree so Claude Code discovers curator-promoted
skills without an extra manual step.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(workflow): add Idle Gate section to shared article

Consolidates two universal role-prompt lines into the shared article
so they apply to all roles without duplication:
- agent-retro before idle (ADR 0013)
- wait for handoff (ADR 0002)

Both were removed from individual role prompts in a prior session
but never landed in the shared article.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* revert(swarmforge): restore original article glob — local-* are additive not overriding

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
#34)

Migration work is complete (PRs #31, #32, #33 merged). These files were
working artifacts — their decisions live in docs/adr/; the role prompts
and scripts have been updated. No live reference points to them.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Merges upstream's daemon-backed handoff system while preserving all
fork divergences. Key changes from upstream:

- New handoff protocol: draft-file format (type/to/priority/task/commit)
  via swarm_handoff.sh → swarm_handoff.bb (Babashka)
- handoffd.bb daemon owns tmux delivery; already includes C-m+C-j fix
- Helper scripts: ready_for_next.sh/bb, done_with_current.sh/bb variants
- swarmforge.conf: receive_mode (task|batch) as field 5
- roles.tsv runtime file written at startup

Fork additions preserved / extended:
- model/effort/advisor KV pairs shifted to field 6+ (field 5 = receive_mode)
- QA_HOLDOUT_PATH + DAEMON_DIR both kept in swarmforge.sh
- ROLE_MODELS/ROLE_EFFORTS/ROLE_ADVISORS + RECEIVE_MODES all kept
- workflow.prompt: Idle Gate section kept
- All ADRs, AGENTS.md, CONTEXT.md, skills (setup-swarm, agent-retro,
  retro-triage) preserved

Dropped: old send/receive/resend/complete-handoff.sh (replaced by daemon)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Changes vs upstream (ADR-0002, ADR-0018):

- handoffd.bb notify!: clear-first delivery — /clear then /swarm-persona
  combined with wake-message as a single turn
- swarmforge.sh: drop send_initial_prompt (violates ADR-0002 idle gate);
  update check_helper_scripts; remove swarm-stop.sh hook registration;
  restore send_initial_grok_prompt verbatim (grok-only, unchanged)
- handoffs.prompt: run agent-retro before done_with_current.sh
- workflow.prompt Idle Gate: wait for handoff only (retro now per-task)
- Delete swarm-handoff and swarm-stop.sh: redundant with daemon protocol
  and inbox state machine

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
merge(upstream): Babashka handoff port + daemon protocol + fork fixes
…gration

# Conflicts:
#	swarmforge/scripts/swarmforge.sh
Extract all fork-specific ADR implementations (0006, 0012, 0017, 0018,
0019, 0020, 0021) into swarmforge/scripts/fork.bb, loaded at runtime via
(load-file ...) before -main. This isolates fork additions from upstream's
swarmforge.bb to minimize future merge conflicts.

swarmforge.bb receives targeted additions only: kv-field parsing for model/
effort/advisor per role (ADR 0012), permission-mode auto, load-file hook,
and forward-declares for fork.bb symbols to satisfy SCI analysis phase.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
feat(fork): merge upstream Babashka port + migrate ADR divergences to fork.bb
Encodes the merge-base intersection technique for identifying real
conflicts, the fork.bb migration pattern, and swarm-forge-specific
rules (rtk prefix, gabadi/swarm-forge target, ADR style).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ixes

Upstream adds: iterm2 terminal adapter, extra-cli-args passthrough for window
configs, tmux pane-base-index detection fix, /tmp socket dir (cross-platform),
and handoff-protocol documentation for extra CLI args.

Conflict resolutions:
- swarmforge.bb: merge extra-args passthrough with fork's kv-map parsing
  (model=, effort=, advisor=); non-kv tokens become raw extra-args, kv tokens
  become structured fields; keep permission-mode auto (ADR-0019) and fork hooks.
- README.md: document both raw extra-cli-args and key=value overrides.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Upstream's extra-cli-args passthrough already handles model/effort — no need
to re-implement kv-map parsing for those. Only advisor=X requires special
treatment since there is no --advisor CLI flag; it writes advisorModel to
.claude/settings.local.json via write-worktree-settings! in fork.bb.

parse-config now strips advisor= tokens before building extra-args (upstream
approach), and launch-command is simplified back to upstream's structure with
only --permission-mode auto preserved (ADR-0019).

ADR-0012 updated to reflect the narrowed scope: advisor-only interception.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Hardcoded --permission-mode auto produced a duplicate flag when a user
passed --permission-mode in extra-args. Now injected only when absent
from extra-args — auto remains the fork default (ADR-0019) but extra-args
can override it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
merge(upstream): sync 5 upstream commits — iterm2, extra-args, tmux fixes
…54)

* fix(handoff): resolve state-dir from roles.tsv; add merge_and_process script

#46: swarm_handoff.bb state-dir now resolves outbox via SWARMFORGE_ROLE + roles.tsv
worktree-path lookup instead of JVM user.dir, matching the path handoffd.bb polls.
Silent delivery failure when invoked outside the assigned worktree is eliminated.

#44: add merge_and_process.sh — agents were improvising (observed: --theirs) because
the script referenced in git_handoff payloads did not exist. Script enforces
--no-ff merge and stops with a clear error on conflict; never --theirs/--ours.

Also adds constitution bullet: run swarm_handoff.sh from assigned worktree only.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(handoff): sync worktree to trunk before task delivery

Eliminates per-role sync boilerplate: ready_for_next_task.bb and
ready_for_next_batch.bb now git-fetch + reset --hard to origin/HEAD
before printing TASK/BATCH. NO_TASK path is unaffected.

This is the enforcement layer fix for #44 — sync happens structurally
in the delivery mechanism, not as advisory text in 8 role prompts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(merge_and_process): merge canonical commit, not branch tip

Fetches the sender branch to ensure the commit is reachable, then
merges the exact canonical-commit from the git_handoff payload.
Previously merged origin/<sender-branch> tip, which could include
commits pushed after the handoff was sent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
The grok case had )))) closing str-grok, case, str-base, AND cond->,
leaving (= index 0) and the transform as bare let expressions. The
transform's )))) then produced an unmatched ) at the top level,
causing a parse error on ./swarm start.

Fix: remove one ) so cond-> stays open through (= index 0) and the
transform; the transform's )))) correctly closes str-transform,
cond->, let, and defn.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
./swarm start passes "start" as the first positional arg to
swarmforge.bb. The default case treated any first arg as a working
directory, so run-main! received "start" and looked for config at
<cwd>/start/swarmforge/swarmforge.conf.

Add "start" to the case dispatch, routing it to run-main! with the
JVM user.dir (same as bare ./swarm with no args).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rules (#58)

* fix(handoffd): gate notify! on recipient idle state to prevent mid-task /clear

Adds UserPromptSubmit/Stop hooks to each role's settings.local.json that
write/remove a per-worktree agent-running marker. handoffd skips notify!
when the marker is present, so a /clear is never injected into a running
agent session. File delivery to inbox/new/ remains unconditional; the
queued handoff is picked up naturally via done_with_current -> ready_for_next.

Closes #56

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(handoffs): reframe git_handoff rules to separate forwarding from back-routing

Replaces the single ambiguous "no functional change" block with two distinct
rules: downstream forwarding requires a functional change; upstream back-routing
(ADR 0004) always uses git_handoff with the sender's branch HEAD, even with
no authored functional lines. Updates ADR 0004 to close the pending handoff
mechanism implementation item.

Closes #57

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…59)

Pin mattpocock/skills and install only grill-with-docs, domain-modeling,
and grilling — the skills needed for autonomous intentional-knowledge
capture during the specifier phase. Uses MATTPOCOCK_SKILLS_INCLUDE in
install-pins.conf to filter the archive, keeping unrelated skills out.
Sentinel tracks mattpocock SHA separately from entireio so either can
upgrade independently.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(swarm): apply drywall execution learnings to handoff protocol

- merge_and_process: drop unconditional git fetch; SHA is already in
  the shared object store for local worktrees
- handoffs: replace commit placeholder with positive generation command
- handoffs: commit findings before routing back upstream
- handoffs: send outbound handoff before agent-retro, not after

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(swarm): add task completion ordering rules to workflow.prompt

Universal rules: send outbound handoff before agent-retro, commit
findings before routing back. Placed in workflow.prompt so six-pack
inherits them on next merge from main.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(swarm): write draft before retro, queue after — not send before retro

Correct order: write handoff draft → agent-retro (reviews draft) →
swarm_handoff.sh (queues it) → done_with_current.sh.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(swarm): correct task completion order — queue handoff then retro

Order: swarm_handoff.sh (queue) → agent-retro → done_with_current.sh.
Previous commit had it backwards (draft → retro → queue).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(swarm): restore original phrasing for task completion ordering

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…#63)

grilling lives in skills/productivity/, not skills/engineering/.
Search all top-level subdirs under skills/ so MATTPOCOCK_SKILLS_INCLUDE
works regardless of which subdir the skill is in.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Passing the prompt file as both --append-system-prompt-file and a
positional arg caused Claude to respond at startup before any handoff
arrived. Drop the positional arg; the system prompt is sufficient and
the agent now stays idle until handoffd delivers the first wake-up.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ADR 0022 — moves the human gate from full spec review to a conversational
frontier brief. Everything after confirmation is autonomous.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(swarm): include remedy in commit-abbrev validation error

Agents receiving the 10-char rejection had no guidance on how to fix it,
causing repeated rediscovery of git rev-parse --short=10 each session.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(fork): propagate swarm allow-rules into worktree settings.local.json

Worktrees live outside the main project directory tree, so Claude Code
never finds the project's .claude/settings.json. write-worktree-settings!
now writes the integrator/specifier allow-rules into each worktree's
settings.local.json on every launch, including restarts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(swarm): apply drywall execution learnings to main branch

- workflow.prompt: act don't narrate in autonomous sessions
- fork.bb: clarify swarm-persona is pre-loaded by harness, not re-invokable

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(swarm): simplify agent instruction and drop autonomous sessions rule

- fork.bb: drop swarm-persona invoke directive — harness already sends it via
  wake message; telling the agent to invoke it again causes double-load
- workflow.prompt: revert autonomous sessions section — too vague and not
  actionable as a constitution rule

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(fork): add fork.bb extension test runner

- test/fork_runner.bb: standalone bb script verifying all fork.bb overrides
  (write-agent-instruction-file!, write-worktree-settings!, write-persona-skill-file!,
   link-curator-skills!) exist and produce correct output
- bb.edn: add fork-test task (run: bb fork-test)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: add GitHub Actions workflow with pinned action hashes

Runs upstream tests (bb test) and fork extension tests (bb fork-test)
on push/PR to main and six-pack. Babashka v1.12.218 pinned by version URL;
actions/checkout pinned to commit hash.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: install zsh — scripts use #!/usr/bin/env zsh

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add status table showing done/todo per stack and tool family
- Separate what-exists (crap4js v0.1.0, drywall v0.1.0, cargo-crap)
  from what-to-build (crap4py, mutate binary, mutate4py)
- Document install commands for released tools
- Move reference material (LCOV format, Uncle Bob CLIs) to appendix
- Record key decisions and rationale (why port mutation, why crap4py
  is new not a port, why one Rust binary covers JS/TS + Rust)
- Flag missing pieces: constitution wiring, repo locations, CI pattern

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ery (#69)

* fix(handoff): remove merge_and_process.sh — upstream uses LLM directive

Upstream's git_handoff body calls `merge_and_process <sender> <commit>`
as a natural language directive to the agent, not a shell command. No
script exists upstream. We added merge_and_process.sh by mistake.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(handoffd): notify only on new inbox write, not on duplicate delivery

Move notify! inside the when-not (fs/exists? target) guard so it fires
only when a new inbox file is actually written. Previously notify! could
fire on daemon restart if a handoff was already in the inbox (write
skipped, notification still sent), causing a spurious /clear +
/swarm-persona sequence to interrupt an agent mid-task.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
The integrator/specifier auto-mode classifier blocks `git reset --hard`
when the target is not a remote ref (e.g. HEAD or a sha). Broaden the
pre-authorized allow-rule from `git reset --hard origin/*` to
`git reset --hard*` so unattended in-role resets never prompt.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants