Quality loops for shipping with AI coding agents. Build with one model, review adversarially with a model from a different family, verify against a real runtime, and let the human gate only the irreversible steps.
Ship quality products · by running quality loops · by delegating as much as possible to agentic loops.
📖 Read the guide: QUALITY_LOOPS.md · 🌐 Web version: index.html (GitHub Pages once enabled)
A quality loop is a cycle where one model builds, a different-family model reviews it adversarially, a runtime gate verifies it against a real environment, and the human only steps in at the irreversible gates (production, money, destructive data, migrations).
Three things lift it above an ordinary AI code review:
- A reviewer of a different lineage — it doesn't share the builder's blind spots.
- Real runtime verification — diff-correct ≠ works. A change can pass every review round and still fail against the real API/environment.
- Structured findings — data, not prose (
--output-schema/--json-schema), so re-review with memory and PR synthesis are trivial.
The core mindset shift: stop prompting every step by hand and design the loop that prompts your agents.
cp -r skills/agentic-loop ~/.claude/skills/Then run /agentic-loop on a production-critical change in any project. The skill discovers the repo's verify command, diff base, and runtime target, then runs:
build → verify → adversarial review (Codex) → re-review with memory → runtime smoke → GO/NO-GO → ship
The agent decides GO; you hold the irreversible gate.
- Delegate the "after". The value is automating what you do after prompting, not the prompt.
- Don't look at the code too early. Let another agent review it before you do.
- Two model families > one. The reviewer must be a different lineage than the builder.
- Diff-correct ≠ works. Always a real runtime gate.
- Findings as data, not prose.
--output-schema/--json-schema. - Dynamic shape, not a persona zoo. Let the problem dictate structure.
- Isolate (git worktrees) so loops don't collide.
- Confront, don't obey. Verify every finding against real code.
- Autonomy = reversibility, not confidence. Human gate only on the irreversible.
- Treat the limit as a challenge. Subscription pricing → loop hard; API pricing → measure first.
- Skill = method; automation = schedule. In that order.
- Aim at something that seems impossible. The wall is farther than you think.
| Path | What |
|---|---|
skills/agentic-loop/SKILL.md |
The Claude Code skill (generic, self-contained). |
QUALITY_LOOPS.md |
The essay / justification of the method. |
index.html |
Human-readable field guide (web). |
The bottleneck in agent-assisted development isn't the model — it's the human orchestrating in the seams. A quality loop moves the human out of every reversible seam (verify, review, fix, re-review, integrate) and keeps them only where a mistake is irreversible. Autonomy scales with reversibility, not with model confidence.
- Reference talks on agentic loops: video 1 · video 2
- OpenAI Codex CLI — non-interactive · best practices · subagents
- Claude Code — headless / Agent SDK
Keywords: agentic loop, agentic loops, AI coding agents, adversarial code review, Claude Code skill, OpenAI Codex, autonomous coding, agentic workflows, runtime verification, ship quality software.