DeterminAgents

Universal audit harnesses for coding agents.

DeterminAgents is a portable library of self-discovering audit prompts. Hand one to an agent pointed at a repo; the agent discovers the project layout, runs a structured audit, and writes a report with severity, evidence, and concrete next steps.

What you get:

Portable audits with no hardcoded service names or repo paths
Repeatable harnesses: discovery phases, rubrics, report templates, and follow-up workflows
Decision-ready artifacts in docs/reports/, including optional system digests via auto-report

Use it when a codebase is large enough that structure beats ad-hoc prompting.

Design principle

Simple prompt + good harness > clever prompt + no harness.

Most of the lift in agentic work comes from the surrounding scaffolding — phases, severity rubrics, report formats, context overlays, fault-injection harnesses — not from clever wording inside the prompt. Every audit here is deliberately plain English; the structure around it does the work, and the structure is what gives the agent something to loop against until success criteria are met. The principle applies recursively to the library's own docs: trust over enumeration, defaults over variants. (See the v0.4 simplification commit for what that looked like applied to DeterminAgents itself.)

When this isn't worth it. A 200-line script or a one-off prototype doesn't need a phased audit with a severity rubric and a report file. Use direct prompts for trivial work.

Install

curl -fsSL https://raw.githubusercontent.com/iansherr/determinagents/main/install.sh | sh

Installs to ~/.determinagents/ (override with $DETERMINAGENTS_HOME) and a determinagents shim to ~/.local/bin/ (override with $DETERMINAGENTS_BIN).

determinagents version             # what's installed
determinagents doctor              # check the install is healthy
determinagents update              # check for updates, show diff, apply with confirmation
determinagents materialize         # install slash commands for your host tool
determinagents completions <shell> # print tab-completion script (bash, zsh, fish)
determinagents uninstall           # remove the library (prompts for confirmation)
determinagents help                # full command list

After determinagents update: audit content (phases, severity rubrics, the doc bodies themselves) flows through automatically — materialized slash commands are thin pointers that re-read the audit doc each time they run. Re-run determinagents materialize only when a new behavior is added (new slash command), the shared invocation header changes, or the hub command template changes. Each release that requires re-materialization will say so at the top of its CHANGELOG entry.

Materialization now defaults to one canonical command family: use /determinagents for onboarding and /determinagents <behavior> [flags] for direct runs.

Troubleshooting wrong invocation: if you run determinagents --help while trying to execute an audit, you are on the installer CLI surface. Return to your host tool and run /determinagents <behavior> [flags] (example: /determinagents error-handling).

To pin a branch (e.g., dev for unreleased work):

curl -fsSL https://raw.githubusercontent.com/iansherr/determinagents/dev/install.sh | sh -s -- --branch=dev

First run

After installing, the lowest-friction path:

Pick a repo — yours or anyone's. The read-only audits don't modify code.
Run an audit. STUB_AND_COMPLETENESS is a good starter: it surfaces phantom endpoints, dead handlers, and silent failures on most codebases without needing build infrastructure. Hand this prompt to your coding agent (Claude Code, Cursor, Gemini, etc.):
```
Run audits/STUB_AND_COMPLETENESS.md from $DETERMINAGENTS_HOME against
this repo. Report to docs/reports/STUB_AUDIT_<YYYY-MM-DD>.md.
```
Read the report at docs/reports/. Every finding has a file:line, severity, and suggested fix. The report's ## Next steps section contains paste-ready follow-up prompts.
Optional next moves: capture project-specific calibrations in docs/determinagents/AUDIT_CONTEXT.md so future audits skip known-false-positives (see specs/BOOTSTRAP.md); or run audits/RESOLVE_FROM_REPORT.md to work through findings with per-finding approval and one commit per fix.

Once that loop is comfortable, browse the audits table below for other audits to try, or INVOCATIONS.md for canonical paste-ready prompts. To install as slash commands in your host tool, see INSTALL.md.

Choose a behavior

Need	Run
Find stubs, phantom paths, and incomplete implementations	`/determinagents stub`
Run a security sweep	`/determinagents security`
Trace where a user action breaks	`/determinagents data-flow --target=<flow>`
Review runtime capacity and resource pressure	`/determinagents resource-capacity`
Find god-files and propose extraction seams	`/determinagents structural-entropy`
Find regression-prone complexity hotspots	`/determinagents regression-surface`
Don't know what to run — let it pick	`/determinagents next`
Set up and discover recursive improvement loops	`/determinagents init-loops`
Orchestrate recursive improvement loops automatically	`/determinagents loop-orchestrator`
Create a weekly or post-change system digest	`/determinagents auto-report --mode=baseline`
Work through report findings with approval gates	`/determinagents resolve --report=<path>`

Reports are meant to be read by humans and reused by agents. A typical finding looks like:

### P1: Save failures are logged but not surfaced to the user
- Evidence: src/api/client.ts:42
- Impact: failed writes can leave the UI showing stale success state
- Suggested fix: return a typed error from saveProfile() and render retryable error state in SavePanel
- Next step: /determinagents resolve --report=docs/reports/ERROR_HANDLING_2026-05-11.md

Layout

determinagents/
├── README.md            # this file
├── INVOCATIONS.md       # paste-ready prompts for every behavior
├── INSTALL.md           # how an agent installs this library into a host tool
├── audits/              # the runnable docs
│   ├── STUB_AND_COMPLETENESS.md
│   ├── SECURITY_PENTEST.md
│   ├── DATA_FLOW_TRACE.md
│   ├── ERROR_HANDLING.md
│   ├── TEST_GAPS.md
│   ├── DOCS_DRIFT.md
│   ├── UX_DESIGN_AUDIT.md
│   ├── RESOURCE_CAPACITY.md
│   ├── SCENARIO_CHAINER.md     # meta: chains findings into simulations
│   ├── STRUCTURAL_ENTROPY.md
│   ├── PICK_NEXT.md            # meta: recommends what to run next
│   ├── RESOLVE_FROM_REPORT.md  # mutating: works through report findings
│   ├── STRUCTURAL_REFACTOR.md  # mutating: executes structural-entropy seams
│   ├── SECURITY_HUNT.md        # mutating: agentic vulnerability hunting
│   ├── DATA_FLOW_VERIFY.md     # mutating: observed-vs-theorized data flow
│   ├── TESTING_CREATOR.md      # mutating: writes new tests
│   └── HARNESS_CREATOR.md      # mutating: generates verification harnesses
└── specs/               # conventions and per-project artifact specs
    ├── FORMAT.md                  # how to author a new audit; harness conventions
    ├── HARNESS_STUBS.md           # boilerplate for common harnesses
    ├── BOOTSTRAP.md               # how to generate AUDIT_CONTEXT.md (cold + warm)
    ├── FEATURE_REGISTRY.md        # spec for the per-project feature registry
    ├── AUTOMATED_REPORTING.md     # project-facing system digest orchestrator
    ├── SIGNAL_SCHEMA.md           # JSON schema for auto-report signal output
    ├── AUDIT_CONTEXT_TEMPLATE.md  # minimal starting overlay (Global only)
    ├── AUDIT_CONTEXT_SECTIONS.md  # catalog of audit-specific sections (copy as needed)
    └── MAINTENANCE.md             # maintainer-only: keep the library current (refresh / integrate / brainstorm)

Available audits (read-only)

Audit	Finds
audits/STUB_AND_COMPLETENESS.md	Phantom endpoints, dead handlers, silent error swallowing, compiled-without-source files
audits/SECURITY_PENTEST.md	Auth bypass, injection, IDOR, hardcoded secrets, JWT issues, exposed internals
audits/DATA_FLOW_TRACE.md	Where a user action breaks between UI, network, handler, and DB
audits/ERROR_HANDLING.md	Silent catches, missing error UI, errors logged but not surfaced
audits/TEST_GAPS.md	Scenarios the test suite would miss — error paths, edge cases, integration boundaries
audits/DOCS_DRIFT.md	Claims in README and docs that the code no longer matches
audits/UX_DESIGN_AUDIT.md	CSS that violates DESIGN.md tokens — colors, spacing, radii, motion, typography
audits/DESIGN_HANDOFF_AUDIT.md	Audit design handoff bundles against target code, bypassing misleading READMEs
audits/RESOURCE_CAPACITY.md	Runtime-agnostic capacity and resource-pressure risks across k8s, docker/compose, bare metal, or unraid-style deployments
audits/STRUCTURAL_ENTROPY.md	God-files and god-modules. Severity is driven by responsibility count, fan-in/out, and change velocity — not LOC alone. Outputs seam proposals consumed by `STRUCTURAL_REFACTOR.md`
audits/REGRESSION_SURFACE.md	Regression-prone complexity hotspots: overlapping responsibilities, fragile error handlers, fallback ladders
audits/PICK_NEXT.md	Meta-audit. Recommends which audit to run next based on report staleness, recent git history, and `AUDIT_CONTEXT.md` cadence preferences. Writes no report by default

Most audits run in 30–180 minutes at default scope, scaling with codebase size. Each audit doc supports --phases=N,M and --max-time=Xm to scope tighter.

SECURITY_PENTEST.md is the static half of security. For serious vulnerability discovery in codebases with build/test infrastructure, also use the agentic SECURITY_HUNT.md below.

Available creators (mutating — writes code)

Doc	What it does	Prerequisites
audits/RESOLVE_FROM_REPORT.md	Works through findings in any audit report — one at a time, with per-finding approval, separate commits, and verification	An audit report exists at `docs/reports/`; clean working tree
audits/STRUCTURAL_REFACTOR.md	Executes seam proposals from a `STRUCTURAL_ENTROPY` report. Per-seam loop with contract-before-code gate; one contract commit + one move commit per seam; before/after dependency artifacts	`STRUCTURAL_ENTROPY` report exists; disposable workspace; tests cover the target file
audits/SECURITY_HUNT.md	Agentic vulnerability hunting — agent gets execution capability to verify or refute bug hypotheses against one target file/function. Inspired by Mozilla's Firefox-hardening pipeline	Project builds locally; sanitizers configured; disposable workspace; AUDIT_CONTEXT.md `SECURITY_HUNT` section configured
audits/DATA_FLOW_VERIFY.md	Drives a real user flow end-to-end and observes wire traffic + DB state. The "observed" counterpart to `DATA_FLOW_TRACE.md`'s "inferred" — catches silent layer drift static analysis misses	Disposable workspace; app runs locally; AUDIT_CONTEXT.md `DATA_FLOW_VERIFY` section configured
audits/TESTING_CREATOR.md	Implements tests across four tiers — Adversarial, Chaos, Simulation, Forensics — beyond what `TEST_GAPS.md` covers	Run `TEST_GAPS.md` and `SECURITY_PENTEST.md` first
audits/HARNESS_CREATOR.md	Deterministically generates verification harnesses (Playwright, Docker, Fuzzing) to prove/refute audit findings	An audit report exists; disposable workspace
audits/RECURSIVE_IMPROVEMENT.md	Autonomously design, execute, and evaluate experiments to improve a specific metric or solve an open-ended problem. Generates hypotheses, mutates code, and verifies against a harness.	Measurable goal; deterministic harness exists; disposable workspace

Two read-only audits — ERROR_HANDLING.md and STUB_AND_COMPLETENESS.md — include an optional mutating Phase 6 (fault injection and endpoint verification respectively) that follows the harness conventions in specs/FORMAT.md. Use scope +harness to enable.

The standard workflow: run an audit (read-only) → review the report → run RESOLVE_FROM_REPORT to work through findings → re-run the audit to verify clean state. For security-sensitive fixes, optionally chain into TESTING_CREATOR Tier 1 (Adversarial) afterwards to add executable coverage.

Per-project specs

These describe an artifact each project generates its own instance of.

Spec	Project artifact	Purpose
specs/FEATURE_REGISTRY.md	`docs/determinagents/FEATURE_REGISTRY.md`	Living catalog of every testable feature with URL, auth, steps, pass criteria, tags
specs/AUDIT_CONTEXT_TEMPLATE.md	`docs/determinagents/AUDIT_CONTEXT.md`	Minimal starting overlay (Global only). Audit-specific sections come from AUDIT_CONTEXT_SECTIONS.md — copied in only when filled.
specs/AUTOMATED_REPORTING.md	`docs/reports/SYSTEM_DIGEST_<YYYY-MM-DD>.md` plus optional `docs/reports/signals/SYSTEM_DIGEST_<YYYY-MM-DD>.json`	Read-only synthesis harness that turns existing audit reports and explicit runtime snapshots into decision-ready system digests. JSON follows SIGNAL_SCHEMA.md.

Supporting docs: specs/FORMAT.md (audit authoring spec), specs/BOOTSTRAP.md (overlay generator workflow).

Conventions

Every audit:

Is read-only by default. Three mutating docs (RESOLVE_FROM_REPORT.md, TESTING_CREATOR.md, and HARNESS_CREATOR.md) declare this prominently in their purpose sections.
Has phases so you can scope: run Phase 1 only for a quick pass, all phases for a deep pass.
Classifies findings by severity (P0/P1/P2/P3) with concrete criteria.
Emits a report with file:line references and concrete fixes — never "fix this."
Reports go to docs/reports/ (in the target project) with a date-stamped name (e.g., STUB_AUDIT_2026-05-09.md).
Reads docs/determinagents/AUDIT_CONTEXT.md first if it exists, to apply project-specific calibrations.

Companion: DESIGN.md

audits/UX_DESIGN_AUDIT.md assumes a DESIGN.md exists at the project root as the source of truth for design tokens. If your project doesn't have one, generate it first using the Google design.md spec:

Spec & format: https://github.com/google-labs-code/design.md
Overview: https://stitch.withgoogle.com/docs/design-md/overview/
Format: https://stitch.withgoogle.com/docs/design-md/format/

The bootstrap prompt for DESIGN.md is in INVOCATIONS.md. The other six audits do not require DESIGN.md.

Acknowledgements

Thank you to Mozilla Security for publicly sharing Behind the Scenes: Hardening Firefox (May 2026). Their description of the agentic-harness pipeline, the inner-loop framing — "there is a bug in this part of the code, please find it and build a testcase" — and the severity-by-defect-class rubric directly shaped audits/SECURITY_HUNT.md and the broader v0.3 / v0.4 design. Open writeups from teams doing real production work like this is how the rest of us learn.

Thank you also to the frontier model engineers who keep saying — out loud, against the cultural reflex of secrecy and the collective instinct to grind for the perfect prompt — that working with an agent to improve a prompt produces better prompts than working alone. This library is an outgrowth of that practice: a personal collection of prompts that worked, refined over time, until the scaffolds of a standard set became visible. The spec emerged from the pattern, not the other way around. The hope now is that publishing it helps others skip a few of the same steps.

And to Andrej Karpathy, whose observation that "LLMs are exceptionally good at looping until they meet specific goals — don't tell it what to do, give it success criteria and watch it go" is the cleanest one-line statement of why the harness, not the prompt, does most of the work. (See also Forrest Chang's andrej-karpathy-skills for a compact CLAUDE.md distillation of the same observations.)

Orchestrated by Ian Sherr at Time Worthy Media.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeterminAgents

Design principle

Install

First run

Choose a behavior

Layout

Available audits (read-only)

Available creators (mutating — writes code)

Per-project specs

Conventions

Companion: DESIGN.md

Acknowledgements

About

Uh oh!

Releases 3

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
.agents/agents/determinagents		.agents/agents/determinagents
.github		.github
audits		audits
bin		bin
docs		docs
specs		specs
tests		tests
~/.gemini/antigravity-cli/agents		~/.gemini/antigravity-cli/agents
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
COMMAND_DISCOVERY_QUICK_REF.md		COMMAND_DISCOVERY_QUICK_REF.md
COMMAND_DISCOVERY_SCAN.md		COMMAND_DISCOVERY_SCAN.md
INSTALL.md		INSTALL.md
INVOCATIONS.md		INVOCATIONS.md
Makefile		Makefile
README.md		README.md
install.sh		install.sh

Folders and files

Latest commit

History

Repository files navigation

DeterminAgents

Design principle

Install

First run

Choose a behavior

Layout

Available audits (read-only)

Available creators (mutating — writes code)

Per-project specs

Conventions

Companion: DESIGN.md

Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages