Skip to content

Commit 93857be

Browse files
Agent formulates assertions itself, presents to user for confirmation
1 parent d475fff commit 93857be

1 file changed

Lines changed: 4 additions & 4 deletions

File tree

AAE.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ To set up a new tuning session, work with the user to:
5656
- The **benchmark harness** (read-only) that defines the metric and evaluation procedure.
5757
- Any **configuration or constants** that are fixed.
5858
4. **Identify constraints**: Understand what the agent can and cannot change (see below).
59-
5. **Collect assertions**: Ask the user for any correctness invariants that must hold after every change (see below).
59+
5. **Formulate assertions**: After reading the code, derive correctness invariants and present them to the user for confirmation (see below).
6060
6. **Verify benchmark works**: Run the benchmark once to confirm it produces output.
6161
7. **Initialize results.tsv**: Create `results.tsv` with just the header row.
6262
8. **Confirm and go**: Confirm setup looks good with the user, then begin.
@@ -77,14 +77,14 @@ The user defines these per project. The agent must respect them strictly.
7777

7878
## Assertions
7979

80-
The user may define correctness assertions: invariants that must hold after every change the agent makes. These act as a safety net, ensuring that optimizations do not silently break the program's correctness.
80+
After reading the target code and benchmark during setup, the agent must formulate correctness assertions: invariants that must hold after every change. These act as a safety net, ensuring that optimizations do not silently break the program's correctness.
8181

82-
Examples of assertions:
82+
The agent derives these from its understanding of the code. Examples:
8383
- "The output must be a valid partition (every node assigned to exactly one block, no block empty)."
8484
- "The sorted output must be a permutation of the input."
8585
- "The loss must be finite and non-negative after every training step."
8686

87-
During setup, ask the user for any assertions they want enforced. If provided, verify them after every implementation, before running the full benchmark. If an assertion fails, the change is incorrect; discard it immediately (no need to benchmark) and log it as a crash.
87+
Present the assertions to the user for confirmation during setup. The user may add, modify, or remove assertions. Once agreed upon, verify them after every implementation, before running the full benchmark. If an assertion fails, the change is incorrect; discard it immediately (no need to benchmark) and log it as a crash.
8888

8989
Assertions are distinct from the benchmark metric. The metric measures performance; assertions guard correctness. A change that improves the metric but violates an assertion is always discarded.
9090

0 commit comments

Comments
 (0)