Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 89 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,95 @@ All notable changes to LOOM will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [1.1.0] - 2026-05-20

**ægraph substrate goes production + first mechanized roundtrip
proof.** A minor-version bump: the v1.0.4 ægraph substrate is now a
default-on pipeline pass with cost-driven extraction and a widened
rule set, and the parser/encoder roundtrip proof (#48) gains a real
Rocq scaffold. Byte-neutral on the current corpus — this is an
infrastructure and correctness release, not a size-win release.

### Optimization

- **Track B (#134, re-applied in this release commit): cost-driven
ægraph extraction.** `egraph::extract()` now finds the union-find
root of the requested class, scans every class id whose `find()`
resolves to that root, and emits the representative with the
lowest *total* encoded-byte cost. New `Op::encoded_byte_cost()`
returns 1 for opcodes and `1 + LEB128(immediate)` for
`const` / `local.get`, mirroring wasm-encoder exactly. Subtree
cost is a HashMap-memoized DP keyed on UF root (the acyclic
invariant — child id < parent id — is the termination guarantee).
This closes the v1.0.5 Track 1 substrate gap: the manual UF-root
scan in `egraph_optimize_body` is deleted, and the call site is
now just `egraph.extract(root_class)`.

Process note: PR #134 merged but its `egraph.rs` / `lib.rs` diff
was silently clobbered when PR #137's rebase resolved conflicts by
whole-file copy from a pre-#134 branch. The content is re-applied
in this release commit; 25 egraph tests green.

- **Track C (#137): ægraph rule-set widening.** 11 new `Op`
variants for i64 (`Add`/`Sub`/`Mul`/`And`/`Or`/`Xor`/`Shl`/
`ShrS`/`ShrU`/`Eq`/`Eqz`) and 8 new identity rules — i64
`+0` / `|0` / `&-1` / `*1` plus three shift-by-zero folds. New
`Op::is_commutative()` + `EGraph::canonicalize_commutative()`
normalize operand order for the commutative i32/i64 ops so each
identity rule only needs the `(wild, Const)` form. One test
(`test_commutativity_zero_plus_x_folds`) is `#[ignore]`'d pending
insertion-time normalization — a v1.1.1 follow-up.

- **Track F: ægraph pass is default-on.** The pass already ran by
default mechanically (`should_run` is permissive without
`--passes`); the stale "opt-in via --passes egraph" comment is
corrected. Default-on is revert-safe by construction:
`egraph_optimize_body` splices extraction back only when it is
strictly shorter than the original tree, so a function is either
improved or left byte-identical — never regressed.

### Proofs

- **Track A (#135): Path A for #48 — parser/encoder roundtrip
identity.** Total `Admitted.` count in `proofs/` drops 4 → 2.
`TermBijection.v` is rewritten from a 42-line placeholder into a
272-line self-contained file; both `term_conversion_bijection`
and `term_conversion_bijection_rev` close with `Qed`.
`StackSignature.v` adds `combined_kind` + `combined_kind_assoc` +
`compose_kind` + `compose_assoc_kind`, all `Qed` — the kind
component of composition associativity is closed. `Roundtrip.v`
lands the `ScopedModule` + LEB128 + section-codec scaffold. The
two remaining `Admitted.` are the `leb128_roundtrip` general-nat
induction step and the `StackSignature` dataflow component, both
documented with proof sketches.

### Measurement

- New `docs/measurements/v1.1.0-corpus-baseline.md`. LOOM produces
no regression on any corpus fixture (every LOOM Δ% ≤ 0). Per-file
deltas are unchanged from v1.0.5 — the ægraph pass is byte-neutral
on the current corpus because these fixtures lack the foldable
identity patterns the rule set targets; the substrate is wired and
will produce wins once such patterns appear.
- `measure_corpus.sh` `pct_delta` no longer coerces sentinel
strings (`error` / `invalid` / `timeout`) to `0`, which had
fabricated a `-100%` "win" on a failed or timed-out run. Such rows
now correctly read `n/a`.

### Deferred to v1.1.1

- **Track D — Track-3 housekeeping** (`Instruction` `Eq`/`Hash`,
`pub(crate)` `AdapterInfo`, surfaced `FusedOptimizationStats`,
no-silent-swallow in `optimize_fused_module`). Touches every
fused-optimizer call site; held back to keep the v1.1.0 review
surface bounded.
- **Track E — real meld-fused multi-component fixture.** Blocked on
a `meld`-binary permission wall and the absence of a component
pair with a shared cross-memory shape. Shipped as a documented
placeholder (`tests/corpus/MELD_FUSED_README.md`); the harness
carries a `meld_fused` workload slot that stays `n/a` until the
fixture lands.

## [1.0.5] - 2026-05-19

**Four-track v1.0.4 follow-through.** Each v1.0.4 infrastructure
Expand Down
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ members = [
]

[workspace.package]
version = "1.0.5"
version = "1.1.0"
authors = ["PulseEngine <https://github.com/pulseengine>"]
edition = "2024"
license = "Apache-2.0"
Expand Down
104 changes: 104 additions & 0 deletions docs/measurements/v1.1.0-corpus-baseline.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# v1.1.0 Corpus Baseline -- LOOM vs wasm-opt -O3

_Generated by `scripts/measure_corpus.sh` at `2026-05-20T05:26:16Z`._

- LOOM commit: `6ae62ed26f3a4e82d25d14e27adbbb615a45298b`
- LOOM branch: `main`
- LOOM version: `loom 1.0.5`
- wasm-opt: `wasm-opt version 116 (version_116)` (used)
- wasm-tools: `wasm-tools 1.243.0`

## Headline

On this corpus (only workloads where both LOOM and wasm-opt produced valid output): LOOM produced a **smaller** output than wasm-opt on: gale. wasm-opt beats LOOM on: httparse, state_machine, json_lite.

Missing fixtures (skipped, marked `n/a`):
- `nom_numbers`
- `loom`
- `calculator`
- `meld_fused`

## Red rows

- :red_circle: httparse: wasm-opt beats LOOM by 6,23% of baseline -> gap analysis recommended
- :red_circle: state_machine: wasm-opt beats LOOM by 9,00% of baseline -> gap analysis recommended
- :red_circle: json_lite: wasm-opt beats LOOM by 10,51% of baseline -> gap analysis recommended

## Results — file size (total bytes incl. all sections)

_File bytes include type / import / export / global and custom sections_
_(name, debug, attestation, dylink). These can change without code changes;_
_see the **code-section table** below for optimizer-relevant deltas._

| Workload | Baseline | LOOM | wasm-opt -O3 | wasm-opt → LOOM | LOOM Δ% | wasm-opt Δ% | Note |
|---|---:|---:|---:|---:|---:|---:|---|
| gale | 1941 | 1846 | 1925 | 1846 | -4,9 | -0,8 | kernel-FFI fixture |
| :red_circle: httparse | 4766 | 4668 | 4371 | 4292 | -2,1 | -8,3 | HTTP parser |
| nom_numbers | n/a | n/a | n/a | n/a | n/a | n/a | parser-combinator primitives |
| :red_circle: state_machine | 1655 | 1558 | 1409 | 1321 | -5,9 | -14,9 | FSM kernel |
| :red_circle: json_lite | 3510 | 3377 | 3008 | 2929 | -3,8 | -14,3 | minimal JSON tokenizer |
| loom | n/a | n/a | n/a | n/a | n/a | n/a | LOOM self-build (dogfood target) |
| calculator | n/a | n/a | n/a | n/a | n/a | n/a | component-shaped fixture |
| calculator_root | 2337724 | error | error | n/a | n/a | n/a | 2.3 MB component (root, large) |
| simple_component | 261 | 212 | error | n/a | -18,8 | n/a | tiny component (adapter-heavy) |
| calc_component | 442 | 392 | error | n/a | -11,3 | n/a | small component (adapter-heavy) |
| meld_fused | n/a | n/a | n/a | n/a | n/a | n/a | real meld-fused multi-component core (Track 3 target — see tests/corpus/MELD_FUSED_README.md) |

## Results — code section only (optimizer-relevant)

_Bytes of the wasm code section (function bodies) only — the surface_
_an optimizer actually changes. Use these deltas to compare optimizer_
_effectiveness fairly (independent of debug-info / attestation noise)._

| Workload | Baseline (code) | LOOM (code) | wasm-opt (code) | LOOM code Δ% | wasm-opt code Δ% | Note |
|---|---:|---:|---:|---:|---:|---|
| gale | 811 | 795 | 795 | -2,0 | -2,0 | kernel-FFI fixture |
| httparse | 3452 | 3433 | 3399 | -0,6 | -1,5 | HTTP parser |
| nom_numbers | n/a | n/a | n/a | n/a | n/a | parser-combinator primitives |
| state_machine | 1055 | 1037 | 992 | -1,7 | -6,0 | FSM kernel |
| json_lite | 2125 | 2071 | 2017 | -2,5 | -5,1 | minimal JSON tokenizer |
| loom | n/a | n/a | n/a | n/a | n/a | LOOM self-build (dogfood target) |
| calculator | n/a | n/a | n/a | n/a | n/a | component-shaped fixture |
| calculator_root | 106017 | n/a | n/a | n/a | n/a | 2.3 MB component (root, large) |
| simple_component | 9 | 9 | n/a | +0,0 | n/a | tiny component (adapter-heavy) |
| calc_component | 33 | 33 | n/a | +0,0 | n/a | small component (adapter-heavy) |
| meld_fused | n/a | n/a | n/a | n/a | n/a | real meld-fused multi-component core (Track 3 target — see tests/corpus/MELD_FUSED_README.md) |

## Components via meld (fused-core baseline)

_For Component-Model fixtures, wasm-opt cannot process the component
directly. `meld fuse` produces a single core module from the component;
that fused core is its own baseline and is structurally different from the
original component. The deltas below compare wasm-opt and LOOM against the
**meld output** as baseline._

| Workload | meld baseline | wasm-opt -O3 | LOOM | wasm-opt Δ% | LOOM Δ% | Note |
|---|---:|---:|---:|---:|---:|---|
| calculator_root | 128764 | 114639 | n/a | -11,0 | n/a | 2.3 MB component (root, large) |
| simple_component | 90 | 90 | 41 | +0,0 | -54,4 | tiny component (adapter-heavy) |
| calc_component | 135 | 135 | 86 | +0,0 | -36,3 | small component (adapter-heavy) |

## Methodology

For each workload (fixture path is relative to repo root):
1. Record baseline byte count via `wc -c` and code-section size via `wasm-tools dump`.
2. Run `loom optimize <fixture> -o <name>.loom.wasm`.
3. Run `wasm-opt -O3 <fixture> -o <name>.wopt.wasm` (skipped if wasm-opt unavailable).
4. Re-run LOOM on the wasm-opt output (`wasm-opt -> LOOM` column).
5. Validate every output via `wasm-tools validate`. **A validation failure is a HARD ERROR** -- the harness aborts with exit code 2.

Conventions:
- Δ% is `(out - base) / base * 100`. Negative means smaller (better).
- A row is flagged :red_circle: if LOOM grew the file vs. baseline, or if wasm-opt beats LOOM by more than 1% of baseline.
- Outputs of every run are in `/tmp/loom-measure-corpus` for forensic inspection.

## Reproducing

```bash
# Build LOOM first (Z3 verification enabled)
Z3_SYS_Z3_HEADER=/opt/homebrew/include/z3.h \
LIBRARY_PATH=/opt/homebrew/lib cargo build --release

# Run the harness
bash scripts/measure_corpus.sh
```
16 changes: 10 additions & 6 deletions loom-cli/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -264,6 +264,7 @@ fn count_instructions_from_bytes(bytes: &[u8]) -> usize {
}

/// Optimize command implementation
#[allow(clippy::too_many_arguments)]
fn optimize_command(
input: String,
output: Option<String>,
Expand Down Expand Up @@ -527,12 +528,15 @@ fn optimize_command(
track_pass("canonicalize", before, after);
}

// v1.0.5 Track 1: ægraph-based optimization. Runs AFTER canonicalize
// (canonical operand order makes pattern matching deterministic) and
// BEFORE peephole-synth (so the egraph engine gets first crack at
// identity folds — the substrate is richer than peephole's linear
// pattern matcher). Disabled by default for v1.0.5 since the
// candidate set is tiny; opt-in via --passes egraph.
// ægraph-based optimization. Runs AFTER canonicalize (canonical
// operand order makes pattern matching deterministic) and BEFORE
// peephole-synth (so the egraph engine gets first crack at identity
// folds — the substrate is richer than peephole's linear pattern
// matcher). Default-on as of v1.1.0: cost-driven extraction (Track B)
// plus the widened i64/commutativity rule set (Track C) make it a
// net-neutral-or-better pass on the corpus. Each function is reverted
// untouched unless extraction is strictly shorter, so default-on
// cannot regress output — see egraph_optimize_body.
if should_run("egraph") {
println!(" Running: egraph");
let before = count_instructions(&module);
Expand Down
78 changes: 27 additions & 51 deletions loom-core/src/component_optimizer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -355,9 +355,7 @@ fn optimize_core_module(module_bytes: &[u8]) -> Result<Vec<u8>> {
" Encode failed after 'specialize_adapters' (reverting): {}",
e
);
crate::stats::record_revert(
"component:specialize_adapters/encode-failed",
);
crate::stats::record_revert("component:specialize_adapters/encode-failed");
module.functions = saved_functions;
}
}
Expand All @@ -380,27 +378,18 @@ fn optimize_core_module(module_bytes: &[u8]) -> Result<Vec<u8>> {
match optimize_async_callback_adapters(&mut module) {
Ok(folded) if folded > 0 => match crate::encode::encode_wasm(&module) {
Ok(bytes) => {
if let Err(e) =
Validator::new_with_features(wasm_features_with_async()).validate_all(&bytes)
if let Err(e) = Validator::new_with_features(wasm_features_with_async())
.validate_all(&bytes)
{
eprintln!(
" Module invalid after 'async-adapter' (reverting): {}",
e
);
eprintln!(" Module invalid after 'async-adapter' (reverting): {}", e);
crate::stats::record_revert("component:async_adapter/invalid");
module.functions = saved_functions;
} else {
eprintln!(
" Async-callback adapter: {} call site(s) folded",
folded
);
eprintln!(" Async-callback adapter: {} call site(s) folded", folded);
}
}
Err(e) => {
eprintln!(
" Encode failed after 'async-adapter' (reverting): {}",
e
);
eprintln!(" Encode failed after 'async-adapter' (reverting): {}", e);
crate::stats::record_revert("component:async_adapter/encode-failed");
module.functions = saved_functions;
}
Expand All @@ -425,24 +414,15 @@ fn optimize_core_module(module_bytes: &[u8]) -> Result<Vec<u8>> {
if let Err(e) = Validator::new_with_features(wasm_features_with_async())
.validate_all(&bytes)
{
eprintln!(
" Module invalid after 'async-chain' (reverting): {}",
e
);
eprintln!(" Module invalid after 'async-chain' (reverting): {}", e);
crate::stats::record_revert("component:async_chain/invalid");
module.functions = saved_functions;
} else {
eprintln!(
" Async-chain composition: {} instructions removed",
shrunk
);
eprintln!(" Async-chain composition: {} instructions removed", shrunk);
}
}
Err(e) => {
eprintln!(
" Encode failed after 'async-chain' (reverting): {}",
e
);
eprintln!(" Encode failed after 'async-chain' (reverting): {}", e);
crate::stats::record_revert("component:async_chain/encode-failed");
module.functions = saved_functions;
}
Expand Down Expand Up @@ -848,19 +828,17 @@ fn has_unknown_instructions(instructions: &[Instruction]) -> bool {
for instr in instructions {
match instr {
Instruction::Unknown(_) => return true,
Instruction::Block { body, .. } | Instruction::Loop { body, .. } => {
if has_unknown_instructions(body) {
return true;
}
Instruction::Block { body, .. } | Instruction::Loop { body, .. }
if has_unknown_instructions(body) =>
{
return true;
}
Instruction::If {
then_body,
else_body,
..
} => {
if has_unknown_instructions(then_body) || has_unknown_instructions(else_body) {
return true;
}
} if (has_unknown_instructions(then_body) || has_unknown_instructions(else_body)) => {
return true;
}
_ => {}
}
Expand Down Expand Up @@ -1345,14 +1323,11 @@ mod async_adapter_tests {
assert!(!has_eq, "I32Eq must be gone after fold");
assert!(!has_set, "LocalSet (exit-code capture) must be gone");
assert!(
body.iter()
.any(|i| matches!(i, Instruction::I32Const(42))),
body.iter().any(|i| matches!(i, Instruction::I32Const(42))),
"fast-path constant 42 must remain"
);
assert!(
!body
.iter()
.any(|i| matches!(i, Instruction::I32Const(-1))),
!body.iter().any(|i| matches!(i, Instruction::I32Const(-1))),
"slow-path constant -1 must be gone"
);
}
Expand Down Expand Up @@ -1589,19 +1564,17 @@ mod async_adapter_tests {
for instr in instrs {
match instr {
Instruction::I32Const(-1) => return true,
Instruction::Block { body, .. } | Instruction::Loop { body, .. } => {
if has_const_neg_one(body) {
return true;
}
Instruction::Block { body, .. } | Instruction::Loop { body, .. }
if has_const_neg_one(body) =>
{
return true;
}
Instruction::If {
then_body,
else_body,
..
} => {
if has_const_neg_one(then_body) || has_const_neg_one(else_body) {
return true;
}
} if (has_const_neg_one(then_body) || has_const_neg_one(else_body)) => {
return true;
}
_ => {}
}
Expand Down Expand Up @@ -1849,7 +1822,10 @@ mod adapter_spec_tests {
let mut module = mk_module(vec![func.clone()]);
let folded = specialize_adapters(&mut module).unwrap();

assert_eq!(folded, 0, "Must not touch modules with Unknown instructions");
assert_eq!(
folded, 0,
"Must not touch modules with Unknown instructions"
);
assert_eq!(module.functions[0].instructions, func.instructions);
}

Expand Down
Loading
Loading