Skip to content

air: make symbolic arena handles safe and generation-checked#179

Open
latifkasuli wants to merge 1 commit into
leanEthereum:mainfrom
latifkasuli:fix/safe-symbolic-arena-handles
Open

air: make symbolic arena handles safe and generation-checked#179
latifkasuli wants to merge 1 commit into
leanEthereum:mainfrom
latifkasuli:fix/safe-symbolic-arena-handles

Conversation

@latifkasuli
Copy link
Copy Markdown

@latifkasuli latifkasuli commented Mar 30, 2026

Summary

Fixes #170.

The thread-local symbolic expression arena in crates/backend/air/src/symbolic.rs used raw u32 byte offsets as public handles into a Vec<u8>. Because SymbolicExpression::Operation(u32) and safe get_node(u32) were public, downstream code could forge handles or keep stale handles and cause unchecked read_unaligned reads.

This PR closes the full issue, not just the immediate OOB symptom:

  1. Forgeable handles -- downstream code can no longer construct an arbitrary arena offset.
  2. Stale handles after clear -- handles carry a generation and are rejected after clear_arena().
  3. Cross-thread handle misuse -- handles carry a thread-local arena id and are rejected in the wrong thread.
  4. Index truncation / arena overflow -- allocation checks both the offset and resulting arena end before storing a u32 offset.
  5. Arena id wraparound -- thread-local arena ids are allocated with a checked atomic update instead of wrapping fetch_add.

What this PR does

  • Replaces raw u32 with an opaque SymbolicNodeRef<F> handle carrying arena_id, generation, and offset with private fields.
  • Replaces the bare Vec<u8> with ArenaState { arena_id, generation, bytes }.
  • Bumps generation on clear_arena() to invalidate outstanding handles.
  • Adds try_get_node and keeps get_node as the panic-on-invalid convenience wrapper.
  • Validates arena id, generation, and bounds before any read_unaligned.
  • Uses checked allocation math before converting offsets to u32.
  • Migrates rec_aggregation::compilation caches and dot-product detection helpers to key by SymbolicNodeRef<F>.
  • Keeps SymbolicExpression and SymbolicNodeRef Copy.

Files changed

File Change
crates/backend/air/src/symbolic.rs Core fix: ArenaState, SymbolicNodeRef, SymbolicNodeAccessError, checked arena id/allocation helpers, try_get_node, clear_arena, updated alloc_node and get_node, 8 unit tests
crates/backend/air/Cargo.toml Adds koala-bear as dev-dependency for tests
crates/rec_aggregation/src/compilation.rs Migrates Operation cache/dot-product codegen from u32 offsets to SymbolicNodeRef<F> handles
Cargo.lock Auto-updated

Test plan

Eight unit tests in symbolic::tests:

  • roundtrip_alloc_get -- basic alloc + read roundtrip
  • stale_handle_rejected_after_clear -- handle from previous generation returns StaleGeneration
  • old_handle_cannot_read_new_generation_bytes -- old handle cannot read data written in a new generation at the same offset
  • wrong_thread_handle_rejected -- handle created on one thread returns WrongArena on another thread
  • out_of_bounds_handle_rejected -- invalid in-module handle with large offset returns OutOfBounds
  • offset_truncation_detected -- allocation range fails before exceeding the u32 offset range
  • arena_id_overflow_detected -- arena id allocation fails instead of wrapping
  • arithmetic_produces_valid_handles -- arithmetic on non-constant expressions produces valid operation handles
  • Compile-time assertion that SymbolicExpression<KoalaBear> and SymbolicNodeRef<KoalaBear> are Copy

Verified after rebase:

cargo fmt --check
cargo test -p mt-air
cargo check --workspace
cargo clippy --workspace --all-targets -- -D warnings

@TomWambsgans TomWambsgans force-pushed the main branch 2 times, most recently from c6f9fd4 to c09c85a Compare April 26, 2026 21:43
@TomWambsgans TomWambsgans force-pushed the main branch 2 times, most recently from eacd019 to 9b2f632 Compare May 25, 2026 00:11
@TomWambsgans TomWambsgans force-pushed the main branch 2 times, most recently from c5a3050 to 9dc5d68 Compare May 28, 2026 12:02
Replace raw `u32` arena offsets with opaque `SymbolicNodeRef<F>` handles
that carry arena_id, generation, and offset. Validate all three before
any `read_unaligned`, turning four classes of UB into deterministic
errors:

1. Forgeable handles: `SymbolicNodeRef` fields are private, so
   downstream code can no longer construct arbitrary indices.
2. Stale handles after clear: `clear_arena()` bumps a generation
   counter; old handles fail with `StaleGeneration`.
3. Cross-thread misuse: each thread-local arena gets a unique
   `arena_id` from a global `AtomicU32`; handles from another
   thread fail with `WrongArena`.
4. Index truncation: `alloc_node` uses `u32::try_from(offset)`
   instead of a bare `as u32` cast.

`SymbolicExpression` remains `Copy`. The only downstream migration
is `Operation(u32)` -> `Operation(SymbolicNodeRef<F>)`, which is
a mechanical change (one call site in `rec_aggregation::compilation`).

Closes leanEthereum#170
@latifkasuli latifkasuli force-pushed the fix/safe-symbolic-arena-handles branch from 5508d33 to 3a22222 Compare June 3, 2026 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[H-3] Unchecked arena index enables out-of-bounds reads

1 participant