Skip to content

feat(graph): Backend::MailboxSoa — classid node-match + CLAM/CAKES neighborhood (Inc 0)#544

Merged
AdaWorldAPI merged 14 commits into
mainfrom
claude/inc0-mailbox-soa-backend
Jun 19, 2026
Merged

feat(graph): Backend::MailboxSoa — classid node-match + CLAM/CAKES neighborhood (Inc 0)#544
AdaWorldAPI merged 14 commits into
mainfrom
claude/inc0-mailbox-soa-backend

Conversation

@AdaWorldAPI

@AdaWorldAPI AdaWorldAPI commented Jun 18, 2026

Copy link
Copy Markdown
Owner

Inc 0 — Backend::MailboxSoa: the substrate-is-the-graph dispatch table, two facets landed

The substrate IS the graph (E-GUID-IS-THE-GRAPH); a query routes to the cheapest facet that answers it, off the GUID key, zero value decode. This PR lands the two key-only facets.

Facet 1 — classid node-match (MATCH (n:Label))

  • match_nodes_by_class(view, class_id) — classid prefix-route; reads only the class column.
  • match_node_by_local_key(view, local_key) — point lookup via MailboxSoaView::row_for_local_key (None-fallback to positional).

Facet 2 — CLAM/CAKES neighborhood (proximity) — panCAKES ≡ radix trie ≡ HHTL

The CLAM cluster tree isn't a separate structure — it is the radix trie of the classid·HEEL·HIP·TWIG nibble paths already in the keys (E-PANCAKES-IS-RADIX-IS-HHTL). So neighborhood = pure prefix arithmetic:

  • NiblePath::common_prefix_depth (contract) — the radix-trie NN measure (CAKES attraction = longest-common-prefix).
  • MailboxSoaView::hhtl_path_at (contract) — per-row HHTL path, deferred-binding default None (canon NodeRow already carries key(16)).
  • clam_contained (is_ancestor_of = the radix subtree = CLAM cluster) + cakes_nearest (common_prefix_depth k-NN). Zero value decode.

Gates (tests, all green, clippy clean)

  • node-match parity vs reference classid filter;
  • CLAM containment = the radix subtree (rows 0,1,2 under 1·2; leaf query narrows to 0);
  • CAKES ranks by shared-prefix depth [(0,3),(1,2),(2,2)];
  • deferred-None HHTL ⇒ scan yields nothing (coarser-facet fallback, never a wrong row);
  • F2 zero-value-decode: a GuardedSoa whose energy()/meta_raw() panic — all facets complete without touching the 480 B value slab.
  • 7/7 mailbox_scan, 21/21 hhtl, 674 contract green.

Dispatch-table tiers still ahead (grounded, not faked)

  • EdgeBlock typed-edge ((a)-[r:TYPE]->(b)): 12-family/4-external or 32×4 turbovec, per classid → EdgeCodecFlavor (E-ADJACENCY-IS-KEY-AND-EDGECODEC).
  • helix exact-location (Signed360, HelixResidue tenant): the one tier that IS a value decode, costed as such (E-HELIX-IS-EXACT-LOCATION).
  • CHAODA anomaly + CausalEdge64 SPO (E-CLAM-IS-THE-MANIFOLD-ENGINE).

Additive, layout-preserving — no NodeRow/stride/ENVELOPE_LAYOUT_VERSION change, no new object model (F5).

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Documentation
    • Expanded architecture/routing/geometry epiphanies for mailbox-based execution and neighborhood semantics.
  • New Features
    • Added a new mailbox-based backend route with deferred edge traversal.
    • Introduced mailbox SoA scan helpers for node lookup, CLAM/CAKES-style containment & nearest neighbors, coarse edge-slot extraction, and prefix-depth distance ranking.
    • Added NiblePath::common_prefix_depth and exposed new MailboxSoaView accessors for per-row HHTL paths and edge blocks.
  • Tests
    • Added/extended unit tests for the new prefix-depth API, scan behaviors, and guarded access semantics.

…(Inc 0 first slice)

cypher-kanban-ast-unification-v1 Inc 0, the verified-safe half: the substrate IS
the graph (E-GUID-IS-THE-GRAPH), so MATCH (n:Label) is a classid prefix-route
over the zero-dep MailboxSoaView contract, resolved off the class column with
zero value decode.

- graph_router::Backend gains the MailboxSoa variant (the named router gap).
- graph/mailbox_scan.rs: match_nodes_by_class (classid route; reads only the
  class column) + match_node_by_local_key (local_key->row via row_for_local_key,
  None-fallback to positional address).
- Gates: parity (matched set == reference classid filter); F2 zero-value-decode
  proven structurally by a GuardedSoa whose energy()/meta_raw() panic on access;
  key-index point lookup. 4/4 green, no new clippy warnings.

Edge-traversal ((a)-[r]->(b)) deliberately deferred, grounded not faked:
CausalEdge64 (the edges_raw column) is an SPO triple, NOT a row->row adjacency
pointer, and the View exposes no EdgeBlock adjacency accessor. That is the
edge-representation boundary the 5+3 council said to pin first (verdict 4b);
it lands as the next slice once the classid-resolved edge rep + adjacency
accessor are added.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds Backend::MailboxSoa variant and MailboxSoaView contract extensions (hhtl_path_at, edge_block_at), implements radix-trie shared-prefix depth utility, introduces a mailbox_scan module with NodeMatch-based class/local-key node matching and zero-decode neighborhood facets (CLAM containment, CAKES nearest, coarse edge slots, distance metrics), plus comprehensive guarded tests and design documentation clarifying FWHT ranking, IVF-PQ architecture, manifold geometry engine, helix location encoding, and multi-facet adjacency representation.

Changes

MailboxSoA Node-Match Routing and Zero-Decode Neighborhood Facets

Layer / File(s) Summary
Backend::MailboxSoa enum variant and module export
crates/lance-graph/src/graph/graph_router.rs, crates/lance-graph/src/graph/mod.rs
Adds MailboxSoa variant to the Backend enum with documentation describing canonical GUID-keyed substrate routing and deferred edge traversal; exports the new mailbox_scan submodule.
MailboxSoaView contract extensions
crates/lance-graph-contract/src/soa_view.rs
Adds optional per-row accessors hhtl_path_at(row) -> Option<NiblePath> and edge_block_at(row) -> Option<EdgeBlock> with #[inline] default None implementations for deferred per-row data materialization.
Radix-trie shared-prefix depth utility
crates/lance-graph-contract/src/hhtl.rs
Implements NiblePath::common_prefix_depth(self, other) -> u8 computing the longest matching prefix depth between two HHTL paths, plus unit tests verifying correct depth for identical, divergent, ancestor, disjoint, symmetric, and empty-path cases.
NodeMatch struct and basic scan functions
crates/lance-graph/src/graph/mailbox_scan.rs
Defines NodeMatch carrying row index and Backend::MailboxSoa tag; implements match_nodes_by_class scanning only the class_id SoA column and match_node_by_local_key resolving via row_for_local_key, both without value-slab column access.
Neighborhood geometry facets: CLAM, CAKES, and coarse edge slots
crates/lance-graph/src/graph/mailbox_scan.rs
Implements clam_contained filtering rows by CLAM subtree ancestry; cakes_nearest computing per-row shared-prefix depth vs. query and sorting descending; EdgeNeighbors struct and edge_slots_coarse reading EdgeBlock for CoarseOnly flavor and extracting non-zero slot bytes.
Distance metrics: PrefixDepth-based node distance
crates/lance-graph/src/graph/mailbox_scan.rs
Adds DistanceMeans enum and implements node_distance computing radix-tree hop distance via HHTL path common-prefix depth, returning None if either row lacks a materialized HHTL path.
Comprehensive zero-decode test suite
crates/lance-graph/src/graph/mailbox_scan.rs
Adds extensive unit tests using GuardedSoa test double that panics on value-slab access, validating class/local-key resolution, CLAM containment, CAKES depth sorting, edge-slot flavor gating, and node-distance prefix-depth metrics across positive/empty/skip-when-path-missing cases.
Design documentation: FWHT ranking, IVF-PQ architecture, manifold engine, helix location, and adjacency facets
.claude/board/EPIPHANIES.md
Prepends eight epiphany entries documenting HHTL coarse fingerprints as scale-free router; per-tenant Walsh–Hadamard spectrum and Kronecker-factorization; orthogonal bundles and Parseval recovery with Walsh–Hadamard equivalence; tenant-switch angle-ranking as CAM-PQ ADC (IVF-PQ); panCAKES/GUID-key prefix unification for zero-decode operations; CLAM manifold geometry ensemble with per-facet cost models; helix Signed360 three-rung decode-cost ladder; and multi-facet adjacency representation (HHTL/CLAM prefix cascade, typed EdgeBlock edges, CausalEdge64 causal arcs) with routing dispatch rules.

Sequence Diagram(s)

sequenceDiagram
  participant CypherRouter
  participant match_nodes_by_class
  participant match_node_by_local_key
  participant clam_contained
  participant cakes_nearest
  participant MailboxSoaView

  rect rgba(100, 149, 237, 0.5)
    Note over CypherRouter,MailboxSoaView: MATCH (n:Label) — class route
    CypherRouter->>match_nodes_by_class: view, class_id
    match_nodes_by_class->>MailboxSoaView: class_id() column only
    MailboxSoaView-->>match_nodes_by_class: entity_type rows
    match_nodes_by_class-->>CypherRouter: Vec<NodeMatch>
  end

  rect rgba(144, 238, 144, 0.5)
    Note over CypherRouter,MailboxSoaView: MATCH (n {key}) — local-key route
    CypherRouter->>match_node_by_local_key: view, local_key
    match_node_by_local_key->>MailboxSoaView: row_for_local_key(local_key)
    MailboxSoaView-->>match_node_by_local_key: Option<usize>
    match_node_by_local_key-->>CypherRouter: Option<NodeMatch>
  end

  rect rgba(255, 218, 185, 0.5)
    Note over CypherRouter,MailboxSoaView: Neighborhood facets — zero-decode geometry
    CypherRouter->>clam_contained: view, query
    clam_contained->>MailboxSoaView: hhtl_path_at(row) per candidate
    MailboxSoaView-->>clam_contained: Option<NiblePath>
    clam_contained-->>CypherRouter: Vec<NodeMatch> in CLAM subtree

    CypherRouter->>cakes_nearest: view, query, k
    cakes_nearest->>MailboxSoaView: common_prefix_depth per row
    MailboxSoaView-->>cakes_nearest: u8 shared depth
    cakes_nearest-->>CypherRouter: Vec<(NodeMatch, depth)> sorted desc
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • AdaWorldAPI/lance-graph#437: Introduced the MailboxSoaView contract that this PR extends with hhtl_path_at and edge_block_at methods to enable zero-decode neighborhood facet implementations.
  • AdaWorldAPI/lance-graph#507: Extended the same mailbox-query stack by adding MailboxSoaView-backed accessors and new NiblePath routing helpers in lance-graph-contract, upon which the main PR's Backend::MailboxSoa scan logic directly builds.
  • AdaWorldAPI/lance-graph#542: Added MailboxSoaView::row_for_local_key key-to-row resolver that the new match_node_by_local_key function directly calls.

Poem

🐇 Hop through the SoA rows with care,
No value-slab decodes to spare!
class_id column, prefix depth measured true,
CLAM containment, CAKES through and through.
The helix ladder climbs with grace—
IVF-PQ geometry finds its place! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: introducing Backend::MailboxSoa with classid node-match and CLAM/CAKES neighborhood functionality.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e6e2b3e1c4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +61 to +64
let classes = view.class_id();
classes
.iter()
.enumerate()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Clamp class scans to logical rows

This iterates the entire class_id() slice instead of the view's logical row count. That is fine for the test GuardedSoa, but the real in-memory MailboxSoA<N> reports n_rows() == populated while entity_type() returns the full backing array capacity, initialized to zeros (cognitive-shader-driver/src/mailbox_soa.rs documents this phantom-row guard around n_rows). In contexts using that view, MATCH can return unpopulated padding rows (for example an empty mailbox queried for class 0, or stale padding after the logical size shrinks), corrupting node-match results; please clamp the scan with view.n_rows()/take(view.n_rows()).

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 2e6d039: match_nodes_by_class now .take(view.n_rows()) so the scan stops at the logical row count and never surfaces the zero-padded capacity tail. Added a regression test (match_nodes_by_class_clamps_to_n_rows_ignoring_padding) modelling a 2-populated / 3-zero-padding view — class 7 returns only [0,1], class 0 returns empty.


Generated by Claude Code

claude added 4 commits June 18, 2026 20:24
…rep boundary (§4b)

Operator correction: adjacency lives in two places, classid/key-resolved, never
query-guessed:
1. HHTL cascade in the GUID key = CLAM hierarchical neighborhood (NiblePath
   is_ancestor_of/prefix; graph/neighborhood/clam.rs) — free, zero value decode.
2. 16-byte EdgeBlock = explicit typed edges per EdgeCodecFlavor: CoarseOnly
   (16x8) = 12 in-family + 4 external; Pq32x4 (32x4) = turbovec residue edges.
3. edges_raw = CausalEdge64 = SPO causal arcs (separate facet).
The class picks the rep (classid -> ClassView -> EdgeCodecFlavor). Unblocks the
deferred edge half of #544: next slice exposes the HHTL/key + EdgeBlock per row
on MailboxSoaView, then CLAM prefix-route + EdgeBlock slot-deref, both
zero-value-decode.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
…lix Signed360 is the exact orthogonal point, "where" is a decode ladder

Operator correction: the helix is not more adjacency, it is the EXACT orthogonal
LOCATION. Adjacency (HHTL/CLAM near, EdgeBlock connected) is relational; helix
Signed360 (ValueTenant::HelixResidue, signed full-sphere golden-spiral, 6B) is
the absolute exact coordinate. "Where" is a decode-cost ladder:
1. HHTL/CLAM containment - key prefix, zero value decode (which cluster).
2. Helix PLACE - deterministic from the address, zero value decode.
3. Helix RESIDUE - Signed360 6B in the value slab, one value-tenant decode (exact).
Router consequence: proximity query = key (free); exact-position query = read the
HelixResidue tenant (a value decode, costed as such). Grounded in canonical_node
ValueTenant::HelixResidue + ValueSchema::Compressed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
…LFD, not a containment check; the full geometry-of-a-node surface

Operator: ndarray also has chaoda. Grounded in ndarray/src/hpc/clam.rs (CAKES
Alg1/4/6, panCAKES Alg2, CHAODA Phase 4 anomaly_scores from LFD) + perturbation-
sim chaoda (CHAODA-lite, names ndarray ClamTree as production). The CLAM facet
is the manifold engine: containment + CAKES ranked-NN (attraction) + CHAODA
anomaly (repulsion) + panCAKES compression, one tree, LFD the shared measure.

Synthesized geometry-of-a-node: off one GUID the substrate answers which-cluster
(CLAM, free), nearest-similar (CAKES), how-anomalous (CHAODA), exact-location
(helix Signed360, value decode), connected-to (EdgeBlock), caused-by
(CausalEdge64) - a complete geometric+relational surface, each at its own decode
cost; the router dispatches a query to the cheapest facet that answers it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
…ixes (Inc 0, the manifold facet)

panCAKES == radix trie == HHTL (E-PANCAKES-IS-RADIX-IS-HHTL): the CLAM cluster
tree IS the radix trie of the classid·HEEL·HIP·TWIG nibble paths already in the
keys, so the structural neighborhood is pure prefix arithmetic, zero value decode.

- NiblePath::common_prefix_depth (contract) — the radix-trie nearest-neighbor
  measure; longest-common-prefix = CAKES attraction. +1 unit test.
- MailboxSoaView::hhtl_path_at (contract) — per-row HHTL NiblePath, deferred-
  binding default None (canon NodeRow already carries key(16); the override
  exposes what's there).
- graph::mailbox_scan::clam_contained (is_ancestor_of = the radix subtree =
  CLAM cluster) + cakes_nearest (common_prefix_depth ranking, k-NN). Both
  key-only, zero value decode.
- Tests: containment = radix subtree (rows 0,1,2 under 1·2; leaf narrows to 0);
  CAKES ranks by shared depth [(0,3),(1,2),(2,2)]; deferred-None yields nothing
  (coarser-facet fallback); F2 zero-value-decode extended to CLAM/CAKES (the
  GuardedSoa value columns still panic-guarded). 7/7 mailbox_scan + 21/21 hhtl
  green, clippy clean.

This is the first dispatch-table facet beyond the classid node-match: proximity/
neighborhood resolves on the key (CLAM/CAKES), free, per E-CLAM-IS-THE-MANIFOLD-
ENGINE. Edge-deref (EdgeBlock) + helix exact-location (value decode) are the
next tiers.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
@AdaWorldAPI AdaWorldAPI changed the title feat(graph): Backend::MailboxSoa + the classid-route node-match (Inc 0 first slice) feat(graph): Backend::MailboxSoa — classid node-match + CLAM/CAKES neighborhood (Inc 0) Jun 18, 2026
claude added 8 commits June 18, 2026 20:58
…ly/4-external slot decode (Inc 0)

The third dispatch-table facet: explicit typed edges (a)-[r]->... under the
classid-resolved EdgeCodecFlavor (E-ADJACENCY-IS-KEY-AND-EDGECODEC). EdgeBlock is
bytes 16..32 (the edge region), NOT the value slab, so still zero value decode.

- MailboxSoaView::edge_block_at(row) -> Option<EdgeBlock> (contract, deferred
  default None; the canon NodeRow carries edges(16), the override exposes it).
- graph::mailbox_scan::{EdgeNeighbors, edge_slots_coarse}: under CoarseOnly,
  decode the 12 in-family + 4 external slots to their populated (non-zero) refs,
  family vs external. Pq32x4 (turbovec residue) / CoarseResidue are refused -
  they are NOT adjacency, never coerced to slots (boundary 4b: classid-resolved,
  not query-guessed).
- Slot-byte -> neighbor-row resolution is deliberately deferred (the basin-local-
  index convention + zero-collision is the next encoding decision, analogous to
  local_key->row); this facet lands the structure (which slots are edges, family
  vs external, under which flavor), never fakes the row resolution.
- Tests: populated decode ([2,5] family + [1] external), all-zero = no edges,
  no-block = None, non-Coarse flavors refused. F2 zero-value-decode extended.
  9/9 mailbox_scan, clippy clean (sort_by_key + Reverse).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
…anded facets

CI fmt --check failed on three blocks (let-chain filter, EdgeNeighbors collect,
cakes assert). Applied cargo fmt. Also refreshed the stale module doc header
(it still described only the node-match half and called CLAM/EdgeBlock
"deferred") to document the three landed key-resident facets + the genuinely
deferred tiers (slot->row convention, helix/CHAODA/SPO costed tier). No logic
change; 9/9 mailbox_scan green, fmt clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
…switch 16K-from-an-angle compare is the costed value sweep, composing with the free key facets as a two-stage cascade

"Switch tenant + compare across the 16K mailbox from an angle" decomposes as:
batch Hamming sweep (hamming_top_k over a contiguous identity plane) = the right
use of popcount on the homogeneous 16K fingerprint; "angle" = which plane
(content/topic/angle) + query; "tenant switch" = column selector. Load-bearing:
it composes with #544's free key facets as a two-stage HHTL cascade - CLAM/CAKES
prefix prune (free, zero decode) then angle-Hamming rank over the pruned set
(costed). Key prune + content rank = two halves of one query. Cost-class
boundary: this facet deliberately decodes the value plane, NOT in the F2
zero-decode class, lands on its own branch with its own cost gate. Grounded in
MailboxSoA content/topic/angle_row + ndarray hamming_top_k + cycle snapshot.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
…p; IVF coarse quantizer IS the HHTL/CLAM prefix

Operator: "sweep or just 90° fingerprint vector cam index" → CAM index. The
value-side rank is a CAM-PQ ADC (distance-table lookups + IVF probe), never a
linear Hamming sweep (that's only the no-index fallback). The load-bearing
correction: CAM-PQ = IVF-PQ, and its IVF coarse quantizer IS the HHTL/CLAM
prefix (turbovec: palette256 = coarse quantizer) while the PQ residual IS the
turbovec Pq32x4 / value-slab codes. So #544's cakes_nearest prefix prune is
literally the IVF coarse-quantization stage, not a prefilter on a scan. 90° =
content/topic/angle orthogonal axes each get their own distance table; tenant
switch = which orthogonal table to ADC against. Retitled the entry (was
"…-SWEEP-IS-PRUNE-THEN-RANK"); fixed an accidental duplicate E-PANCAKES header.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
…seval (any basis); Walsh-Hadamard projection iff bipolar ±1 basis (our case); the third ranking path

Operator: "bundle a 90° sweep → a hadamard projection or something" — made
precise + graded. [G] orthogonal bundle is Parseval-recoverable in any
orthonormal basis; [G] it's a Walsh-Hadamard projection iff the basis is ±1 WH
(H^T H = N I); [G, this substrate] fingerprints are bipolar ±1 so it literally
is a WHT (canon bipolar-phase pyramid + witness.rs particle==wave). Caveats: two
Hadamards (transform = basis change, NOT the product = vsa_bind); "Walsh =
eigenbasis" only on hypercube-structured data (sketch.rs:20). Payoff: the third
ranking path — FWHT the field once, read many angle queries by Parseval
(transform-once/read-many); honest bound: no measured single-query speedup
(witness.rs). Grounded in ndarray::simd::wht_f32 + perturbation-sim sketch/witness.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
…t WHT spectrum = cheap meta-awareness; WHT tensor-factorizes along HHTL → exponential prefix-table lookup (capstone)

Two graded claims (operator synthesis). Claim 1 [G]: one FWHT per tenant =
few-coefficient global summary (walsh_pyramid_energy: per-dyadic-level energy +
coarse fraction); coarse-dominant=predictable, fine-spread=surprising → a
free-energy/awareness proxy, transform-once, cheap, for all tenants a small
meta-fingerprint panel. Claim 2 [G mechanism / H "any factor"]: H_{16^n} =
H_16^⊗n (Kronecker; FWHT butterfly = the HHTL nibble tree), so a Walsh op over a
16^k prefix-subtree = k stages of 16-point table ops = O(k·16) vs O(16^k) for a
SEPARABLE (low-sequency) factor — "exponential lookup over prefixed tables"
(bgz-tensor attention-as-lookup + cascade as Kronecker). Caveat: only for
separable factors; full FWHT is O(N log N) not sub-linear. Capstone: five
threads (PANCAKES index / ORTHOGONAL-BUNDLE transform / TENANT-ANGLE tables /
spectrum summary / prefix-lookup composition) are ONE Walsh-Kronecker structure
along the nibble hierarchy.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
…oarse fingerprints route in-RAM (IVF) AND cross-server (shard) with one lookup; corrected capacity (8MiB/8GiB at 512B node)

Deployment-tier synthesis. Corrected arithmetic: canon 512B node → mailbox =
16384×512 = 8 MiB (not 2 MB; 2 MiB only at a 128B reduced node), 1024 prefixes =
8 GiB, coarse table = 512 KiB negligible. The scale-free insight: the same 1024
coarse fingerprints route at two scales with one lookup — in-RAM IVF probe
(which cluster) AND cross-server shard route (which server). HHTL prefix =
cluster key = IVF cell = shard key. "Delegate awareness to other servers" =
replicate the 512KiB coarse table + route by prefix; the IVF coarse stage IS the
shard router. Split: GUID key + routing uncompressed/transparent, 480B value
slab compresses in Lance, Raft per region. Fences: fork-policy P0 (path is
SurrealDB-on-TiKV not raw TiKV → STOP-and-ask); capacity-only, no throughput bench.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
…erent means, GUID = node"

Consolidates the distance facets into one typed surface (the task at hand): a
node IS its GUID, and node↔node separation is computed by a selectable means
anchored on the key. DistanceMeans names the dispatch surface; node_distance is
the dispatcher.

- PrefixDepth (landed, key-only, zero value decode): the CLAM/HHTL radix
  TREE-HOP metric (depth_a − cpd) + (depth_b − cpd) — a genuine metric
  (d(x,x)=0, symmetric, tree triangle inequality), NOT the earlier MAX−cpd which
  failed d(x,x)=0. Caught by the self-distance test.
- Value-decode means (Hamming-plane / HelixAngular / PqAdc) named in the enum doc
  as the costed tier — wired on their own branch with their own cost gate, return
  None here until then (never silently in the zero-decode path).

Tests: tree-hop metric (self=0, siblings=2, ancestor=1, cross-basin=4, symmetric,
monotone) + None on unmaterialized path. 11/11 mailbox_scan green, fmt + clippy clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
…ng rows (codex P1 #544)

The real MailboxSoA<N> reports n_rows() == populated while class_id()/
entity_type() borrow the full zero-padded backing capacity. Iterating the raw
slice surfaced phantom padding rows (MATCH class 0 hitting the zeroed tail, or
stale padding after a logical shrink). Added .take(view.n_rows()) so the scan
stops at the logical row count. Regression test: a 2-populated / 3-zero-padding
fake — class 7 returns only [0,1], class 0 returns empty. 12/12 green, fmt+clippy clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
AdaWorldAPI pushed a commit that referenced this pull request Jun 19, 2026
# Conflicts:
#	crates/lance-graph/src/graph/mailbox_scan.rs
AdaWorldAPI pushed a commit that referenced this pull request Jun 19, 2026
…_at on the real owner (codex P2 #545)

The Hamming-plane DistanceMeans returned None on the live MailboxSoA<N> because
its MailboxSoaView impl didn't override identity_plane_at (only the unit-test
fake did). The owner DOES carry the W1b content/topic/angle planes, so the
override maps Content/Topic/Angle -> content_row/topic_row/angle_row, guarded by
`populated` (a row beyond the logical size returns None, never a zero-padded
capacity row — same discipline as n_rows() / the #544 clamp). +1 test:
populated row returns the plane byte-identical to content_row; padding row None.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.claude/board/EPIPHANIES.md:
- Around line 21-42: Update the incorrect ndarray module path reference on line
23 from `ndarray::simd::wht_f32` to the correct path
`ndarray::hpc::fft::wht_f32` to match the actual forked ndarray crate structure.
Additionally, clarify the phrase "Theorem-checker applied" to explicitly state
that it refers to mathematical validation through formal reasoning rather than
an automated tool, and reference the grounding theorems (such as Weyl's
inequality and Davis–Kahan) as documented in `perturbation-sim/METHODS.md` for
reproducibility.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 8df73c80-d494-4b1f-b225-529836a2dd57

📥 Commits

Reviewing files that changed from the base of the PR and between e2c518e and 2e6d039.

📒 Files selected for processing (2)
  • .claude/board/EPIPHANIES.md
  • crates/lance-graph/src/graph/mailbox_scan.rs

Comment on lines +21 to +42
## 2026-06-18 — E-WHT-META-AWARENESS-AND-KRONECKER-LOOKUP — per-tenant WHT spectrum = cheap global meta-awareness (a few coefficients); and the Walsh transform tensor-factorizes along HHTL (H_16^⊗n) → "exponential lookup over prefixed tables" for separable factors

**Status:** TWO claims, graded separately (operator synthesis: "operationalize WHT picking any tenant over the standing wave sorted by any factor → meta-awareness; …use it for HHTL 16ⁿ exponential lookup over prefixed tables"). Theorem-checker applied. Grounded in `perturbation-sim::sketch::{fwht, walsh_pyramid_energy}`, `ndarray::simd::wht_f32`, bgz-tensor attention-as-table-lookup + HHTL cascade, OGAR "Bipolar-phase pyramid = Walsh-Hadamard". Capstone of the geometry-of-a-node arc.

**Claim 1 — per-tenant WHT spectrum = global meta-awareness. [G as descriptor; framing as "awareness"].** `walsh_pyramid_energy` already gives Walsh energy per dyadic level + coarse fraction. So one FWHT of a tenant field → a **few-coefficient summary of all 16K rows**: coarse-dominant = smooth/clustered/predictable (low surprise), fine-spread = scattered (high surprise). For all tenants (hhtl/helix/energy/content/topic/angle) it's a small fixed "meta-fingerprint" panel, transform-once, reusable, **cheap** — and it is a **free-energy proxy** (spectral concentration = field predictability = the workspace's self-monitoring/awareness signal, per the active-inference loop). Sound and valuable.

**Claim 2 — WHT for HHTL 16ⁿ "exponential lookup over prefixed tables". [G for the mechanism; H for "any factor"].**
- **[G]** The Walsh-Hadamard matrix **tensor-factorizes along the nibble hierarchy**: `H_{16ⁿ} = H₁₆ ⊗ H₁₆ ⊗ … (n)` (Kronecker; the FWHT butterfly, radix-16). The HHTL nibble tree IS that tensor structure (`E-PANCAKES-IS-RADIX-IS-HHTL`).
- **[G]** Therefore a Walsh-domain operation over a 16ᵏ prefix-subtree decomposes into **k stages of 16-point table ops** — the per-level 16-entry tables are the "prefixed tables," composed by the tensor product → **O(k·16) vs O(16ᵏ)** for a **separable** factor (the bgz-tensor "attention as table lookup" + HHTL cascade, seen as the Kronecker factorization). "Top gaussian preserved level-to-level" (canon Parseval) is the condition making the common factor separable.
- **[H] Caveat (do NOT overclaim):** the exponential saving holds only for factors **low-sequency / separable** in the Walsh basis; an arbitrary high-sequency factor still has to touch the leaves. And a full FWHT of the whole field is O(N log N), not sub-linear — the exponential win is the **per-level table composition over a prefix subtree**, not the full transform.

**The capstone — five threads are ONE structure (Walsh-Kronecker factorization along the HHTL nibble hierarchy):**

| Thread | facet of the Kronecker/HHTL structure |
|---|---|
| `E-PANCAKES` (radix≡HHTL≡panCAKES) | the tensor **index** (n nibble levels) |
| `E-ORTHOGONAL-BUNDLE` (90°→WHT) | the tensor **transform** (H₁₆^⊗ⁿ) |
| `E-TENANT-ANGLE` CAM-PQ ADC | the tensor **tables** (prefixed 16-point distance tables) |
| Claim 1 (per-tenant spectrum) | the tensor **summary** (pyramid energy = meta-awareness) |
| Claim 2 (16ⁿ prefix lookup) | the tensor **composition** (k stages, not 16ᵏ) |

Cross-refs: `E-ORTHOGONAL-BUNDLE-IS-WHT-READOUT`, `E-TENANT-ANGLE-RANK-IS-CAM-PQ-ADC`, `E-PANCAKES-IS-RADIX-IS-HHTL`, `E-CLAM-IS-THE-MANIFOLD-ENGINE`, `ndarray::simd::wht_f32`, `perturbation-sim::sketch`, bgz-tensor attention-as-lookup.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Verify Cargo.toml dependencies for perturbation-sim, bgz-tensor, and ndarray version.

# Check for direct dependencies:
rg -A 2 'perturbation-sim|bgz-tensor' 'Cargo.toml' --type toml

# Check ndarray version:
rg 'ndarray' 'Cargo.toml' --type toml

# Check if perturbation-sim and bgz-tensor are internal crates:
fd -t f 'Cargo.toml' crates/ | xargs rg -l 'perturbation-sim|bgz-tensor'

Repository: AdaWorldAPI/lance-graph

Length of output: 2532


🏁 Script executed:

# Check if perturbation-sim::sketch module exists with the claimed functions
find crates/perturbation-sim -name "*.rs" | head -20

Repository: AdaWorldAPI/lance-graph

Length of output: 1010


🏁 Script executed:

# Search for fwht and walsh_pyramid_energy functions
rg -i "fwht|walsh_pyramid_energy" crates/perturbation-sim/ --type rs

Repository: AdaWorldAPI/lance-graph

Length of output: 94


🏁 Script executed:

# Search in ndarray fork for wht_f32 (check local ndarray path if available)
rg -i "wht_f32|walsh.*hadamard" . --type rs -g "*/ndarray/*" -g "*/simd*" 2>/dev/null | head -30

Repository: AdaWorldAPI/lance-graph

Length of output: 49


🏁 Script executed:

# Check bgz-tensor for attention-as-lookup references
rg -i "attention|lookup|table" crates/bgz-tensor/ --type rs | head -20

Repository: AdaWorldAPI/lance-graph

Length of output: 94


🏁 Script executed:

# Search for "Theorem-checker" documentation
rg -i "theorem.?checker" . --type md --type txt 2>/dev/null

Repository: AdaWorldAPI/lance-graph

Length of output: 49


🏁 Script executed:

# Search for fwht and walsh_pyramid_energy without type flag
rg "fwht|walsh_pyramid_energy" crates/perturbation-sim/ --max-count=10

Repository: AdaWorldAPI/lance-graph

Length of output: 2404


🏁 Script executed:

# Check the main lib.rs structure in perturbation-sim
head -100 crates/perturbation-sim/src/lib.rs

Repository: AdaWorldAPI/lance-graph

Length of output: 4528


🏁 Script executed:

# Search for ndarray simd wht references in the full repo
rg "wht_f32|simd.*walsh" . --max-count=10

Repository: AdaWorldAPI/lance-graph

Length of output: 3097


🏁 Script executed:

# Look for bgz-tensor main module
head -100 crates/bgz-tensor/src/lib.rs 2>/dev/null || echo "File not found"

Repository: AdaWorldAPI/lance-graph

Length of output: 3380


🏁 Script executed:

# Search for Theorem-checker in markdown and docs
rg "Theorem" . -t md 2>/dev/null | head -20

Repository: AdaWorldAPI/lance-graph

Length of output: 886


Correct the ndarray module path reference.

The claim grounding in perturbation-sim::sketch::{fwht, walsh_pyramid_energy} and bgz-tensor is sound—both crates exist with real implementations of Walsh-Hadamard transforms and attention-as-table-lookup. However, line 23 incorrectly references ndarray::simd::wht_f32; the actual module path in the forked ndarray crate is ndarray::hpc::fft::wht_f32. Update the reference to match the codebase (see crates/perturbation-sim/src/sketch.rs and crates/bgz-tensor/src/adaptive_codec.rs for usage).

Also clarify that "Theorem-checker applied" refers to mathematical validation through formal reasoning, not an automated tool—document where those grounding theorems (Weyl's inequality, Davis–Kahan, etc., mentioned in perturbation-sim/METHODS.md) are formally stated if reproducibility is a concern.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.claude/board/EPIPHANIES.md around lines 21 - 42, Update the incorrect
ndarray module path reference on line 23 from `ndarray::simd::wht_f32` to the
correct path `ndarray::hpc::fft::wht_f32` to match the actual forked ndarray
crate structure. Additionally, clarify the phrase "Theorem-checker applied" to
explicitly state that it refers to mathematical validation through formal
reasoning rather than an automated tool, and reference the grounding theorems
(such as Weyl's inequality and Davis–Kahan) as documented in
`perturbation-sim/METHODS.md` for reproducibility.

@AdaWorldAPI AdaWorldAPI merged commit e07e823 into main Jun 19, 2026
6 checks passed
AdaWorldAPI added a commit that referenced this pull request Jun 19, 2026
feat(graph): Hamming-plane distance — the first value-tier DistanceMeans (costed, stacked on #544)
AdaWorldAPI pushed a commit that referenced this pull request Jun 19, 2026
…ing gates F1-F5

Companion to docs/OGAR_AR_SHAPE_ENDGAME.md (the operator-ratified doctrine).
Each Inc promotes one CONJECTURE from §12 to FINDING (or demotes it on
falsifier).

Inc 1 — ClassView::policies typed THINK slot. F1: one ThinkSpec returns
Verdict::Reject on inconsistent row, ProceedAs on consistent. ~4 files.
Inc 2 — OgarAst enum + TripletProjection round-trip. F2: roundtrip_eq
green on 4-variant input. ~3 files.
Inc 3 — ArmDecision + Executor::{NativeLance, SurrealAst} stubs. F3:
same OgarAst::Do routed via 2 executors → same state_after. ~5 files.
Inc 4 — Curator promotion probe (≥2-curator rule mechanized). F4: ≥4
primitives surface under ≥2 curators with different syntactic forms.
~5 files.
Inc 5 — Litmus probe: same OgarAst::Do(PostInvoice, ...) ≡ across 4
executors. F5: semantic identity green; crown-detection sub-probe fails
with the named error message format. ~6 files.

Gates ordered F1 → F2 → F3 → F4 (parallel-OK) → F5.

This PR is the PLAN itself; no code lands here. 5+3 council runs pre-merge
on the plan; the gates above are PLAN-RATIFIED or REJECT-WITH-REASONS per
Inc. Each Inc opens its own PR off the doctrine branch post-ratification.

Cross-plan integration table (§4) ties this plan to
cypher-kanban-ast-unification-v1, lite-unified-surrealql-lance-v1,
probe-excel-compute-dag-v1, E-AR-PROJECTION-CORRECTION-1 Phase 1/2.

Open council questions §9: THINK slot ownership (ClassView vs sibling
ClassPolicies); BindingSet zero-dep tradeoff; crown-detection error
message format lock; F4 corpus surveyability (vendor vs zipball);
cross-plan ClassView layout collision.

Branch rebased onto origin/main (post-#544/#545 merge) — 5 commits ahead.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Xzyc27Nx3f8WC5KzwfWfjx
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants