Skip to content

feat(helix): golden-spiral Place/Residue codec (zero-dep + optional ndarray-hpc)#459

Merged
AdaWorldAPI merged 2 commits into
mainfrom
claude/gallant-rubin-Y9pQd
Jun 3, 2026
Merged

feat(helix): golden-spiral Place/Residue codec (zero-dep + optional ndarray-hpc)#459
AdaWorldAPI merged 2 commits into
mainfrom
claude/gallant-rubin-Y9pQd

Conversation

@AdaWorldAPI

@AdaWorldAPI AdaWorldAPI commented Jun 3, 2026

Copy link
Copy Markdown
Owner

What

New standalone crate crates/helix — the golden-spiral Place/Residue codec specified in crates/helix/KNOWLEDGE.md. HHTL is the deterministic PLACE (trie address — where); helix is the orthogonal RESIDUE (the hemispheric edge at that place). It is the discrete 2-D companion to jc::weyl.

Pipeline (the object speaks for itself)

ResidueEncoder::encode(place, n) runs four stages and returns a 3-byte ResidueEdge:

  1. Placement — equal-area √u hemisphere lift (HemispherePoint: r=√u, Y=√(1−r²), azimuth n·φ). Spherical equal-area, not hyperbolic.
  2. Coupling — stride-4-over-17 curve-ruler arc from the HHTL place offset (CurveRuler; gcd(4,17)=1 → full permutation).
  3. Fisher-Z alignmentarctanh (= hyperbolic depth ρ=2·arctanh(r), one function/two meanings) (Similarity).
  4. Euler hand-off + quantise(z·STRIDE) + γ·(rank/N − ln 17) → 256-palette RollingFloor (occupancy-drift detection + floor-version stamp).

Distance is metric-safe L1 over a 256×256 LUT (DistanceLut / ResidueEdge::distance_adaptive, triangle-inequality regression-tested); distance_heuristic is a non-metric byte-Hamming pre-filter (layer discipline inherited from bgz17). prove() closes the spec's 2-D-discrepancy open item — the golden-spiral hemisphere companion to jc::weyl (D*_φ=0.00160 < D*_ctrl=0.00252 at N=1597, a Fibonacci N).

Build & verification

  • Zero-dep default (edition 2021, empty [workspace], added to root Cargo.toml exclude; verified via cargo test --manifest-path crates/helix/Cargo.toml).
  • Optional ndarray-hpc feature routes batch Fisher-Z through ndarray::simd::simd_ln_f32 (F32x16 lanes); batch_fisher_z_matches_scalar_reference confirms bit-equivalence to the scalar path.
  • 61 unit + 6 doctests pass on both feature configs; clippy -D warnings + rustfmt clean.

Honest overlap (please read before flagging duplication)

A pre-build placement check (encoding-ecosystem.md) found ~80% of this pipeline already exists, some of it certified ρ≥0.999: the Fisher-Z→i8 quantiser (bgz-tensor::projection::Base17Fz / fisher_z::FamilyGamma), the golden-spiral azimuth proof (jc::weyl), the stride-4 zipper (thinking-engine::reencode_safety), the EULER_GAMMA hand-off (jc::precond), and the 256-palette/L1 endpoints (bgz17::palette). This crate is a deliberate, user-ratified zero-dep clean-room re-derivation (self-contained, regenerable-from-template). The genuinely new pieces are the equal-area √u hemisphere placement and the PLACE/RESIDUE doctrine. The overlap and a concrete consolidation path are documented in KNOWLEDGE.md § Overlap & Consolidation and tracked as TD-HELIX-OVERLAP-1. Owed before promotion past clean-room status: a fidelity-vs-ground-truth probe (naive-u8 floor gate ≥0.9980 Pearson — currently CONJECTURE, NOT RUN).

Corrections folded in

  • The spec referenced const::simd::{GOLDEN_RATIO,EULER_GAMMA,E} — that path does not exist; canonical is std::f64::consts (ndarray does not wrap them). helix uses local consts (E sourced from std; GOLDEN_RATIO/EULER_GAMMA as literals for toolchain-robustness, mirroring jc::weyl's local PHI_INV).
  • Golden-step iron rule respected: stride-4-over-17 is coprime (full permutation) — the banned Fibonacci-mod-17 (misses {6,7,10,11}) is not used.

Board hygiene (same commit)

LATEST_STATE + STATUS_BOARD (D-HELIX-1) + EPIPHANIES (E-HELIX-OVERLAP) + TECH_DEBT (TD-HELIX-OVERLAP-1) + AGENT_LOG.

Files

14 crate files (10 src/*.rs, KNOWLEDGE.md, Cargo.toml, .gitignore, examples/prove_residue.rs) + 5 board files + 1 line in root Cargo.toml. Rebased onto main post-#458 (no conflicts — disjoint file sets).

Built via an autoattended wave: 5 read-only research agents + 4 parallel Sonnet leaf workers (placement / fisher_z / quantize / prove, edit-only) + central consolidation.

https://claude.ai/code/session_013rjF2Dvo1DnBACpbpYSffE


Generated by Claude Code

Summary by CodeRabbit

  • New Features

    • New standalone helix codec: golden-spiral Place/Residue encoding with optional high-performance feature path, metric-safe distance lookup, adaptive quantization, and a proof/report harness.
    • Example executable demonstrating encoding, distance computation, and proof output.
  • Documentation

    • Comprehensive specification and knowledge docs describing design, invariants, and consolidation notes.
  • Tests

    • Extensive unit/doctest coverage validating encoding, quantization, distance, and numerical transforms.
  • Chores

    • Crate added to workspace with default zero-dep build and optional accelerator feature.

…darray-hpc)

New standalone crate crates/helix implementing the Place/Residue encoding from
KNOWLEDGE.md: HHTL is the deterministic PLACE, helix is the orthogonal RESIDUE.
Pipeline: equal-area sqrt(u) hemisphere placement (HemispherePoint) ->
stride-4-over-17 curve-ruler coupling (CurveRuler) -> Fisher-Z/arctanh alignment
(Similarity) -> EULER_GAMMA hand-off -> 256-palette rolling-floor quantise
(RollingFloor; occupancy-drift + floor-version stamp) -> 3-byte ResidueEdge
endpoint pair. Distance is metric-safe L1 via a 256x256 LUT
(DistanceLut::distance_adaptive); distance_heuristic is a non-metric byte-Hamming
pre-filter (layer discipline inherited from bgz17). prove() closes the 2-D
discrepancy Open Item -- the golden-spiral hemisphere companion to jc::weyl
(D*_phi < D*_ctrl at N=1597).

Default build is zero-dependency (edition 2021, empty [workspace], added to the
root Cargo.toml exclude list, verified via --manifest-path). The optional
ndarray-hpc feature routes batch Fisher-Z through ndarray::simd::simd_ln_f32 and
is bit-equivalent to the scalar path.

61 unit + 6 doctests pass on both feature configs; clippy -D warnings and
rustfmt clean.

Built via an autoattended wave: 5 read-only research agents + 4 parallel leaf
workers (placement/fisher_z/quantize/prove) + central consolidation.

Honest overlap: ~80% of the pipeline re-derives existing (in places CERTIFIED
rho>=0.999) primitives -- bgz-tensor Base17Fz/FamilyGamma, jc::weyl, the
stride-4 zipper, the EULER_GAMMA hand-off, the bgz17 palette. Shipped as a
user-directed clean-room substrate; the overlap and a consolidation path are
documented in crates/helix/KNOWLEDGE.md and TD-HELIX-OVERLAP-1. The genuinely
new pieces are the sqrt(u) equal-area hemisphere placement and the PLACE/RESIDUE
doctrine.

Board hygiene (same commit): LATEST_STATE, STATUS_BOARD (D-HELIX-1), EPIPHANIES
(E-HELIX-OVERLAP), TECH_DEBT (TD-HELIX-OVERLAP-1), AGENT_LOG.

https://claude.ai/code/session_013rjF2Dvo1DnBACpbpYSffE
@coderabbitai

coderabbitai Bot commented Jun 3, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds a new standalone crate crates/helix implementing a deterministic golden-spiral Place/Residue codec (placement → curve‑ruler → Fisher‑Z → rolling‑floor quantization), metric-safe distance LUTs, residue encoder/serialization, a 2‑D discrepancy proof harness, optional ndarray‑accelerated SIMD paths, workspace integration, and board documentation updates.

Changes

Place/Residue Codec Implementation

Layer / File(s) Summary
Board & workspace integration
.claude/board/AGENT_LOG.md, .claude/board/EPIPHANIES.md, .claude/board/LATEST_STATE.md, .claude/board/STATUS_BOARD.md, .claude/board/TECH_DEBT.md, Cargo.toml
Board logs and status entries record D-HELIX-1 as shipped and track overlap/tech-debt; root Cargo.toml adds crates/helix to workspace members.
Crate manifest, ignore, lib root & KNOWLEDGE
crates/helix/.gitignore, crates/helix/Cargo.toml, crates/helix/src/lib.rs, crates/helix/KNOWLEDGE.md
Crate manifest (zero-dep default, optional ndarray-hpc feature), .gitignore, crate root with public module exports, and comprehensive spec mapping pipeline stages to modules and consolidation notes.
Mathematical primitives
crates/helix/src/constants.rs, crates/helix/src/curve_ruler.rs, crates/helix/src/fisher_z.rs, crates/helix/src/placement.rs
Exported constants, CurveRuler stride/mod mapping and APIs, Similarity/Fisher‑Z transforms with clamping, and HemispherePoint equal-area placement with tests.
Quantization foundation
crates/helix/src/quantize.rs
RollingFloor adaptive 256-bucket quantizer: quantize/observe/drift_score/roll with inertia gliding, versioning, bucket centers, and tests.
Metric-safe distance
crates/helix/src/distance.rs
DistanceLut builders: linear() and from_floor(), O(1) lookup, normalization/clamping, and metric-safety tests.
Residue codec core
crates/helix/src/residue.rs
ResidueEdge 3-byte serializable endpoints with metric and heuristic distance methods; ResidueEncoder composing placement→curve→Fisher‑Z→quantize, with encode/observe/roll APIs and tests.
Validation, SIMD & example
crates/helix/src/prove.rs, crates/helix/src/simd.rs, crates/helix/examples/prove_residue.rs
prove() compares golden-spiral vs control discrepancy (N=1597, grid=64), SIMD-optional batch_fisher_z and batch_l1_u8 with scalar fallbacks, and an example demonstrating encode/report and adaptive distances.

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • AdaWorldAPI/lance-graph#425: Workspace [patch.crates-io].ndarray override may affect which ndarray version the ndarray-hpc feature resolves to.

Poem

🐰 I spun a spiral, seed by seed,
From hemisphere to quantized bead,
Fisher‑Z nudged values true,
The floor rolled tight, the LUT then knew,
A tiny rabbit cheers: encode and read!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main change: introduction of a golden-spiral Place/Residue codec crate with zero-dependency default and optional ndarray-hpc feature support.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/gallant-rubin-Y9pQd

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/helix/src/placement.rs`:
- Around line 76-83: The lift function can return NaN for y when n >= total
because u = (n+0.5)/total may exceed 1; update lift (the pub fn lift(n: usize,
total: usize) -> Self) to guard against out-of-range n by clamping either n (to
total - 1) or clamping u to the [0.0,1.0] range before computing y and r, and
ensure azimuth is computed consistently; replace the current direct computation
with the clamp so callers never get NaN y (or add an explicit documented panic
if you prefer rejecting bad input).

In `@crates/helix/src/simd.rs`:
- Line 15: The EPS constant is too small for f32 and the clamp in simd.rs
becomes a no-op; change the const EPS: f32 to an f32-representable epsilon (e.g.
f32::EPSILON or a small f32 like 1.1920929e-7_f32 or 1e-6_f32) so clamping in
the SIMD routines actually excludes ±1, update any comments to reflect the new
value, and add tests that call the SIMD similarity/Fisher-Z paths with s = 1.0,
s = -1.0 and out-of-range inputs asserting that the outputs (out[i]) are finite
(is_finite()) to prevent regressions.
- Around line 68-87: The batch_l1_u8 function currently depends on
ndarray::simd::U8x64 which is not present in published ndarray 0.16.1; either
vendor the intended ../../../ndarray fork that actually exposes ndarray::simd
(so Cargo.toml points to that local path and the repo includes it) or change
batch_l1_u8 to use a supported SIMD API (e.g. std::simd::Simd<u8, 64> /
portable_simd) and update the imports and loop to construct Simd vectors and
lanes accordingly; ensure the feature flag "ndarray-hpc" is updated to gate the
correct dependency (or renamed) so the code compiles with the actual dependency
providing U8x64, and update the function signature and uses of U8x64 in
batch_l1_u8 to match the chosen SIMD type.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 5a6cd8d1-8474-484b-886d-e406b622ee78

📥 Commits

Reviewing files that changed from the base of the PR and between 41273f3 and 4b70c8a.

📒 Files selected for processing (20)
  • .claude/board/AGENT_LOG.md
  • .claude/board/EPIPHANIES.md
  • .claude/board/LATEST_STATE.md
  • .claude/board/STATUS_BOARD.md
  • .claude/board/TECH_DEBT.md
  • Cargo.toml
  • crates/helix/.gitignore
  • crates/helix/Cargo.toml
  • crates/helix/KNOWLEDGE.md
  • crates/helix/examples/prove_residue.rs
  • crates/helix/src/constants.rs
  • crates/helix/src/curve_ruler.rs
  • crates/helix/src/distance.rs
  • crates/helix/src/fisher_z.rs
  • crates/helix/src/lib.rs
  • crates/helix/src/placement.rs
  • crates/helix/src/prove.rs
  • crates/helix/src/quantize.rs
  • crates/helix/src/residue.rs
  • crates/helix/src/simd.rs

Comment thread crates/helix/src/placement.rs
Comment thread crates/helix/src/simd.rs Outdated
Comment thread crates/helix/src/simd.rs
Comment on lines +68 to +87
#[cfg(feature = "ndarray-hpc")]
pub fn batch_l1_u8(a: &[u8], b: &[u8], out: &mut [u16]) {
use ndarray::simd::U8x64;
assert_eq!(a.len(), b.len(), "batch_l1_u8: length mismatch");
assert_eq!(a.len(), out.len(), "batch_l1_u8: length mismatch");
let n = a.len();
let mut i = 0;
while i + 64 <= n {
let av = U8x64::from_slice(&a[i..i + 64]).to_array();
let bv = U8x64::from_slice(&b[i..i + 64]).to_array();
for l in 0..64 {
out[i + l] = (av[l] as i32 - bv[l] as i32).unsigned_abs() as u16;
}
i += 64;
}
while i < n {
out[i] = (a[i] as i32 - b[i] as i32).unsigned_abs() as u16;
i += 1;
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Locate the ndarray simd module and confirm the symbols used here exist.
fd -t f -e rs . | xargs rg -nP '\b(simd_ln_f32|F32x16|U8x64)\b' -g '!crates/helix/**'

Repository: AdaWorldAPI/lance-graph

Length of output: 12122


🏁 Script executed:

#!/bin/bash
# Show pinned ndarray version + relevant features for the ndarray-hpc build.
rg -n 'ndarray\s*=' crates/helix/Cargo.toml
rg -n 'ndarray-hpc' crates/helix/Cargo.toml

Repository: AdaWorldAPI/lance-graph

Length of output: 206


🏁 Script executed:

#!/bin/bash
# Confirm U8x64 + its needed API in the pinned local ndarray crate.
# (ndarray path is ../../../ndarray from crates/helix)

set -euo pipefail

ND_DIR="$(realpath crates/helix/../../../ndarray)"

if [ ! -d "$ND_DIR" ]; then
  echo "Missing ndarray dir at: $ND_DIR" >&2
  exit 1
fi

# Locate U8x64 definition and relevant methods.
rg -n --no-heading -S 'U8x64' "$ND_DIR" -g'*.rs' | head -n 80

# Find likely constructors/conversions for U8x64.
rg -n --no-heading -S 'U8x64::(from_slice|from_array)|fn (from_slice|from_array).*U8x64|to_array\(\)' "$ND_DIR" -g'*.rs' | head -n 80

# (Optional) confirm simd_ln_f32/F32x16 presence; batch_l1_u8 itself doesn't use them.
rg -n --no-heading -S '(simd_ln_f32|F32x16)' "$ND_DIR" -g'*.rs' | head -n 80

Repository: AdaWorldAPI/lance-graph

Length of output: 116


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Try the Cargo.toml-relative path first (no realpath available in sandbox)
CAND1="crates/helix/../../../ndarray"
if [ -d "$CAND1" ]; then
  echo "FOUND ndarray at: $CAND1"
  ND_DIR="$CAND1"
else
  echo "CAND1 missing: $CAND1" >&2
  ND_DIR=""
fi

# 2) Fallback: locate an ndarray crate directory within the repo
if [ -z "${ND_DIR:-}" ]; then
  # Look for a directory literally named ndarray (case-sensitive) within repo root.
  FOUND_DIRS="$(fd -td ndarray . | head -n 20 || true)"
  if [ -n "$FOUND_DIRS" ]; then
    ND_DIR="$(echo "$FOUND_DIRS" | head -n 1)"
    echo "FOUND ndarray via fd: $ND_DIR"
  else
    echo "Could not locate ndarray directory inside repository." >&2
    exit 1
  fi
fi

echo "Using ND_DIR=$ND_DIR" >&2

# 3) Find the U8x64 definition and methods relevant to crates/helix/src/simd.rs::batch_l1_u8
echo "---- U8x64 definition hits (first 60) ----"
rg -n --no-heading -S 'U8x64' "$ND_DIR" -g'*.rs' | head -n 60

echo "---- U8x64 impl hits (first 80) ----"
rg -n --no-heading -S 'impl[^\n]*U8x64' "$ND_DIR" -g'*.rs' | head -n 80

echo "---- from_slice / to_array hits (first 80) ----"
rg -n --no-heading -S '(from_slice|to_array)' "$ND_DIR" -g'*.rs' | rg -n 'U8x64|f32|u8' -m 80 || true

echo "---- simd_ln_f32 / F32x16 presence (informational) ----"
rg -n --no-heading -S '(simd_ln_f32|F32x16)' "$ND_DIR" -g'*.rs' | head -n 40 || true

Repository: AdaWorldAPI/lance-graph

Length of output: 166


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Try to find the ndarray version actually locked for this repo
# 1) Look for workspace Cargo.lock
if [ -f Cargo.lock ]; then
  LOCK="Cargo.lock"
elif [ -f crates/Cargo.lock ]; then
  LOCK="crates/Cargo.lock"
else
  echo "No Cargo.lock found at repo root or crates/." >&2
  exit 0
fi

echo "Using lockfile: $LOCK" >&2

# 2) Extract ndarray entry (name/version) and ndarray-hpc-related features if present
rg -n 'name = "ndarray"$' "$LOCK" -n | head -n 20 || true

# Print the ndarray package block around the match (limited)
python3 - <<'PY'
import re, itertools, pathlib
lock = pathlib.Path("Cargo.lock")
if not lock.exists():
    lock = pathlib.Path("crates/Cargo.lock")
text = lock.read_text()
m = re.search(r'\[\[package\]\]\nname = "ndarray"\nversion = "([^"]+)"', text)
if m:
    print("ndarray_version=", m.group(1))
else:
    print("ndarray_version=not_found")
PY

# 3) Also see if U8x64 appears anywhere else (sanity)
rg -n '\bU8x64\b' crates -S | head -n 40

Repository: AdaWorldAPI/lance-graph

Length of output: 2051


🌐 Web query:

ndarray 0.16.1 ndarray::simd U8x64 from_slice to_array

💡 Result:

In ndarray 0.16.1 there is no API named ndarray::simd::U8x64 (or U8x64 from_slice to_array)—the U8x64 SIMD type you’re asking about is from Rust’s SIMD APIs (std::simd / core::simd), not from ndarray. What to use instead (Rust SIMD): 1) Build a u8x64 SIMD vector from a slice with u8x64::from_slice(slice). - from_slice converts a slice to a SIMD vector containing slice[..N] and panics if the slice length is less than Simd::N (for u8x64, N = 64). [1][2] 2) Convert a SIMD vector back to a [T; N] array with to_array(). - to_array(self) -> [T; N] converts a SIMD vector to an array. [1] Example: - let v = std::simd::u8x64::from_slice(&bytes); - let arr: [u8; 64] = v.to_array(); Grounding for the SIMD methods/types: - u8x64 is defined as Simd<u8, 64> in std::simd. [3] - from_slice and to_array are documented on the u8x64 SIMD type. [1] Grounding that this is unrelated to ndarray: - The ndarray 0.16.1 docs describe n-dimensional arrays, slicing/views, ArrayBase, etc., but do not document a ndarray::simd module or an U8x64 type. [4][5]

Citations:


🌐 Web query:

ndarray 0.16.1 ndarray::simd simd_ln_f32 F32x16

💡 Result:

ndarray 0.16.1 does not expose a public API named ndarray::simd::simd_ln_f32 (and there’s no readily citable documentation for a symbol with that exact name/signature). The only authoritative “SIMD + ln” evidence surfaced here is from other Rust SIMD libraries/types (e.g., a packed SIMD type f32x16 has an ln method in packed_simd_2), not from ndarray’s documented API.[1][2] What I can substantiate about the pieces in your query: - ndarray v0.16.1 exists and is an n-dimensional array crate, but the docs overview do not list ndarray::simd or a simd_ln_f32 entry point.[1][3] - core/portable SIMD f32x16 is a nightly experimental SIMD vector type with 16 f32 lanes.[4] - A concrete SIMD type f32x16 from packed_simd_2 does implement ln(self) -> Self.[2] If you tell me where you saw ndarray::simd simd_ln_f32 F32x16 (e.g., a stack trace, a specific git commit, or a file path/line number), I can re-search specifically for that symbol in the ndarray 0.16.1 source tree and confirm whether it’s internal, behind a feature, or from a different crate/module.

Citations:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

ND_DIR="crates/helix/../../../ndarray"
if [ ! -d "$ND_DIR" ]; then
  echo "Missing local ndarray path: $ND_DIR" >&2
  exit 1
fi

# Confirm ndarray exports the symbols your helix code imports
rg -n 'pub\s+mod\s+simd' "$ND_DIR" -S || true
rg -n '\b(U8x64|F32x16|simd_ln_f32)\b' "$ND_DIR" -S --glob '*.rs'

Repository: AdaWorldAPI/lance-graph

Length of output: 125


Fix the ndarray::simd::{U8x64,…} assumptions behind ndarray-hpc in Helix.

crates/helix/Cargo.toml wires ndarray-hpc = ["dep:ndarray"] to a local-path ndarray (path = "../../../ndarray"), but in this workspace Cargo.lock resolves ndarray to 0.16.1. Public info for ndarray 0.16.1 does not expose ndarray::simd::U8x64 / simd_ln_f32, while crates/helix/src/simd.rs (batch_l1_u8, lines 68-87) directly uses ndarray::simd::U8x64.

  • Ensure the repository actually includes the intended ../../../ndarray fork/vendor that provides these ndarray::simd APIs for the ndarray-hpc feature; otherwise update the imports/implementation to use std::simd (or gate the function to a documented dependency that truly provides U8x64).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/helix/src/simd.rs` around lines 68 - 87, The batch_l1_u8 function
currently depends on ndarray::simd::U8x64 which is not present in published
ndarray 0.16.1; either vendor the intended ../../../ndarray fork that actually
exposes ndarray::simd (so Cargo.toml points to that local path and the repo
includes it) or change batch_l1_u8 to use a supported SIMD API (e.g.
std::simd::Simd<u8, 64> / portable_simd) and update the imports and loop to
construct Simd vectors and lanes accordingly; ensure the feature flag
"ndarray-hpc" is updated to gate the correct dependency (or renamed) so the code
compiles with the actual dependency providing U8x64, and update the function
signature and uses of U8x64 in batch_l1_u8 to match the chosen SIMD type.

Two valid review findings on PR #459:

1. placement.rs: HemispherePoint::lift produced NaN for n >= total
   (u > 1 -> sqrt(1-u) = NaN). lift is public API, so an external caller
   got a silent NaN. Clamp u to <= 1.0 so out-of-range n saturates to the
   rim (r=1, y=0), matching the documented no-NaN contract. Added the
   lift_out_of_range_n_saturates_to_rim test.

2. simd.rs: the batch Fisher-Z clamp epsilon 1e-9 is below the f32 ULP
   near 1.0 (~1.19e-7), so 1.0 - 1e-9 rounded to 1.0f32 and the clamp was
   a no-op -- s = +/-1 produced ln(0) = -inf. Bumped EPS to 1e-6 (the f64
   Similarity::fisher_z keeps 1e-9). Added the
   batch_fisher_z_boundary_inputs_are_finite test (+/-1, out-of-range).

The third comment (ndarray::simd::U8x64 "missing") is a false positive:
the ndarray-hpc feature resolves ndarray to the AdaWorldAPI fork at
../../../ndarray (which provides ::simd), not crates.io ndarray 0.16.1 --
the feature builds and tests green. Added a Cargo.toml note documenting
the sibling-fork requirement.

63 unit + 6 doctests pass on both feature configs; clippy -D warnings and
rustfmt clean.

https://claude.ai/code/session_013rjF2Dvo1DnBACpbpYSffE

Copy link
Copy Markdown
Owner Author

Thanks for the review — addressed in 07c2fcd.

1. placement.rs — NaN for n >= total ✅ Fixed. lift now clamps u to ≤ 1.0, so out-of-range n saturates to the rim (r = 1, y = 0) instead of sqrt(negative) = NaN, matching the documented no-NaN contract. Added lift_out_of_range_n_saturates_to_rim.

2. simd.rsEPS = 1e-9 below f32 precision ✅ Fixed (good catch). 1e-9 is under the f32 ULP near 1.0 (≈1.19e-7), so 1.0 - EPS rounded to 1.0f32 and the clamp was a no-op — s = ±1 produced ln(0) = -inf. Bumped to EPS = 1e-6 (the f64 Similarity::fisher_z keeps 1e-9, which is fine in f64). Added batch_fisher_z_boundary_inputs_are_finite (±1 + out-of-range → all finite).

3. simd.rsndarray::simd::U8x64 "missing" (ndarray 0.16.1) — this one is a false positive. The ndarray-hpc feature pins ndarray = { path = "../../../ndarray" } — the AdaWorldAPI fork (the same one lance-graph core depends on), which does export ::simd::{U8x64, simd_ln_f32, F32x16}. Being a path dependency, it never resolves to crates.io ndarray 0.16.1 regardless of any lockfile. The feature builds and its tests pass green. I added a Cargo.toml note documenting that ndarray-hpc requires the sibling fork checked out (the default build is zero-dep and needs none of it).

All 63 unit + 6 doctests pass on both the default and --features ndarray-hpc builds; clippy -D warnings + rustfmt clean.


Generated by Claude Code

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/helix/src/placement.rs (1)

76-83: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix doc invariants around HemispherePoint::lift saturation (r can be 1)

  • lift intentionally clamps u to 1.0 when n >= total (see lift_out_of_range_n_saturates_to_rim and the lift argument docs), so r can be exactly 1.0; update the file-level invariants that currently state r ∈ [0, 1) / “u < 1 always holds” to apply only to the intended range n < total.
  • The downstream +inf risk is not present: Similarity::fisher_z clamps inputs to [-1+ε, 1-ε] before the arctanh ln-form, so even Similarity(1.0).fisher_z() stays finite.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/helix/src/placement.rs` around lines 76 - 83, The file-level
invariants incorrectly assert r ∈ [0,1) / "u < 1 always holds" while
HemispherePoint::lift intentionally clamps u to 1.0 when n >= total (see
lift_out_of_range_n_saturates_to_rim), so update the docs/invariants to state
the strict bound r < 1 only for the intended range n < total and acknowledge
that lift may produce r == 1 for n >= total; also note that this does not
introduce +inf risk because Similarity::fisher_z already clamps inputs to [-1+ε,
1-ε] before the arctanh/ln form.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/helix/src/placement.rs`:
- Around line 76-83: The file-level invariants incorrectly assert r ∈ [0,1) / "u
< 1 always holds" while HemispherePoint::lift intentionally clamps u to 1.0 when
n >= total (see lift_out_of_range_n_saturates_to_rim), so update the
docs/invariants to state the strict bound r < 1 only for the intended range n <
total and acknowledge that lift may produce r == 1 for n >= total; also note
that this does not introduce +inf risk because Similarity::fisher_z already
clamps inputs to [-1+ε, 1-ε] before the arctanh/ln form.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 23189277-4643-4c35-a2ae-2592f6ab37dd

📥 Commits

Reviewing files that changed from the base of the PR and between 4b70c8a and 07c2fcd.

📒 Files selected for processing (3)
  • crates/helix/Cargo.toml
  • crates/helix/src/placement.rs
  • crates/helix/src/simd.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/helix/Cargo.toml
  • crates/helix/src/simd.rs

@AdaWorldAPI AdaWorldAPI merged commit ef35ff1 into main Jun 3, 2026
6 checks passed
AdaWorldAPI pushed a commit that referenced this pull request Jun 4, 2026
Board-hygiene follow-up for the merged helix PR (#459, merge ef35ff1):
the PR-number-specific entries that can only be added once the PR has a
number and a merge commit.

- PR_ARC_INVENTORY: prepend the #459 arc entry (Added/Locked/Deferred/
  Docs/Confidence) for crates/helix.
- LATEST_STATE: add the #459 row to the "Recently Shipped PRs" table.

The in-commit board updates (LATEST_STATE blockquote, STATUS_BOARD
D-HELIX-1, EPIPHANIES E-HELIX-OVERLAP, TECH_DEBT TD-HELIX-OVERLAP-1,
AGENT_LOG) already landed with #459.

https://claude.ai/code/session_013rjF2Dvo1DnBACpbpYSffE
AdaWorldAPI added a commit that referenced this pull request Jun 4, 2026
docs(board): post-merge governance for #459 (helix PR_ARC + LATEST_STATE table row)
AdaWorldAPI pushed a commit that referenced this pull request Jun 4, 2026
…ray is mandatory")

Two converging directives:

1. codex P2 (#460): an optional `path` dependency does not make the default
   build self-contained — Cargo reads every dependency manifest (incl. optional)
   during resolution, so `ndarray = { path = "../../../ndarray", optional = true }`
   makes a clean checkout without the sibling fail before feature selection. The
   "default build needs no ndarray checkout" claim was false.
2. Directive: "ndarray is mandatory for lance-graph" (it is The Foundation).

Fix: ndarray is now a mandatory, non-optional GIT dependency:
  ndarray = { git = "https://github.com/AdaWorldAPI/ndarray.git",
              branch = "master", default-features = false, features = ["std"] }

- git source resolves the manifest remotely (no sibling checkout needed);
- non-optional, so the `ndarray-hpc` feature is removed;
- src/simd.rs is now single-impl (always ndarray::simd, no scalar fallback;
  ndarray does its own AVX-512/AVX2/scalar dispatch internally).

No import cycle: helix is standalone (own [workspace] + root exclude) and the
fork is self-contained (only internal subcrate path deps, no lance-graph
back-dependency). [patch] is the wrong tool here (the workspace's
[patch.crates-io] ndarray is separately ineffective: 0.17.2 can't satisfy the
lance-index crates.io 0.16.1).

Verified: 63 unit + 6 doctests green, clippy -D warnings + fmt clean (git source
patched to the local master checkout for the in-sandbox build; github fetch is
validated by CI).

Board: PR_ARC #459 Correction + E-HELIX-NDARRAY-MANDATORY + LATEST_STATE +
STATUS_BOARD D-HELIX-1.

https://claude.ai/code/session_013rjF2Dvo1DnBACpbpYSffE
AdaWorldAPI pushed a commit that referenced this pull request Jun 4, 2026
…lict D-HELIX-1 status

CodeRabbit flagged two board-governance issues on #463:
- PR_ARC_INVENTORY.md edited the #459 historical section outside the
  Confidence field. Relocate the correction into a properly prepended
  #463 entry (rule-5: a separate-PR reversal gets its own entry) and
  reduce the #459 touch to a Confidence-line forward-pointer.
- STATUS_BOARD.md D-HELIX-1 status mixed obsolete (zero-dep / 61 tests /
  --features ndarray-hpc) with current facts. Clean it to post-#463
  reality only (mandatory git dep, no feature, 63 unit + 6 doctests).

ndarray dependency source kept as git branch=master per user decision;
the Codex P2 (offline) and CodeRabbit (pin-rev) threads are recorded as
accepted trade-offs (wontfix) in the new #463 entry.

https://claude.ai/code/session_013rjF2Dvo1DnBACpbpYSffE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants