From f886615c1ba6f4bd7036c2c0b1635e81bed659ca Mon Sep 17 00:00:00 2001 From: Claude Date: Fri, 5 Jun 2026 12:22:17 +0000 Subject: [PATCH 1/2] =?UTF-8?q?docs(cesium-osm):=20=C2=A711=20ADR-024=20ad?= =?UTF-8?q?option=20callout=20=E2=80=94=20D-OSM-2=20=CF=81-vs-reference=20?= =?UTF-8?q?contract?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes the runtime-side follow-up commitment that **OGAR #39 ADR-024 § Consequences** explicitly names by reference: *"Reports ρ-vs-reference on first per-country PBF run per the runtime session's §11 follow-up commitment"*. Sibling to #474 (the codex P2 D-OSM-2 ownership-boundary fix landed by the other session). This PR is purely additive — appends a new **§11 ADR-024 adoption — palette256 + HHTL codec contract** section at the end of `cesium-osm-substrate-v1.md` without touching the existing §1-§10 or the §11 (NOT covered) section labelling (now §11 ADR-024 + §10 NOT covered before _End of plan_). ## What the new §11 pins 1. **Adoption checklist** (ADR-024's 4-step) mapped onto D-OSM-2: - Prefix = Cesium TMS quadkey `qk://` (Q2 ruling). - Palette domain = OSM tag-values per-tile (≤256 at zoom 21 per ADR-024 §256-ceiling escape hatches) + tile-local quantized coords (sub-cm at zoom 21). - ρ-vs-reference target ≥ 0.99 matching the `arm-discovery` aerial-codebook anchor (ρ = 0.9973 vs cosine). - Decode = const-table lookup; zero-allocation. 2. **Falsifiable adoption contract** for D-OSM-2: - Report empirical ρ on first per-country PBF run (default: Liechtenstein per §6 OQ-OSM-4). - Report per-tile palette cardinality distribution (mean / p95 / p99) — escape hatch selected from measurement. - Decode bandwidth target ≥ 10⁸/sec on AVX-512. - If ρ < 0.99 on first run, document gap before multi-country ingest (escalate to per-tile, hierarchical, or palette-64K). 3. **Cross-arc adopters table** — three shipped anchors (`arm-discovery`, `Binary16K` `_effectiveReaders`, `bgz-tensor` `WeightPalette`) + two queued adopters (`D-OSM-2`, `D-SPLAT-4` — both named in ADR-024 § Consequences). 4. **Why §11 not §3**: ADR-024 codec contract is independent of §3 carrier-shape contract (Q1 v1 fallback). The Arrow `List>` column shape holds whether or not the tag value is palette-encoded; palette is a transparent codec underneath the column. Keeping §11 separate preserves §3's "carrier shape" framing while pinning the codec contract. ## Why a separate small PR (not bundled with #474) #474 (other session) landed the codex P2 D-OSM-2 ownership- boundary fix as a 1-line minimal edit. The §11 ADR-024 callout is purely additive (no overlap with #474's diff). Keeping them as two PRs preserves the codex-fix vs proactive-architectural-pin separation — codex reviewers can pattern-match each PR's intent without bundling. ## Companion (separate small PR — also being filed) D-SPLAT-4 in `splat-native-ultrasound-v1.md` deserves the symmetric ADR-024 adoption note (ADR-024 § Consequences names both D-OSM-2 AND D-SPLAT-4 as queued adopters). That edit ships as `claude/splat-native-adr-024-callout` since the splat-native arc has its own branch lineage and its own #471/#472 PR history. ## Test plan - [x] Docs/plan only — no source code; no build/test invoked. - [x] §11 cites ADR-024 by file:line (`OGAR/docs/ARCHITECTURAL- DECISIONS-2026-06-04.md`) + the arm-discovery ρ=0.9973 empirical anchor + bgz-tensor `WeightPalette` reference. - [x] Falsifiable contract is testable at D-OSM-2 implementation time (per-country PBF run; reported ρ + cardinality distribution + decode bandwidth). - [x] No collision with #474 (purely additive section append). - [ ] Codex re-review on this PR. --- .claude/plans/cesium-osm-substrate-v1.md | 45 ++++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/.claude/plans/cesium-osm-substrate-v1.md b/.claude/plans/cesium-osm-substrate-v1.md index 45406dc1..d094cb9b 100644 --- a/.claude/plans/cesium-osm-substrate-v1.md +++ b/.claude/plans/cesium-osm-substrate-v1.md @@ -306,4 +306,49 @@ b-r-u/osmpbf v0.4 ────────┐ --- +## 11. ADR-024 adoption — palette256 + HHTL codec contract + +> **Pinned 2026-06-05 after ADR-024 landed in OGAR #39.** This section closes the runtime-side follow-up commitment that ADR-024 § Consequences names by reference ("Reports ρ-vs-reference on first per-country PBF run per the runtime session's §11 follow-up commitment"). + +The cesium-osm arc adopts **ADR-024** (Palette256 + HHTL codec — universal compression primitive) as the canonical compression contract for D-OSM-2 (OSM tag palette + tile-local coordinate quantization) and any future per-tile palette in this family. + +### What ADR-024 requires + +Per `OGAR/docs/ARCHITECTURAL-DECISIONS-2026-06-04.md § ADR-024` adoption checklist: + +1. **Identify the prefix.** For D-OSM-2: the Cesium TMS quadkey path `qk://` (Q2 ruling §2). The prefix is the per-tile spatial frame. +2. **Identify the palette domain.** For D-OSM-2: OSM tag-values clustered per tile (~95% body covered by ≤256 distinct values at zoom 21 per ADR-024 § 256-ceiling escape hatches) **and** tile-local 16-bit quantized coordinates (tile bounds ~4 m at zoom 21, sub-cm precision). +3. **Build the palette + measure ρ-vs-reference.** For D-OSM-2: the reference is exact-match tag equality; ρ for the per-tile tag palette is the proportion of correctly-decoded tag values vs ground truth. Per ADR-024 target: **ρ ≥ 0.99** matching the `lance-graph-arm-discovery` aerial-codebook anchor (ρ = 0.9973 vs cosine). +4. **Decode = const-table lookup.** Per-tile palette is runtime const-table; decode path is zero-allocation. Compile-time HHTL where the palette is shared across tiles (e.g. the global ~100 most-common OSM keys: `highway`, `building`, `name`, `addr:*`). + +### Falsifiable adoption contract for D-OSM-2 + +**D-OSM-2 implementation MUST report:** + +- Empirical ρ-vs-reference on the **first per-country PBF run** (default candidate: Liechtenstein PBF — smallest tractable corpus; per §6 OQ-OSM-4). +- Per-tile palette cardinality distribution (mean / p95 / p99). The 256-ceiling escape hatch (per-tile, hierarchical, or palette-64K) is selected on the basis of measured cardinality, not speculation. +- Decode bandwidth: target ≥ 10⁸ palette-decodes/sec on AVX-512 (gather + table-lookup), matching the bgz-tensor `AttentionTable` throughput regime. + +If the ρ-vs-reference falls below 0.99 on the first per-country run, **D-OSM-2 must document the gap before proceeding** to multi-country ingest — either by (a) escalating to per-tile palettes if the global palette undercovers, (b) escalating to palette-64K if cardinality genuinely exceeds 256, or (c) accepting the gap with rationale (e.g. ρ = 0.96 may be acceptable for navigation-style queries where exact tag equality is not load-bearing). + +### Cross-arc ADR-024 adopters (the codec spreads) + +| Adopter | Domain | Prefix | Palette domain | ρ-vs-reference | +|---|---|---|---|---| +| `arm-discovery` aerial codebook (anchor; **shipped**) | Distance | class identity | quantized cosine | **ρ = 0.9973** (the empirical floor) | +| `Binary16K` `_effectiveReaders` (anchor; **shipped**) | Security | row identity | 256 role-bit subsets | binary exact-match (popcount intersect) | +| `bgz-tensor` `WeightPalette` (anchor; **shipped**) | Attention | layer / head index | quantized dense weights | cosine (matches ADR-024 reference) | +| **D-OSM-2** (this plan, queued) | Geographic | Cesium TMS quadkey | tag-values + tile-local coords | **ρ ≥ 0.99** target (this section) | +| **D-SPLAT-4** (splat-native arc, queued) | Anatomical / volumetric | FMA NiblePath + SH basis-id | SH coefficients (ℓ=3) | **ρ ≥ 0.99** target (per splat-native plan amendment, separate PR) | + +The two queued adopters (D-OSM-2 + D-SPLAT-4) are explicitly named in ADR-024 § Consequences. This section is the runtime-side acknowledgment that the codec contract binds at adoption time, not after the fact. + +### Why this section is §11, not §3 + +Adding ADR-024 as a deliverable specification on D-OSM-2 (in §3) would conflate the carrier-shape contract (Q1 v1 fallback) with the compression contract (palette256 codec). They are independent — the Arrow `List>` shape (Q1 v1) holds *whether or not* the tag value is palette-encoded; the palette is a transparent codec underneath the column. Keeping the ADR-024 callout at §11 preserves §3's "carrier shape" framing while pinning the codec contract separately. + +When D-OSM-2 ships, the implementation PR cites both §3 (carrier) and §11 (codec) as its contract surface. + +--- + _End of plan v1._ From 90a062ae0863940cd8ae69672a79b3050b448bf7 Mon Sep 17 00:00:00 2001 From: Claude Date: Fri, 5 Jun 2026 12:29:03 +0000 Subject: [PATCH 2/2] docs(cesium-osm): address coderabbit review on #475 (prefix unification + coordinate-fidelity metric) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Addresses two CodeRabbit Major findings on #475: ## Fix 1 — Prefix form unification (CR Major @ line 14) §11 step 1 used `qk://` while §2 Q2 declares the full identity as `osm/qk:////`. The two are not competing formats — the §11 form is the **tile-frame quadkey prefix portion** of the §2 Q2 form (truncated to what ADR-024 calls "the prefix" for palette purposes; §2 Q2 is the **full identity path** including namespace + element-type/ID suffix). Adds an explicit clarification noting this two-layer addressing distinction. Also clarifies that §2 Q2's bare `` means `` — the Q3-converted Y coordinate after the OSM-XYZ → TMS ingest boundary flip. §11 spells `` for explicitness; §2 Q2 leaves it bare because the ingest boundary normalizes to TMS before any consumer reads the identity. Result: downstream implementers see one canonical full-identity form (§2 Q2) and one canonical tile-frame prefix form (§11) that are the same key at different layers of detail — not two incompatible prefixes to split on. ## Fix 2 — Coordinate-fidelity metric (CR Major @ lines 17 + 320-333) §11 acceptance criteria measured ρ only for tag equality. The palette domain ALSO covers tile-local 16-bit quantized coordinates (called out in step 2), but coordinate fidelity could silently regress with no falsifiable gate. Adds a **coordinate-decode-error budget** alongside tag ρ in the §11 falsifiable adoption contract: - **Coordinate fidelity** target: **max ≤ 5 cm AND p95 ≤ 1 cm at zoom 21** (tile bounds ~4 m; 16-bit quantization theoretical resolution ~0.06 mm per axis — the 5 cm / 1 cm thresholds leave ~3 orders of magnitude headroom for projection round-trip error). Report max + p95 + p99 over all decoded coordinates. - **New escape hatch** (mirrors the per-tile palette escalation pattern): 24-bit-per-axis quantization OR per-tile floating- point fallback if max budget is breached. The escalation policy now triggers on EITHER ρ < 0.99 OR coordinate-budget breach — two falsifiable gates, both reported, neither silently regressing. Also tightened step 2's "sub-cm precision" claim to the more accurate "~0.06 mm per-axis resolution at zoom 21 — sub-mm" matching the theoretical floor. ## What this PR does NOT do - No source code; docs/plan only. - Does not duplicate work from #474 (other session's D-OSM-2 ownership-boundary fix) — purely additive in §11. - Does not touch §3 (carrier-shape contract) — coordinate fidelity is a §11 codec-contract concern per the §11-vs-§3 framing the initial PR pinned. ## Test plan - [x] Docs/plan only — no source code; no build/test invoked. - [x] Both findings addressed in single commit on the open #475 branch. - [x] Prefix-form unification preserves the §2 Q2 / §11 distinction (full identity vs tile-frame prefix); no breaking change to §2 Q2's existing form. - [x] Coordinate-fidelity budget is measurable at D-OSM-2 implementation time (per-country PBF run reports max + p95 + p99). - [ ] CodeRabbit re-review on this commit. --- .claude/plans/cesium-osm-substrate-v1.md | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/.claude/plans/cesium-osm-substrate-v1.md b/.claude/plans/cesium-osm-substrate-v1.md index d094cb9b..38eaad3a 100644 --- a/.claude/plans/cesium-osm-substrate-v1.md +++ b/.claude/plans/cesium-osm-substrate-v1.md @@ -316,20 +316,21 @@ The cesium-osm arc adopts **ADR-024** (Palette256 + HHTL codec — universal com Per `OGAR/docs/ARCHITECTURAL-DECISIONS-2026-06-04.md § ADR-024` adoption checklist: -1. **Identify the prefix.** For D-OSM-2: the Cesium TMS quadkey path `qk://` (Q2 ruling §2). The prefix is the per-tile spatial frame. -2. **Identify the palette domain.** For D-OSM-2: OSM tag-values clustered per tile (~95% body covered by ≤256 distinct values at zoom 21 per ADR-024 § 256-ceiling escape hatches) **and** tile-local 16-bit quantized coordinates (tile bounds ~4 m at zoom 21, sub-cm precision). +1. **Identify the prefix.** For D-OSM-2: the **tile-frame quadkey prefix** `qk://` — this is the per-tile spatial-frame portion derived from the **full identity path** `osm/qk:////` declared in §2 Q2. (Note: `` is the Q3-converted Y coordinate after the OSM-XYZ → TMS boundary flip; §2 Q2 uses bare `` to mean `` since the ingest boundary normalizes to TMS. The forms are not two competing prefixes — the §11 form is the §2 Q2 form truncated to the spatial-frame portion, which is what ADR-024 calls "the prefix" for palette purposes.) +2. **Identify the palette domain.** For D-OSM-2: OSM tag-values clustered per tile (~95% body covered by ≤256 distinct values at zoom 21 per ADR-024 § 256-ceiling escape hatches) **and** tile-local 16-bit quantized coordinates (tile bounds ~4 m at zoom 21; 16-bit quantization gives ~0.06 mm per-axis resolution — sub-mm at this zoom). 3. **Build the palette + measure ρ-vs-reference.** For D-OSM-2: the reference is exact-match tag equality; ρ for the per-tile tag palette is the proportion of correctly-decoded tag values vs ground truth. Per ADR-024 target: **ρ ≥ 0.99** matching the `lance-graph-arm-discovery` aerial-codebook anchor (ρ = 0.9973 vs cosine). 4. **Decode = const-table lookup.** Per-tile palette is runtime const-table; decode path is zero-allocation. Compile-time HHTL where the palette is shared across tiles (e.g. the global ~100 most-common OSM keys: `highway`, `building`, `name`, `addr:*`). ### Falsifiable adoption contract for D-OSM-2 -**D-OSM-2 implementation MUST report:** +**D-OSM-2 implementation MUST report (two metrics — tag fidelity AND coordinate fidelity; both falsifiable):** -- Empirical ρ-vs-reference on the **first per-country PBF run** (default candidate: Liechtenstein PBF — smallest tractable corpus; per §6 OQ-OSM-4). -- Per-tile palette cardinality distribution (mean / p95 / p99). The 256-ceiling escape hatch (per-tile, hierarchical, or palette-64K) is selected on the basis of measured cardinality, not speculation. -- Decode bandwidth: target ≥ 10⁸ palette-decodes/sec on AVX-512 (gather + table-lookup), matching the bgz-tensor `AttentionTable` throughput regime. +- **Tag fidelity** — empirical ρ-vs-reference on the **first per-country PBF run** (default candidate: Liechtenstein PBF — smallest tractable corpus; per §6 OQ-OSM-4). Target: **ρ ≥ 0.99**. +- **Coordinate fidelity** — empirical per-tile coordinate decode-error budget on the same first-per-country run, in meters: **max ≤ 5 cm and p95 ≤ 1 cm at zoom 21** (tile bounds ~4 m; 16-bit quantization theoretical resolution ~0.06 mm per axis — the 5 cm / 1 cm thresholds are ~3 orders of magnitude above the theoretical floor, leaving headroom for projection round-trip error). Report max + p95 + p99 over all decoded coordinates. If max exceeds the budget, the 16-bit quantization regime is undersized for the target use case and must escalate to 24-bit-per-axis OR per-tile floating-point fallback (the latter is a per-tile escape hatch analogous to per-tile palettes). +- **Per-tile palette cardinality distribution** (mean / p95 / p99). The 256-ceiling escape hatch (per-tile, hierarchical, or palette-64K) is selected on the basis of measured cardinality, not speculation. +- **Decode bandwidth** — target ≥ 10⁸ palette-decodes/sec on AVX-512 (gather + table-lookup), matching the bgz-tensor `AttentionTable` throughput regime. -If the ρ-vs-reference falls below 0.99 on the first per-country run, **D-OSM-2 must document the gap before proceeding** to multi-country ingest — either by (a) escalating to per-tile palettes if the global palette undercovers, (b) escalating to palette-64K if cardinality genuinely exceeds 256, or (c) accepting the gap with rationale (e.g. ρ = 0.96 may be acceptable for navigation-style queries where exact tag equality is not load-bearing). +If the ρ-vs-reference falls below 0.99 OR the coordinate-decode max/p95 budget is breached on the first per-country run, **D-OSM-2 must document the gap before proceeding** to multi-country ingest — either by (a) escalating to per-tile palettes if the global palette undercovers, (b) escalating to palette-64K if cardinality genuinely exceeds 256, (c) escalating to 24-bit-per-axis quantization if the coordinate budget is breached, or (d) accepting the gap with rationale (e.g. ρ = 0.96 may be acceptable for navigation-style queries where exact tag equality is not load-bearing). ### Cross-arc ADR-024 adopters (the codec spreads)