feat(contract): ContentStore — content-addressed cold text store (D-CC-ARI-3) by AdaWorldAPI · Pull Request #581 · AdaWorldAPI/lance-graph

AdaWorldAPI · 2026-06-21T15:40:07Z

What

The content-addressed cold text/blob store contract — the gating dependency for the AriGraph/OSINT episodic arc (D-CC-ARI-3). Zero-dep typed surface in lance-graph-contract:

ContentId(u64) = hash::fnv1a of the bytes — stable across versions/platforms (the correct content address; DefaultHasher must never key one; 0 = sentinel). Identical bytes ⇒ identical id ⇒ dedup.
SourceSpan{ContentId,u32,u32} = the fixed-size, Copy typed form of template-equivalence's (source_id,start,end) provenance. is_cited() = the gate's "no source span → no claim" predicate.
ContentStore (cold read: resolve(id) -> Option<&[u8]> zero-copy slice into the mmap/backing store; resolve_span/contains defaulted) + ContentSink (idempotent put -> ContentId, dedup by content-address — many episodes → one source row).

Why this shape

Encodes the three rules from the design discussion:

The join key IS the identity — nothing variable-length enters the 512 B node; it carries only a fixed-size ContentId (a value tenant), the text lives in a columnar table and joins by id.
Content-address, not raw GUID — shared OSINT sources dedup.
Hot/cold firewall (ADR-022) — the hot path (SIMD sweep, AriGraph edge traversal) touches only ContentId/SourceSpan; bytes hydrate cold at the membrane (the fingerprint is the hot-path stand-in for text). resolve is never called during computation.

Scope

Additive, zero-dep; 6 tests (stable/dedup, idempotent put, resolve_span slice, OOB/missing errors, uncited-rejected); clippy clean. Board hygiene: LATEST_STATE.md Contract Inventory entry in the same commit.

Consumers: rs-graph-llm/episodic-arc-task (replaces its local fnv1a stand-in with ContentId/SourceSpan), template-equivalence (typed provenance). Also fixes the flagged WitnessEntry::tie_break_hash DefaultHasher correctness issue by giving content-addressing a stable canonical hash.

Coordination: if the other session is independently authoring content_store, this can be superseded/closed — content_store was absent from main and the active jirak session is on the supervisor surface (#578/#579/#580), so this fills the gap.

Plan: .claude/plans/arigraph-osint-episodic-v1.md.

🤖 Generated with Claude Code

Generated by Claude Code

Draft reference for the AriGraph/OSINT episodic-arc wiring (D-CC-ARI-3), parked on its own branch off merged main (content_store does not yet exist on main). Zero-dep typed surface in lance-graph-contract: - ContentId(u64) = fnv1a of the bytes (canon hash, stable across versions — the correct content address; DefaultHasher must never key one). - SourceSpan{ContentId,u32,u32} = the typed (source_id,start,end) form of template-equivalence's provenance; is_cited() = "no source span -> no claim". - ContentStore (cold read, resolve -> Option<&[u8]> zero-copy slice) + ContentSink (idempotent put -> dedup by content-address). Hot path touches only ContentId/SourceSpan; bytes hydrate cold at the membrane (ADR-022). Logic-complete + self-reviewed; cargo verification deferred (worktree was disk/sibling-constrained). Run `cargo test -p lance-graph-contract content_store` in a full checkout before merge. Author canonically or supersede as the other session's content_store work lands. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01VGXeWN4XfVjteBVcVeuLo4

The content-addressed cold text/blob store for the AriGraph/OSINT episodic arc (D-CC-ARI-3). Zero-dep typed surface in lance-graph-contract: - ContentId(u64) = hash::fnv1a of the bytes (stable across versions — the correct content address; DefaultHasher must never key one; 0 = sentinel). - SourceSpan{ContentId,u32,u32} = the fixed-size Copy typed form of template-equivalence's (source_id,start,end); is_cited() = "no source span -> no claim". - ContentStore (cold resolve -> Option<&[u8]> zero-copy slice) + ContentSink (idempotent put -> dedup by content-address: many episodes -> one source). Hot/cold firewall (ADR-022): the hot path touches only the fixed-size ContentId/SourceSpan; bytes hydrate cold at the membrane. Nothing variable-length enters the 512 B node. Additive, zero-dep; +6 tests, clippy clean. Board: LATEST_STATE Contract Inventory. Consumers: rs-graph-llm/episodic-arc-task, template-equivalence. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01VGXeWN4XfVjteBVcVeuLo4

coderabbitai · 2026-06-21T15:40:15Z

Warning

Review limit reached

@AdaWorldAPI, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 14 minutes and 34 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 5ad3f39e-b537-4625-a345-a9e1fd44b9d1

📥 Commits

Reviewing files that changed from the base of the PR and between 98d5d2f and 6103438.

📒 Files selected for processing (3)

.claude/board/LATEST_STATE.md
crates/lance-graph-contract/src/content_store.rs
crates/lance-graph-contract/src/lib.rs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 10b9bb5dc0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-21T15:43:02Z

+    /// Span length in bytes.
+    #[must_use]
+    pub fn len(self) -> u32 {
+        self.end - self.start


Use a saturating length for public SourceSpan values

Because SourceSpan's fields are public, consumers can deserialize or build typed provenance with end < start without going through new(); in that case this subtraction panics in debug builds and wraps to a huge u32 in release, even though is_empty() treats the same span as empty. Any downstream code using len() to size or copy a malformed span can therefore mis-handle provenance; make the invariant unrepresentable or compute the length with saturating_sub.

Useful? React with 👍 / 👎.

SourceSpan's fields are public, so a consumer can build end < start (bypassing new()'s clamp); the old `end - start` panicked in debug and wrapped to a huge u32 in release, inconsistent with is_empty(). Use saturating_sub so len() reports 0 for a malformed span, matching is_empty()/is_cited(). +1 test (malformed_span_len_saturates_not_panics). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01VGXeWN4XfVjteBVcVeuLo4

AdaWorldAPI · 2026-06-21T15:49:15Z

codex P2 (saturating SourceSpan::len) — ✅ fixed in 6103438.

SourceSpan's fields are public, so a consumer can build end < start (bypassing new()'s clamp). len() now uses saturating_sub, returning 0 for a malformed span — consistent with is_empty()/is_cited(), never panicking (debug) or wrapping (release). +1 test (malformed_span_len_saturates_not_panics). Now 7 tests, clippy clean.

Generated by Claude Code

claude added 2 commits June 21, 2026 15:39

chatgpt-codex-connector Bot reviewed Jun 21, 2026

View reviewed changes

AdaWorldAPI merged commit 96c1249 into main Jun 21, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(contract): ContentStore — content-addressed cold text store (D-CC-ARI-3)#581

feat(contract): ContentStore — content-addressed cold text store (D-CC-ARI-3)#581
AdaWorldAPI merged 3 commits into
mainfrom
claude/content-store-contract-draft

AdaWorldAPI commented Jun 21, 2026

Uh oh!

coderabbitai Bot commented Jun 21, 2026 •

edited

Loading

Review limit reached

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 21, 2026

Uh oh!

AdaWorldAPI commented Jun 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AdaWorldAPI commented Jun 21, 2026

What

Why this shape

Scope

Uh oh!

coderabbitai Bot commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 21, 2026

Choose a reason for hiding this comment

Uh oh!

AdaWorldAPI commented Jun 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented Jun 21, 2026 •

edited

Loading