Skip to content

Build Capsule's offline end-to-end encrypted data plane#321

Merged
justin13888 merged 13 commits into
masterfrom
feat-core-data-plane
Jun 1, 2026
Merged

Build Capsule's offline end-to-end encrypted data plane#321
justin13888 merged 13 commits into
masterfrom
feat-core-data-plane

Conversation

@justin13888
Copy link
Copy Markdown
Collaborator

Implements the complete offline cryptographic data plane for capsule-core — from CBOR/crypto primitives up through a signed-manifest provenance chain, CRDT metadata, portable backups, and an end-to-end capsule demo.

Summary

  • Stands up the full client-side crypto stack with no network dependency, gated behind a single verify_asset chokepoint that returns Accept / TerminalReject(reason) / Pending.
  • Defers live MLS behind an AlbumAuthority seam (a signature-backed ReferenceAuthority stands in for OpenMLS), so the data plane is complete and testable today.
  • Every signature and content hash commits to one canonical CBOR byte-identity contract (RFC 8949 §4.2), shared across manifests, sidecars, and backups.
  • Privacy-on-export strips device/camera identifiers and coarsens GPS by default; CRDT metadata (OR-Set tags, LWW caption) converges across devices.
  • Ships a narrated capsule demo exercising the whole flow with real crypto and inspectable on-disk artifacts.

Implementation

  1. Primitives → keys: canonical CBOR encoder, HKDF-SHA512 / Argon2id KDFs, streaming SHA-256, then a key hierarchy — hybrid Ed25519+ML-DSA-65 sigs (both halves required), ML-KEM-768 DEKs, master key, per-epoch AMKs, and an encrypted keystore.
  2. Asset path: AES-256-GCM STREAM (64 KiB chunks, ranged decrypt) + exact metadata-blob wire format; import emits a signed create manifest, append-only SHA-256 provenance chain, and signed SidecarV1.
  3. Verification & validation: verify_asset chokepoint plus pure structural / protocol / idempotency invariants with exhaustive table-driven negatives.
  4. Backup & recovery: deterministic tar artifact (HMAC + exporter-signed manifest), preview/dry-run/commit restore with chain reconciliation, master-key escrow + opt-in Shamir 2-of-3.
  5. Lifecycle Workspace: ties crypto to the on-disk library (import, CRDT edits, soft-delete/restore, backup round-trip to a fresh library with byte-equal verification).

Followup

  • Wire live OpenMLS in place of ReferenceAuthority (X25519 X-Wing hybrid KEM half deferred alongside it); see DEFERRED.md.
  • capsule demo is a manual showcase; core paths are covered by automated tests.

- Pin stable RustCrypto deps (aes-gcm 0.10/aead 0.5 stream, hkdf, hmac,
  ed25519-dalek 2, ml-dsa 0.1, ml-kem 0.3, argon2, getrandom).
- crypto::hash: Hash32 newtype (CBOR byte-string serde) + streaming
  Sha256Hasher; NIST KAT + chunked-equals-one-shot tests.
- utils::hash now delegates to streaming crypto::hash (resolves the
  whole-file read TODO; zero behavior change).
Shared, load-bearing deterministic encoder over ciborium::Value:
definite-length only, shortest-form ints/floats (f16/f32/f64 via half),
map keys sorted by bytewise-lex of their encoded form. This is the
byte-identity contract every signature/content-hash commits to.
Tests: RFC 8949 Appendix A int/float KATs, key-ordering (length head
participates), idempotence, _unknown re-sort, struct round-trip, and a
golden hex vector as the cross-language conformance gate.
- primitives: SuiteId(0x0001) dispatch + fail-closed unknown suites,
  versioned HKDF info labels, DeviceTier->Argon2id param table.
- rng: OS CSPRNG (getrandom) fill/random_array.
- kdf: HKDF-SHA512 derive_key32 with determinism, domain-separation,
  512->256 truncation-prefix, and a golden vector.
- pwkdf: Argon2id+AES-256-GCM wrap/unwrap with params recorded in-band
  (cross-tier unwrap); wrong-passphrase and tamper rejection tests.
…eystore

- hybrid_sig: Ed25519+ML-DSA-65 signatures; BOTH halves required to
  verify (exhaustive negatives: corrupt either half, swap halves,
  truncated ML-DSA). Deterministic 32-byte seeds; CBOR byte-string serde.
- kem: ML-KEM-768 DEK (deterministic 64-byte seed, encapsulate/decapsulate
  round-trip). X25519 X-Wing hybrid half deferred with OpenMLS.
- master: account master key; default-album-id derivation (HKDF, v8 UUID);
  symmetric seal/open for wrapping device keys.
- album: AmkVersion + random per-epoch Amk; derive_file_key/derive_blob_key.
- keystore: Account (master+IK+device) <-> encrypted AccountFile (master
  under passphrase, identity/device keys sealed under master); round-trip,
  wrong-passphrase, and canonical-serialization tests.
- stream: AES-256-GCM STREAM (EncryptorBE32) — 65520B plaintext -> 64KiB
  ciphertext chunks, incremental ciphertext hash, streaming encrypt/decrypt.
  decrypt_chunk(i,is_last) decrypts any chunk independently for ranged reads.
  Tests: round-trips (empty..multi-chunk), ranged==sequential, per-chunk
  tamper, reorder/drop/truncation, wrong key/prefix.
- blob: exact metadata-blob wire format suite(2,BE)|nonce(12)|ct|tag(16);
  fail-closed on unknown suite; content hash over full wire; fresh nonce.
The MLS deferral boundary: a per-album trait exposing exactly what
verify_asset needs (epoch ceiling, per-epoch write-tier pubkey, AMK-present,
admin-chain validity). ReferenceAuthority is an admin-signed epoch ledger
standing in for live OpenMLS. Guardrails: admin signature mandatory +
verified, ceiling = max attested epoch (no fabrication/rewind), AMK presence
minted with the write-tier key. Tests: lookups, pending flip, forged/wrong-
admin attestation rejection, inconsistent-ceiling rejection.
…epoint

- provenance/action: closed Action + DerivativeRole enums (unknown values
  rejected on decode).
- provenance/manifest: AssetManifest/DerivativeManifest with canonical
  signing bytes + two hybrid sigs; structural rules (prior-hash<->create,
  retention only on delete).
- provenance/record: append-only SHA-256 hash chain; append/verify_walk
  detect broken root, stale link, dropped/rewritten records, mirror mismatch.
- keys/directory: master-signed DeviceDirectory (lookup, monotonic version).
- verify_asset: THE chokepoint -> Accept / TerminalReject(reason) / Pending.
  Exhaustive negatives: reader-signed, removed-writer, wrong-epoch,
  forged-chain, replay, suite-downgrade, bad device sig, untrusted authority,
  device-added-after, unknown device, ciphertext mismatch, wrong album.
  Pending->Accept when AMK arrives; drop-in parity via &dyn AlbumAuthority.
- protocol: fail-closed handshake (protocol_gate -> 426, suite-in-inventory,
  sidecar-schema closure) with date-grammar + range checks.
- structural: key-less manifest envelope checks (suite, album pin, added_at<ts,
  timestamp drift, prior-hash==stored-head, monotonic amk_version) as discrete
  predicates + a combined checker; table-driven accept/reject tests.
- idempotency: stable session/lifecycle/chunk idempotency keys.
- metadata::crdt: OrSet (add_id binding, reject-unobserved-remove, union
  merge convergence/idempotence/associativity); Lww register with superseded
  log (ts+device tiebreak, cap 16); monotonic add_id Counter (reseed = max+1,
  never reuses a written counter).
- sidecar::sidecar_v1: signed SidecarV1 with sidecar_schema at CBOR integer
  field 0 (sorts first), CRDT fields, _unknown preserved+signed, hybrid
  signature over canonical bytes; refuses a schema newer than max-known.
- metadata::export_policy: strip camera serial/device_id/session_id, round
  GPS to ~1km by default; per-export opt-in retains; local sidecar untouched.
- artifact: deterministic uncompressed tar (VERSION, MANIFEST.cbor,
  keys/amk-ledger.cbor, sorted blobs/meta/provenance). MANIFEST authenticated
  two ways (HMAC under passphrase-derived wrap key + hybrid exporter sig);
  per-entry content hashes; AMK ledger sealed under the wrap key.
- restore: preview/dry-run(default)/commit with chain reconciliation —
  identical head no-op, absent applied, divergent quarantined (never silent
  overwrite). decrypt_asset rebuilds plaintext from the ledger AMK + manifest.
- escrow: master-key escrow (Argon2id) + opt-in Shamir 2-of-3 of the seed.
- Tests: byte-identical re-export, HMAC+sig tamper detection, wrong exporter,
  AMK-incomplete, reconciliation matrix, restore-to-fresh-lib recovers plaintext.
…ration

Ties the crypto data plane to the on-disk client library:
- import_asset: encrypt(STREAM) -> signed create manifest -> provenance
  chain -> signed SidecarV1 -> verify_asset(Accept) gate; writes plaintext +
  .cbor sidecar + .provenance.cbor.
- tag_add/set_caption: CRDT edits emitting metadata-update provenance records.
- soft_delete(retention)/restore: delete + trash-restore lifecycle records.
- export_backup/import_backup: portable artifact round-trip; ciphertext
  regenerated deterministically from the manifest nonce prefix (plaintext is
  the local source of truth).
- stream: encrypt_asset_with_prefix for deterministic ciphertext regeneration.
- E2E test: account->album->import->verify->edits->delete/restore->backup->
  restore to fresh library->byte-equal; untrusted exporter refused.
Drives capsule_core::lifecycle to run the whole offline flow with real
crypto and narrated output: account+keys -> album+authority -> import
(encrypt/sign/provenance/sidecar/verify_asset) -> CRDT edits -> soft-delete
+restore -> backup export -> restore into a fresh library -> byte-equal PASS
-> untrusted-exporter refusal -> Shamir 2-of-3. Writes inspectable on-disk
artifacts in the design's exact layout.
@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying capsule with  Cloudflare Pages  Cloudflare Pages

Latest commit: 6724004
Status: ✅  Deploy successful!
Preview URL: https://a24f0136.capsule-22k.pages.dev
Branch Preview URL: https://feat-core-data-plane.capsule-22k.pages.dev

View logs

@justin13888 justin13888 merged commit a979103 into master Jun 1, 2026
2 checks passed
@justin13888 justin13888 deleted the feat-core-data-plane branch June 1, 2026 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant