Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
204 changes: 204 additions & 0 deletions solutions/LP-0017.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
# Solution: LP-0017 — Whistleblower: censorship-resistant document upload and indexing

**Submitted by:** aegonmyy

## Summary

Whistleblower is a complete censorship-resistant document publishing pipeline on
the Logos stack, delivered as **two Logos Basecamp modules** plus a
**permissionless batch-anchor daemon**. A user picks a file; the app uploads it to
**Logos Storage** (obtaining a CID), broadcasts a metadata envelope over **Logos
Delivery** so the document is immediately discoverable, and can optionally anchor
the CID on-chain. Long-term anchoring is decoupled from publication: a standalone
CLI lets any altruistic third party gather broadcast CIDs and commit up to 50 per
transaction to a **LEZ SPEL registry program** — no coordination with the original
publisher, and the anchorer can be an anonymous `Private/` account so the index is
public while the publisher's identity is not. The upload → broadcast → anchor logic
is extracted into a reusable `logos-chronicle` module with a documented API.

## Repository

- **Repo:** https://github.com/aegonmyy/logoz
- **Branch / commit:** `main` @ `365273d`
- **v0.2.0 port branch:** [`port/v0.2.0`](https://github.com/aegonmyy/logoz/tree/port/v0.2.0)
- **Key paths:**
- `logos-chronicle/` — reusable Logos module: upload, broadcast, anchor, publish pipeline (the extracted document-indexing module)
- `logos-whistleblower/` — Logos Basecamp view plugin: QML desktop UI (file picker, publish status, history, anchor config)
- `chronicle-registry/` — SPEL registry program (`chronicle_registry_core` shared types, `methods/guest` RISC0 guest, `ffi/` C-ABI shim, `idl/` IDL)
- `batch-anchor/` — permissionless Waku-listener → dedup → batch-anchor CLI/daemon
- `scripts/demo.sh` — reproducible end-to-end demo against a real local sequencer at `RISC0_DEV_MODE=0`
- `scripts/bench-cu.sh` — CU benchmark harness (fresh-registry, dev-mode-off)
- `.github/workflows/ci.yml` — CI (fmt / build+test / publish smoke / on-chain anchor e2e)

## Approach

### Pipeline & module extraction

The core `upload → broadcast → anchor` logic lives in **`logos-chronicle`**, a
self-contained Logos module with a documented JSON API (`uploadFileJson`,
`uploadStatusJson`, `publishFileJson`, broadcast + anchor calls). The Whistleblower
Basecamp app (`logos-whistleblower`) is a thin QML view plugin on top of it, so any
other Logos app can reuse the pipeline without depending on the Whistleblower UI —
which is exactly what the prize's "standalone document-indexing module" asks for.

### On-chain registry — LEZ SPEL program (chosen approach)

We chose a **LEZ program via the SPEL framework** over direct zone-SDK consensus
inscription. Justification: the zone SDK path currently requires a single
designated actor to perform consensus inscription (decentralised zone sequencers
are not yet shipped), which reintroduces a trust bottleneck — the exact
centralisation Whistleblower exists to avoid. A LEZ program keeps anchoring
permissionless and verifiable by anyone. The registry stores
`(cid, metadata_hash, anchor_timestamp, anchored_by, version)` per document, is
queryable by CID, and `index_batch` accepts up to **50 CIDs per transaction**
(`MAX_BATCH = 50`, ≥ the required 10). An IDL is provided.

### Privacy-preserving anchoring

Because the anchorer can be a `Private/` account, the `index_batch` transaction
routes through the LEZ proving path and the on-chain `anchored_by` is an anonymous
key — the document index stays publicly verifiable while the whistleblower's
identity is not revealed.

### Metadata hash & envelope

`metadata_hash = v1:<sha-256-hex>` over alphabetically-sorted canonical JSON of the
envelope fields, stored on-chain and embedded in every Waku envelope
(`v`, `cid`, `title`, `description`, `content_type`, `size_bytes`, `timestamp`,
`tags`, `metadata_hash`) on topic `/chronicle/1/document-index/json`, so any node
can verify document integrity without fetching from Storage.

### Why the Logos stack

Whistleblower needs exactly what a centralised alternative cannot provide:
**Logos Storage** stores bytes durably without identifying the uploader; **Logos
Delivery** propagates the CID peer-to-peer so a document is findable the instant
it is published, with no index server to seize or block; and **LEZ** provides
trustless, permissionless on-chain anchoring with first-class private state so
anchoring never doxxes the publisher. On a centralised host, any one of the host,
the index, or the payment rail is a single point of censorship — the whole reason
the app exists.

### What was tried and did not work (documented as upstream issues)

- `spel program-id` requires a pre-built R0BF `ProgramBinary`, not a raw ELF, and
the format is undocumented — worked around with `tools/mk_program_binary.rs`
([spel#240](https://github.com/logos-co/spel/issues/240)).
- `lgs localnet start` hardcodes a pre-reorg LEZ config path and cannot start a
sequencer for recent `lez` pins; `scripts/demo.sh` applies a config-path bridge
automatically ([scaffold#230](https://github.com/logos-co/scaffold/issues/230)).
- Funding a fresh localnet account (`auth-transfer init` →
`ClaimedUnauthorizedAccount` for non-genesis accounts) is undocumented
([scaffold#232](https://github.com/logos-co/scaffold/issues/232)).
- Also filed: [spel#241](https://github.com/logos-co/spel/issues/241),
[scaffold#231](https://github.com/logos-co/scaffold/issues/231).

## Success Criteria Checklist

### Functionality

- [x] **Upload** — app uploads a selected file to Logos Storage and obtains a CID (`logos-chronicle` → Codex). Verified by `nix run .#smoke-storage`.
- [x] **Broadcast** — a metadata envelope (`cid`, `title`, `description`, `content_type`, `size_bytes`, `timestamp`, `tags`, plus `metadata_hash`) is published to `/chronicle/1/document-index/json` immediately after upload. Verified by `nix run .#smoke-broadcast`.
- [x] **On-chain anchoring** — an explicit "anchor on-chain" action distinct from the basic upload flow, invocable at any time after upload.
- [x] **Batch anchor tool** — `batch-anchor` subscribes to the Delivery topic, accumulates `(CID, metadata_hash)` tuples, submits them in a single batch tx, is permissionless (no publisher coordination), and is idempotent (re-submitting an already-registered CID does not fail).
- [x] **On-chain registry** — LEZ SPEL program (approach chosen + justified above); stores `(cid, metadata_hash, anchor_timestamp)` (plus `anchored_by`, `version`), queryable by CID, accepts batches up to 50 CIDs/tx.
- [x] **Document-indexing module** — `logos-chronicle`, self-contained with a documented API, reusable independently of the Whistleblower app.

### Usability

- [x] **Basecamp GUI** — `logos-whistleblower` QML app with local build instructions and a prebuilt `.lgx` asset, loadable in Logos Basecamp.
- [x] **Module as library/SDK** — `logos-chronicle` shipped with a README covering its API and integration steps.
- [x] **IDL for the LEZ program** — provided (SPEL framework).

### Reliability

- [x] **Upload retries** on transient Storage failures with exponential back-off, surfacing a clear error after exhausting retries.
- [x] **Broadcast dedup** — re-broadcasting the same CID does not create duplicate entries for subscribers (dedup by `(CID, metadata_hash)`).
- [x] **Batch-anchor resume** — on start the tool catches up from the Waku store and skips already-registered CIDs, so it resumes after a network interruption without re-processing. Publish and anchor ledgers persist across daemon restarts (verified by `nix run .#smoke-publish`).

### Performance

- [x] **CU benchmarks** measured on a real local rc5 sequencer at `RISC0_DEV_MODE=0` (full Groth16), fresh registry per batch via `scripts/bench-cu.sh`:

| Operation | Batch size | CU (R0VM cycles) |
|-----------|-----------|------------------|
| `init_registry` | — | 414 |
| `index_batch` | 1 CID | 2,816 |
| `index_batch` | 50 CIDs (`MAX_BATCH`) | 213,348 |

### Supportability

- [x] **Registry deployed & tested on LEZ testnet** — deployed and exercised end-to-end on the live **Testnet v0.2.0** (`https://testnet.lez.logos.co`, LEZ `v0.2.0` / commit `a58fbce2`) at `RISC0_DEV_MODE=0`: program `96ad78fe…`, registry PDA `HvCtoPL6…` (reproducing the documented IDs exactly), a CID anchored via a **real Groth16 proof** (`index_batch` tx `02d8781403…`, confirmed on-chain and independently verified via `wallet chain-info transaction`), and `lookup` confirms it (a bogus CID is not-registered). Full evidence + reproduction steps: [`docs/testnet-v020-live-evidence-20260702.md`](https://github.com/aegonmyy/logoz/blob/main/docs/testnet-v020-live-evidence-20260702.md).
- [x] **E2E integration tests in CI** — upload → broadcast → batch anchor run against a LEZ sequencer in standalone mode in CI (`publish` + `anchor` jobs).
- [x] **CI green on default branch** — `main` CI completed successfully on the code baseline (fmt, build+test, publish smoke, on-chain anchor e2e); `365273d` changes only docs (README + testnet evidence), no code or workflow files.
- [x] **README** — covers build steps, deployment addresses, running the Basecamp app, running the batch anchor tool, and querying the registry.
- [x] **Reproducible demo at `RISC0_DEV_MODE=0`** — `scripts/demo.sh` runs the full pipeline against a real local sequencer with real Groth16 proofs; verified end-to-end on a clean run.
- [x] **Narrated video demo** showing terminal output incl. proof generation at `RISC0_DEV_MODE=0` — https://youtu.be/JY-joCR_2ag

> **Testnet note.** On-chain evidence was originally captured on the **rc5** testnet,
> which was wiped and replaced by **Testnet v0.2.0** mid-submission. The hosted
> testnet is **now live again at the `v0.2.0` tag**, and the registry has been
> **re-deployed and re-anchored there** (see the Supportability evidence above): the
> program ID `96ad78fe…` and PDA `HvCtoPL6…` reproduce the documented values exactly,
> and a CID was anchored with a real Groth16 proof (`index_batch` tx `02d8781403…`,
> confirmed on-chain). This was done by talking to the hosted sequencer directly with
> a `v0.2.0`-final wallet plus the `batch-anchor` CLI — the released `lgs` *localnet*
> tooling still lags v0.2.0's on-disk layout ([scaffold#230](https://github.com/logos-co/scaffold/issues/230)), but that only affects local bring-up, not testnet
> interaction. The earlier rc5 records (tx `f14e39c9…`) are no longer queryable on the
> current network but are retained as the original run record; the program is
> reproducible from source (`make build`), and a clean code port is on `port/v0.2.0`.

## FURPS Self-Assessment

### Functionality
Full upload → broadcast → anchor pipeline; explicit optional on-chain anchor;
permissionless idempotent batch-anchor CLI (dedup, Waku store catch-up, resume);
LEZ SPEL registry (`init_registry`, `index_batch` up to 50 CIDs/tx) queryable by
CID; privacy-preserving anchoring via `Private/` anchorer; reusable
`logos-chronicle` module. Enforced limits: 100 MB max file size, envelope size
cap, source filename never reaches Storage (title-derived staging).

### Usability
QML Basecamp app with file picker, publish status, history, and anchor-config
dialog; prebuilt `.lgx` plus local build instructions. `logos-chronicle` exposes a
small JSON API and ships as a reusable module with a README. IDL provided for the
SPEL program. `scripts/setup.sh` one-shot bootstrap and `scripts/run-app.sh` launch.

### Reliability
Exponential-backoff upload retries with a clear terminal error; broadcast dedup by
`(CID, metadata_hash)`; batch-anchor catches up from the Waku store and skips
already-registered CIDs on restart; publish and anchor ledgers persist across
daemon restarts (verified by a restart assertion in `smoke-publish`). 16/16
batch-anchor unit tests pass.

### Performance
`init_registry` = 414 CU; `index_batch` n=1 = 2,816 CU; n=50 = 213,348 CU
(fresh registry, `RISC0_DEV_MODE=0`). CU is the guest execution-cycle delta logged
at execution time, independent of dev-mode. Because the guest borsh-serialises the
whole registry per call, per-call CU grows with stored entries — documented, with a
fresh-registry harness (`bench-cu.sh`) to isolate the batch cost.

### Reliability / Supportability
CI runs four jobs (fmt; build + 16 tests; `publish` smoke against a live nwaku
node; `anchor` on-chain `index_batch` e2e against a local sequencer) and is green
on `main`. `RISC0_DEV_MODE=0` real-proof runs are shown via `scripts/demo.sh`
locally and in the demo video (dev-mode is used only on the hosted CI runner for
speed; CU is identical either way). Four isolated smoke tests
(`storage`/`broadcast`/`publish`/`anchor`) cover each pipeline stage. Codebase is
split into independently testable crates/modules; five upstream issues filed for
tooling gaps encountered.

## Supporting Materials

- **Narrated demo video (RISC0_DEV_MODE=0):** https://youtu.be/JY-joCR_2ag
- **Reproducible demo script:** [`scripts/demo.sh`](https://github.com/aegonmyy/logoz/blob/main/scripts/demo.sh)
- **CU benchmark harness:** [`scripts/bench-cu.sh`](https://github.com/aegonmyy/logoz/blob/main/scripts/bench-cu.sh)
- **Reusable module:** [`logos-chronicle/`](https://github.com/aegonmyy/logoz/tree/main/logos-chronicle)
- **Batch anchor CLI:** [`batch-anchor/`](https://github.com/aegonmyy/logoz/tree/main/batch-anchor)
- **SPEL registry program + IDL:** [`chronicle-registry/`](https://github.com/aegonmyy/logoz/tree/main/chronicle-registry)
- **v0.2.0 port branch:** [`port/v0.2.0`](https://github.com/aegonmyy/logoz/tree/port/v0.2.0)
- **Upstream issues filed:** spel [#240](https://github.com/logos-co/spel/issues/240) · [#241](https://github.com/logos-co/spel/issues/241) · scaffold [#230](https://github.com/logos-co/scaffold/issues/230) · [#231](https://github.com/logos-co/scaffold/issues/231) · [#232](https://github.com/logos-co/scaffold/issues/232)

## Terms & Conditions

By submitting this solution, I confirm that I have read and agree to the [Terms & Conditions](../TERMS.md).
Loading