feat(share): push/pull the indexed graph via an OCI registry#7
Conversation
Committing the multi-MB graph to git is poor (needs LFS, bloats history). This adds `synapse push` / `synapse pull` so teams share a prebuilt graph through any OCI registry (GHCR/ECR/ACR/Harbor) — teammates pull instead of re-indexing. The graph is shipped as a single-layer OCI artifact (raw .lbug bytes + a small JSON config blob), with git commit / branch / synapse version / blake3 stamped into manifest annotations so identity is verifiable and staleness detectable without downloading the blob. Isolation: all `oci-client` / `tokio` / `docker_credential` usage lives in the new `src/share.rs`; the rest of the crate stays synchronous (the module exposes sync facades that run a short-lived current-thread runtime internally). Gated by a default-on `share` Cargo feature, so `--no-default-features` drops the networking/TLS stack and the push/pull commands. Auth: credentials are auto-discovered from the existing `docker login` (`~/.docker/config.json` + OS credential helpers, incl. macOS keychain) via the `docker_credential` crate — no tokens in synapse's config. Resolution order (auto): env override (SYNAPSE_REGISTRY_USER/PASS/TOKEN) -> docker creds -> anonymous. Public registries pull with zero setup. Push is triple-locked so a fresh clone / CI can't push by accident: `push_enabled = true` in config (default false) AND a clean working tree (or --allow-dirty) AND interactive type-to-confirm (or --yes; non-TTY without --yes refuses rather than hangs). Artifacts are tagged by commit (a per-commit short SHA tag plus the moving `latest`). Pull defaults to the moving tag (`latest`) — NOT the local HEAD commit tag, which usually wouldn't exist for a teammate on a different commit and would hard -fail. It verifies the blob's blake3, writes the graph atomically (temp+rename), records a `.synapse/graph/origin.json` provenance sidecar, and warns loudly when the graph's indexed commit differs from local HEAD. `--tag <sha>` pulls the exact graph for a commit. `status` surfaces the pulled graph's origin commit + staleness on every run (human + --json `origin`/`originStale`); `index` removes the now-stale provenance sidecar. `init` now appends the graph dir to an existing root `.gitignore` (idempotent; leaves `synapse.toml` committable, and never creates a `.gitignore` where none existed) so the binary graph isn't committed. Verified end-to-end against a real Azure Container Registry: push (auth auto-discovered from the macOS keychain, no config secrets) -> pull into a fresh clone -> byte-identical graph, queryable, with the staleness warning firing on a differing HEAD. Full suite (91 tests), fmt and clippy green; the lean `--no-default-features` build still compiles. Bumps version to 0.2.0. New: src/share.rs. Touched: config (ShareConfig), cli (Push/Pull), main (cmd_push/cmd_pull, status origin, index sidecar cleanup, init gitignore), git (full_commit), errors (share variants), Cargo.toml (share feature + deps), README/LLM.md. Tests: share-helper units, push/pull guard CLI tests, gitignore test, ladybug link_edges already covered.
Review Summary by QodoAdd OCI registry push/pull for sharing indexed graphs with safety guards
WalkthroughsDescription• Add synapse push/pull commands to share indexed graphs via OCI registries - Eliminates need to commit multi-MB binary to git; teammates pull instead of re-indexing - Supports any OCI registry (GHCR, ECR, ACR, Harbor); credentials auto-discovered from docker login • Implement triple-locked push safety gates - Requires push_enabled = true in config (default false), clean working tree, and interactive confirmation - Prevents accidental pushes from fresh clones or CI environments • Add graph staleness detection and provenance tracking - Pull verifies blake3 integrity; warns when graph commit differs from local HEAD - status command surfaces pulled graph origin and staleness; index clears stale markers • Isolate async/networking code in new src/share.rs module - All OCI, tokio, and docker_credential usage confined to single module; rest of crate stays synchronous - Gated by default-on share Cargo feature; --no-default-features drops networking/TLS stack • Enhance init to gitignore graph working state - Appends graph directory to existing .gitignore (idempotent); keeps synapse.toml committable Diagramflowchart LR
A["synapse push"] -->|"triple-locked<br/>guards"| B["OCI Registry"]
C["synapse pull"] -->|"verify blake3<br/>check staleness"| B
B -->|"manifest annotations<br/>commit/version/blake3"| D["Shared Graph"]
E["docker login"] -->|"auto-discover<br/>credentials"| B
F["synapse status"] -->|"display origin<br/>& staleness"| G["Provenance<br/>sidecar"]
File Changes3. src/config.rs
|
Code Review by Qodo
1. Commit compare can panic
|
| pub fn compare_commit(graph_commit: Option<&str>, head_commit: Option<&str>) -> GraphFreshness { | ||
| match (graph_commit, head_commit) { | ||
| (Some(g), Some(h)) if !g.is_empty() && !h.is_empty() => { | ||
| let n = g.len().min(h.len()); | ||
| if g[..n].eq_ignore_ascii_case(&h[..n]) { | ||
| GraphFreshness::Match | ||
| } else { | ||
| GraphFreshness::Mismatch { | ||
| graph_commit: g.to_string(), | ||
| head_commit: h.to_string(), | ||
| } | ||
| } | ||
| } | ||
| _ => GraphFreshness::Unknown, | ||
| } | ||
| } |
There was a problem hiding this comment.
1. Commit compare can panic 🐞 Bug ☼ Reliability
share::compare_commit slices strings with g[..n]/h[..n], which can panic on non-UTF8-boundary indices when the registry-provided commit annotation contains multi-byte characters. Because the commit string is taken verbatim from OCI manifest annotations, a malformed/malicious artifact can crash synapse pull (denial-of-service).
Agent Prompt
### Issue description
`compare_commit()` slices `&str` values using byte indices (`g[..n]`), which can panic if `n` is not on a UTF-8 character boundary. The `graph_commit` value comes from untrusted OCI manifest annotations, so a crafted/garbled artifact can crash the process during `synapse pull`.
### Issue Context
- Commit strings are read directly from manifest annotations and later compared against local HEAD.
### Fix Focus Areas
- src/share.rs[131-146]
### Suggested fix
- Avoid `&str` slicing by byte offsets.
- Compare using bytes:
- `let gb = g.as_bytes(); let hb = h.as_bytes(); let n = gb.len().min(hb.len()); if gb[..n].eq_ignore_ascii_case(&hb[..n]) { ... }`
- (Optional hardening) Validate that commit strings are ASCII hex (and cap length) before comparing/storing.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| // Atomic write: temp file + rename, so an interrupted pull never leaves a | ||
| // half-written graph in place. | ||
| let graph_dir = repo.root.join(&config.graph.path); | ||
| std::fs::create_dir_all(&graph_dir) | ||
| .with_context(|| format!("creating {}", graph_dir.display()))?; | ||
| let final_path = graph_dir.join("synapse.lbug"); | ||
| if final_path.exists() { | ||
| eprintln!( | ||
| "warning: overwriting existing local graph at {}", | ||
| final_path.display() | ||
| ); | ||
| } | ||
| let tmp_path = graph_dir.join("synapse.lbug.tmp"); | ||
| std::fs::write(&tmp_path, &pulled.bytes) | ||
| .with_context(|| format!("writing {}", tmp_path.display()))?; | ||
| std::fs::rename(&tmp_path, &final_path) | ||
| .with_context(|| format!("replacing {}", final_path.display()))?; |
There was a problem hiding this comment.
2. Pull overwrite may fail 🐞 Bug ≡ Correctness
cmd_pull warns it will overwrite an existing synapse.lbug but uses std::fs::rename(tmp, final) without removing the destination first. On platforms where rename cannot replace an existing file (notably Windows), synapse pull will fail whenever the graph already exists.
Agent Prompt
### Issue description
`cmd_pull` implements an "atomic write" via temp file + `std::fs::rename`, but it does not ensure the destination does not exist. This breaks the intended overwrite behavior on platforms where `rename` fails if the destination exists.
### Issue Context
The code explicitly logs that it is overwriting an existing graph, so overwrite is intended behavior.
### Fix Focus Areas
- src/main.rs[733-749]
### Suggested fix
- Before `rename`, remove the destination if it exists:
- `if final_path.exists() { std::fs::remove_file(&final_path)?; }`
- Then `std::fs::rename(&tmp_path, &final_path)?;`
- Keep the temp+rename pattern for crash-safety.
- Consider also cleaning up a leftover temp file on failure (best-effort).
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
Why
Committing the multi-MB graph (
synapse.lbug) to git is poor — needs LFS, bloats history. This addssynapse push/synapse pullso teams share a prebuilt graph through any OCI registry (GHCR/ECR/ACR/Harbor): a maintainer or CI pushes the graph, teammates pull it instead of re-indexing.How it works
The graph ships as a single-layer OCI artifact (raw
.lbugbytes + a tiny JSON config blob). Git commit / branch / synapse version / blake3 are stamped into manifest annotations, so identity is verifiable and staleness detectable without downloading the blob.Isolation. All
oci-client/tokio/docker_credentialusage lives in the newsrc/share.rs; the rest of the crate stays synchronous (sync facades run a short-lived current-thread runtime internally — mirrors how all lbug specifics live inladybug_store.rs). Gated by a default-onshareCargo feature;--no-default-featuresdrops the networking/TLS stack and the commands.Auth — uses your existing
docker loginCredentials are auto-discovered from
~/.docker/config.json+ OS credential helpers (incl. macOS keychain) via thedocker_credentialcrate. No tokens in synapse's config. Order (auto): env override (SYNAPSE_REGISTRY_USER/PASS/TOKEN) → docker creds → anonymous. Public registries pull with zero setup.Push safety (triple-locked)
A fresh clone / CI can't push by accident. Push requires all of:
push_enabled = truein[share]config (default false),--allow-dirty),--yes; non-TTY without--yesrefuses rather than hangs).Artifacts are tagged by commit (immutable per-commit short SHA + moving
latest).Pull + staleness
pulldefaults to the moving tag (latest) — deliberately not the local HEAD commit tag, which usually wouldn't exist for a teammate on a different commit (would hard-failmanifest unknown). It verifies the blob's blake3, writes atomically (temp+rename), records a.synapse/graph/origin.jsonprovenance sidecar, and warns loudly when the graph's commit ≠ local HEAD.--tag <sha>pulls the exact graph for a commit.statussurfaces the pulled origin commit + staleness on every run (human +--jsonorigin/originStale).indexclears the now-stale sidecar.init gitignores the graph
initappends the graph dir to an existing root.gitignore(idempotent; keepssynapse.tomlcommittable; never creates a.gitignorewhere none existed), so the binary graph isn't committed.Verification
End-to-end against a real Azure Container Registry: push (auth auto-discovered from the macOS keychain, zero config secrets) → pull into a fresh clone at a different commit → byte-identical graph, fully queryable (338 symbols, 648 ref edges), with the staleness warning firing on the differing HEAD. Push double-confirm exercised interactively.
Automated: full suite 91 tests (6 share-helper units, 5 push/pull guard CLI tests, 1 gitignore test, ladybug batch test),
cargo fmt --check+cargo clippy --all-targetsgreen, and the lean--no-default-featuresbuild compiles.Bumps version to 0.2.0 (new feature).