Skip to content

feat(share): push/pull the indexed graph via an OCI registry#7

Merged
prom3theu5 merged 1 commit into
mainfrom
feat/oci-share
Jun 2, 2026
Merged

feat(share): push/pull the indexed graph via an OCI registry#7
prom3theu5 merged 1 commit into
mainfrom
feat/oci-share

Conversation

@prom3theu5

Copy link
Copy Markdown
Member

Why

Committing the multi-MB graph (synapse.lbug) to git is poor — needs LFS, bloats history. This adds synapse push / synapse pull so teams share a prebuilt graph through any OCI registry (GHCR/ECR/ACR/Harbor): a maintainer or CI pushes the graph, teammates pull it instead of re-indexing.

How it works

The graph ships as a single-layer OCI artifact (raw .lbug bytes + a tiny JSON config blob). Git commit / branch / synapse version / blake3 are stamped into manifest annotations, so identity is verifiable and staleness detectable without downloading the blob.

Isolation. All oci-client / tokio / docker_credential usage lives in the new src/share.rs; the rest of the crate stays synchronous (sync facades run a short-lived current-thread runtime internally — mirrors how all lbug specifics live in ladybug_store.rs). Gated by a default-on share Cargo feature; --no-default-features drops the networking/TLS stack and the commands.

Auth — uses your existing docker login

Credentials are auto-discovered from ~/.docker/config.json + OS credential helpers (incl. macOS keychain) via the docker_credential crate. No tokens in synapse's config. Order (auto): env override (SYNAPSE_REGISTRY_USER/PASS/TOKEN) → docker creds → anonymous. Public registries pull with zero setup.

Push safety (triple-locked)

A fresh clone / CI can't push by accident. Push requires all of:

  1. push_enabled = true in [share] config (default false),
  2. a clean working tree (or --allow-dirty),
  3. interactive type-to-confirm (or --yes; non-TTY without --yes refuses rather than hangs).

Artifacts are tagged by commit (immutable per-commit short SHA + moving latest).

Pull + staleness

pull defaults to the moving tag (latest) — deliberately not the local HEAD commit tag, which usually wouldn't exist for a teammate on a different commit (would hard-fail manifest unknown). It verifies the blob's blake3, writes atomically (temp+rename), records a .synapse/graph/origin.json provenance sidecar, and warns loudly when the graph's commit ≠ local HEAD. --tag <sha> pulls the exact graph for a commit.

status surfaces the pulled origin commit + staleness on every run (human + --json origin/originStale). index clears the now-stale sidecar.

init gitignores the graph

init appends the graph dir to an existing root .gitignore (idempotent; keeps synapse.toml committable; never creates a .gitignore where none existed), so the binary graph isn't committed.

Verification

End-to-end against a real Azure Container Registry: push (auth auto-discovered from the macOS keychain, zero config secrets) → pull into a fresh clone at a different commit → byte-identical graph, fully queryable (338 symbols, 648 ref edges), with the staleness warning firing on the differing HEAD. Push double-confirm exercised interactively.

Automated: full suite 91 tests (6 share-helper units, 5 push/pull guard CLI tests, 1 gitignore test, ladybug batch test), cargo fmt --check + cargo clippy --all-targets green, and the lean --no-default-features build compiles.

Bumps version to 0.2.0 (new feature).

Note: manual validation pushed throwaway artifacts to prom/synapse-test and prom/rusttest in the ACR — prune those repos/tags if unwanted.

Committing the multi-MB graph to git is poor (needs LFS, bloats history). This
adds `synapse push` / `synapse pull` so teams share a prebuilt graph through any
OCI registry (GHCR/ECR/ACR/Harbor) — teammates pull instead of re-indexing.

The graph is shipped as a single-layer OCI artifact (raw .lbug bytes + a small
JSON config blob), with git commit / branch / synapse version / blake3 stamped
into manifest annotations so identity is verifiable and staleness detectable
without downloading the blob.

Isolation: all `oci-client` / `tokio` / `docker_credential` usage lives in the
new `src/share.rs`; the rest of the crate stays synchronous (the module exposes
sync facades that run a short-lived current-thread runtime internally). Gated by
a default-on `share` Cargo feature, so `--no-default-features` drops the
networking/TLS stack and the push/pull commands.

Auth: credentials are auto-discovered from the existing `docker login`
(`~/.docker/config.json` + OS credential helpers, incl. macOS keychain) via the
`docker_credential` crate — no tokens in synapse's config. Resolution order
(auto): env override (SYNAPSE_REGISTRY_USER/PASS/TOKEN) -> docker creds ->
anonymous. Public registries pull with zero setup.

Push is triple-locked so a fresh clone / CI can't push by accident:
`push_enabled = true` in config (default false) AND a clean working tree (or
--allow-dirty) AND interactive type-to-confirm (or --yes; non-TTY without --yes
refuses rather than hangs). Artifacts are tagged by commit (a per-commit short
SHA tag plus the moving `latest`).

Pull defaults to the moving tag (`latest`) — NOT the local HEAD commit tag,
which usually wouldn't exist for a teammate on a different commit and would hard
-fail. It verifies the blob's blake3, writes the graph atomically (temp+rename),
records a `.synapse/graph/origin.json` provenance sidecar, and warns loudly when
the graph's indexed commit differs from local HEAD. `--tag <sha>` pulls the
exact graph for a commit.

`status` surfaces the pulled graph's origin commit + staleness on every run
(human + --json `origin`/`originStale`); `index` removes the now-stale
provenance sidecar.

`init` now appends the graph dir to an existing root `.gitignore` (idempotent;
leaves `synapse.toml` committable, and never creates a `.gitignore` where none
existed) so the binary graph isn't committed.

Verified end-to-end against a real Azure Container Registry: push (auth
auto-discovered from the macOS keychain, no config secrets) -> pull into a
fresh clone -> byte-identical graph, queryable, with the staleness warning
firing on a differing HEAD. Full suite (91 tests), fmt and clippy green; the
lean `--no-default-features` build still compiles. Bumps version to 0.2.0.

New: src/share.rs. Touched: config (ShareConfig), cli (Push/Pull), main
(cmd_push/cmd_pull, status origin, index sidecar cleanup, init gitignore),
git (full_commit), errors (share variants), Cargo.toml (share feature + deps),
README/LLM.md. Tests: share-helper units, push/pull guard CLI tests, gitignore
test, ladybug link_edges already covered.
@qodo-code-review

Copy link
Copy Markdown

Review Summary by Qodo

Add OCI registry push/pull for sharing indexed graphs with safety guards

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Add synapse push/pull commands to share indexed graphs via OCI registries
  - Eliminates need to commit multi-MB binary to git; teammates pull instead of re-indexing
  - Supports any OCI registry (GHCR, ECR, ACR, Harbor); credentials auto-discovered from docker login
• Implement triple-locked push safety gates
  - Requires push_enabled = true in config (default false), clean working tree, and interactive
  confirmation
  - Prevents accidental pushes from fresh clones or CI environments
• Add graph staleness detection and provenance tracking
  - Pull verifies blake3 integrity; warns when graph commit differs from local HEAD
  - status command surfaces pulled graph origin and staleness; index clears stale markers
• Isolate async/networking code in new src/share.rs module
  - All OCI, tokio, and docker_credential usage confined to single module; rest of crate stays
  synchronous
  - Gated by default-on share Cargo feature; --no-default-features drops networking/TLS stack
• Enhance init to gitignore graph working state
  - Appends graph directory to existing .gitignore (idempotent); keeps synapse.toml committable
Diagram
flowchart LR
  A["synapse push"] -->|"triple-locked<br/>guards"| B["OCI Registry"]
  C["synapse pull"] -->|"verify blake3<br/>check staleness"| B
  B -->|"manifest annotations<br/>commit/version/blake3"| D["Shared Graph"]
  E["docker login"] -->|"auto-discover<br/>credentials"| B
  F["synapse status"] -->|"display origin<br/>& staleness"| G["Provenance<br/>sidecar"]

Loading

Grey Divider

File Changes

1. src/share.rs ✨ Enhancement +489/-0

New module for OCI registry graph sharing

src/share.rs


2. src/cli.rs ✨ Enhancement +38/-0

Add Push and Pull command argument structs

src/cli.rs


3. src/config.rs ⚙️ Configuration changes +56/-0

Add ShareConfig section with registry/auth settings

src/config.rs


View more (9)
4. src/errors.rs Error handling +35/-0

Add share-specific error types for guards

src/errors.rs


5. src/git.rs ✨ Enhancement +6/-0

Add full_commit function for canonical identity

src/git.rs


6. src/lib.rs ✨ Enhancement +2/-0

Export share module behind feature gate

src/lib.rs


7. src/main.rs ✨ Enhancement +294/-0

Implement cmd_push and cmd_pull with safety guards

src/main.rs


8. Cargo.toml Dependencies +10/-2

Add share feature with oci-client/tokio/docker_credential

Cargo.toml


9. tests/cli.rs 🧪 Tests +157/-0

Add push/pull guard and gitignore integration tests

tests/cli.rs


10. tests/unit.rs 🧪 Tests +6/-0

Verify ShareConfig defaults in config roundtrip test

tests/unit.rs


11. README.md 📝 Documentation +26/-0

Document push/pull commands and sharing workflow

README.md


12. LLM.md 📝 Documentation +11/-0

Add OCI registry sharing section to quick reference

LLM.md


Grey Divider

Qodo Logo

@qodo-code-review

qodo-code-review Bot commented Jun 2, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (2)

Grey Divider


Action required

1. Commit compare can panic 🐞 Bug ☼ Reliability
Description
share::compare_commit slices strings with g[..n]/h[..n], which can panic on non-UTF8-boundary
indices when the registry-provided commit annotation contains multi-byte characters. Because the
commit string is taken verbatim from OCI manifest annotations, a malformed/malicious artifact can
crash synapse pull (denial-of-service).
Code

src/share.rs[R131-146]

Evidence
The commit value is read verbatim from OCI annotations and then compared using byte slicing, which
can panic on non-UTF8-boundary indices.

src/share.rs[82-94]
src/share.rs[131-146]
src/share.rs[313-319]
src/main.rs[768-784]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`compare_commit()` slices `&str` values using byte indices (`g[..n]`), which can panic if `n` is not on a UTF-8 character boundary. The `graph_commit` value comes from untrusted OCI manifest annotations, so a crafted/garbled artifact can crash the process during `synapse pull`.

### Issue Context
- Commit strings are read directly from manifest annotations and later compared against local HEAD.

### Fix Focus Areas
- src/share.rs[131-146]

### Suggested fix
- Avoid `&str` slicing by byte offsets.
- Compare using bytes:
 - `let gb = g.as_bytes(); let hb = h.as_bytes(); let n = gb.len().min(hb.len()); if gb[..n].eq_ignore_ascii_case(&hb[..n]) { ... }`
- (Optional hardening) Validate that commit strings are ASCII hex (and cap length) before comparing/storing.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Pull overwrite may fail 🐞 Bug ≡ Correctness
Description
cmd_pull warns it will overwrite an existing synapse.lbug but uses std::fs::rename(tmp, final)
without removing the destination first. On platforms where rename cannot replace an existing file
(notably Windows), synapse pull will fail whenever the graph already exists.
Code

src/main.rs[R733-749]

Evidence
The code indicates it will overwrite an existing graph but does not remove the existing destination
before rename, so overwrite is not reliably implemented.

src/main.rs[733-750]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`cmd_pull` implements an "atomic write" via temp file + `std::fs::rename`, but it does not ensure the destination does not exist. This breaks the intended overwrite behavior on platforms where `rename` fails if the destination exists.

### Issue Context
The code explicitly logs that it is overwriting an existing graph, so overwrite is intended behavior.

### Fix Focus Areas
- src/main.rs[733-749]

### Suggested fix
- Before `rename`, remove the destination if it exists:
 - `if final_path.exists() { std::fs::remove_file(&final_path)?; }`
 - Then `std::fs::rename(&tmp_path, &final_path)?;`
- Keep the temp+rename pattern for crash-safety.
- Consider also cleaning up a leftover temp file on failure (best-effort).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

3. PullArgs tag default mismatch 📘 Rule violation ⚙ Maintainability Rule 5
Description
PullArgs/pull --tag documentation claims synapse pull defaults to the HEAD commit tag, but the
implemented tag-resolution logic actually defaults to the configured moving tag (e.g. latest) when
no --tag is provided. This inconsistency can mislead users and future maintainers about the CLI’s
real behavior and cause confusion when pulls fetch latest by default.
Code

src/cli.rs[R62-65]

Evidence
Rule 3 calls for a single consistent convention within a subsystem, but the CLI help text for
PullArgs.tag describes a HEAD-commit-tag default while the code path used by cmd_pull relies on
share::resolve_pull_tag(args.tag.as_deref(), &config.share.moving_tag), which selects the moving
tag when --tag is absent. The share module further documents the moving tag as the intended
default to avoid manifest unknown for teammates on different commits, reinforcing that the
implementation’s behavior is deliberate and contradicts the current help text.

Rule Rule 5: Do Not Blend Conflicting Patterns (Choose One Convention and Apply Consistently)
src/cli.rs[61-65]
src/share.rs[150-166]
src/cli.rs[61-66]
src/main.rs[719-724]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Align the `PullArgs` / `pull --tag` help text with the actual behavior: `synapse pull` currently defaults to the configured moving tag (e.g. `latest`) via `share::resolve_pull_tag`, not the HEAD commit tag. Update documentation/help strings so users and maintainers see one consistent default-tag convention.

## Issue Context
- The `share` module intentionally treats the moving tag as the default to avoid `manifest unknown` for teammates on different commits.
- `cmd_pull` calls `share::resolve_pull_tag(args.tag.as_deref(), &config.share.moving_tag)`, which chooses the moving tag when `--tag` is not provided.
- Suggested wording adjustment: change the doc/help text to something like “Tag to pull (defaults to the configured moving tag, e.g. `latest`).”

## Fix Focus Areas
- src/cli.rs[61-65]
- src/share.rs[150-166]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


4. ShareConfig.auth precedence mismatch 📘 Rule violation ⚙ Maintainability Rule 5
Description
ShareConfig documentation says auto uses docker creds -> env -> anonymous, but resolve_auth
implements env -> docker -> anonymous. This conflicting precedence definition can cause incorrect
expectations about which credentials will be used.
Code

src/config.rs[R109-113]

Evidence
Rule 3 requires choosing one convention and applying it consistently. The config docs define auto
precedence as docker-first, while the implementation is env-first, creating a conflicting pattern
within the same feature.

Rule Rule 5: Do Not Blend Conflicting Patterns (Choose One Convention and Apply Consistently)
src/config.rs[109-113]
src/share.rs[249-287]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The documentation for `[share].auth = "auto"` describes a different credential resolution order than the actual implementation. This creates contradictory guidance within the share feature.

## Issue Context
`resolve_auth` currently treats environment variables as an override and falls back to docker credentials, then anonymous.

## Fix Focus Areas
- src/config.rs[109-113]
- src/share.rs[249-287]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

@prom3theu5 prom3theu5 merged commit f4ee5dd into main Jun 2, 2026
1 check passed
@prom3theu5 prom3theu5 deleted the feat/oci-share branch June 2, 2026 00:03
Comment thread src/share.rs
Comment on lines +131 to +146
pub fn compare_commit(graph_commit: Option<&str>, head_commit: Option<&str>) -> GraphFreshness {
match (graph_commit, head_commit) {
(Some(g), Some(h)) if !g.is_empty() && !h.is_empty() => {
let n = g.len().min(h.len());
if g[..n].eq_ignore_ascii_case(&h[..n]) {
GraphFreshness::Match
} else {
GraphFreshness::Mismatch {
graph_commit: g.to_string(),
head_commit: h.to_string(),
}
}
}
_ => GraphFreshness::Unknown,
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Commit compare can panic 🐞 Bug ☼ Reliability

share::compare_commit slices strings with g[..n]/h[..n], which can panic on non-UTF8-boundary
indices when the registry-provided commit annotation contains multi-byte characters. Because the
commit string is taken verbatim from OCI manifest annotations, a malformed/malicious artifact can
crash synapse pull (denial-of-service).
Agent Prompt
### Issue description
`compare_commit()` slices `&str` values using byte indices (`g[..n]`), which can panic if `n` is not on a UTF-8 character boundary. The `graph_commit` value comes from untrusted OCI manifest annotations, so a crafted/garbled artifact can crash the process during `synapse pull`.

### Issue Context
- Commit strings are read directly from manifest annotations and later compared against local HEAD.

### Fix Focus Areas
- src/share.rs[131-146]

### Suggested fix
- Avoid `&str` slicing by byte offsets.
- Compare using bytes:
  - `let gb = g.as_bytes(); let hb = h.as_bytes(); let n = gb.len().min(hb.len()); if gb[..n].eq_ignore_ascii_case(&hb[..n]) { ... }`
- (Optional hardening) Validate that commit strings are ASCII hex (and cap length) before comparing/storing.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment thread src/main.rs
Comment on lines +733 to +749
// Atomic write: temp file + rename, so an interrupted pull never leaves a
// half-written graph in place.
let graph_dir = repo.root.join(&config.graph.path);
std::fs::create_dir_all(&graph_dir)
.with_context(|| format!("creating {}", graph_dir.display()))?;
let final_path = graph_dir.join("synapse.lbug");
if final_path.exists() {
eprintln!(
"warning: overwriting existing local graph at {}",
final_path.display()
);
}
let tmp_path = graph_dir.join("synapse.lbug.tmp");
std::fs::write(&tmp_path, &pulled.bytes)
.with_context(|| format!("writing {}", tmp_path.display()))?;
std::fs::rename(&tmp_path, &final_path)
.with_context(|| format!("replacing {}", final_path.display()))?;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Pull overwrite may fail 🐞 Bug ≡ Correctness

cmd_pull warns it will overwrite an existing synapse.lbug but uses std::fs::rename(tmp, final)
without removing the destination first. On platforms where rename cannot replace an existing file
(notably Windows), synapse pull will fail whenever the graph already exists.
Agent Prompt
### Issue description
`cmd_pull` implements an "atomic write" via temp file + `std::fs::rename`, but it does not ensure the destination does not exist. This breaks the intended overwrite behavior on platforms where `rename` fails if the destination exists.

### Issue Context
The code explicitly logs that it is overwriting an existing graph, so overwrite is intended behavior.

### Fix Focus Areas
- src/main.rs[733-749]

### Suggested fix
- Before `rename`, remove the destination if it exists:
  - `if final_path.exists() { std::fs::remove_file(&final_path)?; }`
  - Then `std::fs::rename(&tmp_path, &final_path)?;`
- Keep the temp+rename pattern for crash-safety.
- Consider also cleaning up a leftover temp file on failure (best-effort).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant