feat: CLI/MCP parity, map command, and concurrency fixes#147
Conversation
- load_or_index_graph now compares source file mtimes against graph.bin before trusting the cache; stale cache triggers re-index automatically - when a bridge is running (detected via .arbor/cache/db), skip the staleness check — the bridge's persister owns freshness - graph persister acquires an exclusive advisory lock (.arbor/persist.lock) so only one bridge per project writes graph.bin, preventing double-writes when multiple Claude sessions target the same project - add fs2 dep for cross-platform advisory file locks - CLAUDE.md: sqz guidance block (auto-installed by sqz init)
Arbor no longer writes an .arbor/ cache into a project that was never indexed unless the user opts in. Running a read command (map, query, callers, etc.) against an un-indexed project now fails fast with guidance instead of silently mutating the repo — important when an agent reads code in a separate project. - split ensure_arbor_initialized into init_arbor_dir (unconditional, used by explicit init/index/setup) and a gated wrapper for implicit commands - gate load_or_index_graph too, since read commands call it directly without ensure_arbor_initialized - auto_index resolution: ARBOR_AUTO_INDEX env, then ~/.arbor/config.json "auto_index", else false. Lives globally since an un-indexed project has no project config to read - already-indexed projects need no flag (.arbor/ presence is the opt-in) - diff integration tests run on fresh temp repos: set ARBOR_AUTO_INDEX=1
new `arbor hook <harness> [path] [--global]` command installs arbor agent directives + hooks into a coding-agent harness. claude is the lone impl; Harness trait + dispatch leave room for opencode/codex/etc. claude harness: - CLAUDE.md: upserts a marker-delimited arbor guidance block (root CLAUDE.md preferred, then .claude/, else creates root). re-run replaces block in place, never duplicates. - .claude/settings.json: adds the 3 Bash hooks (auto-init, block rg/recursive-grep/find, daily map inject) + arbor command permissions.allow. dedups by exact match, preserves user content. - --global targets ~/.claude/. hooks, permissions, and guidance match the reference install byte-for-byte.
927977b to
dfdc874
Compare
🔴 Arbor PR Walk
Changed Files
🎯 Production Entry Points ReachedThis change propagates to these entry points (HTTP handlers, jobs, CLI commands):
✅ Before You Merge
🔍 Sensitive Path Check — REVIEW REQUIRED
Sensitive call paths
📊 Analysis confidence: High · 1388 nodes · 5522ms
Arbor · View full report → · 5522ms · 1388 nodes · Senior engineer in your repo · Was this useful? 👍 👎 |
Java captured only the bare method name for `Foo.bar()`, dropping the class qualifier. Static calls then collided with same-named methods — edge dropped, or worse, linked to the wrong class. C# dropped static calls entirely. Both now keep the receiver (`Foo.bar`), which exact-matches the stored FQN in the builder. this./super./base. still strip to bare name for same-class resolution; instance calls keep `obj.method` and fail safe. Matches the behavior Go/C++/Rust/Python already had.
47d0e67 to
8da0b77
Compare
There was a problem hiding this comment.
Pull request overview
This PR expands Arbor’s CLI to reach feature parity with the existing MCP tool surface, adds a token-budgeted map view for fast project orientation, introduces a new arbor hook claude installer for Claude Code harness integration, and includes several indexing/cache/concurrency improvements plus parser fixes for qualified static calls.
Changes:
- Added new CLI graph query commands (
callers,callees,entry-points,file-graph,inspect,path) and enhancedquery(multi-term OR +--exclude-test), plus a new token-budgetedmapcommand with centrality persistence. - Added MCP
get_maptool and updated server behavior to support orientation workflows. - Improved correctness and runtime behavior: Java/C# static call parsing fix, incremental indexing updates, cache staleness detection, atomic-ish cache writes, and a single-writer graph persister while a bridge runs.
Reviewed changes
Copilot reviewed 18 out of 20 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/assets/CLAUDE-example.md | Example Claude Code instructions showing Arbor-first navigation workflow and map usage. |
| crates/arbor-watcher/src/lib.rs | Re-exports sources_newer_than for cache staleness checks. |
| crates/arbor-watcher/src/indexer.rs | Adds sources_newer_than helper and unit tests for stale-cache detection. |
| crates/arbor-server/src/sync_server.rs | Recomputes centrality after incremental graph patches/removals to keep map rankings aligned. |
| crates/arbor-mcp/src/lib.rs | Adds get_map tool and tool descriptions; adds “still indexing” guard before serving tools. |
| crates/arbor-gui/Cargo.toml | Updates egui/eframe dependencies to 0.31. |
| crates/arbor-graph/src/graph.rs | Adds rebuild_search_index() to restore non-serialized search index after deserialization. |
| crates/arbor-graph/src/builder.rs | Adds regression test ensuring qualified static calls resolve to the correct target class. |
| crates/arbor-core/src/languages/java.rs | Fixes Java call collection to retain qualifiers for static/type-qualified invocations; adds tests. |
| crates/arbor-core/src/languages/csharp.rs | Fixes C# call collection to retain qualifiers for static/type-qualified invocations. |
| crates/arbor-cli/tests/graph_commands_integration.rs | Adds broad integration coverage for new CLI graph commands and map. |
| crates/arbor-cli/tests/diff_command_integration.rs | Opts tests into auto-indexing via ARBOR_AUTO_INDEX=1. |
| crates/arbor-cli/src/main.rs | Adds new CLI subcommands and extends query with --exclude-test; wires dispatch. |
| crates/arbor-cli/src/hook/mod.rs | Introduces arbor hook <harness> scaffold (trait + dispatch). |
| crates/arbor-cli/src/hook/claude.rs | Implements Claude harness installer (CLAUDE.md upsert + settings.json hooks + permissions). |
| crates/arbor-cli/src/commands.rs | Major: auto-index gate, staleness checks, cache read/write changes, new commands, map implementation, background indexing/persister. |
| crates/arbor-cli/Cargo.toml | Adds fs2 dependency for advisory file locking. |
| CLAUDE.md | Updates docs for new MCP tool tiering and expanded CLI command reference + agent integration details. |
| Cargo.lock | Lockfile updates from dependency bumps (notably egui/eframe ecosystem). |
| .gitignore | Ignores .updates/ directory. |
| let tmp_path = graph_path.with_extension("json.tmp"); | ||
| let file = std::fs::File::create(&tmp_path)?; | ||
| let writer = std::io::BufWriter::new(file); | ||
| serde_json::to_writer_pretty(writer, graph)?; | ||
| fs::rename(&tmp_path, &graph_path)?; |
| let tmp_path = graph_path.with_extension("bin.tmp"); | ||
| let bytes = bincode::serialize(graph)?; | ||
| fs::write(graph_path, bytes)?; | ||
| fs::write(&tmp_path, bytes)?; | ||
| fs::rename(&tmp_path, &graph_path)?; | ||
| Ok(()) |
| fn parse_numstat_files(output: &str) -> Vec<String> { | ||
| output | ||
| .lines() | ||
| .filter_map(|line| { | ||
| let parts: Vec<&str> = line.split('\t').collect(); | ||
| if parts.len() < 3 { | ||
| return None; | ||
| } | ||
| let path = if parts[2].contains(" => ") || !parts[2].is_empty() { | ||
| parts[2] | ||
| } else { | ||
| parts.get(2).copied().unwrap_or("") | ||
| }; | ||
| let normalized = normalize_slashes(path.trim()); | ||
| if normalized.is_empty() { | ||
| None | ||
| } else { | ||
| Some(normalized) | ||
| } | ||
| }) | ||
| .collect() | ||
| } |
| let glob_boost = match focus_glob { | ||
| Some(pattern) if node.file.contains(pattern.trim_matches('*')) => 0.3, | ||
| _ => 0.0, | ||
| }; |
| // If the graph is empty, the background index hasn't finished yet | ||
| if self.graph.read().await.node_count() == 0 { | ||
| return Ok(Self::err_envelope( | ||
| name, | ||
| "Arbor is still indexing the project. Please retry in a few seconds.", | ||
| )); | ||
| } |
| // Recompute centrality so the code map stays in sync with the | ||
| // patched graph — new nodes start at 0.0 otherwise and the map drifts. | ||
| let scores = compute_centrality(&g, 20, 0.85); | ||
| g.set_centrality(scores.into_map()); |
|
I will review this manually in some time. |
Description
This PR brings the CLI to parity with the MCP tool surface, adds a token-budgeted
mapcommand, includes several concurrency and indexing fixes, and adds a newarbor hookcommand that wires arbor into a coding-agent harness (Claude Code) in one shot. The CLI can now perform most operations previously available only to AI agents through the MCP bridge — graph queries, caller/callee lookups, symbol inspection, and path traversal — directly from the terminal. It also fixes a parser bug where Java and C# static method calls were mapped to the wrong target (or dropped).Type of Change
Changes Made
mapcommand (commands.rs,main.rs): Produces a ranked, token-budgeted project skeleton using PageRank and entry-point detection. Supports--tokens N,--focus,--focus-changed,--json, and--verbose. Centrality scores are persisted to the cache so repeat invocations skip recomputation.callers,callees,entry-points,file-graph,inspect, andpathcommands matching the existing MCP tools. All support--json.query "a|b" .performs OR search with test-file filtering.get_maptool (arbor-mcp/src/lib.rs): Exposes the map output to agents as well.arbor hook claude(arbor-cli/src/hook/): Newarbor hook <harness> [path] [--global]command installs arbor agent directives + hooks into a coding-agent harness in one shot. Claude is the lone impl; aHarnesstrait + dispatch leave room for opencode/codex/etc.CLAUDE.mdpreferred, then.claude/, else creates root). Re-run replaces the block in place — never duplicates..claude/settings.json: adds the 3 Bash hooks (auto-init, block recursive grep/find, daily map inject) plus arbor commandpermissions.allow. Dedups by exact match, preserves existing user content.--globaltargets~/.claude/. Hooks, permissions, and guidance match the documented reference install byte-for-byte.arbor-core/src/languages/java.rs,csharp.rs): Qualified static calls likeMathUtils.add()were captured by the Java parser as the bare method nameadd, dropping the class qualifier. The builder then resolved that bare name by suffix-matching, so the edge was either dropped (ambiguous name) or linked to the wrong class (a same-named method in a sibling file). C# dropped static calls entirely. Both parsers now keep the receiver (MathUtils.add), which exact-matches the stored FQN in the builder and links to the correct class.this./super./base.calls still strip to the bare name for same-class resolution, and instance calls (obj.method) keepobj.methodand fail safe (no false edge). This matches the behavior Go/C++/Rust/Python already had..tmpplus rename), cache staleness check, and sled lock avoidance when a bridge may hold the exclusive lock.arbor-watcher/src/indexer.rs): Re-parses only changed files.auto_indexgate: Indexing of un-indexed projects is now gated behind theauto_indexconfig option (default off) to avoid unexpected indexing.CLAUDE.mdand addeddocs/assets/CLAUDE-example.md.Known Limitations
collect_callslooks for acall_expressionnode that the tree-sitter-dart grammar never emits (it usesmember_access+selector), so Dart currently captures zero call edges. This is a pre-existing, separate bug and is left for a follow-up rewrite — not addressed here.Testing
cargo test --allcargo clippy --allflutter test(not applicable — no visualizer changes)Added integration coverage in
graph_commands_integration.rs(531 lines) for the new query commands. Added regression tests for the static-call fix: parser-level (Java keeps the qualifier;this./super.strip to bare name) and a builder end-to-end test proving a qualified static call resolves to the correct class instead of a same-named sibling.Checklist