You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Last audited: 2026-03-14
Audit scope: All 16 source files (src/*.rs) + Cargo.toml — full manual review
Executive Summary
symtrace is a local-only, offline, privacy-respecting CLI tool.
It has no network capabilities, collects no telemetry, uses no unsafe
Rust code, and processes data only as explicitly requested by the user.
Property
Status
Network access
None — no HTTP, TCP, or DNS crates
Telemetry / analytics
None — zero tracking or phoning home
Data exfiltration
None — all output goes to stdout/stderr only
Environment variable access
Limited — reads only XDG_CACHE_HOME, LOCALAPPDATA, HOME, USERPROFILE for cache directory resolution
External process spawning
None — no std::process::Command
Unsafe Rust code
Denied — unsafe_code = "deny" enforced in Cargo.toml lints
File writes outside cache
None — writes only to $XDG_CACHE_HOME/symtrace/<repo_hash>/
Isolated — tree-sitter Trees cached in-memory only; no disk serialisation
Repository path validation
Enforced — canonicalized + directory check before git2 access
1. Ethical Data Processing
What data does symtrace access?
symtrace reads only local git repository data that the user explicitly
points it to via CLI arguments:
Git commit trees and blob objects (via libgit2)
Source code file contents (to build ASTs via tree-sitter)
What does symtrace do with the data?
Parses source code into Abstract Syntax Trees (ASTs)
Computes structural hashes (BLAKE3) for diff matching
Outputs a semantic diff report to stdout
What does symtrace NOT do?
Does NOT transmit any data over the network — there are zero
networking dependencies (reqwest, hyper, http, curl, etc. are
all absent from Cargo.toml)
Does NOT collect telemetry — no analytics, usage metrics, crash
reports, or any form of tracking
Does NOT spawn sub-processes — no std::process::Command usage
Does NOT read files outside the specified git repository
Does NOT modify the git repository in any way (read-only access)
Does NOT log sensitive information — no credentials, tokens, or
personal data appear in any output
Cache data
The on-disk AST cache is stored outside the repository tree:
$XDG_CACHE_HOME/symtrace/<blake3(canonical_repo_path)>/
(Windows: %LOCALAPPDATA%/symtrace/<hash>/)
Cache files contain serialised AST representations wrapped in a versioned
envelope. Cache data is:
Stored externally — not inside the repository directory
Never transmitted anywhere
Versioned with a schema tag — mismatched versions are rejected
Integrity-checked against the git blob OID
Bounded to 20 MiB maximum during deserialization
Safe to delete at any time without affecting functionality
2. Security Findings
2.1 Cache Deserialization Without Integrity Verification — MITIGATED
Field
Value
Severity
Low → Resolved
Location
src/ast_cache.rs — get() method
Description
Cache files were deserialized via bincode::deserialize without checksum or version tag verification.
Mitigation applied
Cache files are now wrapped in a CacheEnvelope with a version: u8 tag and blob_oid: String integrity field. Deserialization uses bincode::options().with_limit(20_971_520). On version mismatch or OID mismatch, the stale file is removed and the entry is treated as a cache miss.
bincode v1's deserialize could allocate large amounts of memory if a crafted input specified large Vec or String lengths.
Mitigation applied
All deserialization uses bincode::options().with_limit(20_971_520) (20 MiB cap). Payloads exceeding this limit cause a deserialization error, which is treated as a cache miss and the corrupt file is removed.
If a thread panics while holding a cache mutex (AstCache or TreeCache), the mutex becomes poisoned and subsequent unwrap() calls will panic.
Mitigation applied
All mutex acquisitions in both AstCache and TreeCache now use .lock().unwrap_or_else(|e| e.into_inner()), recovering gracefully from poisoned mutexes instead of panicking.
2.4 Repository Path — MITIGATED
Field
Value
Severity
Informational → Resolved
Location
src/main.rs — path canonicalization
Description
repo_path was accepted as a raw CLI string without canonicalization.
Mitigation applied
The path is now canonicalized via std::fs::canonicalize() before use. On Windows, the \\?\ UNC prefix is stripped. Cache is stored externally under a blake3(canonical_path) hash, preventing path spoofing and repository contamination.
2.5 Silent Parse Failure — Informational
Field
Value
Severity
Informational
Location
src/main.rs — parse_or_cached_with_tree()
Description
When ast_builder::parse_content returns Err, the error is silently discarded and the file is skipped.
Impact
No security impact. May confuse users if files are silently skipped.
2.6 Incremental Parsing TreeCache — Informational
Field
Value
Severity
Informational
Location
src/incremental_parse.rs — TreeCache
Description
The TreeCache stores tree-sitter Tree objects in a Mutex<LruCache<String, Tree>> (capacity 128). These trees are kept only in process memory and are never serialised to disk.
Security properties
(1) No disk persistence — trees exist only during program execution; (2) No deserialization surface — trees are computed via tree-sitter::Parser, not loaded from external storage; (3) Mutex-protected — concurrent access is serialised; (4) Bounded capacity — LRU eviction at 128 entries prevents unbounded memory growth; (5) Edit computation is pure arithmetic — compute_edit() performs common-prefix/suffix byte comparison with no I/O or allocations beyond a single InputEdit struct.
Impact
No security impact. The TreeCache introduces no new file I/O, no new environment variable access, and no new attack surface.
2.7 Cache Directory Permissions — RESOLVED
Field
Value
Severity
Low → Resolved
Location
src/ast_cache.rs — new() method
Description
The cache directory was created with default OS permissions, which on shared systems could allow other users to read cached AST data (containing source code).
Mitigation applied
On Unix systems, the cache directory permissions are now explicitly set to 0o700 (owner-only read/write/execute) immediately after creation. This prevents other users from listing or reading cached files.
The repository path was passed to git2::Repository::open() without verifying it was a directory. Passing a regular file path could produce confusing error messages.
Mitigation applied
After canonicalization, an explicit is_dir() check rejects non-directory paths with a clear error message before any git operations commence.
2.9 CLI Version Hardcoding — RESOLVED
Field
Value
Severity
Informational → Resolved
Location
src/cli.rs — #[command(version)]
Description
The CLI version was hardcoded as "0.1.0" instead of being derived from Cargo.toml, causing version output to be stale.
Mitigation applied
Replaced with env!("CARGO_PKG_VERSION") to always reflect the actual crate version.
3. Dependency Security Assessment
Crate
Version
Risk
Notes
clap
=4.5.60
Low
Widely audited CLI parser; version pinned
git2
=0.19.0
Low
Rust bindings for libgit2 (C library); mature; pinned
tree-sitter
=0.25.10
Low
Parsing framework with C runtime; well-tested; pinned
All dependency versions are exactly pinned (=x.y.z) in Cargo.toml
unsafe_code = "deny" enforced via [lints.rust] in Cargo.toml
cargo-deny configuration (deny.toml) checks advisories, licenses,
bans, and sources
Run cargo deny check and cargo audit locally before publishing
Key observations:
Zero networking crates — no reqwest, hyper, http, curl,
openssl, rustls, or any TLS dependency
Native code surface comes only from git2 (libgit2), tree-sitter
(C runtime), and grammar crates (generated C parsers)
All crate versions are current with no known CVEs at time of audit
4. File System Access Map
Operation
File
Scope
Repository::open()
src/git_layer.rs
Canonicalized repo path (read only)
fs::canonicalize()
src/main.rs
User-specified repo path → canonical
fs::create_dir_all()
src/ast_cache.rs
$XDG_CACHE_HOME/symtrace/<hash>/ only
fs::read()
src/ast_cache.rs
Cache .bin files only (bounded to 20 MiB)
fs::write()
src/ast_cache.rs
Cache .bin files only
fs::read_dir()
src/ast_cache.rs
Cache directory listing (for stats)
fs::remove_file()
src/ast_cache.rs
Stale/corrupt cache files only
std::env::var()
src/main.rs
XDG_CACHE_HOME, LOCALAPPDATA, HOME, USERPROFILE
No directory traversal vulnerabilities — all cache paths are constructed
from git2::Oid hex strings (40-char [0-9a-f] only). Cache is stored
outside the repository tree, preventing accidental commits.
5. Information Leakage Assessment
Channel
Content
Classification
stdout
Diff report (file paths, function names, line numbers, similarity scores)
Expected output
stderr
Cache stats, performance metrics
Metadata only
Disk cache
Serialised ASTs (includes leaf source text)
Stored in user cache dir
JSON output
Repository path, commit hashes
Expected output
No unexpected information leakage. The tool does not output credentials,
environment variables, or user PII.
6. Reporting Vulnerabilities
If you discover a security issue, please report it by opening an issue
or contacting the maintainer directly. Please include:
Run security checks — use cargo audit and cargo deny check
(configuration in deny.toml)
Review cache permissions — on shared systems, verify that
the cache directory is not world-readable
Delete cache when needed — the cache is stored outside the repo
under $XDG_CACHE_HOME/symtrace/; delete it freely
Use on trusted repositories only — tree-sitter parses source code
via C grammars; parser resource guardrails (file size, node count,
recursion depth, and timeout limits) mitigate pathological inputs
Configure resource limits — use --max-file-size,
--max-ast-nodes, --max-recursion-depth, and --parse-timeout-ms
CLI flags to adjust guardrails for your workload
Disable incremental parsing if needed — use --no-incremental
to force full re-parsing of every file; useful for auditing hash
correctness or when memory is constrained