You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add a new family of code metrics derived from version control history, as a peer to the existing AST-derived metrics. The goal is to surface files that are most likely to contain vulnerabilities OR bugs, using the signals the empirical literature most consistently backs.
Naming principle.vcs is a generic abstraction; git is the v1 backend. Future backends (Mercurial, Jujutsu, Pijul) are plausible and must not require renaming. Generic types live under the vcs namespace; git-specific code lives under vcs::git. Cargo features mirror this split (umbrella vcs = ["vcs-git"], exactly like the existing all-languages umbrella).
Motivation and evidence: the synthesis in vulnerability-correlation.md (in the repo root) plus a broader defect-prediction literature. Concrete published effect sizes include Firefox NumChanges PD 86 / PF 23, RHEL4 ≥9-developer files reported as ~16× more likely to harbor a vulnerability, Windows Vista edit-frequency ρ ≈ 0.29, Hassan's change entropy reaching Pearson 0.54 with file-level defects on Apache projects, and Nagappan & Ball's relative-churn measures forming the basis of Tornhill's well-known complexity × churn "hotspot" model.
A working shell-script prototype lives at git-history-risk-rank.sh; this issue replaces it with a first-class, tested, cross-platform Rust implementation and broadens the signal set.
This is the first metric family in the project that is language-agnostic and not AST-derived, so it also establishes the architectural pattern for any future non-AST signals.
Scope (v1)
v1 ships one backend (vcs-git), behind the umbrella vcs Cargo feature. Adding a second backend later is a separate issue and does not require renaming or moving v1 code.
A single history walk produces these per-file signals over two configurable time windows (defaults: 12 months "long", 90 days "recent"):
Field
Type
Description
Primary literature support
commits_long
u32
Distinct commits touching the file in the long window
Firefox NumChanges (PD 86); Vista edit-freq ρ ≈ 0.29
Days since the file's first commit (capped at window)
Chromium "new features" risk
last_modified_days
u32
Days since the file's most recent commit
Operational filter / staleness
risk_score
f64
Composite, formula-versioned (see below)
Literature-derived; non-cardinal
risk_score_version
u32
Increments any time the formula changes
Forward-compatibility
hotspot_score
Option<f64>
complexity_index × churn_recent, present only when AST metrics are also computed
Nagappan & Ball; Tornhill
vcs_schema_version
u32
Output shape version
Forward-compatibility
The composite score uses log-scaling on every count, weights recent churn and recent commits highest, multiplies the author factor by the ownership-dilution factor (1 - ownership_top_share), treats file size as a tiny tie-breaker, and applies categorical multiplicative bumps for the RHEL4 6-developer (1.15×) and 9-developer (1.35×) thresholds plus a new-file bump (1.15× when age_days < recent_window_days). Bug-fix and security-fix commit counts feed in via a log-scaled additive term with double weight on security fixes. The exact formula is below and is also documented in src/vcs/score.rs with citations; risk_score_version lets it evolve without breaking downstream consumers.
An alternative percentile-based score is available via --risk-formula percentile: each signal is re-ranked to its percentile within the analyzed set, then averaged. The literature explicitly recommends relative/percentile triggers over hard thresholds for cross-project robustness.
Explicitly out of scope (filed as follow-ups)
Per-function granularity via git blame + AST line spans
Just-in-time (commit-level) risk scoring (Kamei et al.)
Directory- and repo-level bus factor (Avelino DoA)
Full SZZ bug-inducing commit detection (developer-validated SZZ recall remains ≈0.55 even with LLM augmentation; out of scope for a metrics library)
Historical metric trend (time series over N historical points)
Persistent VCS history cache keyed by HEAD SHA
CVE / advisory linkage
Dependency graph integration
Submodule recursion
VCS backends other than git (Mercurial, Jujutsu, Pijul, …)
Architecture
New module tree under src/vcs/:
Generic (always compiled when any backend is enabled): error, options, stats, identity, classify, score, hotspot, and a build_history_index(root, options) entry point.
Backend-specific: src/vcs/git/ (repo, history, identity) — gated by vcs-git.
v1 does not introduce a Backend trait (premature abstraction with one backend). The top-level entry point delegates to the single available backend; the trait is extracted when a second backend lands.
Hierarchical Cargo features (mirrors the existing all-languages umbrella):
build_history_index runs ONCE per invocation (before the AST walk) and produces HashMap<repo-relative-path, FileStats>. Walking history per file would be catastrophic on large repos.
CodeMetrics (in src/spaces.rs) gains pub vcs: Option<vcs::Stats> and a Vcs variant in Metric (mark #[non_exhaustive] if not already).
New bca vcs subcommand mirrors the prototype's ranked-list output. Integration into bca metrics/check/report via --metrics vcs is also wired up. The subcommand is backend-agnostic; it probes the working tree to decide which backend to use.
New POST /vcs endpoint on bca-web; new vcs_metrics(...) on Python bindings; opt-in vcs=True parameter on the existing Python analyze().
When the input path is not under a working tree of a supported VCS, bca vcs errors clearly; bca metrics --metrics vcs succeeds with a one-shot warning and omits the vcs field per file.
Edge cases the implementation must handle
.mailmap respected; multiple author emails canonicalized to one identity
Co-authored-by: trailers parsed and counted
Bot identities filtered by default; configurable regex
First-parent history by default; --full-history opts into full DAG
Merge commits skipped by default; --include-merges to include
File rename detection on by default
Shallow clones detected; output flag truncated_shallow_clone: true and a warning
Bare repos and worktrees both supported via gix::open
Submodules NOT recursed into; documented as out-of-scope
Binary files skipped (numstat reports -)
Symlinks skipped
Deleted files skipped by default; --include-deleted opt-in
Untracked / gitignored files: vcs field is None, distinct from a tracked file with zero counts in window
Window units: 12mo, 90d, 2y, 8w, or ISO 8601 P12M
Window inclusive boundary at now - window
Future-dated commits (clock skew) clamped to now()
All time math in UTC; --as-of <RFC3339> for reproducible runs
Author emails never emitted by default; --emit-author-details opts into SHA-256 hashed canonical IDs
All path handling via bstr::BString; UTF-8 conversion only at output boundary with explicit error handling (per AGENTS.md path rules)
No unsafe; no unwrap/expect/panic! in non-test code; all gix errors mapped to typed vcs::Error
Metric enum marked #[non_exhaustive] so future variants don't break consumers
Composite risk-score formula (v1)
Log-scaled weighted sum, plus categorical multiplicative bumps:
Adding fields to CodeMetrics is backwards-compatible (serde makes additive changes safe; #253 confirms this). The Metric enum gains one variant — confirm #[non_exhaustive] so future additions are non-breaking.
Test strategy
Central helper tests/common/vcs_fixture.rs builds deterministic temp git repos via gix with fixed author identities and UNIX timestamps.
Per-signal unit tests assert exact integer counts against known fixtures: empty repo, single commit, two-author file, bot-excluded vs included, mailmap-canonicalized author, Co-authored-by, renamed file (with and without --follow-renames), exact window boundary (inclusive at now - window), keyword classification (bug/security/revert positive + false-positive avoidance).
Score-property tests use comparative assertions, not exact floats: high-churn beats low-churn, diluted-ownership beats concentrated, high author-count beats low, new-and-busy beats old-and-quiet.
Integration tests under tests/:
bca vcs --paths <fixture-repo> → anchored JSON snapshot (per .snapshot-anchor-baseline.txt rules: assert_eq! on integer fields above each insta::assert_json_snapshot!).
bca vcs outside a git repo → non-zero exit with clear error.
bca metrics --metrics vcs outside a git repo → succeeds with warning and omitted vcs field.
Defensive-refactor verification (per .claude/rules/testing.md): any tightening predicate gets a git checkout HEAD~1 revert test to prove it would fail against the pre-refactor code.
cargo build / cargo test --workspace --all-features / cargo clippy ... -D warnings all pass with and without the vcs feature.
make pre-commit (full validation gate) clean before submission.
Mutation testing of src/vcs/ added to the quarterly cron in a separate, follow-up PR (out-of-band, not v1).
Acceptance checklist
gix integrated behind the vcs-git Cargo feature with the explicit feature list above; umbrella vcs = ["vcs-git"] registered at the workspace root
All v1 signals implemented and unit-tested with deterministic synthetic git repos (fixed authors + UNIX timestamps)
bca vcs subcommand reproduces the prototype's ranked-list output, plus configurable windows, history mode, merge handling, bot filtering, identity emission, and --as-of
VCS fields available via bca metrics/check/report when --metrics vcs is selected
POST /vcs endpoint, plus optional repo_path on POST /metrics for AST + VCS in one call
vcs_metrics() Python function, plus opt-in vcs=True on analyze()
metrics/vcs.md mdBook chapter and updates to recipes/rest-api.md
Composite score formula documented in code AND in the book with explicit citations back to vulnerability-correlation.md
Metric enum is #[non_exhaustive]
All edge cases above handled with at least one test each
Defensive-refactor verification (per .claude/rules/testing.md) for any tightening predicates
Anchored snapshots per .snapshot-anchor-baseline.txt rules
make pre-commit clean (full validation gate)
Follow-up issues filed for per-function granularity, change entropy, JIT, bus factor, history trend, persistent cache, and additional VCS backends
git-history-risk-rank.sh either deleted or reduced to a documented historical reference
References
vulnerability-correlation.md (this repo)
Nagappan & Ball, "Use of Relative Code Churn Measures to Predict System Defect Density" (2005)
Hassan, "Predicting Faults Using the Complexity of Code Changes" (2009, change entropy)
Tornhill, "Your Code as a Crime Scene" (hotspots = complexity × churn)
Summary
Add a new family of code metrics derived from version control history, as a peer to the existing AST-derived metrics. The goal is to surface files that are most likely to contain vulnerabilities OR bugs, using the signals the empirical literature most consistently backs.
Naming principle.
vcsis a generic abstraction;gitis the v1 backend. Future backends (Mercurial, Jujutsu, Pijul) are plausible and must not require renaming. Generic types live under thevcsnamespace; git-specific code lives undervcs::git. Cargo features mirror this split (umbrellavcs = ["vcs-git"], exactly like the existingall-languagesumbrella).Motivation and evidence: the synthesis in
vulnerability-correlation.md(in the repo root) plus a broader defect-prediction literature. Concrete published effect sizes include Firefox NumChanges PD 86 / PF 23, RHEL4 ≥9-developer files reported as ~16× more likely to harbor a vulnerability, Windows Vista edit-frequency ρ ≈ 0.29, Hassan's change entropy reaching Pearson 0.54 with file-level defects on Apache projects, and Nagappan & Ball's relative-churn measures forming the basis of Tornhill's well-known complexity × churn "hotspot" model.A working shell-script prototype lives at
git-history-risk-rank.sh; this issue replaces it with a first-class, tested, cross-platform Rust implementation and broadens the signal set.This is the first metric family in the project that is language-agnostic and not AST-derived, so it also establishes the architectural pattern for any future non-AST signals.
Scope (v1)
v1 ships one backend (
vcs-git), behind the umbrellavcsCargo feature. Adding a second backend later is a separate issue and does not require renaming or moving v1 code.A single history walk produces these per-file signals over two configurable time windows (defaults: 12 months "long", 90 days "recent"):
commits_longcommits_recentchurn_longchurn_recentauthors_longauthors_recentownership_top_shareburstcommits_recent / commits_long, clamped to[0, 1]bug_fix_commitssecurity_fix_commitsrevert_commits^Revert/rollbackage_dayslast_modified_daysrisk_scorerisk_score_versionhotspot_scorecomplexity_index × churn_recent, present only when AST metrics are also computedvcs_schema_versionThe composite score uses log-scaling on every count, weights recent churn and recent commits highest, multiplies the author factor by the ownership-dilution factor
(1 - ownership_top_share), treats file size as a tiny tie-breaker, and applies categorical multiplicative bumps for the RHEL4 6-developer (1.15×) and 9-developer (1.35×) thresholds plus a new-file bump (1.15× whenage_days < recent_window_days). Bug-fix and security-fix commit counts feed in via a log-scaled additive term with double weight on security fixes. The exact formula is below and is also documented insrc/vcs/score.rswith citations;risk_score_versionlets it evolve without breaking downstream consumers.An alternative percentile-based score is available via
--risk-formula percentile: each signal is re-ranked to its percentile within the analyzed set, then averaged. The literature explicitly recommends relative/percentile triggers over hard thresholds for cross-project robustness.Explicitly out of scope (filed as follow-ups)
git blame+ AST line spansArchitecture
src/vcs/:error,options,stats,identity,classify,score,hotspot, and abuild_history_index(root, options)entry point.src/vcs/git/(repo,history,identity) — gated byvcs-git.Backendtrait (premature abstraction with one backend). The top-level entry point delegates to the single available backend; the trait is extracted when a second backend lands.all-languagesumbrella):CLI/web/py crates list
"vcs"in their default features so end-user binaries pick up every backend that ships.gixfeature set:["max-performance-safe", "blob-diff", "mailmap", "revision", "index"].build_history_indexruns ONCE per invocation (before the AST walk) and producesHashMap<repo-relative-path, FileStats>. Walking history per file would be catastrophic on large repos.CodeMetrics(insrc/spaces.rs) gainspub vcs: Option<vcs::Stats>and aVcsvariant inMetric(mark#[non_exhaustive]if not already).bca vcssubcommand mirrors the prototype's ranked-list output. Integration intobca metrics/check/reportvia--metrics vcsis also wired up. The subcommand is backend-agnostic; it probes the working tree to decide which backend to use.POST /vcsendpoint onbca-web; newvcs_metrics(...)on Python bindings; opt-invcs=Trueparameter on the existing Pythonanalyze().CLI surface
bca vcsflags:--long-window <DURATION>(default12mo)--recent-window <DURATION>(default90d)--top <N>(default50;0= all)--ref <REF>(defaultHEAD)--full-history(default: first-parent only)--include-merges(default: skip merges)--no-follow-renames(default: follow)--no-exclude-bots,--bot-pattern <REGEX>(default exclude:dependabot[bot],renovate[bot],github-actions[bot],pre-commit-ci[bot],mergify[bot],pyup-bot)--as-of <RFC3339>(default: wall clock) — for reproducible snapshots--risk-formula {weighted|percentile}(default:weighted)--emit-author-details(default: off; opts into SHA-256-hashed canonical author IDs)--paths/--include/--exclude/--exclude-tests/--no-ignoreWhen the input path is not under a working tree of a supported VCS,
bca vcserrors clearly;bca metrics --metrics vcssucceeds with a one-shot warning and omits thevcsfield per file.Edge cases the implementation must handle
.mailmaprespected; multiple author emails canonicalized to one identityCo-authored-by:trailers parsed and counted--full-historyopts into full DAG--include-mergesto includetruncated_shallow_clone: trueand a warninggix::open-)--include-deletedopt-invcsfield isNone, distinct from a tracked file with zero counts in window12mo,90d,2y,8w, or ISO 8601P12Mnow - windownow()--as-of <RFC3339>for reproducible runs--emit-author-detailsopts into SHA-256 hashed canonical IDsbstr::BString; UTF-8 conversion only at output boundary with explicit error handling (per AGENTS.md path rules)unsafe; nounwrap/expect/panic!in non-test code; allgixerrors mapped to typedvcs::ErrorMetricenum marked#[non_exhaustive]so future variants don't break consumersComposite risk-score formula (v1)
Log-scaled weighted sum, plus categorical multiplicative bumps:
Documented in
src/vcs/score.rswith full citations. Score is ordinal, not cardinal: only relative ranks have meaning.Output shape (additive to
CodeMetrics){ "name": "src/foo.rs", "metrics": { "loc": { "...": "..." }, "cyclomatic": { "...": "..." }, "vcs": { "vcs_schema_version": 1, "risk_score_version": 1, "long_window_days": 365, "recent_window_days": 90, "commits_long": 42, "commits_recent": 11, "churn_long": 2150, "churn_recent": 480, "authors_long": 7, "authors_recent": 3, "ownership_top_share": 0.41, "burst": 0.26, "bug_fix_commits": 9, "security_fix_commits": 2, "revert_commits": 0, "age_days": 540, "last_modified_days": 7, "risk_score": 187.3, "hotspot_score": 423.1 } } }Adding fields to
CodeMetricsis backwards-compatible (serdemakes additive changes safe; #253 confirms this). TheMetricenum gains one variant — confirm#[non_exhaustive]so future additions are non-breaking.Test strategy
tests/common/vcs_fixture.rsbuilds deterministic temp git repos viagixwith fixed author identities and UNIX timestamps.--follow-renames), exact window boundary (inclusive atnow - window), keyword classification (bug/security/revert positive + false-positive avoidance).tests/:bca vcs --paths <fixture-repo>→ anchored JSON snapshot (per.snapshot-anchor-baseline.txtrules:assert_eq!on integer fields above eachinsta::assert_json_snapshot!).bca metrics --metrics cyclomatic,vcs --paths <fixture-repo>→ mixed-output snapshot.bca vcsoutside a git repo → non-zero exit with clear error.bca metrics --metrics vcsoutside a git repo → succeeds with warning and omittedvcsfield..claude/rules/testing.md): any tightening predicate gets agit checkout HEAD~1revert test to prove it would fail against the pre-refactor code.cargo build/cargo test --workspace --all-features/cargo clippy ... -D warningsall pass with and without thevcsfeature.make pre-commit(full validation gate) clean before submission.src/vcs/added to the quarterly cron in a separate, follow-up PR (out-of-band, not v1).Acceptance checklist
gixintegrated behind thevcs-gitCargo feature with the explicit feature list above; umbrellavcs = ["vcs-git"]registered at the workspace rootbca vcssubcommand reproduces the prototype's ranked-list output, plus configurable windows, history mode, merge handling, bot filtering, identity emission, and--as-ofbca metrics/check/reportwhen--metrics vcsis selectedPOST /vcsendpoint, plus optionalrepo_pathonPOST /metricsfor AST + VCS in one callvcs_metrics()Python function, plus opt-invcs=Trueonanalyze()metrics/vcs.mdmdBook chapter and updates torecipes/rest-api.mdvulnerability-correlation.mdMetricenum is#[non_exhaustive].claude/rules/testing.md) for any tightening predicates.snapshot-anchor-baseline.txtrulesmake pre-commitclean (full validation gate)git-history-risk-rank.sheither deleted or reduced to a documented historical referenceReferences
vulnerability-correlation.md(this repo)