Skip to content

feat(metrics): just-in-time (commit-level) VCS risk scoring #331

@dekobon

Description

@dekobon

Follow-up to #328.

Add a just-in-time (commit-level) risk scoring mode that scores a specific commit (or PR diff) for defect-induction risk, rather than scoring each file in HEAD.

Why

The just-in-time defect prediction literature (Kamei et al., systematic survey in ACM Computing Surveys 2022) is mature and consistently shows that commit-level prediction at check-in time is high-value for CI gates. The survey's key finding for tooling: JIT models lose power after one year and need to be re-trained on recent data, so a static rule-based scorer (no ML) is the most maintainable starting point.

Scope

  • New CLI mode: bca vcs jit <commit-spec> (or --diff <file> for an arbitrary diff).
  • Compute per-commit features:
    • Size: lines added, lines deleted, files touched, hunks
    • Diffusion: subsystems / top-level directories touched
    • Author experience: prior commits in repo, prior commits in touched files (per vcs-git walker reuse)
    • History: prior bug-fix and security-fix commit counts on touched files
    • Touched files' priors: composite v1 risk_score of each touched file
  • Composite JIT score (versioned, separate from the file-level risk_score).
  • Output: JSON with per-feature contributions and overall score.
  • Optional CI integration: bca vcs jit --fail-over <threshold> exits non-zero.

Edge cases

  • Merge commits: classify by parent count, optionally score per merged side branch.
  • New files: file priors are zero; rely on author experience and diff size.
  • Reverts: detect and either suppress or annotate.

Out of scope

  • ML-based JIT (LR / DBN / neural). Static rules first; ML deferred.
  • Server-side hook integration.

Acceptance criteria

  • bca vcs jit <commit> returns a stable JSON shape with feature breakdown and overall score.
  • Documented citations to Kamei et al. and at least one open replication study.
  • Unit tests on synthetic commits covering size, diffusion, and file-prior contributions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions