big-code-analysis is a hard fork of the rust-code-analysis project. This project is an unapologetic vibe-coded fork that seeks to add as many features and functions as fast as possible.
Nonetheless, it is still a Rust library to analyze and extract information from source code written in many different programming languages. It is based on a parser generator tool and an incremental parsing library called Tree Sitter.
A command line tool called bca is provided to interact with the API of the library in an easy way.
This tool can be used to:
- Call big-code-analysis API
- Print nodes and metrics information
- Export metrics in different formats
- Generate a Markdown or HTML quality-metrics report (
bca report markdown/bca report html)
In addition, we provide a bca-web tool to use the library through a REST API.
bca runs against its own source on every push to main and publishes
the result alongside the documentation:
- HTML hotspot report: https://dekobon.github.io/big-code-analysis/reports/index.html
- Markdown PR/MR comment: https://dekobon.github.io/big-code-analysis/reports/report.md
The wiring lives in
.github/workflows/pages.yml. For
downstream projects, the
CI integration recipe
is the canonical adoption guide — it documents the recommended
pinned-release install path (with BCA_VERSION + sha256 pin) plus
a cargo install alternative. The in-tree pages.yml workflow
builds bca from the current checkout because main may carry CLI
artifact schemas that no released bca supports yet — see the
schema-compatibility note in the recipe before copying that pattern.
big-code-analysis supports many types of programming languages and computes a great variety of metrics. You can find up to date documentation at Documentation.
On the Commands page, there is a list of commands that can be run to get information about metrics, nodes, and other general data provided by this software.
big-code-analysis is published on crates.io and can be embedded
directly. The crate is on the 1.x line and ships under a written
stability contract: the public API surface is held stable across
patch and minor bumps, and breaking shape changes are reserved for
the next major bump. Metric values may still drift across minor
bumps when a grammar pin moves or a metric definition is fixed —
see STABILITY.md for the full versioning contract,
MSRV policy, escape hatches, and exactly what we do and do not
promise within 1.x.
For task-oriented walkthroughs — quick start, in-memory analysis,
walking FuncSpace results, and error handling — see the
Using as a Library
section of the book.
Python bindings (PyO3) live in
big-code-analysis-py/ and ship
the same metric pipeline as a Python package. See the book's
Python Bindings
section for the install matrix, batch / async / SARIF recipes, and
the full error taxonomy.
Every tree-sitter grammar is gated behind a per-language Cargo
feature. The default feature set is all-languages, so a bare
big-code-analysis = "1.1.0"pulls every grammar in (matching the library's historical behaviour
and what the bca / bca-web binaries ship). Library consumers that
only need a subset of languages can opt out of the defaults and
re-enable just the grammars they want:
big-code-analysis = { version = "1.1.0", default-features = false, features = ["rust", "typescript"] }Supported language features: bash, cpp, csharp, elixir,
go, groovy, irules, java, javascript, kotlin, lua,
mozjs, perl, php, python, ruby, rust, tcl,
typescript. The irules feature adds F5 iRules (a Tcl
dialect; extensions .irule / .irules). The
cpp feature covers the Cpp, Ccomment, and Preproc LANG
variants and pulls in bca-tree-sitter-mozcpp,
bca-tree-sitter-ccomment, and bca-tree-sitter-preproc together
(published forks of the matching Mozilla grammars — see the publish
strategy notes in RELEASING.md).
The LANG enum keeps every variant defined regardless of the active
feature set; selecting a [LANG] variant whose feature is off
returns Err(MetricsError::LanguageDisabled(LANG)) from every
dispatch entry point (analyze, metrics_from_tree, action,
get_ops, the deprecated get_function_spaces* shims, and
LANG::get_tree_sitter_language). The set of compiled-in variants
is queryable via LANG::is_enabled.
The repository ships a Makefile that wraps every common build, test,
lint, and docs task. Run make help for the full list, and
make check-tools to verify the optional tools are installed.
make build # debug build of the entire workspace
make build-release # optimised release buildIf you prefer to run cargo directly, or want to build a single crate:
cargo build # library only
cargo build -p big-code-analysis-cli # CLI only
cargo build -p big-code-analysis-web # web server only
cargo build --workspace # everything in one shotmake test # cargo test --workspace --all-features --lib --bins --tests
make test-doc # cargo test --workspace --all-features --doc
make pre-commit # full local gate: fmt-check, clippy, tests, udeps, lint familiesmake pre-commit is the recommended gate before committing — it is
equivalent to what CI runs. If GNU Make 4 or any of the optional
tools are unavailable, the raw cargo invocation still works:
cargo test --workspace --all-features --verboseWe use insta, to update the snapshot tests you should install cargo insta
make insta-review # cargo insta test --reviewWill run the tests, generate the new snapshot references and let you review them.
Have a look at Update grammars guide to learn how to update languages grammars.
If you want to contribute to the development of this software, have a look at the guidelines contained in our Developers Guide.
-
Mozilla-defined grammars are released under the MIT license.
-
big-code-analysis, big-code-analysis-cli and big-code-analysis-web are released under the Mozilla Public License v2.0.