Skip to content

track(grammar): tree-sitter-cpp blockers behind mozcpp/deepspeech skips (#83) #86

@dekobon

Description

@dekobon

Summary

Track the upstream tree-sitter-cpp blockers that are causing
tree-sitter-mozcpp to skip the test_fn_id_strings test and a set of
DeepSpeech files (see #83). The skips themselves are documented and have
correct FIXME references after #83. This issue exists so we have a single
place to record release / fix status for the seven upstream defects and
trigger a grammar bump when one of them becomes available.

This is a watch-and-bump issue, not a coding task. No code change is
expected unless / until a new tree-sitter-cpp release ships on
crates.io with one of these fixes.

Why we cannot fix this locally

tree-sitter-mozcpp is a thin overlay on top of tree-sitter-cpp. The
overlay (tree-sitter-mozcpp/grammar.js, ~378 lines) only adds
Mozilla-specific macro tokens such as MOZ_NONHEAP_CLASS. Every parse
failure in #83 comes from a structural defect in the underlying
tree-sitter-cpp grammar — binary_expression, preproc_if,
enumerator, etc. We cannot fix those rules in the overlay, and per
AGENTS.md we cannot fork the vendored grammar or repoint
tree-sitter-cpp at a git sha:

External grammar crates are version-pinned (=X.Y.Z) in the root
Cargo.toml. Treat the pinned version as fixed: do not loosen pins
to a range without explicit user approval. Bumping a grammar version
is a deliberate, separate change.

The build pipeline reflects this — generate-grammars/generate-mozcpp.sh
reads the pin from tree-sitter-mozcpp/Cargo.toml and downloads the
matching crate from crates.io. There is no supported path to a git sha.

Status of the seven upstream defects

Current pins:

  • root Cargo.toml: tree-sitter-mozcpp = "=0.20.4"
  • tree-sitter-mozcpp/Cargo.toml: tree-sitter-cpp = "0.23.4"
  • crates.io tree-sitter-cpp max_stable_version: 0.23.4 (2024-11-11)
Affected test(s) tree-sitter-cpp issue Status Released?
test_fn_id_strings tree-sitter/tree-sitter-cpp#307 (string + macro concat) OPEN
deepspeech.cc, getopt_win.h, mmap.cc tree-sitter/tree-sitter-cpp#308 (preprocessor conditionals) OPEN
deepspeech.h tree-sitter/tree-sitter-cpp#309 (macro-generated enum values) OPEN
deepspeech.h tree-sitter/tree-sitter-cpp#310 (function annotations) OPEN
fast-dtoa.cc tree-sitter/tree-sitter-cpp#311 (>= operator) OPEN
left_test.cc tree-sitter/tree-sitter-cpp#312 (trailing-backslash macros) OPEN (feature)
fst_test.h (×2 openfst) tree-sitter/tree-sitter-cpp#252 (explicit operator-overload calls) CLOSED 2025-09-16 via tree-sitter/tree-sitter-cpp#329 (commit 4910efc) No — not in 0.23.4

Six bugs and one feature request remain open upstream. One bug (#252) was
fixed on master in September 2025 but no tree-sitter-cpp release has
been cut since 2024-11-11 (v0.23.4), so the fix is unreachable through
crates.io today.

Acceptance criteria (any one of these unblocks an action)

  1. A new tree-sitter-cpp release on crates.io that includes the feat(lib): per-language Cargo features for grammar selection #252
    fix.
    When that ships:
    • Bump tree-sitter-cpp in tree-sitter-mozcpp/Cargo.toml to the new
      version.
    • Bump tree-sitter-mozcpp major/minor in the root Cargo.toml.
    • Run ./generate-grammars/generate-mozcpp.sh and review the diff.
    • Remove the two fst_test.h entries from the exclusion list in
      tests/deepspeech_test.rs.
    • Re-run cargo insta test --review and accept the resulting snapshot
      drift.
    • Cross-check that no other previously-skipped DeepSpeech file now
      parses cleanly — if so, drop those exclusions too.
  2. Any of test(metrics): tighten Npm/Npa annotation-type tests to also catch is_func_space revert #307fix(cfg_predicate): slow-path whitespace collapser mangles non-ASCII UTF-8 #312 closed upstream and shipped in a release — repeat
    the same flow, dropping the corresponding entries.
  3. All seven closed and shipped — remove the FIXME blocks and
    #[ignore] markers entirely (this is the acceptance criterion of
    fix(tests,mozcpp): tree-sitter-cpp parse failures skip mozcpp/deepspeech tests #83), close fix(tests,mozcpp): tree-sitter-cpp parse failures skip mozcpp/deepspeech tests #83 and close this tracking issue.

How to check status quickly

# Latest released tree-sitter-cpp on crates.io
curl -s https://crates.io/api/v1/crates/tree-sitter-cpp \
  | jq '.crate | {max_stable_version, updated_at}'

# Any tree-sitter-cpp release newer than what we pin
gh api repos/tree-sitter/tree-sitter-cpp/releases \
  --jq '.[] | {tag_name, published_at}' | head

# Status of the seven blockers
for n in 307 308 309 310 311 312 252; do
  gh issue view $n --repo tree-sitter/tree-sitter-cpp \
    --json number,title,state,closedAt
done

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdependenciesPull requests that update a dependency file

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions