feat(tck): coverage matrix + rust_status tracking with CI enforcement#891
Conversation
…#608) Stand up the instrument that tracks openCypher TCK conformance toward the #609 100%-of-corpus bar, enforced by the existing `BDD — Rust` CI gate (no new job). - tests/tck/coverage_matrix.json: per-feature `rust_status` (passing|skip) + written-scenario counts over the vendored 2024.3 corpus (220 features, 1,615 written / ~3,897 expanded; 6 passing = Literals1 today). - crates/gf-api/tests/tck_coverage.rs: asserts the matrix stays in sync with the @skip-rust tags and covers every feature exactly once. With the BDD suite's fail_on_skipped, `rust_status == "passing"` ⟹ the feature runs and passes. - docs/reference/tck-compliance.md: rewritten for the Rust v0.5 core — the ~3,897 corpus denominator, the supported-subset/documented-xfail definition, the revived regression-floor policy, and how to un-skip a tier. Replaces the stale v0.3.9 Python report. Validated: cargo test -p gf-api --test tck_coverage green; fmt clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
WalkthroughAdds ChangesTCK Coverage Matrix and Validation Test
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related issues
Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
crates/gf-api/tests/tck_coverage.rs (1)
70-106: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAlso validate
_metaaggregate fields to prevent silent drift.The test verifies per-feature entries, but it does not assert
_meta.feature_files,_meta.scenarios_written, and_meta.scenarios_passingagainst derived totals. Adding these checks keeps the summary metadata trustworthy.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gf-api/tests/tck_coverage.rs` around lines 70 - 106, The test currently validates individual feature entries but does not assert the aggregate metadata fields in the _meta object (feature_files, scenarios_written, and scenarios_passing) against the derived totals from the actual feature files. After the existing validation loop that checks per-feature entries, add assertions that verify _meta.feature_files equals the count of keys in actual, _meta.scenarios_written equals the sum of all scenario counts across actual entries, and _meta.scenarios_passing equals the sum of scenario counts only for entries where skipped is false. This prevents the summary metadata from drifting silently.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/gf-api/tests/tck_coverage.rs`:
- Around line 34-37: The substring matching in the closure of the `any()` method
call is too permissive and will match partial tags like `@skip-rust-foo` instead
of just the exact `@skip-rust` tag. Replace the `contains("skip-rust")` check
with exact tag matching by verifying that `@skip-rust` appears as a complete
word boundary (either followed by whitespace or at the end of the string) rather
than as a substring. This ensures only the exact `@skip-rust` tag triggers the
skip condition.
---
Nitpick comments:
In `@crates/gf-api/tests/tck_coverage.rs`:
- Around line 70-106: The test currently validates individual feature entries
but does not assert the aggregate metadata fields in the _meta object
(feature_files, scenarios_written, and scenarios_passing) against the derived
totals from the actual feature files. After the existing validation loop that
checks per-feature entries, add assertions that verify _meta.feature_files
equals the count of keys in actual, _meta.scenarios_written equals the sum of
all scenario counts across actual entries, and _meta.scenarios_passing equals
the sum of scenario counts only for entries where skipped is false. This
prevents the summary metadata from drifting silently.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 69334aca-40c1-40c1-95ad-822ebaf90f18
⛔ Files ignored due to path filters (1)
docs/reference/tck-compliance.mdis excluded by!**/*.md,!**/docs/**
📒 Files selected for processing (2)
crates/gf-api/tests/tck_coverage.rstests/tck/coverage_matrix.json
mkdocs `--strict` aborts on a relative link to `tests/tck/coverage_matrix.json`: the target is at the repo root, outside the `docs/` tree, so mkdocs cannot resolve it as a doc file. Reference it as inline code (like the other in-body mentions) instead of a markdown link — the file's location is stated in prose. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ored blocks (#608) The matrix is the instrument for the 100% gate (#609/#742), whose denominator is the number of scenarios cucumber actually runs — i.e. Scenario Outlines expanded by their Examples rows — not the count of authored Scenario/Outline blocks. The matrix previously recorded the authored count (1,615), which can't reconcile with the gate's corpus size. - `coverage_matrix.json`: per-feature `scenarios` is now the expanded runnable count; `_meta` carries `scenarios_total` (3,880, the gate denominator) and keeps `scenarios_written` (1,615) for reference. - `tck_coverage.rs`: `expanded_scenarios()` recomputes the same count by line scan (Scenario = 1; Scenario Outline = its Examples data rows), robust to the #886 And/But normalization, and asserts the matrix matches. Verified two independent ways: 1,339 plain + 2,541 example rows = 3,880 (written = 1,339 + 276 outlines = 1,615). - `tck-compliance.md`: leads with the measured 3,880 denominator; the prior "~3,897" was an estimate (off by 17). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summary
Stands up the instrument that tracks openCypher TCK conformance toward the #609 100%-of-corpus
bar — enforced by the existing
BDD — RustCI gate (no new CI job).Closes #608. Part of M17.
What
tests/tck/coverage_matrix.json— per-featurerust_status(passing|skip) + written-scenariocounts over the vendored openCypher 2024.3 corpus: 220 features; 1,615 authored Scenario/Outline blocks expanding to 3,880 runnable
scenarios (the gate denominator); 6 passing (
Literals1) today.crates/gf-api/tests/tck_coverage.rs— asserts the matrix stays in sync with the@skip-rusttags and covers every feature exactly once. With the BDD suite's
fail_on_skipped, this gives theinvariant
rust_status == "passing"⟹ the feature runs and passes.docs/reference/tck-compliance.md— rewritten for the Rust v0.5 core: the 3,880-scenario corpusdenominator, the supported-subset / documented-xfail definition, the revived regression-floor
policy, and how to un-skip a tier. Replaces the stale v0.3.9 Python report.
Validation
cargo test -p gf-api --test tck_coveragegreen;cargo fmt --all -- --checkclean. Data + test +docs only — no engine changes.
Effect
The TCK passing rate is now tracked + CI-enforced: a feature can't be marked
passingwithout actuallyrunning, can't run without passing, and can't silently regress to
skip. Today: 6 / 3,880; thematrix grows as tiers (#598–#601) and the feature cluster land.
🤖 Generated with Claude Code
Note
Add TCK coverage matrix with CI enforcement for Rust status tracking
rust_status(skiporpassing) and expanded scenario count..featurefiles — failing if any feature is missing, extra, has a mismatched scenario count, or has arust_statusinconsistent with its@skip-rusttag.@skip-rusttags will break CI until the coverage matrix is regenerated and committed.Macroscope summarized e34d1cb.
Summary by CodeRabbit
rust_statusbased on presence of the@skip-rusttag and recalculates expected scenario counts from the feature text.