feat: support COUNT(*) pushdown on stable row id datasets#7360
Open
wkalt wants to merge 2 commits into
Open
Conversation
Benchmarks COUNT(*) via the scanner aggregate plan on a synthetic stable-row-id dataset (multiple fragments, scattered cross-fragment deletions, BTree scalar index on the filter column), covering unfiltered and filtered counts at two selectivities. Uses only public APIs, so it runs on any revision for before/after comparison. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The COUNT(*) fast path (CountFromMaskExec) previously refused to fire on datasets using stable row ids, so such counts fell back to the regular scan path -- a full scan when unfiltered, or an index-prefiltered scan plus row materialization when filtered. The cause was a coordinate-space mismatch: the index prefilter and the deletion mask are both expressed in stable-id space, but the exec built its fragments-allow universe in row-address space. ANDing across the two silently dropped rows in fragments > 0, so the rule was gated off for stable row ids. Build the universe in stable-id space instead. create_restricted_deletion_mask already returns a live-id allow list restricted to the covered fragments; it returns None only when there are no deletions and full coverage, in which case the universe is loaded from the covered fragments' row-id sequences (metadata, not column data). For an unfiltered count (no prefilter) the universe is never needed -- the answer is just the live row count of the covered fragments, taken straight from fragment metadata via count_live_rows. Materializing the full stable-id universe (every row id in the dataset) just to take its length would be far more expensive than the answer. The default row-address path is unchanged. Remove the gate in count_pushdown and cover the stable-id path with tests: firing with/without an indexed filter, cross-fragment deletions, and an end-to-end indexed-filter count. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
COUNT(*)pushdown (thecount_pushdownrule ->CountFromMaskExec) wasdisabled on datasets using stable row ids -- the count fell back to the regular
scan path. This enables it.
Why it was off, and the fix
The fast path intersects the scalar-index prefilter and the deletion mask (both
in stable-id space) with a fragments-allow universe -- but that universe was
built in row-address space. ANDing across the two id spaces silently dropped
rows in fragments > 0, so the rule was gated off entirely under stable row ids.
This builds the universe in stable-id space instead, via the live-id deletion
mask (restricted to the covered fragments). For an unfiltered count there is no
prefilter, so the universe is never materialized -- the answer comes straight
from fragment metadata.
Benchmark
New
count_pushdownbench (synthetic: 5M rows, 50 fragments, ~1% scattereddeletions, BTree on the filter column).
cargo bench -p lance --bench count_pushdown,this branch vs
main:count_unfilteredcount_filtered_1pctcount_filtered_50pctThe filtered cases share the index-scan cost with the old path; the win there is
skipping materialization + counting of the matched rows.
Tests
Stable-id coverage in
count_pushdownandcount_from_mask, plus anend-to-end indexed-filter count in
dataset_aggregate(--features substrait).