fix(compaction): prune fully-deleted fragments from the fragment-reuse index#7378
Draft
xuanyu-z wants to merge 1 commit into
Draft
fix(compaction): prune fully-deleted fragments from the fragment-reuse index#7378xuanyu-z wants to merge 1 commit into
xuanyu-z wants to merge 1 commit into
Conversation
…e index A fragment that has all its rows deleted is removed from the manifest at delete time, so it never enters a compaction task and is absent from every fragment-reuse-index (FRI) group's old_frags. With deferred compaction (defer_index_remap=true) the inline remap is skipped, so FragReuseIndex::remap_row_id passed those addresses through unchanged instead of pruning them. An index that still covered the removed fragment then kept dangling references to it, surfacing as 'take received reference to fragment that does not exist' errors / ghost results after a later physical remap. Record the orphaned fragment IDs (covered by an index but no longer present in the dataset, and not part of any rewrite group) on the FRI version and prune those addresses to None in remap_row_id, fixing both auto-remap at load and physical remap. Index coverage bitmaps are left intact (masked at query time by effective_fragment_bitmap); only the row data is pruned. The new proto field is additive and backward-compatible. Adds a regression test. Fixes lance-format#7374
Contributor
|
Important This PR touches the Lance format specification. Substantive changes to the format specification — the If this is a meaningful format change:
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Fixes #7374. A vector (or scalar) index can be left with dangling references after a deferred compaction when a fragment was fully deleted beforehand.
When
deleteremoves every row of a fragment, the fragment is dropped from the manifest at delete time (apply_deletions→FragmentChange::Removed) — it gets no deletion file. It therefore never enters a later compaction task, so it is absent from every fragment-reuse-index (FRI) group'sold_frags. Withdefer_index_remap=truethe inline remap is skipped, andFragReuseIndex::remap_row_idreturns the original address (pass-through) for any address it has no mapping for — including that orphaned fragment's rows. An index that still covers the removed fragment keeps those entries, which resurface astake received reference to fragment that does not existerrors / ghost results once the index is physically remapped through the FRI.(The inline path is unaffected: it relies on
effective_fragment_bitmap— index coverage intersected with live fragments at query time — to mask the dead fragment, and the FRI deferral preserves that data instead of pruning it.)Fix
Record the orphaned fragment IDs on the FRI version and prune them by fragment ID in
remap_row_id:commit_compactioncomputesorphaned = (union of non-system indices' fragment_bitmap) − (fragments currently in the dataset). At that point the about-to-be-compacted fragments are still present, so only fragments removed before the compaction (e.g. fully-deleted ones) remain.FragReuseVersion(new additive, backward-compatible proto fieldremoved_fragments), andFragReuseIndex::remap_row_idreturnsNonefor any address whose fragment is in it — fixing both auto-remap-at-load and physical remap.Testing
test_defer_index_remap_fully_deleted_fragment(10 fragments + IVF_PQ, fully delete one fragment, deferredcompact_files, assert the FRI maps that fragment's addresses toNoneand a search returns no rows from it). Fails before the fix, passes after.dataset::optimizesuite and the frag-reuse / remapping tests pass locally.The regression test asserts at the FRI (
remap_row_id) level on purpose: an end-to-end search alone is masked byeffective_fragment_bitmapand passes even with the bug, so it isn't a reliable detector.