fix(reader): gracefully handle missing Parquet column index in row se… by jdwil · Pull Request #2464 · apache/iceberg-rust

jdwil · 2026-05-18T11:48:38Z

Gracefully handle Parquet files missing column/offset indexes by skipping page-level row selection and falling back to existing row-group filtering plus Arrow row filtering. This preserves predicate correctness for older or migrated Parquet files that lack page index metadata.

Added integration coverage for Parquet files without column/offset indexes. The test verifies that scans no longer fail when page indexes are absent, page-level row selection is skipped gracefully, and predicate filtering (id < 3) still produces the correct filtered result set ([1, 2]) via the existing row-group + Arrow filtering path.

Closes #2452

…lection When row_selection_enabled is true and the Parquet file lacks column or offset index metadata (common with older/migrated files), the reader now skips page-level row pruning instead of returning an error. Row-group filtering via statistics and the ArrowPredicate row filter still function normally; only page-index-based RowSelection is skipped. Closes apache#2452

jdwil force-pushed the fix/2452-graceful-missing-column-index branch from 3cf00d7 to f1a268a Compare May 18, 2026 12:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(reader): gracefully handle missing Parquet column index in row se…#2464

fix(reader): gracefully handle missing Parquet column index in row se…#2464
jdwil wants to merge 1 commit into
apache:mainfrom
jdwil:fix/2452-graceful-missing-column-index

jdwil commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jdwil commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants