Skip to content

fix: tolerate startxref offset inside xref keyword#798

Closed
vitormattos wants to merge 2 commits into
smalot:masterfrom
vitormattos:fix/startxref-whitespace-xref-stream
Closed

fix: tolerate startxref offset inside xref keyword#798
vitormattos wants to merge 2 commits into
smalot:masterfrom
vitormattos:fix/startxref-whitespace-xref-stream

Conversation

@vitormattos
Copy link
Copy Markdown

@vitormattos vitormattos commented Apr 24, 2026

Summary

  • Fixes parsing when startxref points one byte inside the xref keyword (ref) or includes leading whitespace drift.
  • Uses a normalized xref offset before deciding xref table vs xref stream path.
  • Adds an integration regression fixture and test for the failing corpus file.

Reproduction

  • Fixture parses with pdfinfo but failed in parser with Invalid object reference for $obj..
  • New test: testParseFileWhenStartxrefPointsToLeadingWhitespaceInXrefStream.

Changes

Signed-off-by: Vitor Mattos <1079143+vitormattos@users.noreply.github.com>
Signed-off-by: Vitor Mattos <1079143+vitormattos@users.noreply.github.com>
@vitormattos
Copy link
Copy Markdown
Author

Added an extra regression sample from the same investigation: PullRequest794.pdf (veraPDF 6-6-2-3-2-t01-pass-c). This complements existing coverage and confirms the same startxref tolerance fix across another real-world file.

@vitormattos
Copy link
Copy Markdown
Author

This standalone PR has been restacked into the RawDataParser consolidation chain to reduce conflict hotspots in shared test files.\n\nSuperseded-by chain:\n- upstream base: #796\n- fork stack (PR797 replay): https://github.com/vitormattos/pdfparser/pull/26\n- fork stack (this PR798 replay): https://github.com/vitormattos/pdfparser/pull/27\n\nClosing this standalone PR to keep a single merge path per source-file group.

@vitormattos vitormattos deleted the fix/startxref-whitespace-xref-stream branch April 27, 2026 17:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant