Skip to content

fix: aggregate pre-header xref offset robustness#2

Merged
vitormattos merged 2 commits into
fork/libresign-parser-fixesfrom
fix/invalid-object-reference-tolerant-parser
Apr 24, 2026
Merged

fix: aggregate pre-header xref offset robustness#2
vitormattos merged 2 commits into
fork/libresign-parser-fixesfrom
fix/invalid-object-reference-tolerant-parser

Conversation

@vitormattos
Copy link
Copy Markdown
Owner

Summary

  • Mirror of the upstream fix branch for parser robustness with pre-header bytes and absolute xref offsets.
  • Keeps LibreSign integration aligned while upstream review is in progress.

Upstream PR

Scope

  • Branch: fix/invalid-object-reference-tolerant-parser
  • Target aggregate branch: fork/libresign-parser-fixes

Some PDFs include bytes before the %PDF- header while still using
absolute xref offsets from the beginning of the file.

The parser trimmed data before %PDF-, which shifted offsets and caused
xref lookup failures. This manifested as an Invalid object reference
error in the veraPDF corpus header case.

Changes:
- Keep original byte layout in RawDataParser::parseData
- Add stricter trailer key matching for /Size /Root /Encrypt /Info /Prev
- Add defensive handling in xref stream resolution when startxref is near,
  but not exactly at, the xref stream object
- Add regression fixture and integration test

Regression fixture:
- samples/bugs/PullRequestInvalidObjectReference.pdf

Test:
- DocumentIssueFocusTest::testParseFileWithCompressedObjRefInXrefStream

Signed-off-by: Vitor Mattos <1079143+vitormattos@users.noreply.github.com>
Signed-off-by: Vitor Mattos <1079143+vitormattos@users.noreply.github.com>
@vitormattos vitormattos merged commit 5d96c2c into fork/libresign-parser-fixes Apr 24, 2026
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant