mask bytes in chm unmarshalUInt32 to avoid sign extension#2887
Open
rootvector2 wants to merge 1 commit into
Open
mask bytes in chm unmarshalUInt32 to avoid sign extension#2887rootvector2 wants to merge 1 commit into
rootvector2 wants to merge 1 commit into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR aims to fix incorrect unsigned 32-bit (UInt32) decoding in the CHM parser by preventing byte sign-extension when assembling little-endian values from untrusted CHM data, and adds a regression test for the LZXC control block size field.
Changes:
- Mask each input byte with
& 0xffwhen buildingUInt32values inChmItsfHeaderandChmLzxcControlData. - Add a JUnit test ensuring
0x00008000is parsed as32768(not-32768) forChmLzxcControlData.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
.../TestChmLzxcControlData.java |
Adds a regression test for UInt32 parsing when a non-leading byte has its high bit set. |
.../ChmLzxcControlData.java |
Updates unmarshalUInt32 to mask bytes before shifting/OR-ing. |
.../ChmItsfHeader.java |
Updates unmarshalUInt32 to mask bytes before shifting/OR-ing. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+228
to
+231
| dest = (data[this.getCurrentPlace()] & 0xff) | | ||
| (data[this.getCurrentPlace() + 1] & 0xff) << 8 | | ||
| (data[this.getCurrentPlace() + 2] & 0xff) << 16 | | ||
| (data[this.getCurrentPlace() + 3] & 0xff) << 24; |
Comment on lines
+398
to
+401
| dest = (data[this.getCurrentPlace()] & 0xff) | | ||
| (data[this.getCurrentPlace() + 1] & 0xff) << 8 | | ||
| (data[this.getCurrentPlace() + 2] & 0xff) << 16 | | ||
| (data[this.getCurrentPlace() + 3] & 0xff) << 24; |
Comment on lines
+127
to
+143
| @Test | ||
| public void testUInt32HighBitNotSignExtended() throws Exception { | ||
| // size = 0x00008000 little-endian; the 0x80 byte has bit 7 set, so an | ||
| // unmasked shift sign-extends it and yields -32768 instead of 32768 | ||
| byte[] data = new byte[ChmConstants.CHM_LZXC_MIN_LEN]; | ||
| data[1] = (byte) 0x80; | ||
| byte[] sig = ChmConstants.LZXC.getBytes(UTF_8); | ||
| System.arraycopy(sig, 0, data, 4, sig.length); | ||
| data[8] = 0x01; // version | ||
| data[12] = 0x02; // resetInterval | ||
| data[16] = 0x02; // windowSize | ||
| data[20] = 0x02; // windowsPerReset | ||
|
|
||
| ChmLzxcControlData control = new ChmLzxcControlData(); | ||
| control.parse(data, control); | ||
| assertEquals(0x8000L, control.getSize()); | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
unmarshalUInt32inChmItsfHeaderandChmLzxcControlDataORs the four little-endian bytes from an untrusted chm without masking each& 0xff, so any non-leading byte with bit 7 set is sign-extended and corrupts the value (a0x8000size/windowSize/resetInterval field reads back as-32768); spotted because the sibling readers inChmItspHeader,ChmPmgiHeader,ChmPmglHeaderandChmLzxcResetTablealready mask, so this just brings the two stragglers in line.