Skip to content

mask bytes in chm unmarshalUInt32 to avoid sign extension#2887

Open
rootvector2 wants to merge 1 commit into
apache:mainfrom
rootvector2:chm-unmarshal-uint32-mask
Open

mask bytes in chm unmarshalUInt32 to avoid sign extension#2887
rootvector2 wants to merge 1 commit into
apache:mainfrom
rootvector2:chm-unmarshal-uint32-mask

Conversation

@rootvector2

Copy link
Copy Markdown

unmarshalUInt32 in ChmItsfHeader and ChmLzxcControlData ORs the four little-endian bytes from an untrusted chm without masking each & 0xff, so any non-leading byte with bit 7 set is sign-extended and corrupts the value (a 0x8000 size/windowSize/resetInterval field reads back as -32768); spotted because the sibling readers in ChmItspHeader, ChmPmgiHeader, ChmPmglHeader and ChmLzxcResetTable already mask, so this just brings the two stragglers in line.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to fix incorrect unsigned 32-bit (UInt32) decoding in the CHM parser by preventing byte sign-extension when assembling little-endian values from untrusted CHM data, and adds a regression test for the LZXC control block size field.

Changes:

  • Mask each input byte with & 0xff when building UInt32 values in ChmItsfHeader and ChmLzxcControlData.
  • Add a JUnit test ensuring 0x00008000 is parsed as 32768 (not -32768) for ChmLzxcControlData.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
.../TestChmLzxcControlData.java Adds a regression test for UInt32 parsing when a non-leading byte has its high bit set.
.../ChmLzxcControlData.java Updates unmarshalUInt32 to mask bytes before shifting/OR-ing.
.../ChmItsfHeader.java Updates unmarshalUInt32 to mask bytes before shifting/OR-ing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +228 to +231
dest = (data[this.getCurrentPlace()] & 0xff) |
(data[this.getCurrentPlace() + 1] & 0xff) << 8 |
(data[this.getCurrentPlace() + 2] & 0xff) << 16 |
(data[this.getCurrentPlace() + 3] & 0xff) << 24;
Comment on lines +398 to +401
dest = (data[this.getCurrentPlace()] & 0xff) |
(data[this.getCurrentPlace() + 1] & 0xff) << 8 |
(data[this.getCurrentPlace() + 2] & 0xff) << 16 |
(data[this.getCurrentPlace() + 3] & 0xff) << 24;
Comment on lines +127 to +143
@Test
public void testUInt32HighBitNotSignExtended() throws Exception {
// size = 0x00008000 little-endian; the 0x80 byte has bit 7 set, so an
// unmasked shift sign-extends it and yields -32768 instead of 32768
byte[] data = new byte[ChmConstants.CHM_LZXC_MIN_LEN];
data[1] = (byte) 0x80;
byte[] sig = ChmConstants.LZXC.getBytes(UTF_8);
System.arraycopy(sig, 0, data, 4, sig.length);
data[8] = 0x01; // version
data[12] = 0x02; // resetInterval
data[16] = 0x02; // windowSize
data[20] = 0x02; // windowsPerReset

ChmLzxcControlData control = new ChmLzxcControlData();
control.parse(data, control);
assertEquals(0x8000L, control.getSize());
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants