Skip to content

check snappy block length before crc trailer in decode_snappy#3807

Open
dxbjavid wants to merge 4 commits into
apache:mainfrom
dxbjavid:snappy-block-len-check
Open

check snappy block length before crc trailer in decode_snappy#3807
dxbjavid wants to merge 4 commits into
apache:mainfrom
dxbjavid:snappy-block-len-check

Conversation

@dxbjavid

@dxbjavid dxbjavid commented Jun 5, 2026

Copy link
Copy Markdown

decode_snappy in lang/c/src/codec.c takes the block length straight from the container file, where file_read_block_count only rejects negative values, so a snappy block of 1 to 3 bytes reaches it and the len-4 used for snappy_uncompressed_length, snappy_uncompress and the trailing CRC memcmp underflows to a huge size_t and reads out of bounds. The C++ reader in DataFile.cc already refuses len < 4 before the same subtraction, so add the matching check here.

@github-actions github-actions Bot added the C label Jun 5, 2026
@dxbjavid

Copy link
Copy Markdown
Author

Hi maintainers,

Just a friendly follow-up on this PR. When you have a chance, could you please provide an update on its review status?

Happy to make any additional changes if needed.

Thank you!

@martin-g martin-g left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have some unit tests

Comment thread lang/c/src/codec.c Outdated
size_t outlen;

if (len < 4) {
avro_set_error("Snappy block is too small to contain a CRC32 checksum");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation is off.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, fixed. the block now uses the same spacing as the rest of the function.

@dxbjavid

Copy link
Copy Markdown
Author

added a unit test (test_avro_3807) that pushes undersized snappy blocks of 0 to 3 bytes through avro_codec_decode and checks they're rejected rather than decoded. it uses exact sized heap allocations so the memcheck/valgrind run flags the out of bounds read on the old path while parsing the length prefix. registered under add_avro_test_checkmem alongside the others, and i tidied up the indentation you flagged.

Use one tab for indentation as the rest of the file
Comment thread lang/c/src/codec.c Outdated
uint32_t crc;
size_t outlen;

if (len < 4) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (len < 4) {
if (len < 4 || (uint64_t)len > SIZE_MAX) {

len should also be smaller than SIZE_MAX otherwise on 32-bit systems it may overflow when passed to snappy_uncompressed_length() that accepts size_t.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, that 32-bit overflow would slip straight through. added the SIZE_MAX bound to the same guard and pushed. builds fine here with the snappy codec on and the test still passes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants