check snappy block length before crc trailer in decode_snappy#3807
check snappy block length before crc trailer in decode_snappy#3807dxbjavid wants to merge 4 commits into
Conversation
|
Hi maintainers, Just a friendly follow-up on this PR. When you have a chance, could you please provide an update on its review status? Happy to make any additional changes if needed. Thank you! |
martin-g
left a comment
There was a problem hiding this comment.
It would be nice to have some unit tests
| size_t outlen; | ||
|
|
||
| if (len < 4) { | ||
| avro_set_error("Snappy block is too small to contain a CRC32 checksum"); |
There was a problem hiding this comment.
good catch, fixed. the block now uses the same spacing as the rest of the function.
|
added a unit test (test_avro_3807) that pushes undersized snappy blocks of 0 to 3 bytes through avro_codec_decode and checks they're rejected rather than decoded. it uses exact sized heap allocations so the memcheck/valgrind run flags the out of bounds read on the old path while parsing the length prefix. registered under add_avro_test_checkmem alongside the others, and i tidied up the indentation you flagged. |
Use one tab for indentation as the rest of the file
| uint32_t crc; | ||
| size_t outlen; | ||
|
|
||
| if (len < 4) { |
There was a problem hiding this comment.
| if (len < 4) { | |
| if (len < 4 || (uint64_t)len > SIZE_MAX) { |
len should also be smaller than SIZE_MAX otherwise on 32-bit systems it may overflow when passed to snappy_uncompressed_length() that accepts size_t.
There was a problem hiding this comment.
good point, that 32-bit overflow would slip straight through. added the SIZE_MAX bound to the same guard and pushed. builds fine here with the snappy codec on and the test still passes.
decode_snappy in lang/c/src/codec.c takes the block length straight from the container file, where file_read_block_count only rejects negative values, so a snappy block of 1 to 3 bytes reaches it and the len-4 used for snappy_uncompressed_length, snappy_uncompress and the trailing CRC memcmp underflows to a huge size_t and reads out of bounds. The C++ reader in DataFile.cc already refuses len < 4 before the same subtraction, so add the matching check here.