Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
6a89390
igzip: fix raw deflate over-consumption via read_in_length
asonje Mar 20, 2026
93f144e
rename trailer_overconsumption_fixed -> read_in_correction_applied; r…
asonje Mar 20, 2026
92eaf59
add IGZIP inflate/deflate counters to statistics; enable ENABLE_STATI…
asonje Mar 20, 2026
b48d118
add iaa_fallback_igzip: IAA can fall back to IGZIP before software zlib
asonje Mar 20, 2026
0a857e6
cmake: set ENABLE_STATISTICS=OFF by default
asonje Mar 20, 2026
1971197
- Reset statistics to off by default and formatting
asonje Mar 20, 2026
f7ee0ec
IGZIP: lift Z_FINISH-only restriction; enable full streaming compress
asonje Apr 3, 2026
ccd3d6c
fix iaa_fallback_igzip compress: set path_selected=IGZIP after fallback
asonje Apr 14, 2026
37fdc50
iaa: replace marker-based IsIAADecompressible with 512-byte threshold
asonje Apr 14, 2026
fd6ebbb
README: document USE_IGZIP cmake option and ISA-L dependency
asonje Apr 30, 2026
0366ddd
zlib_accel: guard pre_avail_in declaration with USE_IGZIP
asonje Apr 30, 2026
07338b4
Clang format
asonje Apr 30, 2026
8c1bf9a
remove cmake.txt; fix null deref in deflateSetDictionary
asonje May 4, 2026
dd02236
igzip: remove dead SupportedOptions/IGZIPShouldFallback stubs
asonje May 4, 2026
8ba37be
zlib_accel: fix read_in_correction_applied semantics and reset asymmetry
asonje May 4, 2026
cd7cb91
iaa: deprecate iaa_prepend_empty_block config option
asonje May 4, 2026
db0a00f
tests: add IAA->IGZIP fallback coverage
asonje May 4, 2026
a75858e
tests: restore IGZIP path assertions for SYNC_FLUSH regression tests
asonje May 4, 2026
6ef5f24
igzip: document read_in_correction_applied reset and ZSTATE_NEW_HDR c…
asonje May 4, 2026
8f63d4b
address round-2 Copilot review: README doc, igzip.h comment, IGZIP st…
asonje May 5, 2026
612c560
fix: restore gzip_flag after deflateReset on IGZIP stream (Cassandra …
asonje May 6, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,5 @@
/build/
/tests/build
/tests/build
cmake.txt
/resources/

15 changes: 11 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,8 +59,10 @@ make
CMake supports the following options:
- USE_QAT (ON/OFF): include QAT acceleration
- USE_IAA (ON/OFF): include IAA acceleration
- USE_IGZIP (ON/OFF): include IGZIP acceleration (requires ISA-L)
- QPL_PATH: path to QPL for IAA acceleration (if not in a standard directory)
- QATZIP_PATH: path to QATzip for QAT acceleration (if not in a standard directory)
- ISAL_PATH: path to ISA-L for IGZIP acceleration (if not in a standard directory)
- DEBUG_LOG (ON/OFF): enable logging
- ENABLE_STATISTICS (ON/OFF): enable statistics
- COVERAGE (ON/OFF): enable test coverage (more details in a later section)
Expand All @@ -84,6 +86,9 @@ Requirements for IAA
- [accel-config](https://github.com/intel/idxd-config)
- [Query Processing Library](https://github.com/intel/qpl)

Requirements for IGZIP
- [ISA-L (Intel Intelligent Storage Acceleration Library)](https://github.com/intel/isa-l)

A setup with both QAT and IAA enabled has been tested on an AWS m7i.metal-24xl instance (Ubuntu 22.04, kernel 6.8.0).
Refer to the links above for instructions on how to install the dependencies.

Expand Down Expand Up @@ -178,20 +183,22 @@ use_zlib_uncompress
- Enable zlib for decompression
- Setting to 1 is recommended, to allow fall back to zlib in case accelerators cannot be used or experience an error.

iaa_fallback_igzip
- Values: 0,1. Default: 0
- If 1, and an IAA compression or decompression operation fails, the request is retried using IGZIP (if enabled) before falling back to software zlib. Useful on machines where IAA hardware is intermittently unavailable.

iaa_compress_percentage
- Values: 0-100. Default: 50
- If both IAA and QAT are enabled, percentage of compression calls to offload to IAA.

iaa_prepend_empty_block
- Values: 0,1. Default: 0
- Prepend an empty stored block to the compressed data to "mark" that the data was compressed by IAA.
- IAA has a 4kB history window limit and it is not able to decompress blocks that use a longer history window (up to 32kB per deflate standard).
- During decompression, this marker indicates that the data was compressed by IAA and is therefore guarateed decompressible by IAA.
- **Deprecated.** This option is retained for backward compatibility and will be removed in a future release. Setting it to 1 has no effect on decompression.
- Background: the original design prepended a 5-byte empty stored-block marker to IAA-compressed output so the decompressor could identify IAA-produced data (which uses a 4kB history window). This approach was abandoned because QPL hardware always consumes all `available_in` bytes regardless of where the stream boundary falls, making marker-based detection unreliable when the caller does not supply the exact compressed size. IAA decompression eligibility is now determined by a 512-byte minimum input length threshold: callers such as Java's `ZipInputStream` feed chunks of ≤512 bytes when the compressed size is unknown, while Lucene stored-field reads always supply the exact size (>512 bytes).

Comment thread
asonje marked this conversation as resolved.
iaa_uncompress_percentage
- Values: 0-100. Default: 50
- If both IAA and QAT are enabled, percentage of decompression calls to offload to IAA.
- If iaa_prepend_empty_block = 1, this percentage is only applied to data with the empty block marker.

qat_periodical_polling = 0
- Values: 0,1. Default: 0
Expand Down
3 changes: 3 additions & 0 deletions config/config.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ uint32_t configs[CONFIG_MAX] = {
1, /*qat_compression_level*/
0, /*qat_compression_allow_chunking*/
0, /*ignore_zlib_dictionary*/
0, /*iaa_fallback_igzip*/
2, /*log_level*/
1000 /*log_stats_samples*/
};
Expand All @@ -56,6 +57,7 @@ bool LoadConfigFile(std::string& file_content, const char* file_path) {
"qat_compression_level",
"qat_compression_allow_chunking",
"ignore_zlib_dictionary",
"iaa_fallback_igzip",
"log_level",
"log_stats_samples"
};
Expand Down Expand Up @@ -91,6 +93,7 @@ bool LoadConfigFile(std::string& file_content, const char* file_path) {
trySetConfig(QAT_COMPRESSION_LEVEL, 9, 1);
trySetConfig(QAT_COMPRESSION_ALLOW_CHUNKING, 1, 0);
trySetConfig(IGNORE_ZLIB_DICTIONARY, 1, 0);
trySetConfig(IAA_FALLBACK_IGZIP, 1, 0);
trySetConfig(LOG_LEVEL, 2, 0);
trySetConfig(LOG_STATS_SAMPLES, UINT32_MAX, 0);

Expand Down
3 changes: 2 additions & 1 deletion config/config.h
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,12 @@ enum ConfigOption {
USE_IGZIP_UNCOMPRESS,
IAA_COMPRESS_PERCENTAGE,
IAA_UNCOMPRESS_PERCENTAGE,
IAA_PREPEND_EMPTY_BLOCK,
IAA_PREPEND_EMPTY_BLOCK, // DEPRECATED — see README for details
QAT_PERIODICAL_POLLING,
QAT_COMPRESSION_LEVEL,
QAT_COMPRESSION_ALLOW_CHUNKING,
IGNORE_ZLIB_DICTIONARY,
IAA_FALLBACK_IGZIP,
LOG_LEVEL,
LOG_STATS_SAMPLES,
CONFIG_MAX
Expand Down
66 changes: 32 additions & 34 deletions iaa.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -99,9 +99,19 @@ int CompressIAA(uint8_t* input, uint32_t* input_length, uint8_t* output,
output_shift += GZIP_EXT_XHDR_SIZE;
}

// If prepending an empty block, leave space for it to be added
// For zlib format, we don't need an empty block as a marker, as the zlib
// header includes info about the window size
// DEPRECATED: iaa_prepend_empty_block is no longer used by the decompressor.
// The original design used a 5-byte empty stored-block marker written at the
// start of IAA-compressed output so that the decompressor could detect and
// trust IAA-compressed data (which uses a 4kB history window). This approach
// was abandoned because QPL hardware always consumes all available_in bytes
// regardless of where the BFINAL=1 token falls (overconsumption bug), so a
// caller-supplied exact boundary is required instead. IsIAADecompressible now
// uses a 512-byte minimum input length threshold to gate IAA decompression:
// Java ZipInputStream feeds <=512-byte chunks (csize unknown), triggering
// overconsumption; Lucene stored-field reads always provide the exact
// compressed size (>512 bytes), where consuming all input is correct.
// The config option is retained for backward compatibility and will be
// removed in a future release.
bool prepend_empty_block = false;
CompressedFormat format = GetCompressedFormat(window_bits);
if (format != CompressedFormat::ZLIB &&
Expand Down Expand Up @@ -255,44 +265,32 @@ bool SupportedOptionsIAA(int window_bits, uint32_t input_length,
return false;
}

bool PrependedEmptyBlockPresent(uint8_t* input, uint32_t input_length,
CompressedFormat format) {
uint32_t header_length = GetHeaderLength(format);
if (header_length + PREPENDED_BLOCK_LENGTH > input_length) {
return false;
}

if (input[header_length] == 0 && input[header_length + 1] == 0 &&
input[header_length + 2] == 0 && input[header_length + 3] == 0xFF &&
input[header_length + 4] == 0xFF) {
Log(LogLevel::LOG_INFO, "PrependedEmptyBlockPresent() Line ", __LINE__,
" Empty block detected\n");
return true;
}

return false;
}

bool IsIAADecompressible(uint8_t* input, uint32_t input_length,
int window_bits) {
CompressedFormat format = GetCompressedFormat(window_bits);
if (format == CompressedFormat::ZLIB) {
int window = GetWindowSizeFromZlibHeader(input, input_length);
Log(LogLevel::LOG_INFO, "IsIAADecompressible() Line ", __LINE__, " window ",
window, "\n");
return window <= 12;
} else {
// if no empty block markers selected, we cannot tell for sure it's
// IAA-decompression, but we assume it is.
if (configs[IAA_PREPEND_EMPTY_BLOCK] == 0) {
return true;
} else if (configs[IAA_PREPEND_EMPTY_BLOCK] == 1 &&
PrependedEmptyBlockPresent(input, input_length, format)) {
return true;
} else {
return false;
}
}
// For raw deflate and gzip formats, QPL always reports total_in ==
// available_in regardless of where BFINAL=1 falls in the stream. This is
// safe only when the caller provides avail_in == actual_compressed_size
// (e.g. Lucene stored-field reads, where the exact compressed size is known
// from the .fdt file format).
//
// Callers that do not know the compressed size a priori — notably Java's
// ZipInputStream, which uses a fixed 512-byte internal buffer and feeds
// chunks of that size to inflate() — will have avail_in > actual_csize.
// QPL consuming all 512 bytes then reporting total_in=512 when actual_csize
// was 2 triggers ZipException at the Java level.
//
// Guard: only attempt IAA when input_length > 512. ZipInputStream always
// feeds chunks of at most 512 bytes, so any call above that threshold is
// guaranteed not to be a ZipInputStream-chunked read. Lucene stored-field
// entries are typically much larger than 512 bytes; the few that are smaller
// fall back to IGZIP which is also correct.
static constexpr uint32_t kZipInputStreamBufferSize = 512;
return input_length > kZipInputStreamBufferSize;
Comment thread
asonje marked this conversation as resolved.
}

#endif // USE_IAA
Loading