Add compression ratio calculation and per-column compression stats (#18184) by johnsolomonj · Pull Request #18185 · apache/pinot

johnsolomonj · 2026-04-13T18:59:50Z

Labels: feature, release-notes, observability

Summary

Draft implementation for the PEP proposed in #18184. Kept as draft pending design review on the issue.

Adds compression ratio tracking and per-column compression stats to Pinot's existing table size and metadata APIs:

Track uncompressed forward index sizes at write time in all raw column writers (BaseChunkForwardIndexWriter subclasses, VarByteChunkForwardIndexWriterV4/V5/V6, CLPForwardIndexCreatorV2)
Track raw ingest size at write time for dict columns in SegmentDictionaryCreator (STRING via Utf8.encodedLength, BYTES via array length, BIG_DECIMAL via BigDecimalUtils.byteSize; fixed-width types computed from totalDocs × typeWidth at seal time)
Persist uncompressed size and compression codec to metadata.properties per column
Expose compressionStats, columnCompressionStats, and storageBreakdown on both GET /tables/{table}/size and GET /tables/{table}/metadata
Add TABLE_COMPRESSION_RATIO_PERCENT and TABLE_TIERED_STORAGE_SIZE controller gauges with tier lifecycle management
Gated by table-level indexingConfig.compressionStatsEnabled flag (default: off, zero overhead when disabled)

Design document

See #18184 for the full PEP including motivation, prior art, API response structure, and known corner cases.

Key design decisions

Per-value tracking: Uncompressed size tracked at individual put*() callsites, capturing raw ingested data size without chunk headers or alignment padding
Shared codec resolution: ForwardIndexType.resolveCompressionType() handles CLP codec variants, used by both BaseSegmentCreator and ForwardIndexHandler
Dict columns included: Dictionary-encoded columns get codec="DICT_ENCODED", rawIngestSizeInBytes tracked via SegmentDictionaryCreator, and onDiskSizeInBytes = forward index + dictionary file size. Columns with mixed encoding across segments produce codec="MIXED" with a per-codec codecBreakdown (segments, rawIngestSizeInBytes, onDiskSizeInBytes per codec). hasDictionary field removed — encoding fully expressed via codec.
Backward compatible: New metadata fields are additive; old segments gracefully return defaults

Test plan

Unit tests for writer uncompressed size tracking (fixed-byte, var-byte V1-V3, V4/V5/V6)
Unit tests for CLP V2 sub-stream size aggregation
Unit tests for ForwardIndexType.resolveCompressionType() codec resolution
Unit tests for ForwardIndexHandler compression stats persistence on reload
Unit tests for SegmentDictionaryCreator.getTotalRawIngestBytes() (STRING UTF-8 multi-byte, BYTES, BIG_DECIMAL, MV columns)
Controller aggregation tests (dict sentinel preservation, negative ratio guards, partial coverage)
Integration test for end-to-end compression stats API response
E2E manual tests for dict-only, raw-only, mixed codec, and flag-off scenarios via both /size and /metadata APIs
Verify zero overhead when compressionStatsEnabled = false

codecov-commenter · 2026-04-13T19:51:35Z

Codecov Report

❌ Patch coverage is 72.11155% with 210 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.82%. Comparing base (b27a3ad) to head (c6fce9e).
⚠️ Report is 13 commits behind head on master.

Files with missing lines	Patch %	Lines
.../apache/pinot/controller/util/TableSizeReader.java	74.85%	26 Missing and 18 partials ⚠️
.../pinot/server/api/resources/TableSizeResource.java	9.30%	35 Missing and 4 partials ⚠️
...t/controller/util/ServerSegmentMetadataReader.java	72.89%	18 Missing and 11 partials ⚠️
...che/pinot/server/api/resources/TablesResource.java	77.67%	2 Missing and 23 partials ⚠️
...segment/creator/impl/SegmentDictionaryCreator.java	62.85%	8 Missing and 5 partials ⚠️
...oller/api/resources/PinotTableRestletResource.java	0.00%	12 Missing ⚠️
...local/segment/creator/impl/BaseSegmentCreator.java	72.22%	0 Missing and 10 partials ⚠️
...ent/creator/impl/fwd/CLPForwardIndexCreatorV2.java	47.36%	2 Missing and 8 partials ⚠️
...ment/index/forward/ForwardIndexCreatorFactory.java	68.18%	4 Missing and 3 partials ⚠️
...ocal/segment/creator/impl/ColumnIndexCreators.java	20.00%	2 Missing and 2 partials ⚠️
... and 9 more

Additional details and impacted files

@@             Coverage Diff              @@
##             master   #18185      +/-   ##
============================================
+ Coverage     64.78%   64.82%   +0.04%     
  Complexity     1309     1309              
============================================
  Files          3380     3384       +4     
  Lines        209544   210244     +700     
  Branches      32797    32962     +165     
============================================
+ Hits         135746   136285     +539     
- Misses        62870    62943      +73     
- Partials      10928    11016      +88

Flag	Coverage Δ
custom-integration1	`100.00% <ø> (ø)`
integration	`100.00% <ø> (ø)`
integration1	`100.00% <ø> (ø)`
integration2	`0.00% <ø> (ø)`
java-21	`64.82% <72.11%> (+0.04%)`	⬆️
temurin	`64.82% <72.11%> (+0.04%)`	⬆️
unittests	`64.81% <72.11%> (+0.04%)`	⬆️
unittests1	`56.95% <53.22%> (+<0.01%)`	⬆️
unittests2	`37.39% <68.65%> (+0.11%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

xiangfu0 · 2026-05-24T07:18:14Z

a few things:

add docs for all the public apis and configs
java docs should follow markdown style and start with ///
pinot supports both dictionary + raw forward index, so just checking either to determine the existence of the other won't persist.

johnsolomonj · 2026-06-02T21:47:22Z

a few things:

add docs for all the public apis and configs

java docs should follow markdown style and start with ///

pinot supports both dictionary + raw forward index, so just checking either to determine the existence of the other won't persist.

Added /// Javadoc to all new public classes (ColumnCompressionStatsInfo, CompressionStatsSummary) and the compressionStatsEnabled config field.
Converted all class-level docs to /// markdown style.
Addressed. hasDictionary is removed and codec is the single source of truth. Dict columns set codec="DICT_ENCODED", so checking one encoding no longer implies anything about the other.

xiangfu0 · 2026-06-09T21:17:24Z

+      @JsonProperty("rawIngestSizeBytes") long rawIngestSizeBytes,
+      @JsonProperty("onDiskSizeBytes") long onDiskSizeBytes,
+      @JsonProperty("tier") @Nullable String tier,
+      @JsonProperty("columnCompressionStats") @Nullable Map<String, ColumnCompressionStatsInfo>


This per column stats may blow up the response, please make sure the REST API has an explicit param to ask for this, default should be off.

Nice Idea! Added a query param includeColumnStats. Including per column stats only if this param is passed and set to true.

xiangfu0 · 2026-06-09T21:18:24Z

  protected int _chunkSize;
  protected long _dataOffset;
+  protected long _uncompressedSize;
+  protected boolean _trackUncompressedSize = true;


this should be default to false?

The default of true was effectively overridden by setTrackUncompressedSize(false) via ForwardIndexCreatorFactory before any writes happened, so it wasn't causing incorrect behavior. But false is the right default for clarity and safety. Fixed.

xiangfu0 · 2026-06-09T21:18:58Z

  private int _metadataSize = 0;
  private long _chunkOffset = 0;
+  private long _uncompressedSize = 0;
+  private boolean _trackUncompressedSize = true;


the default should be false and set by external

The default of true was effectively overridden by setTrackUncompressedSize(false) via ForwardIndexCreatorFactory before any writes happened, so it wasn't causing incorrect behavior. But false is the right default for clarity and safety. Fixed.

xiangfu0 · 2026-06-09T22:11:25Z

  }

  public void putInt(int value) {
+    if (_trackUncompressedSize) {


this make the tracking to be on hotspot for every put call.
is better to infer it from the _chunkDataOffset?

xiangfu0 · 2026-06-09T22:20:59Z

+            accum[1] += info.getOnDiskSizeInBytes();
+            if (info.getCodec() != null) {
+              columnCodecMap.merge(col, info.getCodec(),
+                  (existing, incoming) -> existing.equals(incoming) ? existing : "MIXED");


this MIXED doesn't provide much info, can you make it a list of codecs?

When codec="MIXED" in the response, the codecBreakdown map provides the full per-codec detail — segment count, rawIngestSizeInBytes, and onDiskSizeInBytes for each codec. So the list of codecs and their sizes are already available via codecBreakdown.

xiangfu0 · 2026-06-09T22:26:30Z

+
+  @Override
+  @Nullable
+  public String getCompressionCodec() {


this should be ChunkCompressionType not string

getCompressionCodec() can return a ChunkCompressionType name, "DICT_ENCODED", or "MIXED" — a new enum would either duplicate ChunkCompressionType values (and drift when new codecs are added) or require a wrapper that still needs a String fallback. Kept it as String to avoid that coupling. Open to suggestions if there's a cleaner pattern you have in mind.

…to table size API This feature enables tracking and reporting of forward index compression effectiveness across Pinot segments. When `compressionStatsEnabled` is set in table config's indexing config, segment creation records uncompressed forward index sizes and compression codec in metadata.properties. The server-side table size endpoint now returns per-segment and per-column raw/compressed forward index sizes. The controller aggregates these into table-level compression ratio metrics (raw/compressed), with partial coverage tracking for mixed-version clusters. Three new ControllerGauge metrics (TABLE_COMPRESSION_RATIO_PERCENT, TABLE_RAW_FORWARD_INDEX_SIZE_PER_REPLICA, TABLE_COMPRESSED_FORWARD_INDEX_SIZE_PER_REPLICA) are emitted for monitoring. ForwardIndexHandler is updated to persist compression metadata during segment reload operations (compression type change and dict-to-raw conversion).

…feature - Add 6 new test files covering writer-level tracking, segment creation, corner cases, ForwardIndexHandler reload, and integration tests for both offline and realtime (Kafka) ingestion paths - Merge redundant dual-loop in TableSizeReader into a single pass over server info, improving performance during table size aggregation - Fix offline integration test teardown to properly wait for table data manager removal before stopping servers - Wrap second table cleanup in offline test in finally block to prevent resource leaks on assertion failure

…tier breakdown, and stale metadata cleanup - Wrap flat compression fields in nested CompressionStats DTO with @JsonInclude(NON_NULL) - Add StorageBreakdown with per-tier segment count and size (always reported) - Add per-column ColumnCompressionDetail with aggregated sizes, ratio, and codec (MIXED when codecs differ across segments) - Gate compressionStats on tableConfig.indexingConfig.compressionStatsEnabled; suppress from JSON when OFF - Fix isPartialCoverage: now correctly returns true when 0 segments have stats but non-missing segments exist - Clear stale forwardIndex.compressionCodec and forwardIndex.uncompressedSizeBytes on raw-to-dict reload - Support null values in SegmentMetadataUtils.updateMetadataProperties to clear properties - Add TABLE_TIERED_STORAGE_SIZE gauge; emit tier metrics always; clear compression+tier gauges when flag OFF - Add testRawToDictClearsCompressionStats, testCompressionStatsNullWhenFlagOff, per-column/tier assertions - Update integration tests for nested compressionStats JSON structure

…leSizeResource for dict - Gate _totalRawIngestBytes accumulation in SegmentDictionaryCreator behind a _trackRawIngestBytes flag (passed from IndexCreationContext.isCompressionStatsEnabled() via DictionaryIndexType.createIndexCreator). Eliminates Utf8.encodedLength() and BigDecimalUtils.byteSize() calls on every row when the feature is disabled. - Fix TableSizeResource to emit CODEC_DICT_ENCODED for dict columns instead of codec=null, include dict file size in onDiskSizeInBytes, and populate rawIngestSizeInBytes from getDictColumnRawIngestSizeBytes() — consistent with TablesResource handling.

… only

…ytes flag; fix tests to use it

…mns regardless of column filter The metadata endpoint accepts an optional ?columns= filter; when omitted, JAX-RS provides an empty list making columnSet empty, so the column loop iterated zero columns and compression stats were never collected. Split the loop into two: a column-stats loop scoped to columnSet, and a separate compression-stats loop over allSegmentColumns — keeping per-requested-column data scoped to the filter while ensuring compression stats always cover all segment columns.

…ablesResource TableSizeReader: the summary guard required maxRawFwdIndexSize > 0 which is always false for dict-only tables (no raw forward index). Switch to summing per-column rawIngest and onDisk from perColumnMax for all table types — consistent with per-column output and covers dict-only, raw-only, and mixed tables correctly. TablesResource: split the single column loop into a column-stats loop (scoped to caller's ?columns= filter) and a separate compression-stats loop over all segment columns, so compression stats are always collected regardless of the column filter.

…ardIndexSizeBytes to rawIngestSizeBytes/onDiskSizeBytes These fields were added in this PR (not on master) so no backward compatibility concern. Aligns naming with ColumnCompressionStatsInfo (rawIngestSizeInBytes, onDiskSizeInBytes) and CompressionStatsSummary (rawIngestSizePerReplicaInBytes, onDiskSizePerReplicaInBytes) for consistency across the compression stats API.

…aysSinceEpoch

…s in TableSizeResource

…tedColumn to return null

…nTest - use shared suite instance

…uming/old segments

…rackUncompressedSize when compressionStatsEnabled

…to gate per-column stats Per-column compression stats (columnCompressionStats) can be large for tables with many columns. Add ?includeColumnStats=false (default) to both GET /tables/{table}/size and GET /tables/{table}/metadata so callers opt in explicitly. - compressionStats summary and storageBreakdown always returned when feature flag enabled - columnCompressionStats only computed and returned when includeColumnStats=true - param flows end-to-end from controller to server; server skips per-column map construction when false, avoiding unnecessary CPU and response bloat

…rue) since default is now false

…ry blocks

…edByteChunkForwardIndexWriter Removes per-put if(_trackUncompressedSize) branches from putInt/putLong/putFloat/putDouble. _chunkDataOffset already accumulates the same byte count unconditionally for flush detection, so we read it once per chunk flush instead of re-incrementing per value. Updates testPartialChunkAccountedInClose to match per-chunk semantics and adds Javadoc clarifying that getUncompressedSize() is accurate only after close().

…artial chunk Override getUncompressedSize() in FixedByteChunkForwardIndexWriter to return _uncompressedSize + _chunkDataOffset so callers reading before close() (e.g. writeMetadata()) get the correct total. Without this, partial chunks that have not yet triggered a flush return 0, causing compression stats to be silently omitted from segment metadata.

…acy writer, and ForwardIndexHandler - MultiValueFixedByteRawIndexCreatorTest: tracking enabled/disabled - MultiValueVarByteRawIndexCreatorTest: tracking enabled/disabled - ForwardIndexWriterUncompressedSizeTest: legacy VarByteChunkForwardIndexWriter tracking - ForwardIndexHandlerCompressionStatsTest: codec not persisted when compressionStatsEnabled=false

…eatorTest

…rawIngestSize metadata persistence - VarByteChunkSVForwardIndexTest: getUncompressedSize/setTrackUncompressedSize via SingleValueVarByteRawIndexCreator (enabled and disabled) - SegmentDictionaryCreatorRawIngestSizeTest: end-to-end test verifying dict.rawIngestSizeBytes is persisted to segment metadata when compressionStatsEnabled

… behavior The initial segment already has SNAPPY persisted (built with compressionStatsEnabled=true). When stats are disabled, the handler does not overwrite the metadata with the new codec — so the assertion is that the old value is unchanged, not null.

…e to request columnCompressionStats is only returned when includeColumnStats=true is passed. The test was calling the endpoint without this param so ccs was always null.

…rror Without this, a stale gauge value from a previous successful fetch persists when all servers subsequently return errors. The test testGetTableSubTypeSizeAllErrors asserts the gauge must not exist after an all-error run.

…/size API Root cause: the summary accumulation loop was gated on perColumnMax which is only populated when the server is called with includeColumnStats=true. For the default case (includeColumnStats=false), the server omits columnCompressionStats from SegmentSizeInfo so perColumnMax was always empty and _segmentsWithStats stayed 0, causing _compressionStats to be null. Fix: use segment-level rawIngestSizeBytes/onDiskSizeBytes from SegmentSizeInfo for the summary (always populated by servers when compressionStatsEnabled). Dict-only segments count toward coverage but not the ratio to avoid skewing it toward zero. Keep per-column fallback for legacy servers that don't populate segment-level fields. Adds regression test testCompressionStatsSummaryPresentWhenColumnStatsExcluded.

…overload Added testRunner(servers, table, includeColumnStats) overload and used it in the regression test instead of a method that does not exist.

johnsolomonj force-pushed the feature/compression-stats-tracking branch 3 times, most recently from b9e573e to 7667a13 Compare May 12, 2026 23:13

johnsolomonj mentioned this pull request May 19, 2026

[PEP] Add compression ratio calculation and per-column compression stats #18184

Open

johnsolomonj force-pushed the feature/compression-stats-tracking branch 4 times, most recently from 0bf95a3 to 0741c48 Compare May 19, 2026 19:41

xiangfu0 added release-notes Referenced by PRs that need attention when compiling the next release notes observability Related to observability (logging, tracing, metrics) feature New functionality labels May 20, 2026

xiangfu0 force-pushed the feature/compression-stats-tracking branch 2 times, most recently from d4ce64e to 1cea546 Compare May 24, 2026 06:41

xiangfu0 reviewed May 24, 2026

View reviewed changes

Comment thread pinot-server/src/main/java/org/apache/pinot/server/api/resources/TablesResource.java Outdated

xiangfu0 reviewed May 24, 2026

View reviewed changes

Comment thread ...ava/org/apache/pinot/integration/tests/CompressionStatsRealtimeIngestionIntegrationTest.java Outdated

xiangfu0 reviewed May 24, 2026

View reviewed changes

Comment thread ...t-controller/src/main/java/org/apache/pinot/controller/util/ServerSegmentMetadataReader.java Outdated

johnsolomonj force-pushed the feature/compression-stats-tracking branch from 992eeae to c53db31 Compare June 2, 2026 00:28

xiangfu0 reviewed Jun 9, 2026

View reviewed changes

johnsolomonj force-pushed the feature/compression-stats-tracking branch from d6abaf1 to f4902a8 Compare June 10, 2026 12:31

johnsolomonj marked this pull request as ready for review June 11, 2026 23:45

johnsolomonj changed the title ~~[Draft] Add compression ratio calculation and per-column compression stats (#18184)~~ Add compression ratio calculation and per-column compression stats (#18184) Jun 11, 2026

johnsolomonj force-pushed the feature/compression-stats-tracking branch from 9755191 to 2c7f1d0 Compare June 12, 2026 09:52

jsol-splunk added 3 commits June 15, 2026 07:53

jsol-splunk added 25 commits June 15, 2026 07:53

Fix ServerSegmentMetadataReader skip guard to drop codec=null entries…

f8a4011

… only

Fix realtime integration test schema name to match table name

2d7b1a3

Add 5-param SegmentDictionaryCreator constructor with trackRawIngestB…

caec779

…ytes flag; fix tests to use it

Fix CompressionStatsRealtimeIngestionIntegrationTest time column to D…

8a972bb

…aysSinceEpoch

Fix typo: getUnonDiskSizeBytes -> getUncompressedForwardIndexSizeByte…

5e31b8c

…s in TableSizeResource

Fix CompressionStatsRealtimeIngestionIntegrationTest: override getSor…

d440713

…tedColumn to return null

Add /// Javadoc to StorageBreakdownInfo (new class added in this PR)

fa3bdb2

Fix controllerUrl null in CompressionStatsRealtimeIngestionIntegratio…

946bd3f

…nTest - use shared suite instance

Guard null rawIngestSizeBytes in integration test assertions for cons…

a01c3cf

…uming/old segments

Default _trackUncompressedSize to false — enabled externally via setT…

a15f1db

…rackUncompressedSize when compressionStatsEnabled

Fix writer tracking tests: explicitly call setTrackUncompressedSize(t…

26dba0f

…rue) since default is now false

Fix checkstyle indentation in setTrackUncompressedSize calls inside t…

233f600

…ry blocks

Fix checkstyle line-length violation in TablesResource comment

3b10c14

Fix checkstyle line-length violations in pinot-controller

81f92c3

Fix compilation: add assertTrue import to MultiValueVarByteRawIndexCr…

26e2981

…eatorTest

johnsolomonj force-pushed the feature/compression-stats-tracking branch from 2c7f1d0 to bb679b5 Compare June 15, 2026 14:53

jsol-splunk added 4 commits June 15, 2026 10:49

Fix testGetTableMetadataMixedDictRawCodec: add includeColumnStats=tru…

a983200

…e to request columnCompressionStats is only returned when includeColumnStats=true is passed. The test was calling the endpoint without this param so ccs was always null.

Fix compilation: replace non-existent setUpHttpMocks with testRunner …

c6fce9e

…overload Added testRunner(servers, table, includeColumnStats) overload and used it in the regression test instead of a method that does not exist.

Conversation

johnsolomonj commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design document

Key design decisions

Test plan

Uh oh!

codecov-commenter commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xiangfu0 commented May 24, 2026

Uh oh!

johnsolomonj commented Jun 2, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

johnsolomonj commented Apr 13, 2026 •

edited

Loading

codecov-commenter commented Apr 13, 2026 •

edited

Loading