Skip to content

Add user-facing warning when MSE Lite hits implicit query limits#18725

Open
anuragrai16 wants to merge 2 commits into
apache:masterfrom
anuragrai16:mse-lite-warnings
Open

Add user-facing warning when MSE Lite hits implicit query limits#18725
anuragrai16 wants to merge 2 commits into
apache:masterfrom
anuragrai16:mse-lite-warnings

Conversation

@anuragrai16

@anuragrai16 anuragrai16 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

MSE Lite silently truncates query results when a query runs without an explicit LIMIT clause — the planner inserts a PhysicalSort(FETCH=liteModeLimit) at the leaf-stage boundary, but returns incomplete results with no user-facing signal. This PR surfaces that truncation via three new fields in the broker response.

Changes

Planning side (pinot-query-planner):

  • PhysicalPlannerContext: two new fields to track when an implicit sort was injected and the effective limit
  • LiteModeSortInsertRule: sets the context flags in all three implicit-limit branches (sort-with-fetch, aggregate-no-limit, new-sort-insertion)
  • QueryEnvironment.QueryPlannerResult: exposes the planner context flags to the broker

Transport (pinot-spi, pinot-common):

  • New query option key LITE_MODE_IMPLICIT_LEAF_STAGE_LIMIT — injected by the broker after planning so servers can detect truncation
  • New DataTable.MetadataKey.LITE_MODE_LEAF_STAGE_LIMIT_REACHED (id 44) — set by the server when the implicit limit is binding
  • QueryOptionsUtils.getLiteModeImplicitLeafStageLimit() — parser for the new query option

Server-side detection (pinot-core):

  • InstanceResponseOperator.buildInstanceResponseBlock(): checks numRows >= implicitLimit and sets the metadata flag (handles aggregation queries via non-streaming combine path)
  • StreamingInstanceResponseOperator.getNextBlock(): tracks totalRowsStreamed during the streaming loop and sets the metadata flag after completion (handles selection queries — the common case for MSE Lite)

Stats propagation (pinot-query-runtime):

  • LeafOperator.StatKey.LITE_MODE_LEAF_STAGE_LIMIT_REACHED + case in mergeExecutionStats() switch — mandatory, as default throws IllegalArgumentException
  • Auto-maps to BrokerResponseNativeV2.StatKey by enum constant name

Broker response (pinot-common, pinot-broker):

  • BrokerResponseNativeV2: new StatKey.LITE_MODE_LEAF_STAGE_LIMIT_REACHED, accessor/merge methods, three new JSON fields in @JsonPropertyOrder
  • isPartialResult() updated to include isMseLiteLeafStageLimitReached()
  • MultiStageBrokerRequestHandler: injects query option before dispatch; sets planning-time response fields (effectiveLimit, fanOutAdjustedLimitApplied) and logs a WARN after execution

New response fields

Field Type When present
mseLiteLeafStageLimitReached boolean Always (false if not reached)
mseLiteLeafStageEffectiveLimit int Only when planner injected an implicit limit
mseLiteFanOutAdjustedLimitApplied boolean Only when planner injected an implicit limit

Example response (limit reached)

{
  "numRowsResultSet": 8,
  "partialResult": true,
  "mseLiteLeafStageLimitReached": true,
  "mseLiteLeafStageEffectiveLimit": 2,
  "mseLiteFanOutAdjustedLimitApplied": false
}

Test plan

  • Unit Tests
  • Manual verification against STREAM quickstart with 17 test cases covering both streaming (selection) and non-streaming (aggregation) operator paths

TC-1: Implicit limit reached — aggregation query
Screenshot 2026-06-10 at 9 44 57 AM

TC-2: Implicit limit reached — selection query
Screenshot 2026-06-10 at 9 45 25 AM

TC-3: Implicit limit NOT reached — selection query with leaf limit override
Screenshot 2026-06-10 at 9 45 41 AM

TC-4: Implicit limit NOT reached — selection query with default limit
Screenshot 2026-06-10 at 9 45 54 AM

TC-5: Implicit limit NOT reached even with override
Screenshot 2026-06-10 at 10 16 59 AM

TC-6: Failure on fetching more than applied leaf limit
Screenshot 2026-06-10 at 10 17 15 AM

TC-7: Fan out adjusted implicit limit - warning is correct
Screenshot 2026-06-10 at 10 18 10 AM

@anuragrai16

Copy link
Copy Markdown
Contributor Author

CC @shauryachats @ankitsultana

@codecov-commenter

codecov-commenter commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 73.91304% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.78%. Comparing base (6464736) to head (e40bbc1).

Files with missing lines Patch % Lines
...requesthandler/MultiStageBrokerRequestHandler.java 0.00% 10 Missing ⚠️
...r/streaming/StreamingInstanceResponseOperator.java 77.77% 1 Missing and 1 partial ⚠️
.../java/org/apache/pinot/query/QueryEnvironment.java 77.77% 2 Missing ⚠️
...common/response/broker/BrokerResponseNativeV2.java 91.66% 0 Missing and 1 partial ⚠️
...he/pinot/query/context/PhysicalPlannerContext.java 80.00% 1 Missing ⚠️
.../physical/v2/opt/rules/LiteModeSortInsertRule.java 83.33% 1 Missing ⚠️
.../apache/pinot/spi/trace/DefaultRequestContext.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18725      +/-   ##
============================================
- Coverage     64.78%   64.78%   -0.01%     
  Complexity     1309     1309              
============================================
  Files          3380     3380              
  Lines        209540   209601      +61     
  Branches      32797    32806       +9     
============================================
+ Hits         135751   135788      +37     
- Misses        62863    62893      +30     
+ Partials      10926    10920       -6     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 64.78% <73.91%> (-0.01%) ⬇️
temurin 64.78% <73.91%> (-0.01%) ⬇️
unittests 64.78% <73.91%> (-0.01%) ⬇️
unittests1 56.97% <86.20%> (-0.01%) ⬇️
unittests2 37.26% <18.84%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@xiangfu0 xiangfu0 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found one high-signal issue; see inline comment.

Integer implicitLimit = QueryOptionsUtils.getLiteModeImplicitLeafStageLimit(
_queryContext.getQueryOptions());
// false-positive when table has exactly implicitLimit rows
if (implicitLimit != null && baseResultsBlock.getNumRows() >= implicitLimit) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numRows >= implicitLimit is not a valid truncation signal. When the complete result cardinality is exactly the implicit limit, this marks mseLiteLeafStageLimitReached=true and partialResult=true even though nothing was dropped; the new equality-case test bakes in that false positive. This needs an explicit "limit actually hit" signal from the limiting operator (and the same fix is needed in StreamingInstanceResponseOperator) instead of inferring truncation from equality.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a known and documented limitation (and should be rare in reality) both in code and PR. Will add a note in MSE Lite docs too

@ankitsultana ankitsultana left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very excited to see this! left some minor comments.

i understand that there's a false positive issue but i think it's okay for a first cut.

// false-positive when table has exactly implicitLimit rows
if (implicitLimit != null && baseResultsBlock.getNumRows() >= implicitLimit) {
instanceResponseBlock.addMetadata(
MetadataKey.LITE_MODE_LEAF_STAGE_LIMIT_REACHED.getName(), "true");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it better to log baseResultsBlock.getNumRows() instead of just a true boolean?

also, afair, baseResultsBlock should ideally have at most implicitLimit rows because both selection and group by operators will not allow returning more than that. in that case i guess we can just keep this true

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont think that's needed. the final response has a mseLiteLeafStageEffectiveLimit that conveys to the user what was the effective limit pushed to the server, so we already know each server is returning <= mseLiteLeafStageEffectiveLimit

liteModeLimit);
return sort;
}
_context.setLiteModeImplicitSortApplied(liteModeLimit);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: probably overkill, but it might be easy to miss calling setLiteModeImplicitSortApplied as this code evolves. one way to mitigate that could be add an onMatchInternal(call) => Pair<PRelNode, Integer> function. but no one reads code anymore so should be okay

liteModeImplicitSortApplied = true;
liteModeEffectiveSortLimit = physicalPlannerContext.getLiteModeEffectiveSortLimit();
}
return new QueryPlannerResult(dispatchableSubPlan, explainStr, tableNames, extraFields,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we get rid of the boolean we can just pass physicalPlannerContext.getLiteModeEffectiveSortLimit() here and avoid the 7 other lines above

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Using only liteModeEffectiveSortLimit everywhere now


DispatchableSubPlan dispatchableSubPlan = queryPlanResult.getQueryPlan();

// Inject the implicit leaf-stage limit as a query option so servers can detect truncation

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might also want to emit a metric when the limit is reached. also from Cellar PoV, with the current change would you be able to identify exact queries which hit this limit? you might want to ensure that broker event listener captures this flag too

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.
Added a metric at the end similar to groupsTrimmed. And added the flag to augmentStatistics in BaseBrokerRequestHandler

@ankitsultana ankitsultana added mse-physical-optimizer Multi-stage engine physical query optimizer mse-lite-mode Multi-stage engine lite mode labels Jun 11, 2026
anuragrai16 added a commit to anuragrai16/pinot that referenced this pull request Jun 16, 2026
- Remove redundant _liteModeImplicitSortApplied boolean; derive from
  _liteModeEffectiveSortLimit >= 0 (simplifies PhysicalPlannerContext,
  QueryEnvironment, QueryPlannerResult)
- Convert ternary to if/else in LiteModeSortInsertRule aggregate branch
- Add comment to LITE_MODE_IMPLICIT_LEAF_STAGE_LIMIT explaining it is a
  system-internal query option
- Add BROKER_RESPONSES_WITH_MSE_LITE_LEAF_STAGE_LIMIT_REACHED metric
- Add isMseLiteLeafStageLimitReached to BrokerResponse interface and
  RequestContext for broker event listener capture
…esults

When a query runs under MSE Lite without an explicit LIMIT, the planner
silently inserts a PhysicalSort at the leaf-stage boundary to cap rows
per server. Users get incomplete results with no indication. This change
adds three new fields to BrokerResponseNativeV2 (mseLiteLeafStageLimitReached,
mseLiteLeafStageEffectiveLimit, mseLiteFanOutAdjustedLimitApplied) that
signal when the implicit limit was binding.

Detection happens at execution time on each server via a new
DataTable.MetadataKey, propagated through LeafOperator.StatKey to the
broker response. Both StreamingInstanceResponseOperator (selection queries)
and InstanceResponseOperator (aggregation queries) are instrumented.
- Remove redundant _liteModeImplicitSortApplied boolean; derive from
  _liteModeEffectiveSortLimit >= 0 (simplifies PhysicalPlannerContext,
  QueryEnvironment, QueryPlannerResult)
- Convert ternary to if/else in LiteModeSortInsertRule aggregate branch
- Add comment to LITE_MODE_IMPLICIT_LEAF_STAGE_LIMIT explaining it is a
  system-internal query option
- Add BROKER_RESPONSES_WITH_MSE_LITE_LEAF_STAGE_LIMIT_REACHED metric
- Add isMseLiteLeafStageLimitReached to BrokerResponse interface and
  RequestContext for broker event listener capture
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

mse-lite-mode Multi-stage engine lite mode mse-physical-optimizer Multi-stage engine physical query optimizer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants