Add user-facing warning when MSE Lite hits implicit query limits#18725
Add user-facing warning when MSE Lite hits implicit query limits#18725anuragrai16 wants to merge 2 commits into
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #18725 +/- ##
============================================
- Coverage 64.78% 64.78% -0.01%
Complexity 1309 1309
============================================
Files 3380 3380
Lines 209540 209601 +61
Branches 32797 32806 +9
============================================
+ Hits 135751 135788 +37
- Misses 62863 62893 +30
+ Partials 10926 10920 -6
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
xiangfu0
left a comment
There was a problem hiding this comment.
Found one high-signal issue; see inline comment.
| Integer implicitLimit = QueryOptionsUtils.getLiteModeImplicitLeafStageLimit( | ||
| _queryContext.getQueryOptions()); | ||
| // false-positive when table has exactly implicitLimit rows | ||
| if (implicitLimit != null && baseResultsBlock.getNumRows() >= implicitLimit) { |
There was a problem hiding this comment.
numRows >= implicitLimit is not a valid truncation signal. When the complete result cardinality is exactly the implicit limit, this marks mseLiteLeafStageLimitReached=true and partialResult=true even though nothing was dropped; the new equality-case test bakes in that false positive. This needs an explicit "limit actually hit" signal from the limiting operator (and the same fix is needed in StreamingInstanceResponseOperator) instead of inferring truncation from equality.
There was a problem hiding this comment.
Yes, this is a known and documented limitation (and should be rare in reality) both in code and PR. Will add a note in MSE Lite docs too
ankitsultana
left a comment
There was a problem hiding this comment.
very excited to see this! left some minor comments.
i understand that there's a false positive issue but i think it's okay for a first cut.
| // false-positive when table has exactly implicitLimit rows | ||
| if (implicitLimit != null && baseResultsBlock.getNumRows() >= implicitLimit) { | ||
| instanceResponseBlock.addMetadata( | ||
| MetadataKey.LITE_MODE_LEAF_STAGE_LIMIT_REACHED.getName(), "true"); |
There was a problem hiding this comment.
would it better to log baseResultsBlock.getNumRows() instead of just a true boolean?
also, afair, baseResultsBlock should ideally have at most implicitLimit rows because both selection and group by operators will not allow returning more than that. in that case i guess we can just keep this true
There was a problem hiding this comment.
dont think that's needed. the final response has a mseLiteLeafStageEffectiveLimit that conveys to the user what was the effective limit pushed to the server, so we already know each server is returning <= mseLiteLeafStageEffectiveLimit
| liteModeLimit); | ||
| return sort; | ||
| } | ||
| _context.setLiteModeImplicitSortApplied(liteModeLimit); |
There was a problem hiding this comment.
nit: probably overkill, but it might be easy to miss calling setLiteModeImplicitSortApplied as this code evolves. one way to mitigate that could be add an onMatchInternal(call) => Pair<PRelNode, Integer> function. but no one reads code anymore so should be okay
| liteModeImplicitSortApplied = true; | ||
| liteModeEffectiveSortLimit = physicalPlannerContext.getLiteModeEffectiveSortLimit(); | ||
| } | ||
| return new QueryPlannerResult(dispatchableSubPlan, explainStr, tableNames, extraFields, |
There was a problem hiding this comment.
if we get rid of the boolean we can just pass physicalPlannerContext.getLiteModeEffectiveSortLimit() here and avoid the 7 other lines above
There was a problem hiding this comment.
Good point. Using only liteModeEffectiveSortLimit everywhere now
|
|
||
| DispatchableSubPlan dispatchableSubPlan = queryPlanResult.getQueryPlan(); | ||
|
|
||
| // Inject the implicit leaf-stage limit as a query option so servers can detect truncation |
There was a problem hiding this comment.
you might also want to emit a metric when the limit is reached. also from Cellar PoV, with the current change would you be able to identify exact queries which hit this limit? you might want to ensure that broker event listener captures this flag too
There was a problem hiding this comment.
Makes sense.
Added a metric at the end similar to groupsTrimmed. And added the flag to augmentStatistics in BaseBrokerRequestHandler
- Remove redundant _liteModeImplicitSortApplied boolean; derive from _liteModeEffectiveSortLimit >= 0 (simplifies PhysicalPlannerContext, QueryEnvironment, QueryPlannerResult) - Convert ternary to if/else in LiteModeSortInsertRule aggregate branch - Add comment to LITE_MODE_IMPLICIT_LEAF_STAGE_LIMIT explaining it is a system-internal query option - Add BROKER_RESPONSES_WITH_MSE_LITE_LEAF_STAGE_LIMIT_REACHED metric - Add isMseLiteLeafStageLimitReached to BrokerResponse interface and RequestContext for broker event listener capture
…esults When a query runs under MSE Lite without an explicit LIMIT, the planner silently inserts a PhysicalSort at the leaf-stage boundary to cap rows per server. Users get incomplete results with no indication. This change adds three new fields to BrokerResponseNativeV2 (mseLiteLeafStageLimitReached, mseLiteLeafStageEffectiveLimit, mseLiteFanOutAdjustedLimitApplied) that signal when the implicit limit was binding. Detection happens at execution time on each server via a new DataTable.MetadataKey, propagated through LeafOperator.StatKey to the broker response. Both StreamingInstanceResponseOperator (selection queries) and InstanceResponseOperator (aggregation queries) are instrumented.
- Remove redundant _liteModeImplicitSortApplied boolean; derive from _liteModeEffectiveSortLimit >= 0 (simplifies PhysicalPlannerContext, QueryEnvironment, QueryPlannerResult) - Convert ternary to if/else in LiteModeSortInsertRule aggregate branch - Add comment to LITE_MODE_IMPLICIT_LEAF_STAGE_LIMIT explaining it is a system-internal query option - Add BROKER_RESPONSES_WITH_MSE_LITE_LEAF_STAGE_LIMIT_REACHED metric - Add isMseLiteLeafStageLimitReached to BrokerResponse interface and RequestContext for broker event listener capture
61f2b07 to
e40bbc1
Compare
Summary
MSE Lite silently truncates query results when a query runs without an explicit
LIMITclause — the planner inserts aPhysicalSort(FETCH=liteModeLimit)at the leaf-stage boundary, but returns incomplete results with no user-facing signal. This PR surfaces that truncation via three new fields in the broker response.Changes
Planning side (
pinot-query-planner):PhysicalPlannerContext: two new fields to track when an implicit sort was injected and the effective limitLiteModeSortInsertRule: sets the context flags in all three implicit-limit branches (sort-with-fetch, aggregate-no-limit, new-sort-insertion)QueryEnvironment.QueryPlannerResult: exposes the planner context flags to the brokerTransport (
pinot-spi,pinot-common):LITE_MODE_IMPLICIT_LEAF_STAGE_LIMIT— injected by the broker after planning so servers can detect truncationDataTable.MetadataKey.LITE_MODE_LEAF_STAGE_LIMIT_REACHED(id 44) — set by the server when the implicit limit is bindingQueryOptionsUtils.getLiteModeImplicitLeafStageLimit()— parser for the new query optionServer-side detection (
pinot-core):InstanceResponseOperator.buildInstanceResponseBlock(): checksnumRows >= implicitLimitand sets the metadata flag (handles aggregation queries via non-streaming combine path)StreamingInstanceResponseOperator.getNextBlock(): trackstotalRowsStreamedduring the streaming loop and sets the metadata flag after completion (handles selection queries — the common case for MSE Lite)Stats propagation (
pinot-query-runtime):LeafOperator.StatKey.LITE_MODE_LEAF_STAGE_LIMIT_REACHED+caseinmergeExecutionStats()switch — mandatory, asdefaultthrowsIllegalArgumentExceptionBrokerResponseNativeV2.StatKeyby enum constant nameBroker response (
pinot-common,pinot-broker):BrokerResponseNativeV2: newStatKey.LITE_MODE_LEAF_STAGE_LIMIT_REACHED, accessor/merge methods, three new JSON fields in@JsonPropertyOrderisPartialResult()updated to includeisMseLiteLeafStageLimitReached()MultiStageBrokerRequestHandler: injects query option before dispatch; sets planning-time response fields (effectiveLimit,fanOutAdjustedLimitApplied) and logs a WARN after executionNew response fields
mseLiteLeafStageLimitReachedmseLiteLeafStageEffectiveLimitmseLiteFanOutAdjustedLimitAppliedExample response (limit reached)
{ "numRowsResultSet": 8, "partialResult": true, "mseLiteLeafStageLimitReached": true, "mseLiteLeafStageEffectiveLimit": 2, "mseLiteFanOutAdjustedLimitApplied": false }Test plan
TC-1: Implicit limit reached — aggregation query

TC-2: Implicit limit reached — selection query

TC-3: Implicit limit NOT reached — selection query with leaf limit override

TC-4: Implicit limit NOT reached — selection query with default limit

TC-5: Implicit limit NOT reached even with override

TC-6: Failure on fetching more than applied leaf limit

TC-7: Fan out adjusted implicit limit - warning is correct
