[SPARK-56032][SQL][FOLLOWUP] Skip FilterExec subexpression elimination codegen when there is no common subexpression#56209
Open
cloud-fan wants to merge 2 commits into
Conversation
…when there is no common subexpression ### What changes were proposed in this pull request? `FilterExec` whole-stage codegen takes the subexpression-elimination (CSE) path whenever `subexpressionEliminationEnabled && otherPreds.nonEmpty`, regardless of whether any common subexpression actually exists. That path emits an `inputVarsEvalCode` prologue at the top of the per-row loop that eagerly evaluates every input column referenced by `otherPreds` (required so eliminated subexpressions can be materialized into shared variables). When there is nothing to eliminate, this prologue provides no benefit but still defeats the short-circuiting the non-CSE path gets from loading columns lazily, just before the predicate that needs them. This PR gates the CSE path on whether `otherPreds` actually contain a common subexpression (`hasCommonSubexpressions`, using the same `EquivalentExpressions` detection and `output` binding as the CSE codegen). When there is none, it falls back to the non-CSE `generatePredicateCode`, which loads columns lazily and preserves short-circuiting. ### Why are the changes needed? For a filter with no common subexpression but multiple conjuncts over different columns (e.g. `q_int BETWEEN ... AND (decimal_a BETWEEN ... OR decimal_b BETWEEN ...)`), the eager prologue decodes the decimal columns for every row, including rows a cheaper earlier predicate would have rejected. Decoding a high-precision decimal allocates a BigInteger/BigDecimal per call, so this is pure waste and shows up as a measurable performance regression versus the lazy non-CSE path. ### Does this PR introduce _any_ user-facing change? No. This is a codegen-only change; query results are unchanged. ### How was this patch tested? New unit test in `WholeStageCodegenSuite` asserting that, for a filter with no common subexpression, the CSE-enabled generated code is identical to the CSE-disabled code (i.e. it falls back to the lazy, short-circuiting path). The existing `FilterExec` CSE tests, which use genuine common subexpressions, still exercise the CSE path and pass. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude (Claude Code) Co-authored-by: Isaac
Reuse a single EquivalentExpressions analysis between the has-common-subexpression gate and the CSE codegen, instead of building it twice. Add a subexpressionEliminationForWholeStageCodegen overload that takes a pre-built EquivalentExpressions, and have FilterExec hold the bound predicates and their CSE analysis as @transient lazy vals (driver-only codegen state; EquivalentExpressions is not serializable). Co-authored-by: Isaac
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This is a follow-up of #54862, which introduced subexpression elimination (CSE) in
FilterExecwhole-stage codegen.FilterExectakes the CSE codegen path wheneversubexpressionEliminationEnabled && otherPreds.nonEmpty, regardless of whether any common subexpression actually exists. That path emits aninputVarsEvalCodeprologue at the top of the per-row loop that eagerly evaluates every input column referenced byotherPreds(required so eliminated subexpressions can be materialized into shared variables). When there is nothing to eliminate, this prologue provides no benefit but still defeats the short-circuiting the non-CSE path gets from loading columns lazily, just before the predicate that needs them.This PR gates the CSE path on whether
otherPredsactually contain a common subexpression, using the sameEquivalentExpressionsanalysis (andoutputbinding) as the CSE codegen, so it agrees exactly with whether that path would find anything. When there is none, it falls back to the non-CSEgeneratePredicateCode, which loads columns lazily and preserves short-circuiting. Filters that do have a common subexpression are unaffected.To avoid analyzing the predicates twice,
subexpressionEliminationForWholeStageCodegengains an overload that accepts a pre-builtEquivalentExpressions, so the single analysis used by the gate is reused by the codegen.Why are the changes needed?
For a filter with no common subexpression but multiple conjuncts over different columns (e.g.
q_int BETWEEN ... AND (decimal_a BETWEEN ... OR decimal_b BETWEEN ...)), the eager prologue decodes the decimal columns for every row, including rows a cheaper earlier predicate would have rejected. Decoding a high-precision decimal allocates aBigInteger/BigDecimalper call, so this is pure waste and shows up as a measurable performance regression versus the lazy non-CSE path (observed on TPC-DS q28).Does this PR introduce any user-facing change?
No. This is a codegen-only change; query results are unchanged.
How was this patch tested?
New unit test in
WholeStageCodegenSuiteasserting that, for a filter with no common subexpression, the CSE-enabled generated code is identical to the CSE-disabled code (i.e. it falls back to the lazy, short-circuiting path). The existingFilterExecCSE tests, which use genuine common subexpressions, still exercise the CSE path and pass.Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude (Claude Code)