[improvement](mtmv) Optimize MTMV partition lineage check#63899
Open
seawinde wants to merge 1 commit into
Open
[improvement](mtmv) Optimize MTMV partition lineage check#63899seawinde wants to merge 1 commit into
seawinde wants to merge 1 commit into
Conversation
### What problem does this PR solve?
Issue Number: N/A
Related PR: N/A
Problem Summary: Complex partitioned async MTMV creation can spend excessive FE CPU in partition lineage analysis. The hot path repeatedly shuttles partition and checked expressions through the full plan lineage replacer, so wide UNION ALL, join, and aggregate plans multiply the same plan walks during CREATE MATERIALIZED VIEW analysis.
Root cause: In PartitionIncrementMaintainer.PartitionIncrementChecker.checkPartition(), each partition candidate and checked expression calls ExpressionUtils.shuttleExpressionWithLineage() separately. Each call traverses the plan through ExpressionLineageReplacer and rebuilds equivalent normalized expressions.
Change Summary:
| File | Change Description |
|------|-------------------|
| PartitionIncrementMaintainer.java | Batch lineage shuttle calls, cache lineage-visible named expressions by plan identity, cache normalized expressions, and reuse the normalization rewrite context during one partition increment check. |
| PartitionColumnTraceTest.java | Add a CTE plus UNION ALL plus wide aggregate lineage test to keep partition lineage behavior covered. |
| test_mtmv_partition_lineage_performance.groovy | Add a desensitized static SQL performance regression case for the complex partitioned MTMV shape. |
Design Rationale: The change keeps the existing ExpressionLineageReplacer semantics and limits caching to a single PartitionIncrementCheckContext. This avoids sharing mutable analysis state across optimizer contexts while removing repeated full plan walks for the same plan and expression set.
### Release note
Improve performance when creating complex partitioned async materialized views.
### Check List (For Author)
- Test: Unit Test / Manual test
- Unit Test: ./run-fe-ut.sh --run org.apache.doris.nereids.rules.exploration.mv.PartitionColumnTraceTest
- Manual test: git diff --check
- Manual test: Tried ./run-regression-test.sh --run -d performance_p0 -s test_mtmv_partition_lineage_performance, but the local Doris FE was not running on 127.0.0.1:9030, so the regression could not execute SQL.
- Behavior changed: No
- Does this need documentation: No
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Member
Author
|
run buildall |
Contributor
TPC-H: Total hot run time: 31786 ms |
Contributor
TPC-DS: Total hot run time: 172293 ms |
Contributor
FE UT Coverage ReportIncrement line coverage |
Contributor
FE Regression Coverage ReportIncrement line coverage |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Issue Number: N/A
Related PR: N/A
Problem Summary:
Complex partitioned async MTMV creation can spend excessive FE CPU in partition lineage analysis. The hot path repeatedly shuttles partition and checked expressions through the full plan lineage replacer, so wide
UNION ALL, join, and aggregate plans multiply the same plan walks duringCREATE MATERIALIZED VIEWanalysis.Root cause: In
PartitionIncrementMaintainer.PartitionIncrementChecker.checkPartition(), each partition candidate and checked expression callsExpressionUtils.shuttleExpressionWithLineage()separately. Each call traverses the plan throughExpressionLineageReplacerand rebuilds equivalent normalized expressions.Change Summary:
PartitionIncrementMaintainer.javaPartitionColumnTraceTest.javaUNION ALLplus wide aggregate lineage test to keep partition lineage behavior covered.test_mtmv_partition_lineage_performance.groovyDesign Rationale: The change keeps the existing
ExpressionLineageReplacersemantics and limits caching to a singlePartitionIncrementCheckContext. This avoids sharing mutable analysis state across optimizer contexts while removing repeated full plan walks for the same plan and expression set.Release note
Improve performance when creating complex partitioned async materialized views.
Check List (For Author)
Test
./run-fe-ut.sh --run org.apache.doris.nereids.rules.exploration.mv.PartitionColumnTraceTestgit diff --check./run-regression-test.sh --run -d performance_p0 -s test_mtmv_partition_lineage_performance, but the local Doris FE was not running on127.0.0.1:9030, so the regression could not execute SQL.Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)