Skip to content

fix groupconcat#24382

Closed
daviszhen wants to merge 6 commits into
matrixorigin:3.0-devfrom
daviszhen:0513-fix-groupconcat
Closed

fix groupconcat#24382
daviszhen wants to merge 6 commits into
matrixorigin:3.0-devfrom
daviszhen:0513-fix-groupconcat

Conversation

@daviszhen
Copy link
Copy Markdown
Contributor

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #24359

What this PR does / why we need it:

group_concat 改写为window函数的方式不对,导致执行结果错误.

修改为main上, 改写agg + order 的方式. 不再使用改写为window函数的方式.

@qodo-code-review
Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

Copy link
Copy Markdown
Contributor

@aunjgr aunjgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right direction (revert window-rewrite, use SORT+AGG with per-aggregate ORDER BY). Two correctness gaps not covered by the new tests.

Blocking

1. Multiple GROUP_CONCATs with different ORDER BY clauses are conflated (pkg/sql/plan/query_builder.go:3155-3175, pkg/sql/plan/having_binder.go:291).

The accumulator ctx.groupConcatOrderBys collects ORDER BY specs from all GROUP_CONCAT aggregates in the SELECT and emits a single composite SORT. For:

SELECT GROUP_CONCAT(x ORDER BY a),
       GROUP_CONCAT(y ORDER BY b DESC)
FROM t GROUP BY g;

the resulting SORT key is (g ASC, a, b DESC). Only the first aggregate's ordering is honored; the second sees rows ordered by (a, b DESC) rather than b DESC. Result: wrong concatenation order for the second aggregate.

The previous window-rewrite supported this correctly via independent window specs per aggregate. The new test file group_concat_orderby_test.go does not cover this shape.

Two options:

  • Push ORDER BY into each GROUP_CONCAT aggregate so the runtime sorts per-aggregate inside its own state.
  • Reject (with a clear error) multiple GROUP_CONCATs with conflicting orderings, until per-aggregate sort is supported.

2. Shuffle/parallel AGG can re-order input across the SORT (pkg/sql/plan/shuffle.go:614-697, untouched).

Nothing in this PR prevents dontShuffle from returning false on the AGG node whose child is a SORT carrying groupConcatOrderBys. Hash-shuffle between SORT and AGG, or multi-CN merge, can interleave rows from different upstream producers within the same group, breaking per-group ordering. No OrderRequired flag on the AGG, no Shuffle=false enforcement, no merge-sort placement at the consumer.

Small-table unit tests won't hit this; production data above threshHoldForShuffleGroup will. The window-rewrite was immune because window operators preserve their own input ordering; the new SORT+AGG pattern relies on the AGG not reshuffling, which isn't guaranteed.

Test coverage gaps

group_concat_orderby_test.go only checks plan shape, not result correctness. Missing cases:

  • Multiple GROUP_CONCATs with different ORDER BY (concern #1).
  • DISTINCT inside GROUP_CONCAT.
  • GROUP_CONCAT inside HAVING.
  • Custom SEPARATOR + ORDER BY.
  • Large-data shuffled execution (concern #2).

Verified

  • Removal of window path is clean: no remaining references to forceWindows or the old getFrame helper.
  • having_binder.go:248 correctly returns the syntax error (the previous bare moerr.NewSyntaxErrorf(...) without return is fixed).
  • BindContext.groupConcatOrderBys is planning-time only, not serialized; safe for plan cache and remote exec.
  • Updated .result files (function_group_concat.result:294, window/window.result:4166) — empty-table GROUP_CONCAT now returns one NULL row, matching MySQL semantics. New result is correct.

Minor

  • having_binder.go:250 uses len(astExpr.Exprs)-1 because the parser appends the separator as the last element — correct but reads as off-by-one. Worth a comment.

@daviszhen
Copy link
Copy Markdown
Contributor Author

此修改会触发bug

@daviszhen daviszhen closed this May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Something isn't working size/M Denotes a PR that changes [100,499] lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants