Skip to content

feat: partition root table support with boundary-based leaf pairing#39

Open
leaocx wants to merge 1 commit into
cloudberry-contrib:mainfrom
leaocx:feat/partition-root-support
Open

feat: partition root table support with boundary-based leaf pairing#39
leaocx wants to merge 1 commit into
cloudberry-contrib:mainfrom
leaocx:feat/partition-root-support

Conversation

@leaocx

@leaocx leaocx commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Summary

Allow partition root tables in --include-table / --dest-table / --exclude-table and expand them into leaf-to-leaf pairs matched by partition boundary, with cross-version normalisation (GP6 ↔ GP7 ↔ CBDB) and partial-match support.

Motivation

  1. Partition root tables could not be specified as copy targets — cbcopy would Fatal on validation
  2. Large partition tables fell through to CopyOnMaster (coordinator single-pipe) because root reltuples=0, causing apparent hangs on multi-million row tables
  3. Leaf pairing by _prt_N name suffix silently misrouted data after add/drop partition operations
  4. Cross-version copy (GP6 ↔ GP7/CBDB) could not match leaves because boundary string formats differ
  5. Structural mismatch (different leaf counts) caused all-or-nothing fallback — either full leaf-to-leaf or full root→root, with no middle ground

Changes

File Change
option/option.go Add COPY_ON_SEGMENT constant; ForceOnSegment field on Table; fix validatePartTables / ValidateDestTables for partition roots; fix "does not exists" typo
copy/copy.go Register --copy-on-segment flag
copy/copy_command.go Add forceOnSegment param to CreateCopyStrategy to bypass row-count threshold
copy/copy_manager.go Pass ForceOnSegment to strategy selection
copy/copy_metadata.go Propagate ForceOnSegment in fillTablePairChan
copy/copy_query.go Add Boundary field to PartLeafTable; GP7/CBDB uses pg_get_expr(relpartbound, oid); GP6 upgraded to pg_get_partition_rule_def(r.oid, true) (replaces parrangestart::text which returned internal node-tree)
copy/copy_query_wrapper.go Extract matchLeavesByBoundary (pure matching) and decidePairAction (pure decision) for direct testability; pairPartitionLeaves orchestrates DB fetch + matching + action; read-error on leaf metadata now Fatal
copy/boundary_normalize.go New. normalizeBoundary() converts GP6 rule-def and GP7/CBDB pg_get_expr to canonical form (e.g. RANGE:[2024-01-01,2024-02-01), LIST:(a,b), DEFAULT); single-column limitation documented
copy/boundary_normalize_test.go New. 20 unit tests for normalisation (GP6, GP7, cross-version equality, real DB fixtures from HashData 3.13.33 and Cloudberry 2.4.0)
copy/copy_query_wrapper_test.go New. 31 unit tests: leavesByRoot (3), matchLeavesByBoundary (21 — exact/partial/cross-version/edge cases), decidePairAction (7 — full/partial/zero × flag)

Key design decisions

Boundary normalisation

  • GP6 pg_get_partition_rule_def: PARTITION p1 START ('2024-01-01'::date) END ('2024-02-01'::date) WITH (...)RANGE:[2024-01-01,2024-02-01)
  • GP7/CBDB pg_get_expr: FOR VALUES FROM ('2024-01-01') TO ('2024-02-01')RANGE:[2024-01-01,2024-02-01)
  • Same normalised string = same boundary → enables cross-version leaf-to-leaf matching
  • Single-column bounds only; multi-column bounds returned as-is (fails safe)

Pure functions for testability

  • matchLeavesByBoundary(src, dst []PartLeafTable) — DB-free, gplog-free matching core; both production and tests call the same function
  • decidePairAction(matchedCount, unmatchedCount, allowReshape) — pure decision returning PairFull / PairPartial / PairRootFallback / PairFatal; directly asserted in unit tests

Partial matching

  • Match as many leaves as possible by normalised boundary
  • Matched pairs → leaf-to-leaf copy (precise, parallel)
  • Unmatched src leaves → copy individually to dst root (partition routing handles placement)
  • Replaces old all-or-nothing approach

--allow-partition-reshape flag

  • Full match → no flag needed, leaf-to-leaf
  • Partial match → requires flag; without it, Fatal with actionable hint
  • Zero matches → requires flag for root→root fallback; without it, Fatal
  • Dest is plain table → always root→root (Warn only, no flag needed)

Error handling

  • Read-error on leaf metadata → Fatal (not silent fallback). Transient failures should not silently flatten a partitioned copy.
  • All fallback reasons logged at Warn/Critical with actionable hints

Test results

Unit tests: 51/51 PASS

Category Count Coverage
Boundary normalisation 20 GP6/GP7/CBDB formats, cross-version equality, real DB fixtures
matchLeavesByBoundary 21 Exact match, partial match, name drift, empty/duplicate boundaries, cross-version GP6↔GP7
decidePairAction 7 Full/partial/zero × with/without flag, edge cases
Other (leavesByRoot, etc.) 3 Filter by root name

E2E cross-cluster tests: 45/45 PASS

9 direction combinations × 5 scenarios:

Combination P1 leaf-to-leaf P2 name-drift P3a no-flag→Fatal P3b reshape→partial P4 plain-dest
GP6 → GP6
GP7 → GP7
CBDB → CBDB
GP7 → CBDB
CBDB → GP7
GP6 → GP7
GP6 → CBDB
GP7 → GP6
CBDB → GP6

Environments:

  • GP6: HashData 3.13.33 (PG 9.4.26)
  • GP7: Greenplum 7.1.0 (PG 12.12)
  • CBDB: Apache Cloudberry 2.4.0 (PG 14.4)

Known scope exclusions

  • --include-table without --dest-table on CB-family (separate issue)
  • Multi-level (intermediate) partitions (separate issue)

@CLAassistant

CLAassistant commented Jun 18, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@talmacschen-arch talmacschen-arch left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this — the boundary-based leaf pairing is the right core idea, and --copy-on-segment / ForceOnSegment solves a real problem. I have a few concerns, the first two of which I think should be addressed before merge. Detailed below.

What the PR does well

  • Boundary-based pairing (never _prt_N suffix/ordinal) is exactly right. pg_get_expr(relpartbound, oid) for GP7/CBDB and the parrange*/parlistvalues concat for GP6 are name-independent, so data can't be misrouted after add/drop partition. 👍
  • The CopyOnMaster fallthrough fix (ForceOnSegment for root→root, where reltuples=0 would otherwise route a multi-million-row tree onto the single coordinator pipe) is a genuine, easy-to-miss bug. Good catch.

🔴 1. The entire test matrix is GP6→GP6; the CB-family path is unverified

The test plan is 50 E2E scenarios across two GP 6.20.3 clusters. But the hardest partition behavior in cbcopy lives on the CB family (Cloudberry / SynxDB / Apache Cloudberry) source path, which is a completely different code path from GP6:

  • The GP7/CBDB boundary SQL you added — coalesce((SELECT pg_get_expr(relpartbound, oid) ...), '') in copy_query.go — is never exercised by GP6 testing (GP6 takes the pg_partition_rule branch).
  • The CBDB branch of leaf pairing / split is unverified.

GP6 happens to "just work" for partition roots because its DDL is inline (a single CREATE TABLE ... PARTITION BY (...) brings all leaves along), which masks problems that only appear on declarative-partition CB sources. Please add a CB→CB (or SynxDB→SynxDB) E2E pass covering at least: boundary pairing success, leaf-count-mismatch fallback, and the large-table root→root path. Until then, "partition root support" is verified only for GP6.

🔴 2. --include-table <root> without --dest-table is still broken on CB family, and the title implies otherwise

All of the new pairing logic lives in processDestinationTables, which is only reached when --dest-table is supplied (GetUserTables: the len(GetDestTablesByDb) == 0 branch returns earlier). The --include-table <root> / no---dest-table path is untouched and still flows through collectTablesAndSchemas → DDL extraction → getUserTableRelationsWithIncludeFiltering.

On CB-family source that extraction matches OIDs by exact FQN and does not expand a root OID to its descendants, so it emits only CREATE TABLE root ... PARTITION BY (...) — an empty partition skeleton on the destination (no leaves, no ATTACH PARTITION). In --metadata-only this is silent; in a full copy the data stage then fails with relation "..._1_prt_xxx" does not exist. (This is the gap the author flagged at queries_relations.go:161-163 — the unported gpbackup GP7+ change e00d2a1f.)

This is pre-existing, not introduced here — but the PR title "partition root table support" reads as if --include-table <root> now works, when on CB family the most common usage (copy a partition table to another cluster, same name, no --dest-table) still yields an empty skeleton. Please either scope the title/description to "--dest-table data-copy path only" and note the remaining DDL gap as a separate issue, or address the OID expansion in this PR.

🟡 3. Fallback reasons are logged at Debug — a "looks-like-success" trap

Successful pairing logs at Info ("mapping partition root ... as N leaf->leaf pairs"), but every fallback reason (leaf count mismatch, empty boundary, no boundary match, duplicate boundary) logs at gplog.Debug, which is invisible by default. A large table can silently fall back to root→root and lose all per-leaf parallelism with no signal to the operator. Please promote the fallback to gplog.Warn, with the reason and the consequence, e.g.:

WARN: partition "src...sales" -> "dst...sales": leaf pairing failed
      (leaf count mismatch 5 vs 4); falling back to whole-tree root->root
      copy (forced ON SEGMENT). Per-leaf parallelism is lost.

Concrete instance of why this matters: GP6→CB --dest-table <root> will never pair, because the GP6 boundary string format (parrange* concat) and the CBDB format (pg_get_expr) differ, so it always falls back — silently, today.

🟡 4. No automated tests

Seven files changed, zero _test.go. pairPartitionLeaves is pure, table-driven logic that's straightforward to unit test: boundary match, leaf-count mismatch, empty boundary, duplicate boundary, no-match. Given #1 (CB path can't be exercised on the GP6 test rig), unit tests are the cheapest way to cover the CBDB branch's pairing logic deterministically.

🟢 5. Intermediate partitions (partType == 'i') are not rejected

validatePartTables now continues for Partition == 1 and otherwise only checks existence. A multi-level partition's intermediate level (neither a true root nor a leaf) isn't a root, so it falls to the "does not exist on source" Warn and is silently dropped rather than rejected with a clear error. Minor / rare, but worth an explicit fatal for clarity.


To be clear on severity: #1 and #2 are the ones I'd want resolved (or the scope explicitly narrowed) before merge; #3#5 are quality improvements. Happy to re-review once the CB-family path has a test pass.

@talmacschen-arch

Copy link
Copy Markdown
Collaborator

Following up specifically on the leaf-count-mismatch handling, which I think needs a stronger stance than the current behavior.

Today, in pairPartitionLeaves:

if len(srcLeaves) != len(destLeaves) {
    gplog.Debug("... leaf count mismatch (%v vs %v); falling back to root->root", ...)
    return nil, nil, false
}

→ the caller silently falls back to a whole-tree root->root copy (forced ON SEGMENT). My concern is not just that it's logged at Debug (raised in my earlier review) — it's that two genuinely different situations are collapsed into one silent fallback:

  1. Destination is a plain (non-partitioned) tablelen(destLeaves) == 0. Flattening the whole partition tree into a single table is an unambiguous, intended operation. root->root is correct here. A Warn + fallback is fine.

  2. Destination is itself a partitioned table, but its leaf count / boundaries differ from the source — this is a structural mismatch in a partition-to-partition migration. root->root relies on kernel tuple routing, which gives three outcomes:

    • dest ranges cover all source rows → correct (legitimate re-partition);
    • dest is missing a range and has no DEFAULT partition → COPY errors loudly (safe-ish);
    • dest is missing a range and has a DEFAULT partition → rows silently land in the default partition → silent data misplacement.

The third outcome is the dangerous one, and for a migration-fidelity tool it shouldn't be auto-resolved by a silent fallback. I'd argue case 2 should Fatal by default, printing both leaf counts (and the first non-matching boundary), and only degrade to root->root when the user explicitly opts in (e.g. a --allow-partition-reshape flag) for the legitimate re-partition-on-migrate use case.

Concretely:

  • len(destLeaves) == 0 (plain dest) → Warn + root->root flatten. (keep)
  • len(destLeaves) > 0 and leaf count / boundary differs → Fatal with a clear message, unless --allow-partition-reshape is set → then Warn + root->root.
  • dest root does not exist → Fatal (already correct).

This keeps the boundary-pairing fast path, keeps the legitimate flatten case working, but stops a structural divergence between a partitioned source and a partitioned destination from being silently absorbed (and possibly misrouted into a DEFAULT partition). It also makes the "I deliberately want to re-partition during migration" path an explicit, auditable choice rather than an invisible default.

@leaocx leaocx force-pushed the feat/partition-root-support branch from a255d72 to 3a78ce5 Compare June 24, 2026 02:13

@talmacschen-arch talmacschen-arch left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the substantial rework — this revision is a big step up from the previous one, and most of my earlier concerns are addressed. Summary first, then the one blocking item and two should-fix items.

What's resolved (credit where due)

  • Honest scoping. The PR body now explicitly lists Known scope exclusions (--include-table without --dest-table on CB-family; multi-level partitions). queries_relations.go is untouched in this branch, which is fine now that it's declared rather than implied by the title.

  • The structural-mismatch safety gate. This was my biggest concern last round. Boundary-parse failures (empty/duplicate src or dest boundary, bad name format) now all funnel into unmatched, and the aggregate decision Fatals unless --allow-partition-reshape is set:

    • zero matches → Fatal without the flag (copy_query_wrapper.go:539)
    • partial match → Fatal without the flag (copy_query_wrapper.go:550)

    So a partitioned→partitioned copy can no longer silently reshape/misroute (e.g. into a dst DEFAULT partition) without explicit opt-in. This is the right default.

  • Boundary-based pairing instead of _prt_N name suffix — robust to add/drop-partition name drift. Good call.

🔴 Blocking: the unit tests duplicate the matching logic instead of exercising the real function

copy_query_wrapper_test.go defines its own pairLeaves + splitFqn (the comment even says "DB-free helper mirroring pairPartitionLeaves matching logic"). It never calls the real pairPartitionLeaves. Two consequences:

  1. The safety gate is completely untested. The test-side pairLeaves has no Fatal / allowReshape logic at all — it just returns unmatched. The production function Fatals on partial/zero match without the flag. So the single most safety-critical behavior in this PR — "structural mismatch is fatal unless --allow-partition-reshape" — has zero assertions on it.
  2. The copy will drift. A future change to pairPartitionLeaves leaves the mirrored test green, giving false confidence.

leavesByRoot and normalizeBoundary are tested against the real symbols, which is great — the gap is specifically the pairing orchestration + the gate.

Suggested fix: extract the pure matching core (build destByNorm → loop → produce matched/unmatched) into a DB-free, gplog-free function that both production and tests call. Then split the decision (full / partial / zero → should-fatal?) into a pure function returning a decision/enum, and assert directly that allowReshape=false yields "fatal" for partial and zero. That tests the real wiring and finally covers the gate.

🟡 Should-fix: the 45/45 e2e matrix is a claim the unit tests don't ground

The normalization tests use hand-written representative strings, which assumes the real pg_get_partition_rule_def / pg_get_expr output looks exactly like those literals. If a given version emits a type cast (e.g. FOR VALUES FROM ('2024-01-01'::date) ...) or a multi-column bound, the regex capture won't strip cleanly, the normalized strings won't compare equal across versions, the leaves become unmatched, and the copy Fatals. That fails safe (no silent bad data), but it would break a cross-version copy that should have succeeded.

Could you capture the actual boundary strings emitted by each version (GP6 / GP7 / CBDB) during those e2e runs and paste them into the tests as fixtures? That turns "45/45 passed" into a reproducible assertion and pins the regex against real output rather than assumed output.

🟡 Should-fix: read-error path silently flattens

In pairPartitionLeaves, if reading src/dst leaves errors, it Warns and falls back to root→root flatten (copy_query_wrapper.go:451 / :456), ungated by --allow-partition-reshape. A transient failure to read leaf metadata isn't a structural decision the user made — silently flattening a partitioned copy on a read error seems wrong. I'd make that path Fatal.

🟢 Minor

  • The GP7 range regex assumes single-column, cast-free bounds; multi-column or type-decorated output won't normalize (fails safe). Worth a one-line comment noting the known limitation.
  • ValidateDestTables keeps "does not exists" while validatePartTables was corrected to "does not exist" in the same PR — cosmetic inconsistency.

Holding at Request Changes for the 🔴 test item; the two 🟡 items would be good to fold in while you're here. Nice progress overall.

Allow partition root tables in --include-table / --dest-table /
--exclude-table (previously caused CRITICAL error). Expand partition
roots into leaf-to-leaf pairs matched by partition boundary (not
_prt_N ordinal which drifts after add/drop partition).

Key changes:
- Add --copy-on-segment flag and ForceOnSegment to bypass row-count
  threshold for partition root tables whose reltuples=0
- Pair leaves by normalised range/list/default boundary across GP6,
  GP7, and CBDB (cross-version leaf-to-leaf matching supported)
- GP6 boundary query upgraded from parrangestart::text (internal
  node-tree) to pg_get_partition_rule_def(r.oid, true) for
  human-readable boundary values
- Boundary normalisation: GP6 rule-def and GP7/CBDB pg_get_expr both
  normalise to canonical form (e.g. RANGE:[2024-01-01,2024-02-01))
- Partial matching: maximise leaf-to-leaf pairs by boundary; unmatched
  src leaves copy individually to dst root via partition routing
- Extract matchLeavesByBoundary (pure matching) and decidePairAction
  (pure decision) for direct unit-test coverage
- --allow-partition-reshape flag controls partial-match and zero-match
  behaviour (Fatal without flag, partial/root-fallback with flag)
- Read-error on leaf metadata now Fatal (not silent fallback)
- Promote all fallback reasons from Debug to Warn/Critical with
  actionable hints
- Fix validatePartTables / ValidateDestTables to allow partition roots
- Fix "does not exists" typo in ValidateDestTables
- Propagate ForceOnSegment through fillTablePairChan
- Add 51 unit tests covering boundary normalisation, real DB fixtures,
  cross-version matching, leaf pairing, and pair-action decisions
@leaocx leaocx force-pushed the feat/partition-root-support branch from 3a78ce5 to ef68f87 Compare June 26, 2026 01:10

@talmacschen-arch talmacschen-arch left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 4 review — head ef68f87c

Thanks for the rework. The blocking item from my previous review is fully resolved. This time I also stood up two real clusters and ran the PR binary end-to-end between them, so the findings below are from live cross-version copies, not static reading.

Test environment (real, end-to-end):

  • Source: Greenplum Database 6.27.1 (PostgreSQL 9.4.26), multi-host.
  • Destination: Apache Cloudberry / SynxDB4 — PostgreSQL 14.4 (Apache Cloudberry 2.1.0) (SynxDB4 4.5.0 build 130), multi-host.
  • Binary: this PR built from ef68f87 (cbcopy v1.1.5+dev.6.gef68f87), deployed cluster-wide on both sides.

✅ Resolved since last round

  • Tests now exercise the real production code. matchLeavesByBoundary (pure matching core) and decidePairAction(...) PairAction (pure decision enum) are extracted, and TestMatchLeaves_* / TestDecidePairAction_* call them directly — no more parallel re-implementation in the test. TestDecidePairAction_* asserts the --allow-partition-reshape Fatal gate across all count×flag combinations.
  • Read-leaf failure is now fatal (copy_query_wrapper.go:573/577), not a silent root→root flatten.
  • All boundary-failure paths funnel through decidePairAction; ForceOnSegment is wired end-to-end; "does not exists" typo fixed. go test ./copy/ is green.

✅ End-to-end happy path (CBDB → CBDB)

--dest-table rename of a LIST root and a named date-range root, with the destination pre-built using different leaf names but identical boundaries:

partition "s.sales"->"s.sales_r": all 3 leaves matched by boundary (leaf-to-leaf)
partition "s.evt"->"s.evt_r":     all 2 leaves matched by boundary (leaf-to-leaf)
Database src1: successfully copied 5 tables ... failed 0 tables

Verified the rows landed in the boundary-correct destination leaves (east→_prt_e, west→_prt_w, south→default; jan/feb to the right range leaves), 30/30 and 20/20 rows. When boundaries normalize, the design works correctly.

🔴 Blocking: boundary normalization does not match real GP6 range partitions

On a real GP6 6.27.1 source, pg_get_partition_rule_def(oid, true) returns range-leaf boundaries in a form this PR's normalizeBoundary() does not normalize, so cross-version leaf pairing fails. Strings captured from the live cluster and run through the PR's own function:

Partition (same bounds on both sides) Real GP6 pg_get_partition_rule_def normalizeBoundary Real CBDB pg_get_expr Pairs?
int range, EVERY START (0) END (100) EVERY (100) (unchanged) RANGE:[0,100)
int range, named PARTITION jan START (1) END (101) (unchanged) RANGE:[1,101)
numeric range, EVERY START (0.0) END (10.0) EVERY (10.0) (unchanged) RANGE:[0.0,10.0)
bigint range, EVERY START ('1'::bigint) END ('101'::bigint) EVERY (100) (unchanged) RANGE:[1,101)
date range, EVERY START ('2024-01-01'::date) END ('2024-02-01'::date) EVERY ('1 mon'::interval) (unchanged) RANGE:[2024-01-01,2024-02-01)
date range, named PARTITION jan START ('2024-01-01'::date) END (…) RANGE:[…) RANGE:[…)
list / default PARTITION pa VALUES('apple', 'banana') / DEFAULT PARTITION other LIST:(…) / DEFAULT same

End-to-end confirmation (GP6 → CBDB, identical Jan/Feb/Mar bounds on both sides), real run:

# default (no flag) — hard abort, no data moved:
[CRITICAL] leaf pairing for "e2e.m"->"s.m_r": no boundary matches found
           (boundary format may differ across versions). Use --allow-partition-reshape ...

# with --allow-partition-reshape — root→root flatten ON SEGMENT; data lands correctly
# (30/30 rows, routed by the dest's own constraints) but per-leaf parallelism is lost:
[WARNING] ... falling back to root->root ON SEGMENT (per --allow-partition-reshape)

Two root causes:

  1. Dispatch by prefix misses EVERY-generated range leaves (dominant). normalizeBoundary (boundary_normalize.go:29-37) routes on the leading token FOR VALUES / PARTITION / DEFAULT PARTITION. But GP6 range partitions defined with EVERY — a very common idiom (monthly/daily ranges) — come back as a bare START (…) END (…) EVERY (…) with no PARTITION <name> prefix, so they hit the final return s and are never normalized. This affects every range type.

  2. reGP6RuleRange requires quoted bounds; GP6 doesn't quote int4/numeric. Even when the PARTITION prefix is present (explicitly named partitions), reGP6RuleRange (boundary_normalize.go:50) requires START\s*\('([^']+)'…. Real GP6 emits unquoted int4 and numeric bounds (START (1), START (0.0)); only bigint/date/text are quoted with a cast.

The captured-fixture tests in boundary_normalize_test.go use a PARTITION p1 START ('1'::integer) … WITH (…) shape that real GP6 6.27.1 does not emit for these cases (no EVERY; int4 is not quoted), which is why the unit tests stay green while the feature fails on real data.

Severity. Fail-safe — a mismatch produces a clear Fatal by default, or a correct-but-slower root→root flatten under --allow-partition-reshape (verified: data lands correctly), so there is no data corruption. But the headline leaf-to-leaf pairing silently never engages for the most common GP6 range tables on the --dest-table path this PR is built for.

Suggested direction (verified locally; leaving the implementation to you)

I tried a minimal change and re-ran the same GP6→CBDB copy; both the EVERY integer and date ranges then paired leaf-to-leaf (all 3 leaves matched by boundary), and all existing copy/ unit tests stayed green. Two parts:

  • Dispatch on content, not just the leading token — accept a bare START (…) (with an optional PARTITION <name> prefix and trailing EVERY (…) / WITH (…)):
    if strings.HasPrefix(upper, "PARTITION ") || strings.HasPrefix(upper, "DEFAULT PARTITION") ||
        strings.HasPrefix(upper, "DEFAULT") || strings.HasPrefix(upper, "START") ||
        strings.Contains(upper, " START (") || strings.Contains(upper, "VALUES") {
        return normalizeGP6RuleDef(s)
    }
  • Make the bound quotes/cast optional so bare int4/numeric and quoted date/bigint both parse:
    reGP6RuleRange = regexp.MustCompile(`(?i)START\s*\(\s*'?([^')]+?)'?(?:::\w+)?\s*\)\s*(INCLUSIVE|EXCLUSIVE)?\s*END\s*\(\s*'?([^')]+?)'?(?:::\w+)?\s*\)\s*(INCLUSIVE|EXCLUSIVE)?`)

Regression tests worth adding (fail on ef68f87, pass after the fix)

These use boundary strings captured verbatim from the live GP6.27.1 / CBDB clusters; they would have caught this:

func TestRealGP6_CrossVersionPairing(t *testing.T) {
    cases := []struct{ name, gp6, cbdb string }{
        {"date range via EVERY", "START ('2024-01-01'::date) END ('2024-02-01'::date) EVERY ('1 mon'::interval)", "FOR VALUES FROM ('2024-01-01') TO ('2024-02-01')"},
        {"int range via EVERY",  "START (0) END (100) EVERY (100)",      "FOR VALUES FROM (0) TO (100)"},
        {"int range, named",     "PARTITION jan START (1) END (101)",    "FOR VALUES FROM (1) TO (101)"},
        {"numeric range, EVERY", "START (0.0) END (10.0) EVERY (10.0)",  "FOR VALUES FROM (0.0) TO (10.0)"},
    }
    for _, c := range cases {
        if g, b := normalizeBoundary(c.gp6), normalizeBoundary(c.cbdb); g != b {
            t.Errorf("%s: GP6 %q -> %q  !=  CBDB %q -> %q", c.name, c.gp6, g, c.cbdb, b)
        }
    }
}

I'd recommend grounding the GP6 fixtures in boundary_normalize_test.go on real pg_get_partition_rule_def output (the EVERY form and unquoted int/numeric), since those are what production GP6 actually emits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants