to 4.0: fix(data-branch): unify diff/merge collect-range over arbitrary DAG depth (cherry-pick #24371 to 4.0-dev)#24378
Merged
mergify[bot] merged 6 commits intoMay 14, 2026
Conversation
`data branch diff` previously walked only the two endpoint relations when computing change streams. Mutations performed on intermediate ancestors (e.g. t1 in a chain t0->t1->t2) that were inherited into the endpoint via cloning never reached the diff, causing rows to silently disappear from `diff t2 against t0`. Replace the old case-by-case branches (LCA == base / LCA == tar / LCA elsewhere with three sub-conditions each) with a single ancestor walk driven by the LCA->endpoint paths captured from the branch DAG. For each ancestor A on path[LCA, endpoint] we collect A's own mutations within (A.CTS, nextChildOnPath.CloneTS] (or (A.CTS, endpointSP] when A is the endpoint). The LCA's prefix that the other side inherits identically is pruned to skip redundant IO. The model generalizes to any tree depth and any pair of nodes. Add `DataBranchDAG.PathFromAncestor` and the corresponding chain fields on `branchMetaInfo` to thread the path information from the DAG resolver to the collect-range builder. The new `buildSideCollectRange` helper materializes one side of the diff uniformly; the historical case-2/3/4 logic is gone. BVT additions exercise the new code path: * test/distributed/cases/git4data/branch/diff/diff_12.sql: 12 cases covering linear chains, mid-chain writes, snapshot mixes, no-PK chains, and post-fork ancestor mutations. * test/distributed/cases/git4data/branch/diff/diff_13.sql: one giant tree case (19 nodes, depth 4, 10 leaves) with 10 diff pairs across non-adjacent levels. Whole `git4data/branch/diff/` BVT now passes 1317/1317. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…esolver
Follow-up to the unified collect-range fix. No semantic change; BVT
still passes 1317/1317.
* Drop stale comments left over from the pre-unified design:
- "1. has no lca / 2. ..." ASCII case list in
decideLCABranchTSFromBranchDAG that described the old branch_t1_ts
/ branch_t2_ts model.
- Lone "Boundary semantics:" header in decideCollectRange whose
body was already absorbed into the new running example block.
- "// has no lca" trailing comment in diffOnBase that contradicted
the lca-handling block right above it.
* Refactor decideLCABranchTSFromBranchDAG so the LCA path information
is computed in a single linear flow (FindLCA → PathFromAncestor →
switch on lcaTableID) instead of a `var (...)` block + defer that
re-wired branchInfo from outside. Adds a small `buildTSs` helper
so the int64-to-types.TS lift no longer repeats four times. The
function-level doc comment now states what each output field is
for and who consumes it downstream.
* Drop an unused error-wrap path: the previous code re-wrapped
`dag.GetCloneTS` not-found into an internal error even though the
DAG always populates CloneTS for any node it knows about (FindLCA
has already validated descent). The simpler `_, _ := dag.GetCloneTS`
matches the intent.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eliminate the last vestiges of the pre-unified design where
branchMetaInfo carried three pieces of information that all derive
from the LCA->endpoint paths:
* lcaType (lcaEmpty / lcaLeft / lcaRight / lcaOther) — was set in
five places but only ever read at one decision point ("is the
side empty?"); the other reads forwarded it through diffDataHelper
and buildHashmapForTable as a *dead parameter*. Replace the gate
with branchMetaInfo.hasLCA() and remove the dead parameter
entirely.
* tarBranchTS / baseBranchTS — per-side LCA snapshot used by
findDeleteAndUpdateBat and diffOnBase. Both values are exactly
pathFromLCAToOther[1].cloneTS (or otherSP when the other endpoint
is the LCA). Expose three small helpers on branchMetaInfo —
tarLCASnapshot, baseLCASnapshot, lcaProbeSnapshot — and let the
consumers compute on demand.
Other touch-ups in the same spirit:
* decideLCABranchTSFromBranchDAG now also handles the
same-table-id-no-DAG case directly with synthetic single-node
paths, instead of leaking the special-case setup into
decideCollectRange. The function is once again single-purpose.
* The lcaEmpty / lcaLeft / lcaRight / lcaOther constants are gone.
* data_branch_hashdiff_test.go updated for the new signatures.
Whole `git4data/branch/diff/` BVT still passes 1317/1317. Frontend
unit tests pass.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The previous refactor swapped the meaning of pathFromLCAToTar vs
pathFromLCAToBase in the LCA-snapshot helpers, which broke
no-PK / fake-PK merge cases (merge_6, merge_7) where tombstones on
the descendant side were resolved against the wrong moment of the
LCA. The right semantics:
tarLCASnapshot returns the moment tar's view of the LCA was
anchored — tar's first-child CloneTS when tar forked off the LCA,
otherwise base's first-child CloneTS, otherwise baseSP. The
baseLCASnapshot mirror works the same way.
This matches the original `tarBranchTS = CloneTS(tarFirstChild)`
behavior. After the fix the whole merge/ BVT (merge_1..7) goes back
to 100%, and the chain merge_8.sql (172 statements over 9 cases)
also passes 100%.
Includes:
* test/distributed/cases/git4data/branch/merge/merge_8.{sql,result}
— chain merge BVT covering cascade up the chain, direct grandchild
-> grandparent merge, conflict skip/accept across multiple chain
depths, snapshot-scoped, no-PK chain, alternating-round merges,
and order-of-merges regression.
* Regenerated diff_12.result / diff_13.result against the corrected
helpers (no semantic change, only the LCA probe TS that drives
tombstone resolution).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* SCA fail (gofmt): the ASCII tree art in three doc comments needed tab-indented continuation lines. Run gofmt -w; no semantic change. * Must Fix (Qodo / 7-jury review): PathFromAncestor's ok=false return was discarded with `_`, which would leak nil slices into buildSideCollectRange and panic at `selfPath[len-1]`. Propagate it as an internal error so a corrupted DAG fails fast and loud instead of crashing the diff. * Should Fix: add defensive length check before ctsList[0] in buildSideCollectRange — getTablesCreationCommitTS theoretically could return an empty slice on RPC corner cases. * Should Fix: bound PathFromAncestor's walk by maxBranchDAGDepth (=1024) so a Parent-pointer cycle in the DAG cannot cause an infinite loop. Real chains never approach this; deep observed trees sit around 16 levels. * Should Fix: extract tableStuff.resolvedSnapshots so the per-side snapshot derivation lives in one place. Replaces three duplicated copies in decideCollectRange, diffOnBase, and hashDiffIfHasLCA — the same triple-duplication that previously hid the LCA-snapshot mismatch bug fixed in dd0205e. Whole `git4data/branch/diff/` BVT still 1317/1317. Whole `git4data/branch/merge/` BVT still 466/466. Frontend unit tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
Open
1 task
XuPeng-SH
approved these changes
May 13, 2026
Contributor
Merge Queue Status
This pull request spent 19 seconds in the queue, including 2 seconds running CI. Required conditions to merge
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Backport of #24371 to
4.0-dev.What type of PR is this?
Which issue(s) this PR fixes:
issue #21979 (cherry-picked from #24371)
What this PR does / why we need it:
Cherry-picks the 5 substantive commits from #24371 (skips the
intermediate
Merge branch 'main'merge commit, which only resolvescontext against
main). All commits applied cleanly with no conflicts.0b566bbe75b0f44f14d5e844aaf7ffb63cdac409e193420425The final tree state of all 11 modified files (
pkg/frontend/data_branch*.go,pkg/frontend/databranchutils/branch_dag.go, and the new BVT cases undertest/distributed/cases/git4data/branch/{diff,merge}/) is byte-identicalto the head of #24371 (verified via per-file shasum diff).
TL;DR (from #24371)
data branch diffanddata branch mergepreviously dropped rowssilently whenever the LCA → endpoint chain in the branch DAG had any
intermediate node that performed inserts/updates/deletes before the
endpoint was forked. Symptom: rows that the endpoint inherits via
multi-level cloning were never compared on its side of the diff, so
downstream merges failed to migrate them too. The fix replaces the
historical case-by-case logic with a single ancestor-walking model
that works uniformly for any DAG depth and shape.
Refer to the original PR for the full bug analysis, fix design, and
running 19-node tree example used throughout the new code comments.
Verification on 4.0-dev
pkg/frontendbuild +go vetclean.pkg/frontendunit tests pass (includingTestHashDiff_NoLCAWithStubHandles,TestDiffDataHelper_*,TestRunLCAProbeWithReaderFallback_*,TestHandleDelsOnLCA_*,TestLCAProbeJoinCastType).gofmt -lclean on all five modified Go files.diff_12,diff_13,merge_8) included with theircorresponding
.resultfiles.Special notes for your reviewer:
4.0-dev(no conflicts), so thediff against
4.0-devmatches the diff againstmainon the sourcebranch (i.e. nothing on
mainbetween the two relevant base commitstouches the migrated files).
Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com