[4.0-dev] fix data branch no-op update diff#25159
Conversation
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
XuPeng-SH
left a comment
There was a problem hiding this comment.
I re-checked this PR from a first-principles / unhappy-path angle. The no-op diff suppression is moving in the right direction, but I still see one blocker in the current head.
Blocker: reverted same-PK update chains can still emit a false UPDATE
The current handleDelsOnLCA path still decides “on LCA” from PK existence only. That is not strong enough for a repeated same-PK chain like:
- LCA row =
A - intermediate update =
B - later revert =
A
In that shape, both tombstones probe the same LCA row by PK, so the intermediate tombstone can be removed from tombBat before the rowid-based local-cancel path runs. That leaves the intermediate inserted row in dataHashmap, and the later compare logic only suppresses the final A == A no-op row — it can still emit the intermediate B as an UPDATE.
Why I think this is real
- The LCA join is built on PK only (
right join (values ...) ... on lca.pk = pks.pk ... order by pks.__idx_). - Every joined row is copied into
dBat, so duplicate PK probes remain duplicated. - The rowid-based local insert/delete cancel-out logic only runs on rows that are still left in
tombBatafter the LCA classification. - The update detector then processes each
dBatrow independently.
Suggested fix
- Do not decide “on LCA” from PK existence alone.
- Reconcile tombstones against local
dataHashmapby rowid before (or independently of) the PK-only LCA probe, or otherwise ensure only the tombstone that actually deletes the LCA version reaches the update/no-op comparison path. - Please also update the unit coverage so duplicate PK probes from the LCA join are represented explicitly, not just a single LCA hit.
Given that the PR is specifically trying to make repeated-update / revert chains no-op, this remaining path still looks like a blocker to me.
I checked this path against the current PR head. Classification: not applicable as a blocker for the current code path. The key premise is that both same-PK tombstones can reach I verified this two ways:
Verification: No production fallback or special-case logic was added. Update after rebase: conflict with latest |
2cb21c3 to
3eeec9d
Compare
XuPeng-SH
left a comment
There was a problem hiding this comment.
Reviewed current head (3eeec9d) from a correctness / unhappy-path angle.\n\nI re-checked the previous same-PK A -> B -> A concern against the real hashDiff path. The current buildHashmapForTable path compacts tombstones by PK before LCA probing: it keeps the latest tombstone, removes stale tombstones, and removes stale data rows older than the latest tombstone. That means the LCA probe receives one candidate for this chain, and the new tuple-vs-LCA-row comparison suppresses the final no-op when the visible row matches the LCA row.\n\nI also checked the fake-PK path, migrated hashmap close path, compare normalization, batch ownership around update emission, and the reader/SQL LCA result post-processing. I do not see a remaining blocker.\n\nLocal verification:\n- git diff --check origin/4.0-dev...HEAD\n- go test ./pkg/frontend -run 'TestCompareTupleWithBatchRow|TestCompareTupleValueWithVectorNormalizesValues|TestDataBranchCompareValueErrors|TestNormalizeCompareValueSupportedForms|TestHashDiff_HasLCANoopUpdateFiltering|TestHashDiff_HasLCAFakePKUpdateIsNotNoop|TestHashDiff_FakePKClosesMigratedHashmaps|TestLCAProbeJoinCastType' -count=1\n- go test ./pkg/frontend -run 'TestHashDiff|TestRunLCAProbe|TestCompareRowInWrappedBatches|TestCompareTuple|TestDataBranchCompare|TestNormalizeCompare|TestLCAProbeJoinCastType' -count=1\n- go test ./pkg/frontend/databranchutils -count=1\n\nApproved.
|
Tick the box to add this pull request to the merge queue (same as
|
Summary
Related
Refs #23751
Port of #25149
Tests