[flink] Preserve log-only source restore mode by luoyuxia · Pull Request #3355 · apache/fluss

luoyuxia · 2026-05-20T03:30:20Z

Purpose

Linked issue: close #3354

Preserve the original Fluss-only (non-lake) startup semantics after a Flink source restore. When the source starts from a specified Fluss log position, it should not initialize lake snapshot splits after restore just because lakeSource is available again.

Brief change log

Persist an empty remainingHybridLakeFlussSplits list when the enumerator checkpoints with lakeSource == null.
Allow empty pending hybrid split lists to deserialize without lakeSource, while still requiring lakeSource for non-empty hybrid split state.
Document the three-state meaning of pendingHybridLakeFlussSplits so null, empty, and non-empty states are explicit.
Add a restore regression test that checkpoints a Fluss-only enumerator, restores it with a non-null lakeSource and a registered lake snapshot, and verifies no LakeSnapshotSplit is generated.
Add a serializer round-trip test for empty pending hybrid split state with lakeSource == null.

Tests

./mvnw -pl fluss-flink/fluss-flink-common -Dtest=SourceEnumeratorStateSerializerTest,FlinkSourceEnumeratorTest#testRestoreFlussOnlySourceWithLakeSourceDoesNotGenerateLakeSplits test
./mvnw -pl fluss-flink/fluss-flink-common spotless:apply
git diff --check

API and Format

No public API, storage format, or checkpoint serializer version change. The fix only changes the value stored in the existing enumerator state field for Fluss-only source checkpoints and permits that empty-state sentinel to deserialize without a lake source.

Documentation

No new feature or user-facing configuration change.

Generative AI disclosure

Yes: OpenAI Codex was used to author this PR following the repository AGENTS.md guidance.

luoyuxia · 2026-05-20T07:29:26Z

@loserwang1024 Could you please help review

loserwang1024 · 2026-05-20T08:43:57Z

+                // Preserve Fluss-only (non-lake) startup across restore. Otherwise a restored
+                // enumerator with a non-null lakeSource would treat null as "not initialized yet"
+                // and generate lake snapshot splits.
+                lakeSource == null ? Collections.emptyList() : pendingHybridLakeFlussSplits;


I understand what you mean, but the code seems hard to understand without sufficient context. I've also thought about this:

If the enumerator is created by FlinkSource#createEnumerator, it indicates a stateless restart. Therefore, whether to generate lake splits depends on whether it's a LakeSource.

If the enumerator is created by FlinkSource#restoreEnumerator, there's no need to generate lake splits again. This is because before the first checkpoint is taken, FlinkSourceEnumerator#start → FlinkSourceEnumerator#generateHybridLakeFlussSplits has already been executed. Thus, upon restoration, the lake splits do not need to be regenerated.

Therefore, even if the job was previously started from a specified timestamp, according to this logic, as long as a checkpoint has been taken, upon stateful restart it will not read the lake splits again.

I agree the original approach in snapshotState() is hard to understand without sufficient context.

I've reworked the fix to use the checkpointTriggeredBefore flag in generateHybridLakeFlussSplits() instead. While the restore-awareness logic still requires some thought, this approach keeps all the complexity contained within a single method rather than spreading it across snapshotState().

Additionally, I changed startInStreamModeForNonPartitionedTable to call generateHybridLakeFlussSplits() synchronously, consistent with the partitioned-table path in start(). This ensures lake split initialization always completes before any checkpoint can be triggered, which is a prerequisite for the checkpointTriggeredBefore guard to work correctly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

luoyuxia · 2026-05-20T09:45:01Z

@loserwang1024 Comments has been addressed.

luoyuxia force-pushed the fix-flink-log-only-restore-lake-splits branch 2 times, most recently from cdabb47 to 71b01aa Compare May 20, 2026 03:55

[flink] Preserve log-only source restore mode

a01bd57

luoyuxia force-pushed the fix-flink-log-only-restore-lake-splits branch from 71b01aa to a01bd57 Compare May 20, 2026 05:54

luoyuxia requested a review from loserwang1024 May 20, 2026 07:28

loserwang1024 reviewed May 20, 2026

View reviewed changes

address comment

e58b0cb

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

luoyuxia requested a review from loserwang1024 May 20, 2026 09:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[flink] Preserve log-only source restore mode#3355

[flink] Preserve log-only source restore mode#3355
luoyuxia wants to merge 2 commits into
apache:mainfrom
luoyuxia:fix-flink-log-only-restore-lake-splits

luoyuxia commented May 20, 2026 •

edited

Loading

Uh oh!

luoyuxia commented May 20, 2026

Uh oh!

loserwang1024 May 20, 2026

Uh oh!

luoyuxia May 20, 2026

Uh oh!

luoyuxia commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

luoyuxia commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Brief change log

Tests

API and Format

Documentation

Generative AI disclosure

Uh oh!

luoyuxia commented May 20, 2026

Uh oh!

loserwang1024 May 20, 2026

Choose a reason for hiding this comment

Uh oh!

luoyuxia May 20, 2026

Choose a reason for hiding this comment

Uh oh!

luoyuxia commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

luoyuxia commented May 20, 2026 •

edited

Loading