Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
1.15.0, 1.15.1
Description
- When native checkpoint is enabled and incremental checkpointing is enabled in rocksdb statebackend,if state data is greater than state.storage.fs.memory-threshold,it will be stored in a data file (FileStateHandle,RelativeFileStateHandle, etc) rather than stored with ByteStreamStateHandle in checkpoint metadata, like base-path1/chk-1/file1.
- Then restore the job from base-path1/chk-1 in claim mode,using changelog statebackend,and the checkpoint path is set to base-path2, then new checkpoint will be saved in base-path2/chk-2, previous checkpoint file (base-path1/chk-1/file1) is needed.
- Then restore the job from base-path2/chk-2 in changelog statebackend, flink will try to read base-path2/chk-2/file1, rather than the actual file location base-path1/chk-1/file1, which leads to FileNotFoundException and job failed.
How to reproduce?
- Set state.storage.fs.memory-threshold to a small value, like '20b'.
- run org.apache.flink.test.checkpointing.ChangelogPeriodicMaterializationSwitchStateBackendITCase#testSwitchFromDisablingToEnablingInClaimMode
Attachments
Issue Links
- relates to
-
FLINK-25872 Restoring from non-changelog checkpoint with changelog state-backend enabled in CLAIM mode discards state in use
- Closed
-
FLINK-28699 Native rocksdb full snapshot in non-incremental checkpointing
- Resolved
- links to