Details
-
Sub-task
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
None
Description
Follow up the test for https://issues.apache.org/jira/browse/FLINK-32070
1.20 is the MVP version for FLIP-306. It is a little bit complex and should be tested carefully. The main idea of FLIP-306 is to merge checkpoint files in TM side, and provide new StateHandles to the JM. There will be a TM-managed directory under the 'shared' checkpoint directory for each subtask, and a TM-managed directory under the 'taskowned' checkpoint directory for each Task Manager. Under those new introduced directories, the checkpoint files will be merged into smaller file set. The following scenarios need to be tested, including but not limited to:
- With the file merging enabled, periodic checkpoints perform properly, and the failover, restore and rescale would also work well.
- Switch the file merging on and off across jobs, checkpoints and recovery also work properly.
- There will be no left-over TM-managed directory, especially when there is no cp complete before the job cancellation.
- File merging takes no effect in (native) savepoints.
Besides the behaviors above, it is better to validate the function of space amplification control and metrics. All the config options can be found under 'execution.checkpointing.file-merging'.
Attachments
Attachments
Issue Links
- is blocked by
-
FLINK-35778 Escape URI reserved characters when creating file-merging directories
- Resolved
-
FLINK-35784 The cp file-merging directory not properly registered in SharedStateRegistry
- Resolved
- mentioned in
-
Page Loading...