Details
-
Sub-task
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
1.3.2, 1.4.0
-
None
Description
Currently, we always delete checkpoint handles if they (or the data from the DFS) cannot be read: https://github.com/apache/flink/blob/91a4b276171afb760bfff9ccf30593e648e91dfb/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/ZooKeeperCompletedCheckpointStore.java#L180
This can lead to problems in case the DFS is temporarily not available, i.e. we could inadvertently
delete all checkpoints even though they are still valid.
A user reported this problem on the mailing list: https://lists.apache.org/thread.html/9dc9b719cf8449067ad01114fedb75d1beac7b4dff171acdcc24903d@%3Cuser.flink.apache.org%3E
Attachments
Issue Links
- causes
-
FLINK-22502 DefaultCompletedCheckpointStore drops unrecoverable checkpoints silently
- Resolved
- links to