Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Done
-
None
Description
At the moment failure handling of asynchronously triggered checkpoint in check point coordinator happens in different places. We could organise it similar way as failure handling of synchronous triggering of checkpoint in CheckpointTriggerResult where we classify error cases. This will simplify e.g. integration of error counter for FLINK-4810.
See also discussion here: https://github.com/apache/flink/pull/6567
The specific design document : https://docs.google.com/document/d/1ce7RtecuTxcVUJlnU44hzcO2Dwq9g4Oyd8_biy94hJc/edit?usp=sharing
Attachments
Issue Links
- blocks
-
FLINK-12209 Refactor failure handling with CheckpointFailureManager
- Closed
- is depended upon by
-
FLINK-4810 Checkpoint Coordinator should fail ExecutionGraph after "n" unsuccessful checkpoints
- Closed
- is related to
-
FLINK-11662 Discarded checkpoint can cause Tasks to fail
- Closed
- relates to
-
FLINK-12058 Cancel checkpoint operations belonging to a discarded/aborted checkpoint
- Closed
- links to