[FLINK-12058] Cancel checkpoint operations belonging to a discarded/aborted checkpoint - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Duplicate
Affects Version/s: 1.7.2, 1.8.0
Fix Version/s: None
Component/s: Runtime / Checkpointing
Labels:
None

Description

In order to save CPU cycles and reduce disk and network I/O, we should try to cancel local checkpoint operations belonging to discarded aborted or subsumed checkpoints. For example, if a Task declines a checkpoint, the CheckpointCoordinator will discard this checkpoint. However, other checkpointing operations belonging to this checkpoint won't be necessarily notified and canceled.

The notification mechanism could piggy back on the existing CancelCheckpointMarker or be a separate signal sent to all participating Tasks.

Attachments

Issue Links

duplicates

FLINK-8871 Checkpoint cancellation is not propagated to stop checkpointing threads on the task manager

Closed

is related to

FLINK-10724 Refactor failure handling in check point coordinator

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Till Rohrmann

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 28/Mar/19 14:21

Updated:: 02/Oct/19 17:43

Resolved:: 22/May/19 09:59