[FLINK-5053] Incremental / lightweight snapshots for checkpoints - ASF JIRA

Attach files

Attach Screenshot

Voters

Watch issue

Watchers

Create sub-task

Link

Clone

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Implemented
Affects Version/s: None
Fix Version/s: None
Component/s: Runtime / State Backends
Labels:
None

Description

There is currently basically no difference between savepoints and checkpoints in Flink and both are created through exactly the same process.

However, savepoints and checkpoints have a slightly different meaning which we should take into account to keep Flink efficient:

Savepoints are (typically infrequently) triggered by the user to create a state from which the application can be restarted, e.g. because Flink, some code, or the parallelism needs to be changed.

Checkpoints are (typically frequently) triggered by the System to allow for fast recovery in case of failure, but keeping the job/system unchanged.

This means that savepoints and checkpoints can have different properties in that:

Savepoint should represent a state of the application, where characteristics of the job (e.g. parallelism) can be adjusted for the next restart. One example for things that savepoints need to be aware of are key-groups. Savepoints can potentially be a little more expensive than checkpoints, because they are usually created a lot less frequently through the user.

Checkpoints are frequently triggered by the system to allow for fast failure recovery. However, failure recovery leaves all characteristics of the job unchanged. This checkpoints do not have to be aware of those, e.g. think again of key groups. Checkpoints should run faster than creating savepoints, in particular it would be nice to have incremental checkpoints.

For a first approach, I would suggest the following steps/changes:

In checkpoint coordination: differentiate between triggering checkpoints
and savepoints. Introduce properties for checkpoints that describe their set of abilities, e.g. "is-key-group-aware", "is-incremental".

In state handle infrastructure: introduce state handles that reflect incremental checkpoints and drop full key-group awareness, i.e. covering folders instead of files and not having keygroup_id -> file/offset mapping, but keygroup_range -> folder?

Backend side: We should start with RocksDB by reintroducing something similar to semi-async snapshots, but using BackupableDBOptions::setShareTableFiles(true) and transferring only new incremental outputs to HDFS. Notice that using RocksDB's internal backup mechanism is giving up on the information about individual key-groups. But as explained above, this should be totally acceptable for checkpoints, while savepoints should use the key-group-aware fully async mode. Of course we also need to implement the ability to restore from both types of snapshots.

One problem in the suggested approach is still that even checkpoints should support scale-down, in case that only a smaller number of instances is left available in a recovery case.

Attachments

Sub-Tasks

Create Sub-Task

There are no Sub-Tasks for this issue.

Activity

Comment

This comment will be Viewable by All Users Viewable by All Users

Cancel

People

Assignee:: Xiaogang Shi

Reporter:: Stefan Richter

Votes:: 1 Vote for this issue

Watchers:: 17 Start watching this issue

Dates

Created:: 11/Nov/16 10:59

Updated:: 07/Jun/17 08:11

Resolved:: 07/Jun/17 08:11

Agile

View on Board

Incremental / lightweight snapshots for checkpoints

Details

Description

Attachments

Attachments

Sub-Tasks

Activity

People

Dates

Agile

Slack

Issue deployment