[FLINK-13093] Provide an easy way to modify max parallelism using the State Processor API - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: API / DataStream, Runtime / State Backends
Labels:
- auto-unassigned

Description

Currently, the State Processor API does not easily allow one to modify the max parallelism of a job. To do that with the current state of the API, one would have to read all state that exists in a loaded ExistingSavepoint, extract them as DataSet}}s, and then create a {{NewSavepoint that has the new max parallelism with all the extracted data sets bootstrapped as new state.

It would be nice if the user could simply do something like the following (API is TBD):

ExistingSavepoint savepoint = Savepoint.load("path", env, backend);
savepoint.modifyMaxParallelism("newPath", newParallelism);

Under the hood, a batch job is launched that repartitions all existing operator state using the new max parallelism, and writes to the re-partitioned state data into the new savepoint path.

The API can be designed such that the user cannot modify the max parallelism and add / new operators at the same time, to not over complicate the batch job.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Tzu-Li (Gordon) Tai

Votes:: 2 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 04/Jul/19 07:37

Updated:: 27/Apr/21 23:03