[FLINK-2250] Backtracking of intermediate results - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Closed
Priority: Major
Resolution: Incomplete
Affects Version/s: None
Fix Version/s: None
Component/s: API / DataSet, Runtime / Coordination
Labels:
None

Description

With intermediate results available in the distributed runtime as of ~~FLINK-986~~, we could now incrementally resume failed jobs if we cached the results ~~FLINK-1404~~. Moreover, Flink users could build incremental Flink jobs using count/collect/print and, ultimately, also continue from old job results in an interactive shell environment like the scala-shell.

The following tasks need to be completed for that to happen:

Cache the results
Keep the ExecutionGraph in the JobManager
Change the scheduling mechanism to track back the results from the sinks
Implement a session management to eventually discard old results and ExecutionGraphs from the TaskManagers/JobManager

Attachments

Issue Links

is duplicated by

FLINK-6110 Flink unnecessarily repeats shared work triggered by different blocking sinks, leading to massive inefficiency

Closed

is related to

FLINK-986 Add intermediate results to distributed runtime

Resolved

is superceded by

FLINK-10429 Redesign Flink Scheduling, introducing dedicated Scheduler component

Closed

relates to

FLINK-2245 Programs that contain collect() reported as multiple jobs in the Web frontend

Closed

FLINK-1730 Add a FlinkTools.persist style method to the Data Set.

Closed

Sub-Tasks

There are no Sub-Tasks for this issue.

Activity

People

Assignee:: Maximilian Michels

Reporter:: Maximilian Michels

Votes:: 3 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 19/Jun/15 12:31

Updated:: 26/Feb/19 15:56

Resolved:: 26/Feb/19 15:56