[FLINK-10653] Introduce Pluggable Shuffle Service Architecture - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.9.0
Component/s: Runtime / Network
Labels:
None

Description

This is the umbrella issue for improving shuffle architecture.

Shuffle is the process of data transfer between stages, which involves in writing outputs on sender side and reading data on receiver side. In flink implementation, it covers three parts of writer, transport layer and reader separately which are uniformed for both streaming and batch jobs.

In detail, the current ResultPartitionWriter interface on upstream side only supports in-memory outputs for streaming job and local persistent file outputs for batch job. If we extend to implement another writer such as DfsWriter, RdmaWriter, SortMergeWriter, etc based on ResultPartitionWriter interface, it has not the unified mechanism to extend the reader side accordingly.

In order to make the shuffle architecture more flexible and support more scenarios especially for batch jobs, a high level shuffle architecture is necessary to manage and extend both writer and reader sides together.

Refer to the design doc for more details.

Attachments

Issue Links

is a parent of

FLINK-13247 Implement external shuffle service for YARN

Open

FLINK-11805 A Common External Shuffle Service Framework

Reopened

FLINK-13246 Implement external shuffle service for Kubernetes

Reopened

relates to

FLINK-1833 Refactor partition availability notification in ExecutionGraph

Closed

supercedes

FLINK-1833 Refactor partition availability notification in ExecutionGraph

Closed

links to

(1 links to)

Sub-Tasks

1.

Introduce ResultPartitionWithConsumableNotifier in task for notifying consumable result partition

Resolved

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

2.

Remove the schedule mode property from RPDD to TDD

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

3.

Introduce ShuffleMaster in Job Master

Resolved

Andrey Zagrebin

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 40m

4.

Introduce ShuffleEnvironment in Task Executor

Resolved

Andrey Zagrebin

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

5.

Activate default shuffle implementation and remove legacy code

Closed

Zhijiang

6.

Extend the necessary methods in ResultPartitionWriter interface

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

7.

Make ResultPartitionWriter extend AutoCloseable

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

8.

Make InputGate interface extend AutoCloseable

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

9.

Remove IOMode from NetworkEnvironment

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

10.

Remove KvState related components from NetworkEnvironment

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

11.

Refactor the creation of ResultPartition and InputGate into NetworkEnvironment

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

12.

Replace IntermediateResultPartitionID with ResultPartitionID in ResultPartitionDeploymentDescriptor

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 10m

13.

Refactor to simplify the process of scheduleOrUpdateConsumers

Resolved

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

14.

Refactor the constructor of NetworkEnvironment

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

15.

Abstract TaskEventPublisher interface for simplifying NetworkEnvironment

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

16.

Move network related options to NetworkEnvironmentOptions

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

17.

Remove legacy fields for SingleInputGate

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

18.

Remove unregister task from NetworkEnvironment to simplify the interface of ShuffleService

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

19.

Introduce partition/gate setup to decouple task registration with NetworkEnvironment

Closed

Andrey Zagrebin

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 40m

20.

Refactor IOMetrics to not distinguish between local/remote in/out bytes

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

21.

Introduce InputGateWithMetrics in Task to increment numBytesIn metric

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

22.

Consider introducing batch metric register in NetworkEnviroment

Closed

Zhijiang

23.

Refactor ResultPartitionManager to break tie with Task

Resolved

Andrey Zagrebin

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

24.

Move network metrics setup into NetworkEnvironment

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

25.

Refactor the start method of ConnectionManager

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

26.

Move Task.inputGatesById to NetworkEnvironment

Closed

Andrey Zagrebin

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

27.

Introduce an encapsulated metric group layout for shuffle API and deprecate old one

Closed

Andrey Zagrebin

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

28.

Remove getBufferProvider method from ResultPartitionWriter interface

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

29.

Switch Task from ResultPartition to ResultPartitionWriter interface

Closed

Andrey Zagrebin

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

30.

Make NetworkEnvironment#start() return the binded data port

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

31.

Remove getOwningTaskName method from InputGate

Resolved

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

32.

Refactor abstract InputGate to general interface

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

33.

Introduce NetworkEnvironment.getUnreleasedPartitions instead of using getResultPartitionManager

Closed

Andrey Zagrebin

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

34.

Introduce ShuffleService interface and its configuration

Resolved

Andrey Zagrebin

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

35.

Make shuffle environment implementation independent with IOManager

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

36.

Remove abstract getPageSize method from InputGate

Resolved

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

37.

Remove ExecutionAttemptID argument from ResultPartitionFactory#create

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

38.

Add partition lifecycle related Shuffle API

Closed

Andrey Zagrebin

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

39.

Introduce ShuffleDescriptor#ReleaseType and ShuffleDescriptor#getSupportedReleaseTypes

Closed

Andrey Zagrebin

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 40m

40.

Refactor the process of SchedulerNG#requestPartitionState

Closed

Zhijiang

41.

Remove getBufferSize method from BufferPoolFactory

Closed

Zhijiang

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

42.

Remove ShuffleDescriptor.ReleaseType and make release semantics fixed per partition type

Closed

Andrey Zagrebin

100%

Original Estimate - Not Specified

Original Estimate - Not Specified

Time Spent - 20m

Activity

People

Assignee:: Zhijiang

Reporter:: Zhijiang

Votes:: 5 Vote for this issue

Watchers:: 30 Start watching this issue

Dates

Created:: 23/Oct/18 10:12

Updated:: 09/Oct/20 07:07

Resolved:: 09/Oct/20 07:04

Time Tracking

Estimated:

Not Specified

Remaining:

0h

Logged:

13h 50m

Include sub-tasks