Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-21110 Optimize scheduler performance for large-scale jobs
  3. FLINK-21326

Optimize building topology when initializing ExecutionGraph

    XMLWordPrintableJSON

Details

    Description

      The main idea of optimizing the procedure of building topology is to put all the vertices that consumed the same result partitions into one group, and put all the result partitions that have the same consumer vertices into one consumer group. The corresponding data structure is ConsumedPartitionGroup and ConsumerVertexGroupEdgeManager is used to store the relationship between the groups. The procedure of creating ExecutionEdge is replaced with building EdgeManager.

      With these improvements, the complexity of building topology in ExecutionGraph decreases from O(N^2) to O(N). 

      Furthermore, ExecutionEdge and all its related calls are replaced with ConsumedPartitionGroup and ConsumerVertexGroup.

      The detailed design doc is located at: https://docs.google.com/document/d/1OjGAyJ9Z6KsxcMtBHr6vbbrwP9xye7CdCtrLvf8dFYw/edit?usp=sharing

      Attachments

        Issue Links

          Activity

            People

              Thesharing Zhilong Hong
              Thesharing Zhilong Hong
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: