Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-21110 Optimize scheduler performance for large-scale jobs
  3. FLINK-21915

Optimize Execution#finishPartitionsAndUpdateConsumers

    XMLWordPrintableJSON

    Details

      Description

      Based on the scheduler benchmark PartitionReleaseInBatchJobBenchmark introduced in FLINK-20612, we find that there's another procedure that has O(N^2) computation complexity: Execution#finishPartitionsAndUpdateConsumers

      Once an execution is finished, it will finish all its BLOCKING partitions and update the partition info to all consumer vertices. The procedure can be illustrated as the following pseudo code:

      for all Execution in ExecutionGraph:
        for all produced IntermediateResultPartition of the Execution:
          for all consumer ExecutionVertex of the IntermediateResultPartition:
            update or cache partition info

      This procedure has O(N^2) complexity in total.

      Based on FLINK-21326, the consumed partitions are grouped if they are connected to the same consumer vertices. Therefore, we can update partition info of the entire ConsumedPartitionGroup in batch, rather than one by one. This will decrease the complexity from O(N^2) to O(N).

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                Thesharing Zhilong Hong
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: