Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-16430 FLIP-119 Pipelined Region Scheduling
  3. FLINK-19286

Improve pipelined region scheduling performance

    XMLWordPrintableJSON

Details

    Description

      In my recent TPCDS benchmark, pipelined region scheduling is slower than lazy-from-sources scheduling.
      The regression is due to some suboptimal implementation of PipelinedRegionSchedulingStrategy, including:
      1. topologically sorting of vertices to deploy
      2. unnecessary O(V) loop when sorting an empty set of regions

      After improving these implementations, pipelined region scheduling turned to be 10% faster in the previous benchmark setup.

      Attachments

        Issue Links

          Activity

            People

              zhuzh Zhu Zhu
              zhuzh Zhu Zhu
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: