Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-24892 FLIP-187: Adaptive Batch Scheduler
  3. FLINK-25995

Make implicit assumption of SQL local hash explicit

    XMLWordPrintableJSON

Details

    Description

      If there are multiple consecutive and the same hash shuffles, SQL planner will change them except the first one to use forward partitioner, so that these operators can be chained to reduce unnecessary shuffles.

      However, sometimes the local hash operators are not chained (e.g. multiple inputs), and this kind of forward partitioners will turn into forward job edges. These forward edges still have the local keyBy assumption, so that they cannot be changed into rescale/rebalance edges, otherwise it can lead to incorrect results. This prevents the adaptive batch scheduler from determining parallelism for other forward edge downstream job vertices (see FLINK-25046).

      When SQL planner optimizes the case of multiple consecutive the same groupBy, it should use ForwardForConsecutiveHashPartitioner, so that the runtime framework can further decide whether the partitioner can be changed to rescale or not.

      Attachments

        Issue Links

          Activity

            People

              godfreyhe godfrey he
              zhuzh Zhu Zhu
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: