Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26543

Support the coordinator to determine post-shuffle partitions more reasonably

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.3.1, 2.3.2
    • None
    • SQL
    • Patch

    Description

      For SparkSQL ,when we open AE by 'set spark.sql.adapative.enable=true',the ExchangeCoordinator will introduced to determine the number of post-shuffle partitions. But in some certain conditions,the coordinator performed not very well, there are always some tasks retained and they worked with Shuffle Read Size / Records 0.0B/0 ,We could increase the spark.sql.adaptive.shuffle.targetPostShuffleInputSize to solve this,but this action is unreasonable as targetPostShuffleInputSize Should not be set too large. As follow:

      We can filter the useless partition(0B) with ExchangeCoorditinator automatically

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              southernriver chenliang
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: