Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-30602 SPIP: Support push-based shuffle to improve shuffle efficiency
  3. SPARK-32919

Add support in Spark driver to coordinate the shuffle map stage in push-based shuffle by selecting external shuffle services for merging shuffle partitions

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.0
    • 3.1.0
    • Shuffle, Spark Core
    • None

    Description

      In the beginning of a shuffle map stage, driver needs to select external shuffle services as the mergers of the shuffle partitions for the corresponding shuffle.

      We currently leverage the immediate available information about current and past executor location information for this selection purpose. Ideally, this would be behind a pluggable interface so that we can potentially leverage information tracked outside of a Spark application for better load balancing or for a disaggregate deployment environment.

      Attachments

        Activity

          People

            vsowrirajan Venkata krishnan Sowrirajan
            mshen Min Shen
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: