Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-30602 SPIP: Support push-based shuffle to improve shuffle efficiency
  3. SPARK-32919

Add support in Spark driver to coordinate the shuffle map stage in push-based shuffle by selecting external shuffle services for merging shuffle partitions

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.1.0
    • Fix Version/s: 3.1.0
    • Component/s: Shuffle, Spark Core
    • Labels:
      None

      Description

      In the beginning of a shuffle map stage, driver needs to select external shuffle services as the mergers of the shuffle partitions for the corresponding shuffle.

      We currently leverage the immediate available information about current and past executor location information for this selection purpose. Ideally, this would be behind a pluggable interface so that we can potentially leverage information tracked outside of a Spark application for better load balancing or for a disaggregate deployment environment.

        Attachments

          Activity

            People

            • Assignee:
              vsowrirajan Venkata krishnan Sowrirajan
              Reporter:
              mshen Min Shen
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: