Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-33235 Push-based Shuffle Improvement Tasks
  3. SPARK-35426

When addMergerLocation exceed the maxRetainedMergerLocations , we should remove the merger based on merged shuffle data size.

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.2.0
    • None
    • Spark Core
    • None

    Description

      Now When addMergerLocation exceed the maxRetainedMergerLocations , we just remove the oldest merger, but we'd better remove the merger based on merged shuffle data size. 

      We should remove mergers with the largest amount of merged shuffle data, so that the remaining mergers have potentially more disk space to store new merged shuffle data

      Attachments

        Activity

          People

            Unassigned Unassigned
            zhuqi Qi Zhu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: