Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-34601

Do not delete shuffle file on executor lost event when using remote shuffle service

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: 3.2.0
    • Fix Version/s: None
    • Component/s: Shuffle
    • Labels:

      Description

      There are multiple work going on with disaggregated/remote shuffle service (e.g. LinkedIn shuffle, Facebook shuffle service, Uber shuffle service). Such remote shuffle service is not Spark External Shuffle Service. It could be third party shuffle solution and user uses it by setting spark.shuffle.manager. In those systems, shuffle data will be stored on different server other than executor. Spark should not mark shuffle data lost when the executor is lost. We could add a Spark configuration to control this behavior. By default, Spark still mark shuffle file lost. For disaggregated/remote shuffle service, people could set the configure to not mark shuffle file lost.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              bobyangbo BoYang
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: