Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-34601

Do not delete shuffle file on executor lost event when using remote shuffle service

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 3.2.0
    • None
    • Shuffle

    Description

      There are multiple work going on with disaggregated/remote shuffle service (e.g. LinkedIn shuffle, Facebook shuffle service, Uber shuffle service). Such remote shuffle service is not Spark External Shuffle Service. It could be third party shuffle solution and user uses it by setting spark.shuffle.manager. In those systems, shuffle data will be stored on different server other than executor. Spark should not mark shuffle data lost when the executor is lost. We could add a Spark configuration to control this behavior. By default, Spark still mark shuffle file lost. For disaggregated/remote shuffle service, people could set the configure to not mark shuffle file lost.

      Attachments

        Activity

          People

            Unassigned Unassigned
            bobyangbo BoYang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: