Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17547

Temporary shuffle data files may be leaked following exception in write

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.5.3, 1.6.0, 2.0.0
    • Fix Version/s: 1.6.3, 2.0.1, 2.1.0
    • Component/s: Shuffle
    • Labels:
      None

      Description

      SPARK-8029 modified shuffle writers to first stage their data to a temporary file in the same directory as the final destination file and then to atomically rename the file at the end of the write job. However, this change introduced the potential for the temporary output file to be leaked if an exception occurs during the write because the shuffle writers' existing error cleanup code doesn't handle this new temp file.

      This is easy to fix: we just need to add a finally block to ensure that the temporary file is guaranteed to be either moved or deleted before existing the shuffle write method.

        Attachments

          Activity

            People

            • Assignee:
              joshrosen Josh Rosen
              Reporter:
              joshrosen Josh Rosen
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: