Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32901

UnsafeExternalSorter may cause a SparkOutOfMemoryError to be thrown while spilling

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.4.7, 3.0.1
    • Fix Version/s: 2.4.8, 3.0.2, 3.1.0
    • Component/s: Spark Core
    • Labels:
      None

      Description

      Consider the following sequence of events:

      1. UnsafeExternalSorter runs out of space in its pointer array and attempts to allocate a large array to replace the current one.
      2. TaskMemoryManager tries to allocate the memory backing the large array using MemoryManager, but MemoryManager is only willing to return most but not all of the memory requested.
      3. TaskMemoryManager asksĀ UnsafeExternalSorter to spill, which causesĀ UnsafeExternalSorter to spill the current run to disk, to free its record pages and to reset its UnsafeInMemorySorter.
      4. UnsafeInMemorySorter frees its pointer array, and tries to allocate a new small pointer array.
      5. TaskMemoryManager tries to allocate the memory backing the small array using MemoryManager, but MemoryManager is unwilling to give it any memory, as the TaskMemoryManager is still holding on to the memory it got for the large array.
      6. TaskMemoryManager again asks UnsafeExternalSorter to spill, but this time there is nothing to spill.
      7. UnsafeInMemorySorter receives less memory than it requested, and causes a SparkOutOfMemoryError to be thrown, which causes the current task to fail.

      A simple way to fix this is to avoid allocating a new array in UnsafeInMemorySorter.reset() and to do this on-demand instead.

        Attachments

          Activity

            People

            • Assignee:
              tomvanbussel Tom van Bussel
              Reporter:
              tomvanbussel Tom van Bussel
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: