Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-25081

Nested spill in ShuffleExternalSorter may access a released memory page

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 1.6.0, 1.6.1, 1.6.2, 1.6.3, 2.0.0, 2.0.1, 2.0.2, 2.1.0, 2.1.1, 2.1.2, 2.1.3, 2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.3.1, 2.3.2
    • 2.2.3, 2.3.3, 2.4.0
    • Spark Core

    Description

      This issue is pretty similar to SPARK-21907.
      "allocateArray" in ShuffleInMemorySorter.reset may trigger a spill and cause ShuffleInMemorySorter access the released `array`. Another task may get the same memory page from the pool. This will cause two tasks access the same memory page. When a task reads memory written by another task, many types of failures may happen. Here are some examples I have seen:

      • JVM crash. (This is easy to reproduce in a unit test as we fill newly allocated and deallocated memory with 0xa5 and 0x5a bytes which usually points to an invalid memory address)
      • java.lang.IllegalArgumentException: Comparison method violates its general contract!
      • java.lang.NullPointerException at org.apache.spark.memory.TaskMemoryManager.getPage(TaskMemoryManager.java:384)
      • java.lang.UnsupportedOperationException: Cannot grow BufferHolder by size -536870912 because the size after growing exceeds size limitation 2147483632

      Attachments

        Issue Links

          Activity

            People

              zsxwing Shixiong Zhu
              zsxwing Shixiong Zhu
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: