Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-7542

Support off-heap sort buffer in UnsafeExternalSorter

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.4.0
    • 1.6.0
    • Spark Core
    • None

    Description

      UnsafeExternalSorter, introduced in SPARK-7081, uses on-heap long[] arrays as its sort buffers. When records are small, the sorting array might be as large as the data pages, so it would be useful to be able to allocate this array off-heap (using our unsafe LongArray). Unfortunately, we can't currently do this because TimSort calls allocate() to create data buffers but doesn't call any corresponding cleanup methods to free them.

      We should look into extending TimSort with buffer freeing methods, then consider switching to LongArray in UnsafeShuffleSortDataFormat.

      Attachments

        Issue Links

          Activity

            People

              davies Davies Liu
              joshrosen Josh Rosen
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: