Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-4480

Avoid many small spills in external data structures

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 1.1.0
    • Fix Version/s: 1.1.1, 1.2.0
    • Component/s: Spark Core
    • Labels:
      None

      Description

      The following output is provided by shenh062326 in SPARK-4380.

      14/11/13 19:20:43 INFO collection.ExternalSorter: Thread 60 spilling in-memory batch of 4792 B to disk (292769 spills so far)
      14/11/13 19:20:43 INFO collection.ExternalSorter: Thread 60 spilling in-memory batch of 4760 B to disk (292770 spills so far)
      14/11/13 19:20:43 INFO collection.ExternalSorter: Thread 60 spilling in-memory batch of 4520 B to disk (292771 spills so far)
      14/11/13 19:20:43 INFO collection.ExternalSorter: Thread 60 spilling in-memory batch of 4560 B to disk (292772 spills so far)
      14/11/13 19:20:43 INFO collection.ExternalSorter: Thread 60 spilling in-memory batch of 4792 B to disk (292773 spills so far)
      14/11/13 19:20:43 INFO collection.ExternalSorter: Thread 60 spilling in-memory batch of 4784 B to disk (292774 spills so far)
      

      Spilling many small files has two implications. First, it can cause "too many open files" exceptions, as we observed in SPARK-3633. Second, it causes degradation in performance. We have seen slight performance regressions from 1.0.2 to 1.1.0, and this is likely the cause.

      Note that this is spun-off from SPARK-4452, the fixing of which involves a bigger change in the way we track shuffle memory. This issue is smaller in scope in that it only makes sure we don't constantly spill, regardless of the policy we use for tracking shuffle memory.

        Attachments

          Activity

            People

            • Assignee:
              andrewor14 Andrew Or
              Reporter:
              andrewor14 Andrew Or
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: