Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-4808

Spark fails to spill with small number of large objects

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.0.2, 1.1.0, 1.2.0, 1.2.1
    • 1.4.0
    • Spark Core
    • None

    Description

      Spillable's maybeSpill does not allow spill to occur until at least 1000 elements have been spilled, and then will only evaluate spill every 32nd element thereafter. When there is a small number of very large items being tracked, out-of-memory conditions may occur.

      I suspect that this and the every-32nd-element behavior was to reduce the impact of the estimateSize() call. This method was extracted into SizeTracker, which implements its own exponential backup for size estimation, so now we are only avoiding using the resulting estimated size.

      Attachments

        Issue Links

          Activity

            People

              mkim Mingyu Kim
              dlawler Dennis Lawler
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: