Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-10000

Consolidate storage and execution memory management

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Story
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.6.0
    • Block Manager, Spark Core
    • None

    Description

      Memory management in Spark is currently broken down into two disjoint regions: one for execution and one for storage. The sizes of these regions are statically configured and fixed for the duration of the application.

      There are several limitations to this approach. It requires user expertise to avoid unnecessary spilling, and there are no sensible defaults that will work for all workloads. As a Spark user, I want Spark to manage the memory more intelligently so I do not need to worry about how to statically partition the execution (shuffle) memory fraction and cache memory fraction. More importantly, applications that do not use caching use only a small fraction of the heap space, resulting in suboptimal performance.

      Instead, we should unify these two regions and let one borrow from another if possible.

      Attachments

        There are no Sub-Tasks for this issue.

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            andrewor14 Andrew Or
            rxin Reynold Xin
            Votes:
            2 Vote for this issue
            Watchers:
            34 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment