Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-10000

Consolidate storage and execution memory management

    Details

    • Type: Story
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.6.0
    • Component/s: Block Manager, Spark Core
    • Labels:
      None
    • Target Version/s:

      Description

      Memory management in Spark is currently broken down into two disjoint regions: one for execution and one for storage. The sizes of these regions are statically configured and fixed for the duration of the application.

      There are several limitations to this approach. It requires user expertise to avoid unnecessary spilling, and there are no sensible defaults that will work for all workloads. As a Spark user, I want Spark to manage the memory more intelligently so I do not need to worry about how to statically partition the execution (shuffle) memory fraction and cache memory fraction. More importantly, applications that do not use caching use only a small fraction of the heap space, resulting in suboptimal performance.

      Instead, we should unify these two regions and let one borrow from another if possible.

      There are no Sub-Tasks for this issue.

        Activity

        Hide
        bowenzhangusa Bowen Zhang added a comment -

        Reynold Xin, I am very interested in this new story. I am trying to understand the story here. In addition to consolidate these two parts of memory into one memory, are there other tricky things that can pose a challenge to this story or other use case considerations that should be taken into account for this story?

        Show
        bowenzhangusa Bowen Zhang added a comment - Reynold Xin , I am very interested in this new story. I am trying to understand the story here. In addition to consolidate these two parts of memory into one memory, are there other tricky things that can pose a challenge to this story or other use case considerations that should be taken into account for this story?
        Hide
        rxin Reynold Xin added a comment -

        Not much - unless you can think of something. It might regress certain user behavior, if they expect their workload to work exactly the same way as current Spark.

        Show
        rxin Reynold Xin added a comment - Not much - unless you can think of something. It might regress certain user behavior, if they expect their workload to work exactly the same way as current Spark.
        Hide
        bowenzhangusa Bowen Zhang added a comment -

        Reynold Xin, I am interested in this ticket. Can you assign it to me?

        Show
        bowenzhangusa Bowen Zhang added a comment - Reynold Xin , I am interested in this ticket. Can you assign it to me?
        Hide
        rxin Reynold Xin added a comment -

        Bowen Zhang thanks for the interest. This task is pretty significant and involves substantial refactoring to internals, so it might be pretty hard for somebody less familiar with Spark to just pick up. However, we will post a design doc soon and try to break this down into multiple tasks. Please follow the ticket and see if you can help contribute to some of them. Thanks!

        Show
        rxin Reynold Xin added a comment - Bowen Zhang thanks for the interest. This task is pretty significant and involves substantial refactoring to internals, so it might be pretty hard for somebody less familiar with Spark to just pick up. However, we will post a design doc soon and try to break this down into multiple tasks. Please follow the ticket and see if you can help contribute to some of them. Thanks!
        Hide
        bowenzhangusa Bowen Zhang added a comment -

        Reynold Xin, sounds good.

        Show
        bowenzhangusa Bowen Zhang added a comment - Reynold Xin , sounds good.

          People

          • Assignee:
            andrewor14 Andrew Or
            Reporter:
            rxin Reynold Xin
          • Votes:
            2 Vote for this issue
            Watchers:
            31 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development