Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-3466

Limit size of results that a driver collects for each action

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 1.2.0
    • Spark Core
    • None

    Description

      Right now, operations like collect() and take() can crash the driver with an OOM if they bring back too many data. We should add a spark.driver.maxResultSize setting (or something like that) that will make the driver abort a job if its result is too big. We can set it to some fraction of the driver's memory by default, or to something like 100 MB.

      Attachments

        Activity

          People

            davies Davies Liu
            matei Matei Alexandru Zaharia
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: