Details
-
New Feature
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
-
None
Description
Right now, operations like collect() and take() can crash the driver with an OOM if they bring back too many data. We should add a spark.driver.maxResultSize setting (or something like that) that will make the driver abort a job if its result is too big. We can set it to some fraction of the driver's memory by default, or to something like 100 MB.