Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
3.1.1-incubating
-
None
Description
We should add a big warning regarding the two storage level settings, especially since we have MEMORY_ONLY as the default value. What's the impact of changing any of these settings? What is kept in memory, what not? How much memory will my executors need when I use one or another storage level?
Currently these 2 settings mean nothing to the average user (you'll probably only know how to use them, if you have a deep understanding of how Spark works). For the averaage user we should probably also consider to make DISK_ONLY (or perhaps MEMORY_AND_DISK?) the default.