Description
Often the driver/master node has ram allocated than the worker nodes.
In the case the user runs take or first on a cached RDD, the task can get launched locally on the master, and then the master would attempt to put the first block (or first few blocks) in memory, leading to OOM on the master.
Perhaps the simplest solution is to not put blocks in memory on the master node when Spark is running in cluster mode.
See the related discussion on the mailing list: https://groups.google.com/forum/#!topic/spark-users/eu9RJc3nQng