Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.10.0
-
None
-
None
-
Spark
Description
java.lang.OutOfMemoryError: Java heap space
The code has an unnecessary .collect(), forcing all interaction data into memory of the client/driver. Increasing the executor memory will not help with this.
remove this line and rebuild Mahout.
https://github.com/apache/mahout/blob/mahout-0.10.x/spark/src/main/scala/org/apache/mahout/drivers/TextDelimitedReaderWriter.scala#L157
The errant line reads:
interactions.collect()
This forces the user action data into memory, a bad thing for memory consumption. Removing it should allow for better Spark memory management.