Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
0.8.0, 0.8.1, 0.9.0
Description
When setting spark.cleaner.ttl, spark performs the cleanup on time - but it cleans all broadcast variables, not just the ones that are older than the ttl. This creates an exception when the next mapPartitions runs because it cannot find the broadcast variable, even when it was created immediately before running the task.
Our temp workaround - not set the ttl and suffer from an ongoing memory leak (forces a restart).
We are using JavaSparkContext and our broadcast variables are Java HashMaps.