Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-1042

spark cleans all java broadcast variables when it hits the spark.cleaner.ttl

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.8.0, 0.8.1, 0.9.0
    • Fix Version/s: 0.9.2
    • Component/s: Java API, Spark Core
    • Labels:

      Description

      When setting spark.cleaner.ttl, spark performs the cleanup on time - but it cleans all broadcast variables, not just the ones that are older than the ttl. This creates an exception when the next mapPartitions runs because it cannot find the broadcast variable, even when it was created immediately before running the task.

      Our temp workaround - not set the ttl and suffer from an ongoing memory leak (forces a restart).

      We are using JavaSparkContext and our broadcast variables are Java HashMaps.

        Attachments

          Activity

            People

            • Assignee:
              qqsun8819 OuyangJin
              Reporter:
              sliwo Tal Sliwowicz
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: