Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-1866

Closure cleaner does not null shadowed fields when outer scope is referenced

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.0.0
    • Fix Version/s: None
    • Component/s: Spark Core
    • Labels:
      None

      Description

      Take the following example:

      val x = 5
      val instances = new org.apache.hadoop.fs.Path("/") /* non-serializable */
      sc.parallelize(0 until 10).map { _ =>
        val instances = 3
        (instances, x)
      }.collect
      

      This produces a "java.io.NotSerializableException: org.apache.hadoop.fs.Path", despite the fact that the outer instances is not actually used within the closure. If you change the name of the outer variable instances to something else, the code executes correctly, indicating that it is the fact that the two variables share a name that causes the issue.

      Additionally, if the outer scope is not used (i.e., we do not reference "x" in the above example), the issue does not appear.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              ilikerps Aaron Davidson
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: