Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-8368

ClassNotFoundException in closure for map

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 1.4.0
    • Fix Version/s: 1.4.1, 1.5.0
    • Component/s: SQL
    • Labels:
      None
    • Environment:

      Centos 6.5, java 1.7.0_67, scala 2.10.4. Build the project on Windows 7 and run in a spark standalone cluster(or local) mode on Centos 6.X.

      Description

      After upgraded the cluster from spark 1.3.0 to 1.4.0(rc4), I encountered the following exception:
      ======begin exception========

      Exception in thread "main" java.lang.ClassNotFoundException: com.yhd.ycache.magic.Model$$anonfun$9$$anonfun$10
      at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
      at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
      at java.security.AccessController.doPrivileged(Native Method)
      at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
      at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
      at java.lang.Class.forName0(Native Method)
      at java.lang.Class.forName(Class.java:278)
      at org.apache.spark.util.InnerClosureFinder$$anon$4.visitMethodInsn(ClosureCleaner.scala:455)
      at com.esotericsoftware.reflectasm.shaded.org.objectweb.asm.ClassReader.accept(Unknown Source)
      at com.esotericsoftware.reflectasm.shaded.org.objectweb.asm.ClassReader.accept(Unknown Source)
      at org.apache.spark.util.ClosureCleaner$.getInnerClosureClasses(ClosureCleaner.scala:101)
      at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:197)
      at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:132)
      at org.apache.spark.SparkContext.clean(SparkContext.scala:1891)
      at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:294)
      at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:293)
      at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148)
      at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:109)
      at org.apache.spark.rdd.RDD.withScope(RDD.scala:286)
      at org.apache.spark.rdd.RDD.map(RDD.scala:293)
      at org.apache.spark.sql.DataFrame.map(DataFrame.scala:1210)
      at com.yhd.ycache.magic.Model$.main(SSExample.scala:239)
      at com.yhd.ycache.magic.Model.main(SSExample.scala)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:606)
      at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
      at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
      at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
      at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
      at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

      ===============end exception===========

      I simplify the code that cause this issue, as following:
      ==========begin code==================

      object Model extends Serializable{
        def main(args: Array[String]) {
          val Array(sql) = args
          val sparkConf = new SparkConf().setAppName("Mode Example")
          val sc = new SparkContext(sparkConf)
          val hive = new HiveContext(sc)
          //get data by hive sql
          val rows = hive.sql(sql)
      
          val data = rows.map(r => { 
            val arr = r.toSeq.toArray
            val label = 1.0
            def fmap = ( input: Any ) => 1.0
            val feature = arr.map(_=>1.0)
            LabeledPoint(label, Vectors.dense(feature))
          })
      
          data.count()
        }
      }
      

      =====end code===========
      This code can run pretty well on spark-shell, but error when submit it to spark cluster (standalone or local mode). I try the same code on spark 1.3.0(local mode), and no exception is encountered.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                yhuai Yin Huai
                Reporter:
                zwChan CHEN Zhiwei
                Shepherd:
                Andrew Or
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: