Description
From the mailing list:
Since upgrading to Spark 1.4, I'm getting a scala.reflect.internal.MissingRequirementError when creating a DataFrame from an RDD. The error references a case class in the application (the RDD's type parameter), which has been verified to be present. Items of note: 1) This is running on AWS EMR (YARN). I do not get this error running locally (standalone). 2) Reverting to Spark 1.3.1 makes the problem go away 3) The jar file containing the referenced class (the app assembly jar) is not listed in the classpath expansion dumped in the error message. I have seen SPARK-5281, and am guessing that this is the root cause, especially since the code added there is involved in the stacktrace. That said, my grasp on scala reflection isn't strong enough to make sense of the change to say for sure. It certainly looks, though, that in this scenario the current thread's context classloader may not be what we think it is (given #3 above). Any ideas? App code: def registerTable[A <: Product : TypeTag](name: String, rdd: RDD[A])(implicit hc: HiveContext) = { val df = hc.createDataFrame(rdd) df.registerTempTable(name) } Stack trace: scala.reflect.internal.MissingRequirementError: class com....MyClass in JavaMirror with sun.misc.Launcher$AppClassLoader@d16e5d6 of type class sun.misc.Launcher$AppClassLoader with classpath [ lots and lots of paths and jars, but not the app assembly jar] not found at scala.reflect.internal.MissingRequirementError$.signal(MissingRequirementError.scala:16) at scala.reflect.internal.MissingRequirementError$.notFound(MissingRequirementError.scala:17) at scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:48) at scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:61) at scala.reflect.internal.Mirrors$RootsBase.staticModuleOrClass(Mirrors.scala:72) at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:119) at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:21) at com.ipcoop.spark.sql.SqlEnv$$typecreator1$1.apply(SqlEnv.scala:87) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:231) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:231) at org.apache.spark.sql.catalyst.ScalaReflection$class.localTypeOf(ScalaReflection.scala:71) at org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:59) at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:28) at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:410)
Another report:
Hi, I use spark 0.14. I tried to create dataframe from RDD below, but got scala.reflect.internal.MissingRequirementError val partitionedTestDF2 = pairVarRDD.toDF("column1","column2","column3") //pairVarRDD is RDD[Record4Dim_2], and Record4Dim_2 is a Case Class How can I fix this? Exception in thread "main" scala.reflect.internal.MissingRequirementError: class etl.Record4Dim_2 in JavaMirror with sun.misc.Launcher$AppClassLoader@30177039 of type class sun.misc.Launcher$AppClassLoader with classpath [file:/local/spark140/conf/,file:/local/spark140/lib/spark-assembly-1.4.0-SNAPSHOT-hadoop2.6.0.jar,file:/local/spark140/lib/datanucleus-core-3.2.10.jar,file:/local/spark140/lib/datanucleus-rdbms-3.2.9.jar,file:/local/spark140/lib/datanucleus-api-jdo-3.2.6.jar,file:/etc/hadoop/conf/] and parent being sun.misc.Launcher$ExtClassLoader@52c8c6d9 of type class sun.misc.Launcher$ExtClassLoader with classpath [file:/usr/jdk64/jdk1.7.0_67/jre/lib/ext/sunec.jar,file:/usr/jdk64/jdk1.7.0_67/jre/lib/ext/sunjce_provider.jar,file:/usr/jdk64/jdk1.7.0_67/jre/lib/ext/sunpkcs11.jar,file:/usr/jdk64/jdk1.7.0_67/jre/lib/ext/zipfs.jar,file:/usr/jdk64/jdk1.7.0_67/jre/lib/ext/localedata.jar,file:/usr/jdk64/jdk1.7.0_67/jre/lib/ext/dnsns.jar] and parent being primordial classloader with boot classpath [/usr/jdk64/jdk1.7.0_67/jre/lib/resources.jar:/usr/jdk64/jdk1.7.0_67/jre/lib/rt.jar:/usr/jdk64/jdk1.7.0_67/jre/lib/sunrsasign.jar:/usr/jdk64/jdk1.7.0_67/jre/lib/jsse.jar:/usr/jdk64/jdk1.7.0_67/jre/lib/jce.jar:/usr/jdk64/jdk1.7.0_67/jre/lib/charsets.jar:/usr/jdk64/jdk1.7.0_67/jre/lib/jfr.jar:/usr/jdk64/jdk1.7.0_67/jre/classes] not found. at scala.reflect.internal.MissingRequirementError$.signal(MissingRequirementError.scala:16) at scala.reflect.internal.MissingRequirementError$.notFound(MissingRequirementError.scala:17) at scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:48) at scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:61) at scala.reflect.internal.Mirrors$RootsBase.staticModuleOrClass(Mirrors.scala:72) at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:119) at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:21) at no.uni.computing.etl.LoadWrfV14$$typecreator1$1.apply(LoadWrfV14.scala:91) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:231) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:231) at org.apache.spark.sql.catalyst.ScalaReflection$class.localTypeOf(ScalaReflection.scala:71) at org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:59) at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:28) at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:410) at org.apache.spark.sql.SQLContext$implicits$.rddToDataFrameHolder(SQLContext.scala:335) BR, Patcharee
Attachments
Issue Links
- duplicates
-
SPARK-8368 ClassNotFoundException in closure for map
- Resolved
- is broken by
-
SPARK-5281 Registering table on RDD is giving MissingRequirementError
- Resolved
- relates to
-
SPARK-8489 Add regression tests for SPARK-8470
- Resolved
- links to