Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Not A Problem
-
1.2.0
-
None
-
None
-
CDH 5.3.2
Description
DataType.fromJson will fail in spark-shell if the schema includes "udt". It works if running in an application.
This causes that I cannot read a parquet file including a UDT field. DataType.fromCaseClass does not support UDT.
I can load the class which shows that my UDT is in the classpath.
scala> Class.forName("com.bwang.MyTestUDT") res6: Class[_] = class com.bwang.MyTestUDT
But DataType fails:
scala> DataType.fromJson(json) java.lang.ClassNotFoundException: com.bwang.MyTestUDT at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:190) at org.apache.spark.sql.catalyst.types.DataType$.parseDataType(dataTypes.scala:77)
The reason is DataType.fromJson tries to load udtClass using this code:
case JSortedObject( ("class", JString(udtClass)), ("pyClass", _), ("sqlType", _), ("type", JString("udt"))) => Class.forName(udtClass).newInstance().asInstanceOf[UserDefinedType[_]] }
Unfortunately, my UDT is loaded by SparkIMain$TranslatingClassLoader, but DataType is loaded by Launcher$AppClassLoader.
scala> DataType.getClass.getClassLoader res2: ClassLoader = sun.misc.Launcher$AppClassLoader@6876fb1b scala> this.getClass.getClassLoader res3: ClassLoader = org.apache.spark.repl.SparkIMain$TranslatingClassLoader@63d36b29
Attachments
Issue Links
- is duplicated by
-
SPARK-20252 java.lang.ClassNotFoundException: $line22.$read$$iwC$$iwC$movie_row
- Resolved