Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-12045

ClassNotFoundException for GenericUDF [Spark Branch]

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • spark-branch, 2.1.0
    • Spark
    • None
    • Cloudera QuickStart VM - CDH5.4.2
      beeline

    Description

      If I execute the following query in beeline, I get ClassNotFoundException for the UDF class.

      drop function myGenericUdf;
      create function myGenericUdf as 'org.example.myGenericUdf' using jar 'hdfs:///tmp/myudf.jar';
      select distinct myGenericUdf(1,2,1) from mytable;
      

      In my example, myGenericUdf just looks for the 1st argument's value in the others and returns the index. I don't think this is related to the actual GenericUDF function.

      Note that:
      "select myGenericUdf(1,2,1) from mytable;" succeeds
      If I use the non-generic implementation of the same UDF, the select distinct call succeeds.

      StackTrace:

      15/10/06 05:20:25 ERROR exec.Utilities: Failed to load plan: hdfs://quickstart.cloudera:8020/tmp/hive/hive/f9de3f09-c12d-4528-9ee6-1f12932a14ae/hive_2015-10-06_05-20-07_438_6519207588897968406-20/-mr-10003/27cd7226-3e22-46f4-bddd-fb8fd4aa4b8d/map.xml: org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: org.example.myGenericUDF
      Serialization trace:
      genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
      chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
      colExprMap (org.apache.hadoop.hive.ql.exec.GroupByOperator)
      childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
      childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
      aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
      org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: org.example.myGenericUDF
      Serialization trace:
      genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
      chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
      colExprMap (org.apache.hadoop.hive.ql.exec.GroupByOperator)
      childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
      childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
      aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
      	at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
      	at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
      	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
      	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
      	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
      	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
      	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
      	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
      	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
      	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
      	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
      	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
      	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
      	at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
      	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672)
      	at org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:1069)
      	at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:960)
      	at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:974)
      	at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:416)
      	at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:296)
      	at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:268)
      	at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:505)
      	at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:203)
      	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
      	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
      	at scala.Option.getOrElse(Option.scala:120)
      	at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
      	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
      	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
      	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
      	at scala.Option.getOrElse(Option.scala:120)
      	at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
      	at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:82)
      	at org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:80)
      	at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:206)
      	at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:204)
      	at scala.Option.getOrElse(Option.scala:120)
      	at org.apache.spark.rdd.RDD.dependencies(RDD.scala:204)
      	at org.apache.spark.scheduler.DAGScheduler.visit$2(DAGScheduler.scala:338)
      	at org.apache.spark.scheduler.DAGScheduler.getAncestorShuffleDependencies(DAGScheduler.scala:355)
      	at org.apache.spark.scheduler.DAGScheduler.registerShuffleDependencies(DAGScheduler.scala:317)
      	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$getShuffleMapStage(DAGScheduler.scala:218)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$visit$1$1.apply(DAGScheduler.scala:301)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$visit$1$1.apply(DAGScheduler.scala:298)
      	at scala.collection.immutable.List.foreach(List.scala:318)
      	at org.apache.spark.scheduler.DAGScheduler.visit$1(DAGScheduler.scala:298)
      	at org.apache.spark.scheduler.DAGScheduler.getParentStages(DAGScheduler.scala:310)
      	at org.apache.spark.scheduler.DAGScheduler.newStage(DAGScheduler.scala:244)
      	at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:731)
      	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1362)
      	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
      	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
      Caused by: java.lang.ClassNotFoundException: org.example.myGenericUDF
      	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
      	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
      	at java.lang.Class.forName0(Native Method)
      	at java.lang.Class.forName(Class.java:270)
      	at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
      	... 72 more
      

      Attachments

        1. HIVE-12045.patch
          19 kB
          Xuefu Zhang
        2. HIVE-12045.4-spark.patch
          26 kB
          Rui Li
        3. HIVE-12045.3-spark.patch
          26 kB
          Rui Li
        4. HIVE-12045.2-spark.patch
          19 kB
          Xuefu Zhang
        5. HIVE-12045.2-spark.patch
          19 kB
          Xuefu Zhang
        6. HIVE-12045.1-spark.patch
          3 kB
          Rui Li
        7. hive.log.gz
          1.93 MB
          Xuefu Zhang
        8. genUDF.patch
          7 kB
          Xuefu Zhang
        9. example.jar
          3 kB
          Zsolt Tóth

        Issue Links

          Activity

            People

              lirui Rui Li
              ztoth Zsolt Tóth
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: