Description
I run SQL like that
CREATE TEMPORARY FUNCTION test_avg AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDAFAverage';
SELECT
test_avg(1),
test_avg(substr(value,5))
FROM src;
then i get a exception
15/03/19 09:36:45 ERROR CliDriver: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 4 times, most recent failure: Lost task 0.3 in stage 2.0 (TID 6, HPC-3): java.lang.ClassCastException: org.apache.hadoop.hive.ql.udf.generic.GenericUDAFAverage$AverageAggregationBuffer cannot be cast to org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator$AbstractAggregationBuffer at org.apache.spark.sql.hive.HiveUdafFunction.<init>(hiveUdfs.scala:369) at org.apache.spark.sql.hive.HiveGenericUdaf.newInstance(hiveUdfs.scala:214) at org.apache.spark.sql.hive.HiveGenericUdaf.newInstance(hiveUdfs.scala:188)
i find that GenericUDAFAverage used a deprecated interface AggregationBuffer that has been instead by AbstractAggregationBuffer. and spark avoid the old interface AggregationBuffer , so GenericUDAFAverage can not work.I think it is not necessary.
code in spark
// Cast required to avoid type inference selecting a deprecated Hive API. private val buffer = function.getNewAggregationBuffer.asInstanceOf[GenericUDAFEvaluator.AbstractAggregationBuffer]