Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-6477

Aggregation functions for tiny/smallint broken with parquet

    XMLWordPrintableJSON

Details

    Description

      Given the following table:

      CREATE TABLE IF NOT EXISTS commontypesagg (
      id int,
      bool_col boolean,
      tinyint_col tinyint,
      smallint_col smallint,
      int_col int,
      bigint_col bigint,
      float_col float,
      double_col double,
      date_string_col string,
      string_col string)
      PARTITIONED BY (year int, month int, day int)
      STORED AS PARQUET;
      

      The following queries throws ClassCastException:

      select count(tinyint_col), min(tinyint_col), max(tinyint_col), sum(tinyint_col) from commontypesagg;
      
      select count(smallint_col), min(smallint_col), max(smallint_col), sum(smallint_col) from commontypesagg;
      

      Exception is the following:

      2014-01-29 14:02:11,381 INFO org.apache.hadoop.mapred.TaskStatus: task-diagnostic-info for task attempt_201401290934_0006_m_000001_1 : java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"id":null,"bool_col":null,"tinyint_col":1,"smallint_col":null,"int_col":null,"bigint_col":null,"float_col":null,"double_col":null,"date_string_col":null,"string_col":null,"year":"2009","month":"1"}
      at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:175)
      at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
      at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
      at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:415)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
      at org.apache.hadoop.mapred.Child.main(Child.java:262)
      Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"id":null,"bool_col":null,"tinyint_col":1,"smallint_col":null,"int_col":null,"bigint_col":null,"float_col":null,"double_col":null,"date_string_col":null,"string_col":null,"year":"2009","month":"1"}
      at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:529)
      at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:157)
      ... 8 more
      Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to java.lang.Byte
      at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:796)
      at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
      at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:844)
      at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
      at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
      at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:844)
      at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91)
      at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
      at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:844)
      at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:519)
      ... 9 more
      Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to java.lang.Byte
      at org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaByteObjectInspector.get(JavaByteObjectInspector.java:40)
      at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:666)
      at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:631)
      at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.merge(GenericUDAFMin.java:109)
      at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.iterate(GenericUDAFMin.java:96)
      at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:183)
      at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:629)
      at org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:826)
      at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:723)
      at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:791)
      ... 18 more
      

      Attachments

        1. HIVE-6477.2.patch
          19 kB
          Szehon Ho
        2. HIVE-6477.patch
          19 kB
          Szehon Ho

        Issue Links

          Activity

            People

              szehon Szehon Ho
              szehon Szehon Ho
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: