Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
0.9.0
-
None
-
Hive 0.9.0 on JDK 1.6.0_35
Description
Recently, I was helping one of my colleague to fix one Hive issue. I have to admit that the query didn't do what was planned, but I don't think it should break the hive.
For a hive table, the following query will break
select count(distinct columnA), columnB from table group by columnA, columnB
I know the correct query should be
select count(distinct columnA), columnB from table group by columnB
But even in the incorrect query, hive should return me 1 as the unique count of columnA for every columnB, but instead, it breaks with the following stack trace:
Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
at java.util.ArrayList.RangeCheck(ArrayList.java:547)
at java.util.ArrayList.get(ArrayList.java:322)
at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118)
at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.<init>(StandardStructObjectInspector.java:106)
at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:274)
at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:259)
at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.initEvaluatorsAndReturnStruct(ReduceSinkOperator.java:188)
at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:197)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:959)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1005)