Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-4237

Hive cannot support following query, and will throw java.lang.IndexOutOfBoundsException in org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector

    XMLWordPrintableJSON

Details

    Description

      Recently, I was helping one of my colleague to fix one Hive issue. I have to admit that the query didn't do what was planned, but I don't think it should break the hive.

      For a hive table, the following query will break

      select count(distinct columnA), columnB from table group by columnA, columnB

      I know the correct query should be
      select count(distinct columnA), columnB from table group by columnB

      But even in the incorrect query, hive should return me 1 as the unique count of columnA for every columnB, but instead, it breaks with the following stack trace:

      Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
      at java.util.ArrayList.RangeCheck(ArrayList.java:547)
      at java.util.ArrayList.get(ArrayList.java:322)
      at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118)
      at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.<init>(StandardStructObjectInspector.java:106)
      at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:274)
      at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:259)
      at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.initEvaluatorsAndReturnStruct(ReduceSinkOperator.java:188)
      at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:197)
      at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
      at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
      at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:959)
      at org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1005)

      Attachments

        Activity

          People

            Unassigned Unassigned
            java8964 Yong Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: