Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-13659

An empty where condition leads to vectorization exceptions instead of throwing a compile time error

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • None
    • 2.1.0
    • Hive
    • None

    Description

      A partial query
      select count (distinct field) from table where field;
      Note the missing 'field=value'

      resulted in the following error in task logs, instead of failing early during compile

      org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
              at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
              at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
              at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181)
              at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
              at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
              at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
              at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
              at java.util.concurrent.FutureTask.run(FutureTask.java:262)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              at java.lang.Thread.run(Thread.java:745)
      Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
              at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
              at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
              at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:326)
              at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
              ... 14 more
      Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
              at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52)
              at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
              ... 17 more
      Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector cannot be cast to org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
              at org.apache.hadoop.hive.ql.exec.vector.expressions.SelectColumnIsTrue.evaluate(SelectColumnIsTrue.java:46)
              at org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:106)
              at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
              at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
              at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:164)
              at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
              ... 18 more
      

      Comment by Gunther:
      I think this works by implicitly converting the expr to boolean (if a cast is necessary). This query looks legal to me and probably needs to be handle in Vectorization

      Comment by Ashutosh:
      Oracle, postgres & sql server throws error for this if type of field is not boolean. However, MySQL & Hive (with vectorization off) executes the query by implicitly adding a cast to boolean. Hive shall be consistent in its behavior regardless whether vectorization is on or off.

      Attachments

        1. HIVE-13659.03.patch
          13 kB
          Matt McCline
        2. HIVE-13659.02.patch
          13 kB
          Matt McCline
        3. HIVE-13659.01.patch
          5 kB
          Matt McCline

        Issue Links

          Activity

            People

              mmccline Matt McCline
              mmccline Matt McCline
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: