Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-12795

Vectorized execution causes ClassCastException

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.1.0
    • 1.3.0, 2.1.0
    • Query Processor
    • None

    Description

      In some hive versions, when
      set hive.auto.convert.join=false;
      set hive.vectorized.execution.enabled = true;

      Some join queries fail with ClassCastException:
      The stack:

      Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyStringObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableStringObjectInspector
      at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.genVectorExpressionWritable(VectorExpressionWriterFactory.java:419)
      at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.processVectorInspector(VectorExpressionWriterFactory.java:1102)
      at org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.initializeOp(VectorReduceSinkOperator.java:55)
      at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
      at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469)
      at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425)
      at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
      at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
      at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:431)
      at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
      at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:126)
      ... 22 more
      
      

      It can not be reproduced in hive 2.0 and 1.3 because of different code path.
      Reproduce:

      CREATE TABLE test1
       (
         id string)
         PARTITIONED BY (
        cr_year bigint,
        cr_month bigint)
       ROW FORMAT SERDE
        'org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe'
      STORED AS INPUTFORMAT
        'org.apache.hadoop.hive.ql.io.RCFileInputFormat'
      OUTPUTFORMAT
        'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'
      TBLPROPERTIES (
        'serialization.null.format'='' );
        
        CREATE TABLE test2(
          id string
        )
         PARTITIONED BY (
        cr_year bigint,
        cr_month bigint)
      ROW FORMAT SERDE
        'org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe'
      STORED AS INPUTFORMAT
        'org.apache.hadoop.hive.ql.io.RCFileInputFormat'
      OUTPUTFORMAT
        'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'
      TBLPROPERTIES (
        'serialization.null.format'=''
       );
      set hive.auto.convert.join=false;
      set hive.vectorized.execution.enabled = true;
       SELECT cr.id1 ,
      cr.id2 
      FROM
      (SELECT t1.id id1,
       t2.id id2
       from
       (select * from test1 ) t1
       left outer join test2  t2
       on t1.id=t2.id) cr;
      
      

      Attachments

        1. HIVE-12795.1.patch
          6 kB
          Yongzhi Chen
        2. HIVE-12795.2.patch
          6 kB
          Yongzhi Chen

        Activity

          People

            ychena Yongzhi Chen
            ychena Yongzhi Chen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: