Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7847

query orc partitioned table fail when table column type change

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 0.11.0, 0.12.0, 0.13.0
    • None
    • File Formats
    • None

    Description

      I use the following script to test orc column type change with partitioned table on branch-0.13:

      use test;
      DROP TABLE if exists orc_change_type_staging;
      DROP TABLE if exists orc_change_type;
      CREATE TABLE orc_change_type_staging (
          id int
      );
      CREATE TABLE orc_change_type (
          id int
      ) PARTITIONED BY (`dt` string)
      stored as orc;
      --- load staging table
      LOAD DATA LOCAL INPATH '../hive/examples/files/int.txt' OVERWRITE INTO TABLE orc_change_type_staging;
      --- populate orc hive table
      INSERT OVERWRITE TABLE orc_change_type partition(dt='20140718') select * FROM orc_change_type_staging limit 1;
      --- change column id from int to bigint
      ALTER TABLE orc_change_type CHANGE id id bigint;
      INSERT OVERWRITE TABLE orc_change_type partition(dt='20140719') select * FROM orc_change_type_staging limit 1;
      SELECT id FROM orc_change_type where dt between '20140718' and '20140719';
      

      if fails in the last query "SELECT id FROM orc_change_type where dt between '20140718' and '20140719';" with exception:

      Error: java.io.IOException: java.io.IOException: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.LongWritable
              at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
              at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
              at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:256)
              at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:171)
              at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:197)
              at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:183)
              at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
              at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
              at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
              at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
              at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
      Caused by: java.io.IOException: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.LongWritable
              at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
              at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
              at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:344)
              at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
              at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
              at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:122)
              at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:254)
              ... 11 more
      Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.LongWritable
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$LongTreeReader.next(RecordReaderImpl.java:717)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1788)
              at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2997)
              at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:153)
              at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:127)
              at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:339)
              ... 15 more
      

      The value object would be reused each time we deserialize the row, it will fail when we start to process the next path with different schema. Resetting value each time we finish reading one path would solve this problem.

      Attachments

        1. HIVE-7847.1.patch
          0.6 kB
          Zhichun Wu
        2. vector_alter_partition_change_col.q
          5 kB
          Matt McCline

        Activity

          People

            wzc1989 Zhichun Wu
            wzc1989 Zhichun Wu
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: