Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-26955

Select query fails when decimal column data type is changed to string/char/varchar in Parquet

    XMLWordPrintableJSON

Details

    Description

      Steps to reproduce

      create table test_parquet (id decimal) stored as parquet;
      insert into test_parquet values(238);
      alter table test_parquet change id id string;
      select * from test_parquet;
      
      Error: java.io.IOException: org.apache.parquet.io.ParquetDecodingException: Can not read value at 1 in block 0 in file hdfs:/namenode:8020/warehouse/tablespace/managed/hive/test_parquet/delta_0000001_0000001_0000/000000_0 (state=,code=0)
          at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:624)
          at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:531)
          at org.apache.hadoop.hive.ql.exec.FetchTask.executeInner(FetchTask.java:194)
          ... 55 more
      Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read value at 1 in block 0 in file file:/home/centos/Apache-Hive-Tarak/itests/qtest/target/localfs/warehouse/test_parquet/000000_0
          at org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:255)
          at org.apache.parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:207)
          at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:87)
          at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:89)
          at org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:771)
          at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:335)
          at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:562)
          ... 57 more
      Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo cannot be cast to org.apache.hadoop.hive.serde2.typeinfo.DecimalTypeInfo
          at org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter$8$5.convert(ETypeConverter.java:669)
          at org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter$8$5.convert(ETypeConverter.java:664)
          at org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter$BinaryConverter.addBinary(ETypeConverter.java:977)
          at org.apache.parquet.column.impl.ColumnReaderBase$2$6.writeValue(ColumnReaderBase.java:360)
          at org.apache.parquet.column.impl.ColumnReaderBase.writeCurrentValueToConverter(ColumnReaderBase.java:410)
          at org.apache.parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:30)
          at org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:406)
          at org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:230)
          ... 63 more

      However the same is working as expected in ORC table

      create table test_orc (id decimal) stored as orc;
      insert into test_orc values(238);
      alter table test_orc change id id string;
      select * from test_orc;
      +--------------+
      | test_orc.id  |
      +--------------+
      | 238          |
      +--------------+

      As well as text table

      create table test_text (id decimal) stored as textfile;
      insert into test_text values(238);
      alter table test_text change id id string;
      select * from test_text;
      +---------------+
      | test_text.id  |
      +---------------+
      | 238           |
      +---------------+

      Similar exception is thrown when the altered datatype is varchar and char datatype.

      Attachments

        Issue Links

          Activity

            People

              sbadhya Sourabh Badhya
              tarak271 Taraka Rama Rao Lethavadla
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 40m
                  1h 40m