Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-19069

Hive can't read int32 and int64 Parquet decimal values

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 2.3.2
    • None
    • Types

    Description

      Parquet supports several minor types for Decimal ligical data type:
      https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal

      But Hive supports only "fixed_len_byte_array":
      https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java#L335

      After creating parquet external table and quering it via Hive:

      hive> select * from decimal_parquet;
      OK
      Failed with exception java.io.IOException:org.apache.parquet.io.ParquetDecodingException: Can not read value at 1 in block 0 in file maprfs:///tmp/decimal_parquet/0_0_0.parquet
      

      The sample of parquet file with decimal int32 values is added to the jira:

      vitalii@vitalii-pc:~$ java -jar parquet-tools/parquet-mr/parquet-tools/target/parquet-tools-1.6.0rc3-SNAPSHOT.jar schema /tmp/decimal_parquet/0_0_0.parquet 
      message root {
        optional binary a (UTF8);
        optional int32 b (DECIMAL(7,2));
      }
      
      vitalii@vitalii-pc:~$ java -jar parquet-tools/parquet-mr/parquet-tools/target/parquet-tools-1.6.0rc3-SNAPSHOT.jar cat /tmp/md4107_par/0_0_0.parquet 
      a = a
      b = 100
      

      Attachments

        1. 0_0_0.parquet
          0.4 kB
          Vitalii Diravka

        Issue Links

          Activity

            People

              Unassigned Unassigned
              vitalii Vitalii Diravka
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: