Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-19069

Hive can't read int32 and int64 Parquet decimal values

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 2.3.2
    • Fix Version/s: None
    • Component/s: Types
    • Labels:

      Description

      Parquet supports several minor types for Decimal ligical data type:
      https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal

      But Hive supports only "fixed_len_byte_array":
      https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java#L335

      After creating parquet external table and quering it via Hive:

      hive> select * from decimal_parquet;
      OK
      Failed with exception java.io.IOException:org.apache.parquet.io.ParquetDecodingException: Can not read value at 1 in block 0 in file maprfs:///tmp/decimal_parquet/0_0_0.parquet
      

      The sample of parquet file with decimal int32 values is added to the jira:

      vitalii@vitalii-pc:~$ java -jar parquet-tools/parquet-mr/parquet-tools/target/parquet-tools-1.6.0rc3-SNAPSHOT.jar schema /tmp/decimal_parquet/0_0_0.parquet 
      message root {
        optional binary a (UTF8);
        optional int32 b (DECIMAL(7,2));
      }
      
      vitalii@vitalii-pc:~$ java -jar parquet-tools/parquet-mr/parquet-tools/target/parquet-tools-1.6.0rc3-SNAPSHOT.jar cat /tmp/md4107_par/0_0_0.parquet 
      a = a
      b = 100
      

        Attachments

        1. 0_0_0.parquet
          0.4 kB
          Vitalii Diravka

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                vitalii Vitalii Diravka
              • Votes:
                1 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: