Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17843

UINT32 Parquet columns are handled as signed INT32-s, silently reading incorrect data

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 3.0.0
    • None
    • None

    Description

      An unsigned 32 bit Parquet column, such as

      optional int32 uint_32_col (UINT_32)
      

      is read by Hive as if it were signed, leading to incorrect results.

      Attachments

        1. test_uint.parquet
          0.3 kB
          Janaki Lahorani
        2. HIVE-17843.4.patch
          65 kB
          Janaki Lahorani
        3. HIVE-17843.3.patch
          65 kB
          Janaki Lahorani
        4. HIVE-17843.2.patch
          63 kB
          Janaki Lahorani
        5. HIVE-17843.1.patch
          63 kB
          Janaki Lahorani
        6. HIVE-17843.1.patch
          63 kB
          Janaki Lahorani
        7. data_with_valid_values.parquet
          0.8 kB
          Janaki Lahorani
        8. data_including_invalid_values.parquet
          0.9 kB
          Janaki Lahorani

        Issue Links

          Activity

            People

              janulatha Janaki Lahorani
              zi Zoltan Ivanfi
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: