Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-8359

Map containing null values are not correctly written in Parquet files

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.13.1
    • Fix Version/s: 1.1.0
    • Component/s: File Formats
    • Labels:
      None

      Description

      Tried write a map<string,string> column in a Parquet file. The table should contain :

      {"key3":"val3","key4":null}
      {"key3":"val3","key4":null}
      {"key1":null,"key2":"val2"}
      {"key3":"val3","key4":null}
      {"key3":"val3","key4":null}
      

      ... and when you do a query like

      SELECT * from mytable

      We can see that the table is corrupted :

      {"key3":"val3"}
      {"key4":"val3"}
      {"key3":"val2"}
      {"key4":"val3"}
      {"key1":"val3"}
      

      I've not been able to read the Parquet file in our software afterwards, and consequently I suspect it to be corrupted.

      For those who are interested, I generated this Parquet table from an Avro file.

        Attachments

        1. map_null_val.avro
          0.3 kB
          Frédéric TERRAZZONI
        2. HIVE-8359.5.patch
          40 kB
          Sergio Peña
        3. HIVE-8359.4.patch
          29 kB
          Sergio Peña
        4. HIVE-8359.2.patch
          39 kB
          Mickael Lacour
        5. HIVE-8359.1.patch
          8 kB
          Sergio Peña

          Issue Links

            Activity

              People

              • Assignee:
                spena Sergio Peña
                Reporter:
                Akryus Frédéric TERRAZZONI
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: