Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-8359

Map containing null values are not correctly written in Parquet files

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.13.1
    • 1.1.0
    • File Formats
    • None

    Description

      Tried write a map<string,string> column in a Parquet file. The table should contain :

      {"key3":"val3","key4":null}
      {"key3":"val3","key4":null}
      {"key1":null,"key2":"val2"}
      {"key3":"val3","key4":null}
      {"key3":"val3","key4":null}
      

      ... and when you do a query like

      SELECT * from mytable

      We can see that the table is corrupted :

      {"key3":"val3"}
      {"key4":"val3"}
      {"key3":"val2"}
      {"key4":"val3"}
      {"key1":"val3"}
      

      I've not been able to read the Parquet file in our software afterwards, and consequently I suspect it to be corrupted.

      For those who are interested, I generated this Parquet table from an Avro file.

      Attachments

        1. map_null_val.avro
          0.3 kB
          Frédéric TERRAZZONI
        2. HIVE-8359.1.patch
          8 kB
          Sergio Peña
        3. HIVE-8359.2.patch
          39 kB
          Mickael Lacour
        4. HIVE-8359.4.patch
          29 kB
          Sergio Peña
        5. HIVE-8359.5.patch
          40 kB
          Sergio Peña

        Issue Links

          Activity

            People

              spena Sergio Peña
              Akryus Frédéric TERRAZZONI
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: