Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-9303

Parquet files are written with incorrect definition levels

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.13.1
    • 1.2.0
    • None
    • None

    Description

      The definition level, which determines which level of nesting is NULL, appears to always be n or n-1, where n is the maximum definition level. This means that only the innermost level of nesting can be NULL. This is only relevant for Parquet files. For example:

      CREATE TABLE text_tbl (a STRUCT<b:STRUCT<c:INT>>)
      STORED AS TEXTFILE;
      
      INSERT OVERWRITE TABLE text_tbl
      SELECT IF(false, named_struct("b", named_struct("c", 1)), NULL)
      FROM tbl LIMIT 1;
      
      CREATE TABLE parq_tbl
      STORED AS PARQUET
      AS SELECT * FROM text_tbl;
      
      SELECT * FROM text_tbl;
      => NULL # right
      
      SELECT * FROM parq_tbl;
      => {"b":{"c":null}} # wrong
      

      Attachments

        1. HIVE-9303.1.patch
          5 kB
          Brock Noland
        2. HIVE-9303.1.patch
          5 kB
          Sergio Peña

        Activity

          People

            spena Sergio Peña
            skye Skye Wanderman-Milne
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: