Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-11719

Inconsistency in printing NULL values

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • Impala 4.2.0
    • Backend
    • None

    Description

      If they are top level or in collections, null values are printed as "NULL":

      select int_array from functional_parquet.complextypestbl;
      +------------------------+
      | int_array              |
      +------------------------+
      | [-1]                   |
      | [1,2,3]                |
      | [NULL,1,2,NULL,3,NULL] |
      | []                     |
      | NULL                   |
      | NULL                   |
      | NULL                   |
      | NULL                   |
      +------------------------+

      If they are in a struct, they are printed as "null":

      select small_struct from functional_parquet.complextypes_structs;
      +------------------------------------+
      | small_struct                       |
      +------------------------------------+
      | NULL                               |
      | {"i":19191,"s":"small_struct_str"} |
      | {"i":98765,"s":null}               |
      | {"i":null,"s":"str"}               |
      | {"i":98765,"s":"abcde f"}          |
      | {"i":null,"s":null}                |
      +------------------------------------+

      In Hive the situation is a bit different: "NULL" is used only for top level values and "null" is printed in both collections and structs.

      select int_array from functional_parquet.complextypestbl;
      +-------------------------+
      |        int_array        |
      +-------------------------+
      | [-1]                    |
      | [1,2,3]                 |
      | [null,1,2,null,3,null]  |
      | []                      |
      | NULL                    |
      | NULL                    |
      | NULL                    |
      | NULL                    |
      +-------------------------+
      select small_struct from functional_parquet.complextypes_structs;
      +-------------------------------------+
      |            small_struct             |
      +-------------------------------------+
      | NULL                                |
      | {"i":19191,"s":"small_struct_str"}  |
      | {"i":98765,"s":null}                |
      | {"i":null,"s":"str"}                |
      | {"i":98765,"s":"abcde f"}           |
      | {"i":null,"s":null}                 |
      +-------------------------------------+

      Officially we print collections and structs in JSON form. In JSON the relevant keyword is "null".

      We should decide how we handle this situation.

      1. Have a uniform NULL representation everywhere: top level, collections and structs
        • either "NULL" or "null" everywhere
      2. Have "NULL" on the top level and "null" in collections and structs, like Hive
      3. Leave everything as it is now: "NULL" at the top level and in collections, "null" in structs.

      Attachments

        Activity

          People

            daniel.becker Daniel Becker
            daniel.becker Daniel Becker
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: