Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32501

Inconsistent NULL conversions to strings

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.0
    • 3.1.0
    • SQL
    • None

    Description

      1. It is impossible to distinguish empty string and null, for instance:

      scala> Seq(Seq(""), Seq(null)).toDF().show
      +-----+
      |value|
      +-----+
      | []|
      | []|
      +-----+
      

      2. Inconsistent NULL conversions for top-level values and nested columns, for instance:

      scala> sql("select named_struct('c', null), null").show
      +---------------------+----+
      |named_struct(c, NULL)|NULL|
      +---------------------+----+
      | []|null|
      +---------------------+----+
      

      3. `.show()` is different from conversions to Hive strings, and as a consequence its output is different from `spark-sql` (sql tests):

      spark-sql> select named_struct('c', null) as struct;
      {"c":null}
      
      scala> sql("select named_struct('c', null) as struct").show
      +------+
      |struct|
      +------+
      | []|
      +------+
      

      Attachments

        Issue Links

          Activity

            People

              maxgekk Max Gekk
              maxgekk Max Gekk
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: