Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-33246

Spark SQL null semantics documentation is incorrect

    XMLWordPrintableJSON

Details

    • Documentation
    • Status: Resolved
    • Trivial
    • Resolution: Fixed
    • 3.0.2, 3.1.0
    • 3.0.2, 3.1.0
    • Documentation, SQL
    • None

    Description

      The documentation of Spark SQL's null semantics is (I believe) incorrect.

      The documentation states that "NULL AND False" yields NULL, when in fact it yields False.

      Seq[(java.lang.Boolean, java.lang.Boolean)](
        (true, null),
        (false, null),
        (null, true),
        (null, false),
        (null, null)
      )
        .toDF("left_operand", "right_operand")
        .withColumn("OR", 'left_operand || 'right_operand)
        .withColumn("AND", 'left_operand && 'right_operand)
        .show(truncate = false)
      
      +------------+-------------+----+-----+
      |left_operand|right_operand|OR  |AND  |
      +------------+-------------+----+-----+
      |true        |null         |true|null |
      |false       |null         |null|false|
      |null        |true         |true|null |
      |null        |false        |null|false|  <---- this line is incorrect in the docs
      |null        |null         |null|null |
      +------------+-------------+----+-----+
      

      Attachments

        Activity

          People

            stwhit Stuart White
            stwhit Stuart White
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: