Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-38790

Unexpected behaviour for "IN" operator in spark when null is involved in array

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 3.2.1
    • None
    • Spark Core, SQL
    • None
    • Tested on pyspark 3.2.1

    Description

       

      `IN`operator in spark.sql is giving unexpected results

      1 in (null, 1) => true

      1 in (null, 2) = null

      I would have expected the second piece of code to throw false.

       

       

      >>> spark.sql('SELECT 1 in (null, 1)').show()
      +----------------+
      |(1 IN (NULL, 1))|
      +----------------+
      |            true|
      +----------------+
      
      >>> spark.sql('SELECT 1 in (null, 2)').show()
      +----------------+
      |(1 IN (NULL, 2))|
      +----------------+
      |            null|
      +----------------+
       

       

      Attachments

        1. image-2022-04-05-22-21-27-695.png
          38 kB
          Vishnu K Suman

        Activity

          People

            Unassigned Unassigned
            visuman Vishnu K Suman
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: