Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-48652

Casting Issue in Spark SQL: String Column Compared to Integer Value Yields Empty Results

    XMLWordPrintableJSON

Details

    Description

      In Spark SQL, comparing a string column to an integer value can lead to unexpected results due to type casting resulting in an empty result set.

      case class Person(id: String, name: String)
      val personDF = Seq(Person("a", "amit"), Person("b", "abhishek")).toDF()
      personDF.createOrReplaceTempView("person_ddf")
      val sqlQuery = "SELECT * FROM person_ddf WHERE id <> -1"
      val resultDF = spark.sql(sqlQuery)
      resultDF.show() // Empty result due to type casting issue 
      
      

      Below is the logical and physical plan which I m getting

      == Parsed Logical Plan ==
      'Project [*]
      +- 'Filter NOT ('id = -1)
         +- 'UnresolvedRelation [person_ddf], [], false
      
      == Analyzed Logical Plan ==
      id: string, name: string
      Project [id#356, name#357]
      +- Filter NOT (cast(id#356 as int) = -1)
         +- SubqueryAlias person_ddf
            +- View (`person_ddf`, [id#356,name#357])
               +- LocalRelation [id#356, name#357]

      But when I m using the same query and table in Redshift which is based on PostGreSQL. I am getting the desired result.

      select * from person where id <> -1; 

      Explain plan obtained in Redshift.

      XN Seq Scan on person  (cost=0.00..0.03 rows=1 width=336)
        Filter: ((id)::text <> '-1'::text) 

       

      In the execution plan for Spark, the ID column is cast as an integer, while in Redshift, the ID column is cast as a varchar.

      Shouldn't Spark SQL handle this the same way as Redshift, using the datatype of the ID column rather than the datatype of -1?

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              abhiips07 Abhishek Singh
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: