Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.6.3, 2.0.2, 2.1.2, 2.2.1, 2.3.0
Description
scala> spark.version res1: String = 2.2.1 scala> spark.sql("select cast('2017-03-01 00:00:00' as timestamp) between cast('2017-02-28' as date) and cast('2017-03-01' as date)").show +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |((CAST(CAST(2017-03-01 00:00:00 AS TIMESTAMP) AS STRING) >= CAST(CAST(2017-02-28 AS DATE) AS STRING)) AND (CAST(CAST(2017-03-01 00:00:00 AS TIMESTAMP) AS STRING) <= CAST(CAST(2017-03-01 AS DATE) AS STRING)))| +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | false| +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
As shown above, when a timestamp is compared to date in SparkSQL, both timestamp and date are downcast to string, and leading to unexpected result. If run the same SQL in presto/Athena, I got the expected result
select cast('2017-03-01 00:00:00' as timestamp) between cast('2017-02-28' as date) and cast('2017-03-01' as date) _col0 1 true
Is this a bug for Spark or a feature?