Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Cannot Reproduce
-
3.0.0, 3.0.1
-
None
-
None
-
Important
Description
How to reproduce this behavior?
- TZ="America/Los_Angeles" ./bin/spark-shell
- sql("set spark.sql.hive.convertMetastoreOrc=true")
- sql("set spark.sql.orc.impl=hive")
- sql("create table t_spark(col timestamp) stored as orc;")
- sql("insert into t_spark values (cast('2100-01-01 01:33:33.123America/Los_Angeles' as timestamp));")
- sql("select col, date_format(col, 'DD') from t_spark where col = cast('2100-01-01 01:33:33.123America/Los_Angeles' as timestamp);").show(false)
This will return empty results, which is incorrect. - sql("set spark.sql.orc.impl=native")
- sql("select col, date_format(col, 'DD') from t_spark where col = cast('2100-01-01 01:33:33.123America/Los_Angeles' as timestamp);").show(false)
This will return 1 row, which is the expected output.
The above query using (True, hive) returns correct results if pushdown filters are turned off.
- sql("set spark.sql.orc.filterPushdown=false")
- sql("select col, date_format(col, 'DD') from t_spark where col = cast('2100-01-01 01:33:33.123America/Los_Angeles' as timestamp);").show(false)
This will return 1 row, which is the expected output.