Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.0.0, 3.0.1, 3.1.0
Description
`date_trunc` SQL expression returns incorrect results for minute formatting string.
Context: The minute formatting string should truncate the timestamps such that the seconds is set to ZERO.
Repro (run the following commands in spark-shell):
spark.conf.set("spark.sql.session.timeZone", "America/Los_Angeles")
spark.sql("SELECT date_trunc('minute', '1769-10-17 17:10:02')").show()
Spark currently incorrectly returns
1769-10-17 17:10:02
against the expected return value of
1769-10-17 17:10:00
This happens as truncTimestamp in package org.apache.spark.sql.catalyst.util.DateTimeUtils incorrectly assumes that time zone offsets can never have the granularity of a second and thus does not account for time zone adjustment when truncating the timestamp to minute.
This assumption is currently used when truncating the timestamps to microsecond, millisecond, second, or minute.