Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-33404

"date_trunc" expression returns incorrect results

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0, 3.0.1, 3.1.0
    • 3.0.2, 3.1.0
    • SQL

    Description

      `date_trunc` SQL expression returns incorrect results for minute formatting string.

      Context: The minute formatting string should truncate the timestamps such that the seconds is set to ZERO.

      Repro (run the following commands in spark-shell):

      spark.conf.set("spark.sql.session.timeZone", "America/Los_Angeles")
      spark.sql("SELECT date_trunc('minute', '1769-10-17 17:10:02')").show()

      Spark currently incorrectly returns

      1769-10-17 17:10:02

      against the expected return value of

      1769-10-17 17:10:00

      This happens as truncTimestamp in package org.apache.spark.sql.catalyst.util.DateTimeUtils incorrectly assumes that time zone offsets can never have the granularity of a second and thus does not account for time zone adjustment when truncating the timestamp to minute.
      This assumption is currently used when truncating the timestamps to microsecond, millisecond, second, or minute.

      Attachments

        Activity

          People

            utkarsh39 Utkarsh Agarwal
            utkarsh39 Utkarsh Agarwal
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: