Details
-
Umbrella
-
Status: Resolved
-
Major
-
Resolution: Done
-
2.4.0
-
None
Description
Spark 2.4 and previous versions use a hybrid calendar - Julian + Gregorian in date/timestamp parsing, functions and expressions. The ticket aims to switch Spark on Proleptic Gregorian calendar, and use java.time classes introduced in Java 8 for timestamp/date manipulations. One of the purpose of switching on Proleptic Gregorian calendar is to conform to SQL standard which supposes such calendar.
Release note:
Spark 3.0 has switched on Proleptic Gregorian calendar in parsing, formatting, and converting dates and timestamps as well as in extracting sub-components like years, days and etc. It uses Java 8 API classes from the java.time packages that based on ISO chronology . Previous versions of Spark performed those operations by using the hybrid calendar (Julian + Gregorian). The changes might impact on the results for dates and timestamps before October 15, 1582 (Gregorian).
Attachments
Issue Links
- is related to
-
SPARK-31702 Old POSIXlt, POSIXct and Date become corrupt due to calendar difference
- Open
-
SPARK-30688 Spark SQL Unix Timestamp produces incorrect result with unix_timestamp UDF
- In Progress
-
SPARK-31404 file source backward compatibility after calendar switch
- Resolved
-
SPARK-30760 Port `millisToDays` and `daysToMillis` on Java 8 time API
- Resolved
- relates to
-
SPARK-18381 Wrong date conversion between spark and python for dates before 1583
- Open
-
SPARK-30951 Potential data loss for legacy applications after switch to proleptic Gregorian calendar
- Resolved
- supercedes
-
SPARK-34675 TimeZone inconsistencies when JVM and session timezones are different
- Reopened
- links to