Description
Currently, Spark converts dates to/from "timestamp" in millisecond precision but internally Catalyst's TimestampType values are stored as microseconds since epoch. When such conversion is needed in other date-timestamp functions like DateTimeUtils.monthsBetween, the function has to convert microseconds to milliseconds and then to days, see https://github.com/apache/spark/blob/06217cfded8d32962e7c54c315f8e684eb9f0999/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala#L577-L580 which just brings additional overhead w/o any benefits.
In later versions, it makes sense because milliseconds can be passed to TimeZone.getOffset but recently Spark switched to Java 8 time API and ZoneId. And supporting conversions to milliseconds are not needed any more.
The ticket aims to replace millisToDays by microsToDays, and daysToMillis by daysToMicros in DateTimeUtils.
Attachments
Issue Links
- links to