Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
4.0.0-beta-1
Description
HIVE-25268 switched the internal implementation of date_format from java.text.SimpleDateFormat to java.time.format.DateTimeFormatter in order to avoid some inconsistencies (arguably wrong results) for dates prior to 1900.
However, the API of the underlying formatter is exposed to the user since they need to pass patterns that are valid for the respective formatter.
Changing the formatter implementation resolves the bugs in HIVE-25268 but also leads to backward incompatible behavior.
Consider for example the following query where the letter 'u' is used to format the date:
select date_format('2023-09-08','u');
The query above will return different result depending on the formatter that is used underneath.
In SimpleDateFormat, the letter 'u' means day of the week so the query returns 5.
In DateTimeFormatter, the letter 'u' means year so the query returns 2023.
The goal of this ticket is to make the underlying formatter of date_format function configurable by the end-user via property, similarly to what was done in HIVE-25576. For this purpose we could reuse the same property: hive.datetime.formatter
Attachments
Issue Links
- relates to
-
HIVE-25268 date_format udf returns wrong results for dates prior to 1900 if the local timezone is other than UTC
- Closed
-
HIVE-25576 Configurable datetime formatter for unix_timestamp, from_unixtime
- Closed
- links to