The unix_timestamp conversion where the year is in 2-digit format return values that differ from the Hive's result:
select from_unixtime(unix_timestamp('31-AUG-94', 'dd-MMM-yy'),'yyyyMMdd');
Impala Result: 20940831
Hive Result: 19940831
If the 2-digit year is equal to '99' Impala will return a NULL value.
if (tok_len < 4 && dt_result->year < 99) dt_result->year += 2000;
Improvement to evaluate:
- Include the year '99' in the condition because it can lead to undesidered impala behaviour
- The impala's unix_timestamp conversion will return the same value as Hive:
- It can be interesting to implement a kind of parameter (i.e.: 'Century Break' as in other reality) that allow to choice arbitrarily which century to be added in such conversion (considering that in some cases the automated software logic cannot match all the users expectations).