Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3894

unix_timestamp date conversion for 2-digit years is broken

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Impala 2.5.0
    • Fix Version/s: Impala 2.10.0
    • Component/s: Backend
    • Labels:

      Description

      The unix_timestamp conversion where the year is in 2-digit format return values that differ from the Hive's result:
      select from_unixtime(unix_timestamp('31-AUG-94', 'dd-MMM-yy'),'yyyyMMdd');
      Impala Result: 20940831
      Hive Result: 19940831
      If the 2-digit year is equal to '99' Impala will return a NULL value.
      Logic:
      if (tok_len < 4 && dt_result->year < 99) dt_result->year += 2000;
      timestamp-parse-util.h#L466
      Improvement to evaluate:

      • Include the year '99' in the condition because it can lead to undesidered impala behaviour
      • The impala's unix_timestamp conversion will return the same value as Hive:
      • It can be interesting to implement a kind of parameter (i.e.: 'Century Break' as in other reality) that allow to choice arbitrarily which century to be added in such conversion (considering that in some cases the automated software logic cannot match all the users expectations).

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                tianyiwang Tianyi Wang
                Reporter:
                adriano.simone_impala_b5e7 Adriano Simone
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: