Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3894

unix_timestamp date conversion for 2-digit years is broken

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 2.5.0
    • Impala 2.10.0
    • Backend

    Description

      The unix_timestamp conversion where the year is in 2-digit format return values that differ from the Hive's result:
      select from_unixtime(unix_timestamp('31-AUG-94', 'dd-MMM-yy'),'yyyyMMdd');
      Impala Result: 20940831
      Hive Result: 19940831
      If the 2-digit year is equal to '99' Impala will return a NULL value.
      Logic:
      if (tok_len < 4 && dt_result->year < 99) dt_result->year += 2000;
      timestamp-parse-util.h#L466
      Improvement to evaluate:

      • Include the year '99' in the condition because it can lead to undesidered impala behaviour
      • The impala's unix_timestamp conversion will return the same value as Hive:
      • It can be interesting to implement a kind of parameter (i.e.: 'Century Break' as in other reality) that allow to choice arbitrarily which century to be added in such conversion (considering that in some cases the automated software logic cannot match all the users expectations).

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            tianyiwang Tianyi Wang
            adriano.simone_impala_b5e7 Adriano Simone
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment