Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-27586

Parse dates from strings ignoring trailing (potentialy) invalid chars

    XMLWordPrintableJSON

Details

    Description

      The goal of this ticket is to extract and return a valid date from a string value when there is a valid date prefix in the string.

      The following table contains a few illustrative examples highlighting what happens now and what will happen after the proposed changes to ignore trailing characters. HIVE-20007 introduced some behavior changes around this area so the table also displays what was the Hive behavior before that change.

      ID String value Before HIVE-20007 Current behavior Ignore trailing chars
      1 2023-08-03_16:02:00 2023-08-03 null 2023-08-03
      2 2023-08-03-16:02:00 2023-08-03 null 2023-08-03
      3 2023-08-0316:02:00 2024-06-11 null 2023-08-03
      4 03-08-2023 0009-02-12 null 0003-08-20
      5 2023-08-03 GARBAGE 2023-08-03 2023-08-03 2023-08-03
      6 2023-08-03TGARBAGE 2023-08-03 2023-08-03 2023-08-03
      7 2023-08-03_GARBAGE 2023-08-03 null 2023-08-03

      This change partially (see example 3 and 4) restores the behavior changes introduced by HIVE-20007 and at the same time makes the current behavior of handling trailing invalid chars more uniform.

      This change will have an impact on various Hive SQL functions and operators (+/-) that accept dates from string values. A partial list of affected functions is outlined below:

      • CAST (V AS DATE)
      • CAST (V AS TIMESTAMP)
      • TO_DATE
      • DATE_ADD
      • DATE_DIFF
      • WEEKOFYEAR
      • DAYOFWEEK
      • TRUNC

      Attachments

        Issue Links

          Activity

            People

              zabetak Stamatis Zampetakis
              zabetak Stamatis Zampetakis
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: