Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-12191 Hive timestamp problems
  3. HIVE-20007

Hive should carry out timestamp computations in UTC

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • None
    • 4.0.0-alpha-1
    • Hive

    Description

      Hive currently uses the "local" time of a java.sql.Timestamp to represent the SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use Timestamp#getYear() and similar methods to implement SQL functions like year.

      When the SQL session's time zone is a DST zone, such as America/Los_Angeles that alternates between PST and PDT, there are times that cannot be represented because the effective zone skips them.

      hive> select TIMESTAMP '2015-03-08 02:10:00.101';
      2015-03-08 03:10:00.101
      

      Using UTC instead of the SQL session time zone as the underlying zone for a java.sql.Timestamp avoids this bug, while still returning correct values for getYear etc. Using UTC as the convenience representation (timestamp without time zone has no real zone) would make timestamp calculations more consistent and avoid similar problems in the future.

      Notably, this would break the unix_timestamp UDF that specifies the result is with respect to "the default timezone and default locale". That function would need to be updated to use the System.getProperty("user.timezone") zone.

      Attachments

        1. HIVE-20007.patch
          4.14 MB
          jcamachorodriguez

        Issue Links

          Activity

            People

              jcamacho Jesús Camacho Rodríguez
              rdblue Ryan Blue
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: