Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-9482

Hive parquet timestamp compatibility

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.15.0
    • 1.2.0
    • File Formats
    • None

    Description

      In current Hive implementation, timestamps are stored in UTC (converted from current timezone), based on original parquet timestamp spec.

      However, we find this is not compatibility with other tools, and after some investigation it is not the way of the other file formats, or even some databases (Hive Timestamp is more equivalent of 'timestamp without timezone' datatype).

      This is the first part of the fix, which will restore compatibility with parquet-timestamp files generated by external tools by skipping conversion on reading.

      Later fix will change the write path to not convert, and stop the read-conversion even for files written by Hive itself.

      Attachments

        1. parquet_external_time.parq
          0.2 kB
          Szehon Ho
        2. HIVE-9482.patch
          28 kB
          Szehon Ho
        3. HIVE-9482.patch
          28 kB
          Szehon Ho
        4. HIVE-9482.2.patch
          28 kB
          Szehon Ho

        Issue Links

          Activity

            People

              szehon Szehon Ho
              szehon Szehon Ho
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: