Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2111

Parquet TIMESTAMP is in milliseconds

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Done
    • Impala 2.2, Impala 2.3.0
    • None
    • Backend

    Description

      Based on documentation:
      http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/impala_parquet.html#parquet_data_types_unique_1

      INT96 is mapping to TIMESTAMP in Impala.
      A Sqoop import from DATE, DATETIME and TIMESTAMP will map to INT64 annotated with TIMESTAMP_MILLIS (sqlType 91/93) as described in:
      https://github.com/apache/parquet-format/blob/master/LogicalTypes.md

      In Impala this appears as a BIGINT, which casting to TIMESTAMP will produce wrong results as the underlying value is in milliseconds, while the cast expects seconds.

      Impala should recognize INT64 + OriginalType DATE/DATETIME/TIMESTAMP as TIMESTAMP in milliseconds.

      Attachments

        1. timestamp_millis.parquet
          8 kB
          Taras Bobrovytsky

        Issue Links

          Activity

            People

              Unassigned Unassigned
              acordero Andrés Cordero
              Votes:
              3 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: