Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2111

Parquet TIMESTAMP is in milliseconds

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Done
    • Impala 2.2, Impala 2.3.0
    • None
    • Backend

    Description

      Based on documentation:
      http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/impala_parquet.html#parquet_data_types_unique_1

      INT96 is mapping to TIMESTAMP in Impala.
      A Sqoop import from DATE, DATETIME and TIMESTAMP will map to INT64 annotated with TIMESTAMP_MILLIS (sqlType 91/93) as described in:
      https://github.com/apache/parquet-format/blob/master/LogicalTypes.md

      In Impala this appears as a BIGINT, which casting to TIMESTAMP will produce wrong results as the underlying value is in milliseconds, while the cast expects seconds.

      Impala should recognize INT64 + OriginalType DATE/DATETIME/TIMESTAMP as TIMESTAMP in milliseconds.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            acordero Andrés Cordero
            Votes:
            3 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment