Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4825

Change Timestamp representation for binary compatibility with Kudu.

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Critical
    • Resolution: Won't Do
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Impala 2.9.0
    • Component/s: Backend
    • Labels:

      Description

      Kudu represents a timestamp as a 12-byte field in rows returned to Impala. In addition, Kudu stores the 4-byte date portion first, followed by the 8-byte time portion. Impala has those fields reversed (timestamp-value.h) and treats a timestamp as a 16-byte field.

      In order to maintain Impala/Kudu tuple-format compatibility, we should change Impala's representation to match Kudu's.

        Activity

        Hide
        dknupp David Knupp added a comment -

        Reopening briefly to change the resolution.

        Show
        dknupp David Knupp added a comment - Reopening briefly to change the resolution.
        Hide
        mjacobs Matthew Jacobs added a comment -

        Kudu isn't going to have a 12byte timestamp as this JIRA states, they will only have a 64-bit unix time in micros.

        Show
        mjacobs Matthew Jacobs added a comment - Kudu isn't going to have a 12byte timestamp as this JIRA states, they will only have a 64-bit unix time in micros.
        Hide
        alex.behm Alexander Behm added a comment -

        Dan Hecht, the Kudu layout is explained in the JIRA description. Impala currently does not need to copy the fixed-size portions of tuples when scanning Kudu tables. The motivation is to maintain that perf advantage with the new Kudu timestamp type.

        Show
        alex.behm Alexander Behm added a comment - Dan Hecht , the Kudu layout is explained in the JIRA description. Impala currently does not need to copy the fixed-size portions of tuples when scanning Kudu tables. The motivation is to maintain that perf advantage with the new Kudu timestamp type.
        Hide
        dhecht Dan Hecht added a comment -

        Alexander Behm what is the kudu in-memory format you want to match specifically? And why do you want to match it exactly? don't we need to copy the slot anyway?

        Show
        dhecht Dan Hecht added a comment - Alexander Behm what is the kudu in-memory format you want to match specifically? And why do you want to match it exactly? don't we need to copy the slot anyway?
        Hide
        alex.behm Alexander Behm added a comment -

        Yes, that's what a comment in timestamp-value.h basically says. However, within a tuple, we 8-byte align a timestamp so it ends up using 16 bytes. See PrimitiveType.java and types.h.
        The ask is to make timestamp use 12 byes in tuple.

        Show
        alex.behm Alexander Behm added a comment - Yes, that's what a comment in timestamp-value.h basically says. However, within a tuple, we 8-byte align a timestamp so it ends up using 16 bytes. See PrimitiveType.java and types.h. The ask is to make timestamp use 12 byes in tuple.
        Hide
        lv Lars Volker added a comment -

        My understanding was that Impala swaps the two fields to save the 4 bytes otherwise needed to properly align the members of the class. That allows Impala to use 12 bytes instead of 16. I don't know how Kudu does it.

        Show
        lv Lars Volker added a comment - My understanding was that Impala swaps the two fields to save the 4 bytes otherwise needed to properly align the members of the class. That allows Impala to use 12 bytes instead of 16. I don't know how Kudu does it.

          People

          • Assignee:
            mjacobs Matthew Jacobs
            Reporter:
            alex.behm Alexander Behm
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development