Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-7472

Consider removing TimestampValue::FromSubsecondUnixTime

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • Backend
    • ghx-label-1

    Description

      TimestampValue::FromSubsecondUnixTime converts a double interpreted as unix time in seconds to TimestampValue. Impala uses it in two cases:
      1: double <-> timestamp casting
      2: aggregate function AVG on timestamps - it converts timestamp to double, calculates the average on doubles, and converts the average back to timestamp.

      My concern about this is double's varying's precision depending on the distance from 0 (1970-01-01) - subsec timestamps around 2018 are rounded/distorted with several nanosecs, which can add up if a large number of timestamps are averaged.

      I think that decimal (with nanosec precision) could be used for both cases:
      1. Decimal could used as an intermediate stage - double < - >decimal and decimal < - > timestamp conversions have well defined rounding rules since decimal V2.
      2. Decimal could be used as aggregation state - this would probably make the AVG slower, but it would be much more precise.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            csringhofer Csaba Ringhofer

            Dates

              Created:
              Updated:

              Slack

                Issue deployment