Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-5121

avg() on timestamp col is wrong with -use_local_tz_for_unix_timestamp_conversions

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: Impala 2.5.0, Impala 2.2.10, Impala 2.3.4
    • Fix Version/s: None
    • Component/s: Backend
    • Labels:
    • Target Version:
    • Epic Color:
      ghx-label-7

      Description

      The flag '-use_local_tz_for_unix_timestamp_conversions' was added for IMPALA-97. Enabling it results in timestamps sometimes being converted into localtime, but unfortunately this doesn't seem to be well defined when/where this conversion will happen.

      I've noticed that its use seems to break the avg() aggregate function on timestamp types (despite being an odd function on timestamps, it should still work).

      Impala by default, i.e. not enabling this flag:

      [localhost:21000] > select timestamp_col from functional.alltypestiny;
      Query: select timestamp_col from functional.alltypestiny
      Query submitted at: 2017-03-27 18:50:57 (Coordinator: http://mj-desktop.ca.cloudera.com:25000)
      Query progress can be monitored at: http://mj-desktop.ca.cloudera.com:25000/query_plan?query_id=8242bb6012948f06:143961ed00000000
      +---------------------+
      | timestamp_col       |
      +---------------------+
      | 2009-01-01 00:00:00 |
      | 2009-01-01 00:01:00 |
      | 2009-02-01 00:00:00 |
      | 2009-02-01 00:01:00 |
      | 2009-03-01 00:00:00 |
      | 2009-03-01 00:01:00 |
      | 2009-04-01 00:00:00 |
      | 2009-04-01 00:01:00 |
      +---------------------+
      Fetched 8 row(s) in 0.02s
      [localhost:21000] > select avg(timestamp_col) from functional.alltypestiny;
      Query: select avg(timestamp_col) from functional.alltypestiny
      Query submitted at: 2017-03-27 18:50:59 (Coordinator: http://mj-desktop.ca.cloudera.com:25000)
      Query progress can be monitored at: http://mj-desktop.ca.cloudera.com:25000/query_plan?query_id=534f6ab59b201b5e:40e2a86d00000000
      +---------------------+
      | avg(timestamp_col)  |
      +---------------------+
      | 2009-02-14 23:45:30 |
      +---------------------+
      

      Then enabling the flag results in the same timestamps returned when scanning, but evaluating them in avg() results in them being converted:

      [localhost:21000] > select timestamp_col from functional.alltypestiny;
      Query: select timestamp_col from functional.alltypestiny
      Query submitted at: 2017-03-27 18:51:17 (Coordinator: http://mj-desktop.ca.cloudera.com:25000)
      Query progress can be monitored at: http://mj-desktop.ca.cloudera.com:25000/query_plan?query_id=ac4ab8fd8caf4be9:ebb0834d00000000
      +---------------------+
      | timestamp_col       |
      +---------------------+
      | 2009-01-01 00:00:00 |
      | 2009-01-01 00:01:00 |
      | 2009-02-01 00:00:00 |
      | 2009-02-01 00:01:00 |
      | 2009-03-01 00:00:00 |
      | 2009-03-01 00:01:00 |
      | 2009-04-01 00:00:00 |
      | 2009-04-01 00:01:00 |
      +---------------------+
      Fetched 8 row(s) in 0.30s
      [localhost:21000] > select avg(timestamp_col) from functional.alltypestiny;
      Query: select avg(timestamp_col) from functional.alltypestiny
      Query submitted at: 2017-03-27 18:51:25 (Coordinator: http://mj-desktop.ca.cloudera.com:25000)
      Query progress can be monitored at: http://mj-desktop.ca.cloudera.com:25000/query_plan?query_id=9e4e2c16896090f7:8922c4f200000000
      +---------------------+
      | avg(timestamp_col)  |
      +---------------------+
      | 2009-02-15 00:00:30 |
      +---------------------+
      Fetched 1 row(s) in 0.12s
      

      This behavior seems inconsistent and I'm pretty sure is not intentional. There are two misleading functions on TimestampValue that will do this conversion when the flag is set: ToUnixTime() and ToSubsecondUnixTime(). avg() seems to have started using ToSubsecondUnixTime() after IMPALA-2914.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              mjacobs Matthew Jacobs
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: