Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4120

Incorrect results with LEAD() analytic function

    Details

      Description

      for the following query

      SELECT
        FROM_UNIXTIME( UNIX_TIMESTAMP( CONCAT(CAST( ssm.ymd as STRING ),CAST( pl.time AS string )), 'yyyyMMddHH:mm:ss' ) ) AS datetime_click,
        FROM_UNIXTIME( LEAD( UNIX_TIMESTAMP( CONCAT( CAST( ssm.ymd as STRING ), CAST( pl.time AS string ) ), 'yyyyMMddHH:mm:ss' ), 1 ) OVER (PARTITION BY ssm.tracking_int_id ORDER BY cl.date_id) ) AS datetime_next_click_v1,
        LEAD( FROM_UNIXTIME( UNIX_TIMESTAMP( CONCAT( CAST( ssm.ymd as STRING ), CAST( pl.time AS string ) ), 'yyyyMMddHH:mm:ss' ) ), 1 ) OVER (PARTITION BY ssm.tracking_int_id ORDER BY cl.date_id) AS datetime_next_click_v2,
        LEAD( ssm.ymd, 1 ) OVER (PARTITION BY ssm.tracking_int_id ORDER BY cl.date_id) AS ymd_next_click,
        LEAD( pl.time, 1 ) OVER (PARTITION BY ssm.tracking_int_id ORDER BY cl.date_id) AS time_next_click
      FROM
        trivago_analytic.session_stats_master ssm
      JOIN ssm.co_log_entries AS cl
      JOIN ssm.page_log_entries AS pl
        ON pl.date_id = cl.date_id
      WHERE
        ssm.ymd BETWEEN 20160501 AND 20160503
        AND ssm.crawler_id = 0
        AND cl.page_id = 8001
      ORDER BY ssm.ymd, cl.date_id;
      

      datetime_next_click_v1 returns different values than datetime_next_click_v2 even though they should return the same one to my understanding. datetime_next_click_v1 is the correct one.

      I attached (reduced to relevant columns) table structure, query, query plan and profile.
      impalad version 2.6.0-cdh5.8.0 RELEASE

      please let me know if you need more information

        Attachments

        1. explain
          1 kB
          Clemens Valiente
        2. query_profile
          613 kB
          Clemens Valiente
        3. query_result.csv
          7 kB
          Clemens Valiente
        4. query.sql
          1 kB
          Clemens Valiente
        5. table_definition.hql
          0.4 kB
          Clemens Valiente
        6. thrift_profile_b64a3852190d9ac2-412186998b9ee4a1.txt
          102 kB
          Clemens Valiente

          Issue Links

            Activity

              People

              • Assignee:
                kwho Michael Ho
                Reporter:
                clemens.valiente@trivago.com Clemens Valiente
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: