Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 2.6.0, Impala 2.7.0
Description
for the following query
SELECT FROM_UNIXTIME( UNIX_TIMESTAMP( CONCAT(CAST( ssm.ymd as STRING ),CAST( pl.time AS string )), 'yyyyMMddHH:mm:ss' ) ) AS datetime_click, FROM_UNIXTIME( LEAD( UNIX_TIMESTAMP( CONCAT( CAST( ssm.ymd as STRING ), CAST( pl.time AS string ) ), 'yyyyMMddHH:mm:ss' ), 1 ) OVER (PARTITION BY ssm.tracking_int_id ORDER BY cl.date_id) ) AS datetime_next_click_v1, LEAD( FROM_UNIXTIME( UNIX_TIMESTAMP( CONCAT( CAST( ssm.ymd as STRING ), CAST( pl.time AS string ) ), 'yyyyMMddHH:mm:ss' ) ), 1 ) OVER (PARTITION BY ssm.tracking_int_id ORDER BY cl.date_id) AS datetime_next_click_v2, LEAD( ssm.ymd, 1 ) OVER (PARTITION BY ssm.tracking_int_id ORDER BY cl.date_id) AS ymd_next_click, LEAD( pl.time, 1 ) OVER (PARTITION BY ssm.tracking_int_id ORDER BY cl.date_id) AS time_next_click FROM trivago_analytic.session_stats_master ssm JOIN ssm.co_log_entries AS cl JOIN ssm.page_log_entries AS pl ON pl.date_id = cl.date_id WHERE ssm.ymd BETWEEN 20160501 AND 20160503 AND ssm.crawler_id = 0 AND cl.page_id = 8001 ORDER BY ssm.ymd, cl.date_id;
datetime_next_click_v1 returns different values than datetime_next_click_v2 even though they should return the same one to my understanding. datetime_next_click_v1 is the correct one.
I attached (reduced to relevant columns) table structure, query, query plan and profile.
impalad version 2.6.0-cdh5.8.0 RELEASE
please let me know if you need more information
Attachments
Attachments
Issue Links
- is related to
-
IMPALA-4518 behavior with LEAD/LAG on string column/expression where offset goes beyond row bounds differs from postgres
-
- Resolved
-