Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Cannot Reproduce
-
Impala 3.0, Impala 2.12.0
-
None
-
None
-
ghx-label-7
Description
Recent runs of stress test in a cluster with 135 nodes resulted in inconsistent result every now and then for TPCDS-Q18a. The scale of TPC-DS is 10000.
--- result_correct.txt 2018-09-10 08:54:30.427603941 -0700 +++ result_incorrect.txt 2018-09-10 17:39:59.512926323 -0700 @@ -1,3 +1,4 @@ +opening /tmp/stress/instance1/data/jenkins/workspace/impala-test-stress-secure-140node/archive/result_hashes/input.txt +------------------+------------+----------+-----------+-------+--------+----------+--------+----------+---------+------+ | i_item_id | ca_country | ca_state | ca_county | agg1 | agg2 | agg3 | agg4 | agg5 | agg6 | agg7 | +------------------+------------+----------+-----------+-------+--------+----------+--------+----------+---------+------+ @@ -13,7 +14,7 @@ | AAAAAAAAAABMAAAA | | IN | | 67.00 | 105.60 | 2232.51 | 74.08 | -1114.55 | 1964.50 | 1.00 | | AAAAAAAAAABNFAAA | | IN | | 40.00 | 115.76 | 0.00 | 70.61 | -459.60 | 1933.00 | 3.00 | | AAAAAAAAAACBBAAA | | IN | | 32.00 | 37.99 | 0.00 | 8.73 | -448.64 | 1963.00 | 3.00 | -| AAAAAAAAAACCAAAA | | IN | | 56.00 | 2.50 | 0.00 | 0.62 | -62.72 | NULL | 4.00 | +| AAAAAAAAAACCAAAA | | IN | | 56.00 | 2.50 | 0.00 | 0.62 | -62.72 | 38463209| 4.00 | | AAAAAAAAAACDCAAA | | IN | | 30.00 | 53.19 | 0.00 | 17.02 | -505.80 | 1990.00 | 6.00 | | AAAAAAAAAACFDAAA | | IN | | 58.00 | 113.96 | 0.00 | 19.37 | -2148.90 | 1974.00 | 1.00 | | AAAAAAAAAACHEAAA | | IN | | 16.00 | 19.90 | 0.00 | 13.13 | 9.76 | 1960.00 | 3.00 | @@ -101,4 +102,4 @@ | AAAAAAAAAAPKBAAA | | IN | | 2.00 | 65.90 | 0.00 | 58.65 | 60.24 | 1954.00 | 3.00 | | AAAAAAAAAAPOAAAA | | IN | | 92.00 | 125.36 | 0.00 | 94.02 | 1743.40 | 1963.00 | 6.00 | | AAAAAAAAAAPODAAA | | IN | | 75.00 | 119.08 | 0.00 | 104.79 | 4501.50 | 1981.00 | 5.00 | -+------------------+------------+----------+-----------+-------+--------+----------+--------+----------+---------+------+ \ No newline at end of file ++------------------+------------+----------+-----------+-------+--------+----------+--------+----------+---------+------+
The problem is not reproducible by running the query at Impala shell.
The query is TPCDS Q18a:
with results as (select i_item_id, ca_country, ca_state, ca_county, cast(cs_quantity as decimal(12,2)) agg1, cast(cs_list_price as decimal(12,2)) agg2, cast(cs_coupon_amt as decimal(12,2)) agg3, cast(cs_sales_price as decimal(12,2)) agg4, cast(cs_net_profit as decimal(12,2)) agg5, cast(c_birth_year as decimal(12,2)) agg6, cast(cd1.cd_dep_count as decimal(12,2)) agg7 from catalog_sales, customer_demographics cd1, customer_demographics cd2, customer, customer_address, date_dim, item where cs_sold_date_sk = d_date_sk and cs_item_sk = i_item_sk and cs_bill_cdemo_sk = cd1.cd_demo_sk and cs_bill_customer_sk = c_customer_sk and cd1.cd_gender = 'F' and cd1.cd_education_status = 'Unknown' and c_current_cdemo_sk = cd2.cd_demo_sk and c_current_addr_sk = ca_address_sk and c_birth_month in (1, 6, 8, 9, 12, 2) and d_year = 1998 and ca_state in ('MS', 'IN', 'ND', 'OK', 'NM', 'VA', 'MS') ) select i_item_id, ca_country, ca_state, ca_county, agg1, agg2, agg3, agg4, agg5, agg6, agg7 from ( select i_item_id, ca_country, ca_state, ca_county, avg(agg1) agg1, avg(agg2) agg2, avg(agg3) agg3, avg(agg4) agg4, avg(agg5) agg5, avg(agg6) agg6, avg(agg7) agg7 from results group by i_item_id, ca_country, ca_state, ca_county union all select i_item_id, ca_country, ca_state, NULL as county, avg(agg1) agg1, avg(agg2) agg2, avg(agg3) agg3, avg(agg4) agg4, avg(agg5) agg5, avg(agg6) agg6, avg(agg7) agg7 from results group by i_item_id, ca_country, ca_state union all select i_item_id, ca_country, NULL as ca_state, NULL as county, avg(agg1) agg1, avg(agg2) agg2, avg(agg3) agg3, avg(agg4) agg4, avg(agg5) agg5, avg(agg6) agg6, avg(agg7) agg7 from results group by i_item_id, ca_country union all select i_item_id, NULL as ca_country, NULL as ca_state, NULL as county, avg(agg1) agg1, avg(agg2) agg2, avg(agg3) agg3, avg(agg4) agg4, avg(agg5) agg5, avg(agg6) agg6, avg(agg7) agg7 from results group by i_item_id union all select NULL AS i_item_id, NULL as ca_country, NULL as ca_state, NULL as county, avg(agg1) agg1, avg(agg2) agg2, avg(agg3) agg3, avg(agg4) agg4, avg(agg5) agg5, avg(agg6) agg6, avg(agg7) agg7 from results ) foo order by ca_country, ca_state, ca_county, i_item_id limit 100;
cc'ing tarmstrong@cloudera.com, twm378