Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 4.2.0
-
None
-
ghx-label-9
Description
IMPALA-10973 has a bug where union fragment without scan node can be overparallelized by backend scheduler by 1. This can be reproduced by running TPC-DS Q11 with MT_DOP=1.
Planner will plan 2 instances of F11.
| F11:PLAN FRAGMENT [RANDOM] hosts=2 instances=2 | Per-Instance Resources: mem-estimate=27.53MB mem-reservation=17.00MB thread-reservation=1 | 14:UNION | | mem-estimate=0B mem-reservation=0B thread-reservation=0 | | tuple-ids=28 row-size=44B cardinality=14.80K | | in pipelines: 41(GETNEXT) | | | 41:AGGREGATE [FINALIZE] | | output: sum:merge(ws_ext_list_price - ws_ext_discount_amt) | | group by: c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address, d_year | | having: sum(ws_ext_list_price - ws_ext_discount_amt) > CAST(0 AS DECIMAL(3,0)) | | mem-estimate=17.00MB mem-reservation=17.00MB spill-buffer=1.00MB thread-reservation=0 | | tuple-ids=27 row-size=169B cardinality=14.80K | | in pipelines: 41(GETNEXT), 16(OPEN) | | | 40:EXCHANGE [HASH(c_customer_id,c_first_name,c_last_name,c_preferred_cust_flag,c_birth_country,c_login,c_email_address,d_year)] | | mem-estimate=10.34MB mem-reservation=0B thread-reservation=0 | | tuple-ids=27 row-size=169B cardinality=148.00K | | in pipelines: 16(GETNEXT)
But backend scheduler will schedule 1 extra instance of F11.
| F11:EXCHANGE SENDER 3 3 130.372us 157.038us 52.86 KB 192.00 KB | 14:UNION 3 3 113.073us 196.543us 1.33K 14.80K 8.00 KB 0 | 41:AGGREGATE 3 3 1.437ms 1.787ms 1.33K 14.80K 17.11 MB 17.00 MB FINALIZE | 40:EXCHANGE 3 3 82.561us 116.482us 1.33K 148.00K 104.00 KB 10.34 MB HASH(c_customer_id,c_first_name,c_last_name,c_preferred_cust_flag,c_birth_country,c_login,c_email_address,d_year)
This is because backend scheduler mistakenly think that this fragment is free to get assigned randomly because it does not have scan node and its num input fragment is less than num backend.
https://github.com/apache/impala/blob/112bab64b77d6ed966b1c67bd503ed632da6f208/be/src/scheduling/scheduler.cc#L441
This branch should additionally check if instances_per_host.empty().
Attached is the full profile.
Attachments
Attachments
Issue Links
- causes
-
IMPALA-12135 TestDataStreamSenderTpch.test_krpc_datastream_sender_shuffle fails with memory limit exceeded
- Resolved