Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-12106

Union fragment without scan node can be overparallelized by backend scheduler by 1

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 4.2.0
    • Impala 4.3.0
    • Backend
    • None
    • ghx-label-9

    Description

      IMPALA-10973 has a bug where union fragment without scan node can be overparallelized by backend scheduler by 1. This can be reproduced by running TPC-DS Q11 with MT_DOP=1.

      Planner will plan 2 instances of F11.

      |  F11:PLAN FRAGMENT [RANDOM] hosts=2 instances=2
      |  Per-Instance Resources: mem-estimate=27.53MB mem-reservation=17.00MB thread-reservation=1
      |  14:UNION
      |  |  mem-estimate=0B mem-reservation=0B thread-reservation=0
      |  |  tuple-ids=28 row-size=44B cardinality=14.80K
      |  |  in pipelines: 41(GETNEXT)
      |  |
      |  41:AGGREGATE [FINALIZE]
      |  |  output: sum:merge(ws_ext_list_price - ws_ext_discount_amt)
      |  |  group by: c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address, d_year
      |  |  having: sum(ws_ext_list_price - ws_ext_discount_amt) > CAST(0 AS DECIMAL(3,0))
      |  |  mem-estimate=17.00MB mem-reservation=17.00MB spill-buffer=1.00MB thread-reservation=0
      |  |  tuple-ids=27 row-size=169B cardinality=14.80K
      |  |  in pipelines: 41(GETNEXT), 16(OPEN)
      |  |
      |  40:EXCHANGE [HASH(c_customer_id,c_first_name,c_last_name,c_preferred_cust_flag,c_birth_country,c_login,c_email_address,d_year)]
      |  |  mem-estimate=10.34MB mem-reservation=0B thread-reservation=0
      |  |  tuple-ids=27 row-size=169B cardinality=148.00K
      |  |  in pipelines: 16(GETNEXT) 

      But backend scheduler will schedule 1 extra instance of F11.

      |  F11:EXCHANGE SENDER           3      3  130.372us  157.038us                        52.86 KB      192.00 KB                                                                                                                    
      |  14:UNION                      3      3  113.073us  196.543us    1.33K      14.80K    8.00 KB              0                                                                                                                    
      |  41:AGGREGATE                  3      3    1.437ms    1.787ms    1.33K      14.80K   17.11 MB       17.00 MB  FINALIZE                                                                                                          
      |  40:EXCHANGE                   3      3   82.561us  116.482us    1.33K     148.00K  104.00 KB       10.34 MB  HASH(c_customer_id,c_first_name,c_last_name,c_preferred_cust_flag,c_birth_country,c_login,c_email_address,d_year) 

      This is because backend scheduler mistakenly think that this fragment is free to get assigned randomly because it does not have scan node and its num input fragment is less than num backend.
      https://github.com/apache/impala/blob/112bab64b77d6ed966b1c67bd503ed632da6f208/be/src/scheduling/scheduler.cc#L441 

      This branch should additionally check if instances_per_host.empty().
      Attached is the full profile.

      Attachments

        Issue Links

          Activity

            People

              rizaon Riza Suminto
              rizaon Riza Suminto
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: