Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-1473

Incorrect cardinality displayed in exec summary for merging exchange on coordinator.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: Impala 2.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Backend
    • Labels:

      Description

      Consider the following exec summary for TPCH-Q21.

      select
        s_name,
        count(*) as numwait
      from
        supplier,
        lineitem l1,
        orders,
        nation
      where
        s_suppkey = l1.l_suppkey
        and o_orderkey = l1.l_orderkey
        and o_orderstatus = 'F'
        and l1.l_receiptdate > l1.l_commitdate
        and exists (
          select
            *
          from
            lineitem l2
          where
            l2.l_orderkey = l1.l_orderkey
            and l2.l_suppkey <> l1.l_suppkey
        )
        and not exists (
          select
            *
          from
            lineitem l3
          where
            l3.l_orderkey = l1.l_orderkey
            and l3.l_suppkey <> l1.l_suppkey
            and l3.l_receiptdate > l3.l_commitdate
        )
        and s_nationkey = n_nationkey
        and n_name = 'SAUDI ARABIA'
      group by
        s_name
      order by
        numwait desc,
        s_name
      limit 100
      

      Node 21 returns 100 rows, but 300 is incorrectly displayed.

      I'm not sure if this bug is specific to merging exchanges or exchanges with limits in general.

      21:MERGING-EXCHANGE        1  341.969us  341.969us      300         100          0        -1.00 B  UNPARTITIONED               
      12:TOP-N                   3    1.738ms    1.928ms      300         100   20.00 KB        4.10 KB                              
      20:AGGREGATE               3  141.305ms  141.559ms      411      10.39K    6.29 MB       10.00 MB  FINALIZE                    
      19:EXCHANGE                3  163.412us  178.198us    1.20K      10.39K          0              0  HASH(s_name)                
      11:AGGREGATE               3  159.478ms  212.594ms    1.20K      10.39K   11.47 MB       10.00 MB                              
      10:HASH JOIN               3    1s485ms    1s575ms    4.14K     600.12K  838.05 MB       14.27 MB  LEFT ANTI JOIN, PARTITIONED 
      |--18:EXCHANGE             3  293.828ms  308.780ms    3.79M     600.12K          0              0  HASH(l3.l_orderkey)         
      |  05:SCAN HDFS            3    2s787ms    3s233ms    3.79M     600.12K   65.66 MB      264.00 MB  tpch.lineitem l3            
      09:HASH JOIN               3    4s791ms    4s880ms   73.09K     600.12K  838.05 MB       33.58 MB  LEFT SEMI JOIN, PARTITIONED 
      |--17:EXCHANGE             3  380.799ms  406.256ms    6.00M       6.00M          0              0  HASH(l2.l_orderkey)         
      |  04:SCAN HDFS            3    2s745ms    3s046ms    6.00M       6.00M   64.73 MB      264.00 MB  tpch.lineitem l2            
      16:EXCHANGE                3    3.863ms    3.915ms   75.87K     600.12K          0              0  HASH(l1.l_orderkey)         
      08:HASH JOIN               3  387.206ms  400.723ms   75.87K     600.12K    8.59 MB        28.00 B  INNER JOIN, BROADCAST       
      |--15:EXCHANGE             3   15.843us   19.148us        1           1          0              0  BROADCAST                   
      |  03:SCAN HDFS            1  443.178ms  443.178ms        1           1   41.00 KB       32.00 MB  tpch.nation                 
      07:HASH JOIN               3    3s889ms    4s269ms    1.83M     600.12K  566.03 MB       13.11 MB  INNER JOIN, BROADCAST       
      |--14:EXCHANGE             3   141.48ms  144.758ms  729.41K     500.00K          0              0  BROADCAST                   
      |  02:SCAN HDFS            2    1s744ms    2s386ms  729.41K     500.00K   32.16 MB      176.00 MB  tpch.orders                 
      06:HASH JOIN               3    1s763ms    1s879ms    3.79M     600.12K   12.43 MB      472.66 KB  INNER JOIN, BROADCAST       
      |--13:EXCHANGE             3    3.112ms    5.549ms   10.00K      10.00K          0              0  BROADCAST                   
      |  00:SCAN HDFS            1  508.179ms  508.179ms   10.00K      10.00K    2.24 MB       32.00 MB  tpch.supplier               
      01:SCAN HDFS               3    1s854ms    1s895ms    3.79M     600.12K   65.38 MB      264.00 MB  tpch.lineitem l1            
      

        Activity

        Hide
        gmedasani_impala_1424 Guru Medasani added a comment -

        TPCDS query 7 showed similar behavior.

        Query:

        select i_item_id,
        avg(ss_quantity) agg1,
        avg(ss_list_price) agg2,
        avg(ss_coupon_amt) agg3,
        avg(ss_sales_price) agg4
        from
        store_sales,
        customer_demographics,
        date_dim,
        item,
        promotion
        where
        ss_sold_date_sk = d_date_sk
        and ss_item_sk = i_item_sk
        and ss_cdemo_sk = cd_demo_sk
        and ss_promo_sk = p_promo_sk
        and cd_gender = 'F'
        and cd_marital_status = 'W'
        and cd_education_status = 'Primary'
        and (p_channel_email = 'N'
        or p_channel_event = 'N')
        and d_year = 1998
        and ss_sold_date_sk between 2450815 and 2451179 -- partition key filter
        group by
        i_item_id
        order by
        i_item_id
        limit 100
        

        Exec Summary:

        Operator              #Hosts   Avg Time   Max Time    #Rows  Est. #Rows   Peak Mem  Est. Peak Mem  Detail                         
        ----------------------------------------------------------------------------------------------------------------------------------
        17:MERGING-EXCHANGE        1  198.162us  198.162us      400         100          0        -1.00 B  UNPARTITIONED                  
        10:TOP-N                   4   13.299ms   13.728ms      400         100  148.00 KB        6.25 KB                                 
        16:AGGREGATE               4  479.146ms  542.757ms  180.00K     179.73K   38.49 MB       10.00 MB  FINALIZE                       
        15:EXCHANGE                4   18.521ms    19.92ms  720.00K     179.73K          0              0  HASH(i_item_id)                
        09:AGGREGATE               4    7s481ms    8s016ms  720.00K     179.73K  793.16 MB       12.07 MB                                 
        08:HASH JOIN               4   968.64ms    1s032ms   22.38M       1.63B   26.92 MB       73.48 KB  INNER JOIN, BROADCAST          
        |--14:EXCHANGE             4   76.135us   77.429us    1.79K       1.80K          0              0  BROADCAST                      
        |  04:SCAN HDFS            1   66.655ms   66.655ms    1.79K       1.80K  275.09 KB       48.00 MB  tpcds_parquet.promotion        
        07:HASH JOIN               4    2s813ms    3s011ms   22.67M       1.63B   60.00 MB       13.60 MB  INNER JOIN, BROADCAST          
        |--13:EXCHANGE             4   13.731ms   14.841ms  360.00K     360.00K          0              0  BROADCAST                      
        |  03:SCAN HDFS            1  138.558ms  138.558ms  360.00K     360.00K    7.98 MB       96.00 MB  tpcds_parquet.item             
        06:HASH JOIN               4  529.185ms  566.291ms   22.67M       1.63B   13.27 MB        27.00 B  INNER JOIN, BROADCAST          
        |--12:EXCHANGE             4   28.338us   29.773us      365           3          0              0  BROADCAST                      
        |  02:SCAN HDFS            1   20.911ms   20.911ms      365           3    1.20 MB       32.00 MB  tpcds_parquet.date_dim         
        05:HASH JOIN               4   16s622ms   18s042ms   22.67M       1.63B    6.52 MB        1.83 MB  INNER JOIN, BROADCAST          
        |--11:EXCHANGE             4     1.99ms    1.193ms   27.44K      27.44K          0              0  BROADCAST                      
        |  01:SCAN HDFS            1   630.69ms   630.69ms   27.44K      27.44K    9.38 MB       64.00 MB  tpcds_parquet.customer_demo... 
        00:SCAN HDFS               4   42s654ms         1m    1.63B       1.63B  318.16 MB      616.00 MB  tpcds_parquet.store_sales
        
        Show
        gmedasani_impala_1424 Guru Medasani added a comment - TPCDS query 7 showed similar behavior. Query: select i_item_id, avg(ss_quantity) agg1, avg(ss_list_price) agg2, avg(ss_coupon_amt) agg3, avg(ss_sales_price) agg4 from store_sales, customer_demographics, date_dim, item, promotion where ss_sold_date_sk = d_date_sk and ss_item_sk = i_item_sk and ss_cdemo_sk = cd_demo_sk and ss_promo_sk = p_promo_sk and cd_gender = 'F' and cd_marital_status = 'W' and cd_education_status = 'Primary' and (p_channel_email = 'N' or p_channel_event = 'N') and d_year = 1998 and ss_sold_date_sk between 2450815 and 2451179 -- partition key filter group by i_item_id order by i_item_id limit 100 Exec Summary: Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail ---------------------------------------------------------------------------------------------------------------------------------- 17:MERGING-EXCHANGE 1 198.162us 198.162us 400 100 0 -1.00 B UNPARTITIONED 10:TOP-N 4 13.299ms 13.728ms 400 100 148.00 KB 6.25 KB 16:AGGREGATE 4 479.146ms 542.757ms 180.00K 179.73K 38.49 MB 10.00 MB FINALIZE 15:EXCHANGE 4 18.521ms 19.92ms 720.00K 179.73K 0 0 HASH(i_item_id) 09:AGGREGATE 4 7s481ms 8s016ms 720.00K 179.73K 793.16 MB 12.07 MB 08:HASH JOIN 4 968.64ms 1s032ms 22.38M 1.63B 26.92 MB 73.48 KB INNER JOIN, BROADCAST |--14:EXCHANGE 4 76.135us 77.429us 1.79K 1.80K 0 0 BROADCAST | 04:SCAN HDFS 1 66.655ms 66.655ms 1.79K 1.80K 275.09 KB 48.00 MB tpcds_parquet.promotion 07:HASH JOIN 4 2s813ms 3s011ms 22.67M 1.63B 60.00 MB 13.60 MB INNER JOIN, BROADCAST |--13:EXCHANGE 4 13.731ms 14.841ms 360.00K 360.00K 0 0 BROADCAST | 03:SCAN HDFS 1 138.558ms 138.558ms 360.00K 360.00K 7.98 MB 96.00 MB tpcds_parquet.item 06:HASH JOIN 4 529.185ms 566.291ms 22.67M 1.63B 13.27 MB 27.00 B INNER JOIN, BROADCAST |--12:EXCHANGE 4 28.338us 29.773us 365 3 0 0 BROADCAST | 02:SCAN HDFS 1 20.911ms 20.911ms 365 3 1.20 MB 32.00 MB tpcds_parquet.date_dim 05:HASH JOIN 4 16s622ms 18s042ms 22.67M 1.63B 6.52 MB 1.83 MB INNER JOIN, BROADCAST |--11:EXCHANGE 4 1.99ms 1.193ms 27.44K 27.44K 0 0 BROADCAST | 01:SCAN HDFS 1 630.69ms 630.69ms 27.44K 27.44K 9.38 MB 64.00 MB tpcds_parquet.customer_demo... 00:SCAN HDFS 4 42s654ms 1m 1.63B 1.63B 318.16 MB 616.00 MB tpcds_parquet.store_sales
        Hide
        mjacobs Matthew Jacobs added a comment -

        FYI I didn't see the issue for TPCDS query 7 but it did repro for TPCH-Q21.

        It looks like the profile also has the wrong num rows returnered at the merging exchange node.

        Show
        mjacobs Matthew Jacobs added a comment - FYI I didn't see the issue for TPCDS query 7 but it did repro for TPCH-Q21. It looks like the profile also has the wrong num rows returnered at the merging exchange node.
        Hide
        twmarshall Thomas Tauber-Marshall added a comment -

        commit 7fad3e5dc38c1097db6be24da0cda6941f554150
        Author: Thomas Tauber-Marshall <tmarshall@cloudera.com>
        Date: Mon Oct 10 10:32:55 2016 -0700

        IMPALA-3002/IMPALA-1473: Cardinality observability cleanup

        IMPALA-3002:
        The shell prints an incorrect value for '#Rows' in the exec
        summary for broadcast nodes due to incorrect logic around
        whether to use max or agg stats. This patch makes the behavior
        consistent with the way the be treats exec summaries in
        summary-util.cc. This incorrect logic was also duplicated in
        the impala_beeswax test framework.

        IMPALA-1473:
        When there is a merging exchange with a limit, we may copy rows
        into the output batch beyond the limit. In this case, we currently
        update the output batch's size to reflect the limit, but we also
        need to update ExecNode::num_rows_returned_ or the exec summary
        may show that the exchange node returned more rows than it really
        did.

        Additionally, PlanFragmentExecutor::GetNext does not update
        rows_produced_counter_ in some cases, leading the runtime profile
        to display an incorrect value for 'RowsProduced'.

        Change-Id: I386719370386c9cff09b8b35d15dc712dc6480aa
        Reviewed-on: http://gerrit.cloudera.org:8080/4679
        Reviewed-by: Matthew Jacobs <mj@cloudera.com>
        Tested-by: Internal Jenkins

        Show
        twmarshall Thomas Tauber-Marshall added a comment - commit 7fad3e5dc38c1097db6be24da0cda6941f554150 Author: Thomas Tauber-Marshall <tmarshall@cloudera.com> Date: Mon Oct 10 10:32:55 2016 -0700 IMPALA-3002 / IMPALA-1473 : Cardinality observability cleanup IMPALA-3002 : The shell prints an incorrect value for '#Rows' in the exec summary for broadcast nodes due to incorrect logic around whether to use max or agg stats. This patch makes the behavior consistent with the way the be treats exec summaries in summary-util.cc. This incorrect logic was also duplicated in the impala_beeswax test framework. IMPALA-1473 : When there is a merging exchange with a limit, we may copy rows into the output batch beyond the limit. In this case, we currently update the output batch's size to reflect the limit, but we also need to update ExecNode::num_rows_returned_ or the exec summary may show that the exchange node returned more rows than it really did. Additionally, PlanFragmentExecutor::GetNext does not update rows_produced_counter_ in some cases, leading the runtime profile to display an incorrect value for 'RowsProduced'. Change-Id: I386719370386c9cff09b8b35d15dc712dc6480aa Reviewed-on: http://gerrit.cloudera.org:8080/4679 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: Internal Jenkins

          People

          • Assignee:
            twmarshall Thomas Tauber-Marshall
            Reporter:
            alex.behm Alexander Behm
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development