Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-8026

Actual row counts for nested loop join are way too high while the query is executing

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Impala 3.1.0
    • Fix Version/s: Impala 3.2.0
    • Component/s: Backend
    • Labels:
      None
    • Epic Color:
      ghx-label-4

      Description

      Consider this extract from a query plan:

      Operator                      #Rows  Est. #Rows
      --------------------------------------------------------------
      …
      |  10:HASH JOIN               9.53M      18.14K 
      |  |--19:EXCHANGE                 1           1
      |  |  00:SCAN HDFS                1           1
      |  06:NESTED LOOP JOIN        4.88B     863.84K 
      |  |--18:EXCHANGE                 1           1
      |  |  04:SCAN HDFS                1           1
      |  05:HASH JOIN               9.53M     863.84K
      

      If the above is to be believed, the 06 nested loop join produced 5 billion rows. But, the actual number is far too huge for that: joining 1 row with 10 million rows cannot produce 500 times that number of rows.

      It appears that the nested loop join actually processed and returned the 9.5 million rows, since that is the same number produced by the 10 hash join which joins a single row with the output of the nested loop join.

      Because this same bogus result appears across multiple plans, it is likely that the actual number is completely wrong and bears no relation to the number of rows actually returned.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                tarmstrong Tim Armstrong
                Reporter:
                Paul.Rogers Paul Rogers
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: