Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4231

Performance regression in TPC-H Q2 due to delay in filter arrival

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Backend
    • Labels:

      Description

      This commit: http://gerrit.cloudera.org:8080/3873 "IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder" caused a regression in TPC-H Q2 on some systems of up to 2x.

      I was able to reproduce a regression of ~2s to ~3s locally on TPC-H scale factor 20 with 3 Impala daemons in my minicluster. I spent some time looking at profiles and the key difference seems to be that runtime filters arrived later in the scans so were ineffective at reducing the size of the join. The arrival time went from slightly under 1s to slightly over 1s.

      The regression goes away if I set:

      set RUNTIME_FILTER_WAIT_TIME_MS=1500;

        Attachments

          Activity

            People

            • Assignee:
              tarmstrong Tim Armstrong
              Reporter:
              tarmstrong Tim Armstrong
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: