Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4231

Performance regression in TPC-H Q2 due to delay in filter arrival

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • Impala 2.8.0
    • Impala 2.8.0
    • Backend

    Description

      This commit: http://gerrit.cloudera.org:8080/3873 "IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder" caused a regression in TPC-H Q2 on some systems of up to 2x.

      I was able to reproduce a regression of ~2s to ~3s locally on TPC-H scale factor 20 with 3 Impala daemons in my minicluster. I spent some time looking at profiles and the key difference seems to be that runtime filters arrived later in the scans so were ineffective at reducing the size of the join. The arrival time went from slightly under 1s to slightly over 1s.

      The regression goes away if I set:

      set RUNTIME_FILTER_WAIT_TIME_MS=1500;

      Attachments

        Activity

          People

            tarmstrong Tim Armstrong
            tarmstrong Tim Armstrong
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: