Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-9789

Disable ineffective bloom filters for Kudu scan

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 3.4.0
    • Impala 4.0.0
    • Backend, Frontend
    • None
    • ghx-label-6

    Description

      In bloom-filter benchmark for Kudu, there is performance regression for query TPCH-Q9. In Profile shows that 5 bloom filters are generated by hash join. Some of those filters are not useful for filtering rows. When pushing all bloom filters to Kudu, the bloom filter evaluations add extra cost for Kudu scan, which cause performance regression.
       
      The regression on Q9 looks a lot like https://issues.apache.org/jira/browse/IMPALA-9302, where Q9 regressed a lot with multithreading initially because ineffective filters weren't being disabled. This query is a bit special in that there are many filters pushed to scan 2, and most of them are not useful. Based on our experience there, we need to add a method to disable ineffective filters for Kudu scan.

      Attachments

        Issue Links

          Activity

            People

              wzhou Wenzhe Zhou
              wzhou Wenzhe Zhou
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: