Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-9302

Multithreaded scanners don't check for filter effectiveness

    XMLWordPrintableJSON

Details

    Description

      This can be reproduced for TPC-H Q9. I saw this on scale factor 30 locally, where the mt_dop=4 version of the query uses a lot more CPU in the scan than the mt_dop=0 version.

      This turns out to be because none of the runtime filters are getting disabled, not even the ineffective ones.

                Filter 2 (16.00 MB):
                   - Files processed: 0 (0)
                   - Files rejected: 0 (0)
                   - Files total: 0 (0)
                   - RowGroups processed: 0 (0)
                   - RowGroups rejected: 0 (0)
                   - RowGroups total: 0 (0)
                   - Rows processed: 30.97M (30970695)
                   - Rows rejected: 0 (0)
                   - Rows total: 31.01M (31009074)
                   - Splits processed: 0 (0)
                   - Splits rejected: 0 (0)
                   - Splits total: 0 (0)
                Filter 4 (8.00 MB):
                   - Files processed: 0 (0)
                   - Files rejected: 0 (0)
                   - Files total: 0 (0)
                   - RowGroups processed: 0 (0)
                   - RowGroups rejected: 0 (0)
                   - RowGroups total: 0 (0)
                   - Rows processed: 30.97M (30970695)
                   - Rows rejected: 0 (0)
                   - Rows total: 31.01M (31009074)
                   - Splits processed: 0 (0)
                   - Splits rejected: 0 (0)
                   - Splits total: 0 (0)
                Filter 5 (8.00 MB):
                   - Files processed: 0 (0)
                   - Files rejected: 0 (0)
                   - Files total: 0 (0)
                   - RowGroups processed: 0 (0)
                   - RowGroups rejected: 0 (0)
                   - RowGroups total: 0 (0)
                   - Rows processed: 30.97M (30970695)
                   - Rows rejected: 0 (0)
                   - Rows total: 31.01M (31009074)
                   - Splits processed: 0 (0)
                   - Splits rejected: 0 (0)
                   - Splits total: 0 (0)
                Filter 8 (1.00 MB):
                   - Files processed: 0 (0)
                   - Files rejected: 0 (0)
                   - Files total: 0 (0)
                   - RowGroups processed: 0 (0)
                   - RowGroups rejected: 0 (0)
                   - RowGroups total: 0 (0)
                   - Rows processed: 31.01M (31009074)
                   - Rows rejected: 0 (0)
                   - Rows total: 31.01M (31009074)
                   - Splits processed: 0 (0)
                   - Splits rejected: 0 (0)
                   - Splits total: 0 (0)
                Filter 10 (1.00 MB):
                   - Files processed: 0 (0)
                   - Files rejected: 0 (0)
                   - Files total: 0 (0)
                   - RowGroups processed: 0 (0)
                   - RowGroups rejected: 0 (0)
                   - RowGroups total: 0 (0)
                   - Rows processed: 31.01M (31009074)
                   - Rows rejected: 29.32M (29317263)
                   - Rows total: 31.01M (31009074)
                   - Splits processed: 0 (0)
                   - Splits rejected: 0 (0)
                   - Splits total: 0 (0)
      

      In contrast here are the filters for mt_dop=0, where not all the rows are processed.

                Filter 2 (16.00 MB):
                   - Files processed: 0 (0)
                   - Files rejected: 0 (0)
                   - Files total: 0 (0)
                   - RowGroups processed: 0 (0)
                   - RowGroups rejected: 0 (0)
                   - RowGroups total: 0 (0)
                   - Rows processed: 8.18M (8180257)
                   - Rows rejected: 0 (0)
                   - Rows total: 180.00M (179998372)
                   - Splits processed: 0 (0)
                   - Splits rejected: 0 (0)
                   - Splits total: 0 (0)
                Filter 4 (8.00 MB):
                   - Files processed: 0 (0)
                   - Files rejected: 0 (0)
                   - Files total: 0 (0)
                   - RowGroups processed: 0 (0)
                   - RowGroups rejected: 0 (0)
                   - RowGroups total: 0 (0)
                   - Rows processed: 8.18M (8180257)
                   - Rows rejected: 0 (0)
                   - Rows total: 180.00M (179998372)
                   - Splits processed: 0 (0)
                   - Splits rejected: 0 (0)
                   - Splits total: 0 (0)
                Filter 5 (8.00 MB):
                   - Files processed: 0 (0)
                   - Files rejected: 0 (0)
                   - Files total: 0 (0)
                   - RowGroups processed: 0 (0)
                   - RowGroups rejected: 0 (0)
                   - RowGroups total: 0 (0)
                   - Rows processed: 8.18M (8180257)
                   - Rows rejected: 0 (0)
                   - Rows total: 180.00M (179998372)
                   - Splits processed: 0 (0)
                   - Splits rejected: 0 (0)
                   - Splits total: 0 (0)
                Filter 8 (1.00 MB):
                   - Files processed: 0 (0)
                   - Files rejected: 0 (0)
                   - Files total: 0 (0)
                   - RowGroups processed: 0 (0)
                   - RowGroups rejected: 0 (0)
                   - RowGroups total: 0 (0)
                   - Rows processed: 8.41M (8406914)
                   - Rows rejected: 0 (0)
                   - Rows total: 180.00M (179998372)
                   - Splits processed: 0 (0)
                   - Splits rejected: 0 (0)
                   - Splits total: 0 (0)
                Filter 10 (1.00 MB):
                   - Files processed: 0 (0)
                   - Files rejected: 0 (0)
                   - Files total: 0 (0)
                   - RowGroups processed: 0 (0)
                   - RowGroups rejected: 0 (0)
                   - RowGroups total: 0 (0)
                   - Rows processed: 180.00M (179998372)
                   - Rows rejected: 170.18M (170177099)
                   - Rows total: 180.00M (179998372)
                   - Splits processed: 0 (0)
                   - Splits rejected: 0 (0)
                   - Splits total: 0 (0)
      

      Perf top showed 28% of CPU time in impala::BloomFilter::BucketFindAVX2, which corroborates this.

      Attachments

        Activity

          People

            tarmstrong Tim Armstrong
            tarmstrong Tim Armstrong
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: