Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 3.3.0
-
ghx-label-4
Description
This can be reproduced for TPC-H Q9. I saw this on scale factor 30 locally, where the mt_dop=4 version of the query uses a lot more CPU in the scan than the mt_dop=0 version.
This turns out to be because none of the runtime filters are getting disabled, not even the ineffective ones.
Filter 2 (16.00 MB): - Files processed: 0 (0) - Files rejected: 0 (0) - Files total: 0 (0) - RowGroups processed: 0 (0) - RowGroups rejected: 0 (0) - RowGroups total: 0 (0) - Rows processed: 30.97M (30970695) - Rows rejected: 0 (0) - Rows total: 31.01M (31009074) - Splits processed: 0 (0) - Splits rejected: 0 (0) - Splits total: 0 (0) Filter 4 (8.00 MB): - Files processed: 0 (0) - Files rejected: 0 (0) - Files total: 0 (0) - RowGroups processed: 0 (0) - RowGroups rejected: 0 (0) - RowGroups total: 0 (0) - Rows processed: 30.97M (30970695) - Rows rejected: 0 (0) - Rows total: 31.01M (31009074) - Splits processed: 0 (0) - Splits rejected: 0 (0) - Splits total: 0 (0) Filter 5 (8.00 MB): - Files processed: 0 (0) - Files rejected: 0 (0) - Files total: 0 (0) - RowGroups processed: 0 (0) - RowGroups rejected: 0 (0) - RowGroups total: 0 (0) - Rows processed: 30.97M (30970695) - Rows rejected: 0 (0) - Rows total: 31.01M (31009074) - Splits processed: 0 (0) - Splits rejected: 0 (0) - Splits total: 0 (0) Filter 8 (1.00 MB): - Files processed: 0 (0) - Files rejected: 0 (0) - Files total: 0 (0) - RowGroups processed: 0 (0) - RowGroups rejected: 0 (0) - RowGroups total: 0 (0) - Rows processed: 31.01M (31009074) - Rows rejected: 0 (0) - Rows total: 31.01M (31009074) - Splits processed: 0 (0) - Splits rejected: 0 (0) - Splits total: 0 (0) Filter 10 (1.00 MB): - Files processed: 0 (0) - Files rejected: 0 (0) - Files total: 0 (0) - RowGroups processed: 0 (0) - RowGroups rejected: 0 (0) - RowGroups total: 0 (0) - Rows processed: 31.01M (31009074) - Rows rejected: 29.32M (29317263) - Rows total: 31.01M (31009074) - Splits processed: 0 (0) - Splits rejected: 0 (0) - Splits total: 0 (0)
In contrast here are the filters for mt_dop=0, where not all the rows are processed.
Filter 2 (16.00 MB): - Files processed: 0 (0) - Files rejected: 0 (0) - Files total: 0 (0) - RowGroups processed: 0 (0) - RowGroups rejected: 0 (0) - RowGroups total: 0 (0) - Rows processed: 8.18M (8180257) - Rows rejected: 0 (0) - Rows total: 180.00M (179998372) - Splits processed: 0 (0) - Splits rejected: 0 (0) - Splits total: 0 (0) Filter 4 (8.00 MB): - Files processed: 0 (0) - Files rejected: 0 (0) - Files total: 0 (0) - RowGroups processed: 0 (0) - RowGroups rejected: 0 (0) - RowGroups total: 0 (0) - Rows processed: 8.18M (8180257) - Rows rejected: 0 (0) - Rows total: 180.00M (179998372) - Splits processed: 0 (0) - Splits rejected: 0 (0) - Splits total: 0 (0) Filter 5 (8.00 MB): - Files processed: 0 (0) - Files rejected: 0 (0) - Files total: 0 (0) - RowGroups processed: 0 (0) - RowGroups rejected: 0 (0) - RowGroups total: 0 (0) - Rows processed: 8.18M (8180257) - Rows rejected: 0 (0) - Rows total: 180.00M (179998372) - Splits processed: 0 (0) - Splits rejected: 0 (0) - Splits total: 0 (0) Filter 8 (1.00 MB): - Files processed: 0 (0) - Files rejected: 0 (0) - Files total: 0 (0) - RowGroups processed: 0 (0) - RowGroups rejected: 0 (0) - RowGroups total: 0 (0) - Rows processed: 8.41M (8406914) - Rows rejected: 0 (0) - Rows total: 180.00M (179998372) - Splits processed: 0 (0) - Splits rejected: 0 (0) - Splits total: 0 (0) Filter 10 (1.00 MB): - Files processed: 0 (0) - Files rejected: 0 (0) - Files total: 0 (0) - RowGroups processed: 0 (0) - RowGroups rejected: 0 (0) - RowGroups total: 0 (0) - Rows processed: 180.00M (179998372) - Rows rejected: 170.18M (170177099) - Rows total: 180.00M (179998372) - Splits processed: 0 (0) - Splits rejected: 0 (0) - Splits total: 0 (0)
Perf top showed 28% of CPU time in impala::BloomFilter::BucketFindAVX2, which corroborates this.