Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 3.4.0
-
ghx-label-2
Description
TPC-DS query 13 has a set of predicates on the customer_address table, ca_state column that are currently evaluated after the join of customer_address and store_sales. The ca_state predicates could be pushed down to the customer_address scan node. This would reduce the size of the join input by a factor of 3.4.
As an experiment I added an additional redundant predicate to the query (see attached query13_mod.sql) which causes the planner to evaluate the predicate at the scan node.
Performance of the original and modified queries at 10 TB scale factor:
Original: 164 seconds
Modified: 44 seconds
Query profiles for both versions attached.
Attachments
Attachments
Issue Links
- causes
-
IMPALA-11274 CNF Rewrite causes a regress in join node performance
- Resolved
- is related to
-
IMPALA-11770 Review Predicate arguments for impact of CNF rewrites
- Open
-
IMPALA-9620 Predicates in the SELECT and GROUP-BY cause failure with CNF rewrite enabled
- Resolved
- relates to
-
IMPALA-8165 Planner does not push through predicates when there is a disjunction
- Resolved