Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
Impala 3.1.0
-
None
-
None
-
ghx-label-3
Description
Consider the TPC-H queries. Here is a part of one of them:
select count(*) from tpch.customer c, tpch.orders o where c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey and o_orderdate < '1995-03-15'
This produces the following cardinality estimates (using the revised, standard join cardinality estimator):
| 02:HASH JOIN [INNER JOIN, BROADCAST] | | | hash predicates: o.o_custkey = c.c_custkey | | | runtime filters: RF000 <- c.c_custkey | | | row-size=59B cardinality=30.00K | | | | | |--04:EXCHANGE [BROADCAST] | | | | | | | 00:SCAN HDFS [tpch.customer c] | | | partitions=1/1 files=1 size=23.08MB row-size=29B cardinality=30.00K | | | predicates: c.c_mktsegment = 'BUILDING' | | | | | 01:SCAN HDFS [tpch.orders o] | | partitions=1/1 files=1 size=162.56MB row-size=30B cardinality=150.00K | | predicates: o_orderdate < '1995-03-15' | | runtime filters: RF000 -> o.o_custkey |
Run the query, the actual count is 147K vs. the estimate of 30K. The culprit here is the HDFS Scan node with an estimated cardinality of 150K. Run just this query:
select count(*) from tpch.orders o where o_orderdate < '1995-03-15'
With this plan:
| 00:SCAN HDFS [tpch.orders o] | | partitions=1/1 files=1 size=162.56MB row-size=22B cardinality=150.00K | | predicates: o_orderdate < '1995-03-15' |
With with an actual result of 727K. Had this been used in the previous plan instead of 30K, the join cardinality would have been about four times larger, or 120K, which is far closer to the actual total.
So, the problem is the mis-estimation of the inequality, which are very hard to estimate. The actual selectivity, found by running a count(*) on the table, is 727K / 1.5M =~ 0.5. But, Impala estimated 30K / 1.5M = 0.02. Most systems estimate around 0.45 or so.
For example, using a default estimate of 1/3 for inequality provides the following plan, which is much better:
03:HASH JOIN [INNER JOIN] | hash predicates: c.c_custkey = o.o_custkey | runtime filters: RF002 <- o.o_custkey | row-size=71B cardinality=120.00K | |--01:SCAN HDFS [tpch.orders o] | partitions=1/1 files=1 size=162.56MB row-size=42B cardinality=600.00K | predicates: o_orderdate < '1995-03-15' | runtime filters: RF000 -> o.o_orderkey
Makes one wonder if the data set was designed with the industry-standard defaults in mind...