Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
Impala 2.12.0
-
ghx-label-6
Description
The following:
| F00:PLAN FRAGMENT [RANDOM] hosts=1 instances=1 | | Per-Host Resources: mem-estimate=33.94MB mem-reservation=1.94MB | | 02:HASH JOIN [INNER JOIN, BROADCAST] | | | hash predicates: b.code = a.code | | | fk/pk conjuncts: none | | | runtime filters: RF000 <- a.code | | | mem-estimate=1.94MB mem-reservation=1.94MB spill-buffer=64.00KB | | | tuple-ids=1,0 row-size=163B cardinality=9223372036854775807 | <==== Estimation due to overflow. | | | | |--03:EXCHANGE [BROADCAST] | | | | mem-estimate=0B mem-reservation=0B | | | | tuple-ids=0 row-size=82B cardinality=823 | | | | | | | F01:PLAN FRAGMENT [RANDOM] hosts=1 instances=1 | | | Per-Host Resources: mem-estimate=32.00MB mem-reservation=0B | | | 00:SCAN HDFS [default.sample_07 a, RANDOM] | | | partitions=1/1 files=1 size=44.98KB | | | stats-rows=823 extrapolated-rows=disabled | | | table stats: rows=823 size=44.98KB | | | column stats: all | | | mem-estimate=32.00MB mem-reservation=0B | | | tuple-ids=0 row-size=82B cardinality=823 | | | | | 01:SCAN HDFS [default.sample_08 b, RANDOM] | | partitions=1/1 files=1 size=44.99KB | | runtime filters: RF000 -> b.code | | stats-rows=823 extrapolated-rows=disabled | | table stats: rows=823 size=44.99KB | | column stats: all | | mem-estimate=32.00MB mem-reservation=0B | | tuple-ids=1 row-size=82B cardinality=823 | +--------------------------------------------------------------------+
is the result of both join columns having 0 as NDV.
https://github.com/cloudera/Impala/blob/cdh5-trunk/fe/src/main/java/org/apache/impala/planner/JoinNode.java#L368
should handle this more gracefully.
IMPALA-7310 makes it a bit more likely that someone will run into this.