Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Resolved
-
Impala 3.0
-
None
-
ghx-label-8
Description
Impala makes heavy use of stats, which is a good thing. Stats feed into query planning where they allow the planner to choose among a fixed set of alternatives such as: do I put t1 on the build or probe side of a join?
Because the planner decisions tend to be discrete, we only need enough information to decide whether to do A or B (or, more generally, to choose among a set of choices A, B, C, ... N).
Often data sizes are vastly different on different paths. Stats help refine these numbers, but much of the information just needs to be in the ball park: is table t1 larger or smaller than t2? Often, one table is much larger than the other, so even a rough size estimate will force the right decision (put the smaller table on the build side of a join.)
Today, if Impala has no stats, it refuses to even consider table size. Consider the following unit test:
runTest("SELECT a FROM functional.tinytable;", -1);
This plans the given query, then verifies that the expected result cardinality is the number given. In this case, tinytable has no stats. So, we don't know the cardinality. OK...
The table turns out to be 3 rows. Perhaps I join this to a hypothetical hugetable of 1 million rows. Without even a guess at cardinality, Impala can't choose a good plan.
The suggestion is to use table size to estimate row cardinality. Come up with some assumed row width, say 100. Then, estimate row count as file size / est. row width. This gives a ballpark number that would be plenty good for the planner to choose the proper plan much of the time.
Since this is such an easy estimate to make, and will address the occasional case in which stats are not available, it seems a shame to not take advantage of this information.
In terms of implementation, HdfsScanNode.computeCardinalities() already uses some extrapolation, if enabled. It can be extended to do the last-ditch extrapolation suggested above if, after the current techniques, the cardinality is still undefined.
If we apply this simple fix in a prototype build, the new test result is closer to reality:
runTest("SELECT a FROM functional.tinytable;", 1);
Given that the fix is so simple, any reason not to use the file size, when available? Is 100 a reasonable assumed row width? Should this functionality always be on, not just when enabled using the back-end config?