[IMPALA-8034] PlannerTest cardinality tests are not realistic - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: Impala 3.1.0
Fix Version/s: Impala 3.2.0
Component/s: Frontend
Labels:
None

Epic Color:
ghx-label-9

Description

Impala generally assumes that queries are M:1, joined on the FK/PK. A PK uniquely identifies a row, so |pl1| = |Table|. This assumption is build into join estimation: that columns are independent, so if we have multiple keys, |pk1| * |pk2| * … * |pkn| = |Table|.

But, PlannerTest frequently uses non-independent, non unique columns. For example, it might join on both the (unique) id column and the non-unique int_col column, which throws off calculations. For example:

select *
from functional.alltypesagg a
full outer join functional.alltypessmall b using (id, int_col)
right join functional.alltypesaggnonulls c on (a.id = c.id and b.string_col = c.string_col)

If we then try to get the estimated cardinalities to match the actual cardinalities obtained from running the query, we end up fighting our assumptions. This shows up in the code: rather than use the classic assumption that the key columns are independent, the code uses special adjustments for redundant columns, perhaps so that tests such as the above produce good estimates.

Better to modify (or add) tests that are based on our assumptions so we can verify that the intended logic works. It is fine to then add a few “oddball” queries to see how well the estimates hold up when the data (or user) does not follow the independence assumption.

Alternatively, add new tests that use realistic joins, and retain the existing tests, adding a note of explanation why the resulting cardinality estimates appear wrong (because we are using unrealistic, redundant columns in joins, which real users seldom do.)

Attachments

Activity

People

Assignee:: Paul Rogers

Reporter:: Paul Rogers

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 31/Dec/18 20:25

Updated:: 14/Mar/19 14:23

Resolved:: 07/Feb/19 01:10