Description
The default row estimation for multi-key joins divides the row estimate by the product of the NDVs for each join column, which can cause the row estimate to be low. Try adding a config to assume the columns are correlated, where we only divide the row estimate by the largest NDV.
Attachments
Attachments
Issue Links
- is related to
-
HIVE-20537 Multi-column joins estimates with uncorrelated columns different in CBO and Hive
-
- Closed
-
-
HIVE-17308 Improvement in join cardinality estimation
-
- Closed
-