Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
None
Description
The join rowcount regresses a lot after changes made for DRILL-7148. This affects several TPC-DS queries.
One of theĀ fixes for DRILL-7148, introduced a change in DrillRelMDDistinctRowcount to only use the guess of 0.1*input_row_count when not all columns in the group-by key have NDV statistics. However, the fix was incorrect and instead caused it to use the guess-timate NDV even when statistics were present.
Since the NDV was estimated as 0.1 * input_count_count because of the regression, the join cardinality was severely underestimated for TPCDS-21 = 400M * 15 / Max(400K, 15) = 150.
Attachments
Issue Links
- Dependent
-
DRILL-7227 TPCDS queries 47, 57, 59 fail to run with Statistics enabled at sf100
- Resolved