Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
Depending on available statistics, FilterStatsRule estimates the rows as numRows/3 at times. This causes, lower keyCount to be projected for hashtable computation causing rehashing often.
E.g TPCDS Q74 @ 10TB. But as part of evaluating "t_s_firstyear.year_total > 0, t_w_secyear.year_total / t_w_firstyear.year_total , t_s_secyear.year_total / t_s_firstyear.year_total " conditions, it projects 1/3rd of the rows causing rehashing of hashtable in downstream vertex.
May have to check whether stats can be projected for these columns correctly.