Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
With hive.stats.fetch.column.stats=true, we'll estimate data size with column stats when annotating operators with statistics. However, when column stats is partial, we're likely to underestimate data size, which may hurt performance, e.g. picking an inappropriate small table for map join.