Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
2.1.0, 2.2.0
Description
After HIVE-13287 went in, we calculate IN clause estimates natively (instead of just dividing incoming number of rows by 2). However, as the distribution of values of the columns is considered uniform, we might end up heavily underestimating/overestimating the resulting number of rows.
This issue is to add a factor that multiplies the IN clause estimation so we can alleviate this problem. The solution is not very elegant, but it is the best we can do until we have histograms to improve our estimate.
Attachments
Attachments
Issue Links
- relates to
-
HIVE-13287 Add logic to estimate stats for IN operator
- Closed