Details
-
Task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
ghx-label-2
Description
The Bloom filters are created by estimating the NDV and then using the FPP of 75% to get the right size for the filter. This is may be too high to be very useful - if our filters are currently filtering more than 75% out, then it is only because we are overestimating NDV.
Attachments
Issue Links
- depends upon
-
IMPALA-10110 Separate option to control fpp for bloom filter sizing
- Resolved
- is related to
-
IMPALA-5633 Bloom filters underestimate false positive probability
- Open
-
IMPALA-11924 Bloom filter size is unaffected by column NDV
- Resolved
-
IMPALA-12451 Cardinality underestimation can hurt bloom filter effectiveness
- Open