Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
After a few experiments with TPC-DS 10TB dataset, we observed that in some cases semijoin reducers were not effective; they didn't reduce the number of records or they reduced the relation only a tiny bit.
In some cases we can make the semijoin reducer more effective by adding more columns but this requires also a bigger bloom filter so the decision for the number of columns to include in the bloom becomes more delicate.
The current decision model always chooses multi-column semijoin reducers if they are available but this may not always beneficial if the a single column can reduce significantly the target relation.