Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
ghx-label-1
Description
It would be beneficial to extend runtime filters to push set exclusion down to scan nodes. This would be used to optimize NOT IN, EXCEPT style queries or more generally ANTI JOINS, as well as OUTER JOINs which filter out non null attributes from the nullable side.
This is almost the inverse operation of a traditional bloom filter, other data structures might be more efficient.
This would also compliment Impala's left deep pipelined query planning very well for what otherwise would require complex query plans due to reordering restrictions with ANTI/OUTER joins.