Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
If a dimension table and fact table are joined:
select * from store join store_sales on (store.id = store_sales.store_id) where store.s_store_name = 'My Store'
One optimization that can be done is to get the min/max store id values that come out of the scan/filter of the store table, and send this min/max value (via Tez edge) to the task which is scanning the store_sales table.
We can add a BETWEEN(min, max) predicate to the store_sales TableScan, where this predicate can be pushed down to the storage handler (for example for ORC formats). Pushing a min/max predicate to the ORC reader would allow us to avoid having to entire whole row groups during the table scan.
Attachments
Attachments
Issue Links
- is blocked by
-
HIVE-15270 ExprNode/Sarg changes to support values supplied during query runtime
- Resolved
- is depended upon by
-
HIVE-10924 add support for MERGE statement
- Resolved
- is related to
-
HIVE-16001 add test for merge + runtime filtering
- Resolved
-
HIVE-15698 Vectorization support for min/max/bloomfilter runtime filtering
- Closed
- is required by
-
HIVE-16862 Implement a similar feature like "hive.tez.dynamic.semijoin.reduction" in hive on spark
- Resolved
- relates to
-
HIVE-15802 Changes to expected entries for dynamic bloomfilter runtime filtering
- Closed
- links to