Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
4.0.0
-
None
-
None
Description
This project aims to allow users using Apache Iceberg or other non-native table formats to take advantage of Hive's advanced optimizations.
Apache Hive provides some optimizations depending on the storage layout of Hive native tables. Bucket Map Join, Sort Merge Bucket Join, or GroupByOptimizer are some of them. Those optimizations are not open to non-native tables because they rely on a piece of hardcoded logic. For example, hashing algorithms are implemented outside StorageHandlers, so enabling Bucket Map Join on Iceberg tables with Bucket Transforms is unrealistic.
We have some discussions in HIVE-27734; this is the first design doc.
Attachments
Issue Links
- is duplicated by
-
HIVE-27734 Add Iceberg's storage-partitioned join capabilities to Hive's [sorted-]bucket-map-join
- Resolved
1.
|
Bucket Map Join on Iceberg tables | Resolved | Shohei Okumiya | |
2.
|
Sort Merge Bucket Join for Iceberg tables | Open | Unassigned | |
3.
|
Group By Optimization for Iceberg tables | Open | Unassigned | |
4.
|
Flexible join optimization based on partition transform specs | Open | Unassigned |