[HIVE-28410] Partition-Aware Optimization for Iceberg or OTF - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 4.0.0
Fix Version/s: None
Component/s: Iceberg integration, StorageHandler
Labels:
None

Description

This project aims to allow users using Apache Iceberg or other non-native table formats to take advantage of Hive's advanced optimizations.

Apache Hive provides some optimizations depending on the storage layout of Hive native tables. Bucket Map Join, Sort Merge Bucket Join, or GroupByOptimizer are some of them. Those optimizations are not open to non-native tables because they rely on a piece of hardcoded logic. For example, hashing algorithms are implemented outside StorageHandlers, so enabling Bucket Map Join on Iceberg tables with Bucket Transforms is unrealistic.

We have some discussions in ~~HIVE-27734~~; this is the first design doc.

Attachments

Issue Links

is duplicated by

HIVE-27734 Add Iceberg's storage-partitioned join capabilities to Hive's [sorted-]bucket-map-join

Resolved

Sub-Tasks

1.	Bucket Map Join on Iceberg tables	Resolved	Shohei Okumiya
2.	Sort Merge Bucket Join for Iceberg tables	Open	Unassigned
3.	Group By Optimization for Iceberg tables	Open	Unassigned
4.	Flexible join optimization based on partition transform specs	Open	Unassigned

Activity

People

Assignee:: Shohei Okumiya

Reporter:: Shohei Okumiya

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 28/Jul/24 06:23

Updated:: 01/Nov/24 10:41