Details
-
Epic
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
ghx-label-14
Description
Iceberg V2 adds support for row-level modifications.
One way to implement this is via equality based delete files:
https://iceberg.apache.org/spec/#equality-delete-files
https://iceberg.apache.org/spec/#scan-planning
We could implement this via doing ANTI HASH JOIN between data and delete files. Similarly to what we do for Hive full ACID tables:
https://github.com/apache/impala/blob/f5fc08573352d0a1943296209791a4db17268086/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java#L1729-L1735
The complexity comes when different delete files use different set of columns. In that case we will need multiple ANTI HASH JOINs on top of each other.
Attachments
1.
|
Create a simple test table with equality deletes | Resolved | Tamas Mate | |
2.
|
Basic equality delete support | Resolved | Gabor Kaszab | |
3.
|
Add support for multiple equality field ID list | Resolved | Gabor Kaszab | |
4.
|
Support equality delete files that don't contain the partition values | Open | Unassigned | |
5.
|
Support equality deletes when table has partition or schema evolution | Resolved | Gabor Kaszab | |
6.
|
Push down conjuncts to the equality delete scanner | Open | Unassigned | |
7.
|
Missing field ID in the eq-delete file could filter out rows with null values | Open | Unassigned | |
8.
|
Use max(data_sequence_number) fo joining equality delete rows | Open | Unassigned | |
9.
|
Add better cardinality estimation for Iceberg V2 tables with equality deletes | Open | Unassigned |