[IMPALA-12995] Don't always sort data for Iceberg partitioned inserts - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Frontend
Labels:
- impala-iceberg

Epic Color:
ghx-label-1

Description

Currently we always do a SHUFFLE and SORT before inserting data into a partitioned Iceberg table.
With SHUFFLE, we can guarantee that each partition is assigned to a single sink operator, hence we can minimize the number of files being created.
With SORT, we can write partitions one after the other, therefore we only need to write one file at a time (and only buffer data for that one file). This way we can avoid out of memory situations in the sink.

SORT does a total ordering of the incoming records, therefore it can be very expensive. And when SORT needs to spill to disk, its fragment cannot receive incoming records, blocking the execution of the whole query cluster-wide (because after some time every sender is blocked on the receiving SORT fragment).

In a lot of cases the SORT is not necessary because there's only a very few partitions assigned to each sink, especially in large clusters with high MT_DOP.

During planning Impala should decide whether to add the SORT node for such partitioned INSERTs. Also, it should respect the optimizer hints CLUSTERED/NOCLUSTERED.
https://impala.apache.org/docs/build/html/topics/impala_hints.html

Attachments

Issue Links

is related to

IMPALA-6692 When partition exchange is followed by sort each sort node becomes a synchronization point across the cluster

Reopened

Activity

People

Assignee:: Unassigned

Reporter:: Zoltán Borók-Nagy

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 11/Apr/24 11:12

Updated:: 11/Apr/24 18:07