Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
ghx-label-1
Description
Currently we always do a SHUFFLE and SORT before inserting data into a partitioned Iceberg table.
With SHUFFLE, we can guarantee that each partition is assigned to a single sink operator, hence we can minimize the number of files being created.
With SORT, we can write partitions one after the other, therefore we only need to write one file at a time (and only buffer data for that one file). This way we can avoid out of memory situations in the sink.
SORT does a total ordering of the incoming records, therefore it can be very expensive. And when SORT needs to spill to disk, its fragment cannot receive incoming records, blocking the execution of the whole query cluster-wide (because after some time every sender is blocked on the receiving SORT fragment).
In a lot of cases the SORT is not necessary because there's only a very few partitions assigned to each sink, especially in large clusters with high MT_DOP.
During planning Impala should decide whether to add the SORT node for such partitioned INSERTs. Also, it should respect the optimizer hints CLUSTERED/NOCLUSTERED.
https://impala.apache.org/docs/build/html/topics/impala_hints.html
Attachments
Issue Links
- is related to
-
IMPALA-6692 When partition exchange is followed by sort each sort node becomes a synchronization point across the cluster
- Reopened