Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Invalid
-
None
-
None
-
None
Description
The coalesce partitions operator simply reduces the number of partitions to the specified amount.
The target partition count must be >=1
If the target partition count is >= the number of input partitions then this is a no-op and can be optimized out of the plan.
The simplest implementation would be to assign one or more input partitions to each output partition. This works well where the number of input partitions is divisible by the number of output partitions e.g. going from 64 input partitions to 8 output partitions. In other cases, the resulting partitions may have data skew e.g. going from 3 partitions to 2. It would be possible to do the partitioning at the row level but that would add a lot of overhead and the "repartition" operator should be used for that case.