Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.9
Description
Flink allows combiners which accept a type I and combine the values of this type into type O. In Google Dataflow, combiners are more generalized. They accept an Input I, produce an intermediate combine value of T, and finally an output O. Flink's combiners are like the SimpleCombineFn in Google Dataflow.
Right now, we translate the KeyedCombineFn into a SortPartition followed by a MapPartition to emulate the Combiner's behavior. Rudimentary performance tests showed that this behavior causes a significant increase in run time compared to the proper Combine implementation.
Let's implement a more generalized Combiner to create a better mapping from Google Dataflow to Flink.
Attachments
Issue Links
- is related to
-
FLINK-1780 Rename FlatCombineFunction to GroupCombineFunction
- Closed
- relates to
-
FLINK-1043 Alternative combine interface
- Resolved
1.
|
Add GroupReducePartial Operator | Resolved | Maximilian Michels |