Details
-
Bug
-
Status: Open
-
Not a Priority
-
Resolution: Unresolved
-
1.1.3
-
None
Description
Flink KMeans java example [1] shows the usage of ForwardedFields function annotation. How ever, the example job was taking more time than expected on medium sized data itself. By merely removing the function annotation from the example code (with out any other change), a better execution plan and run time was obtained. The execution plan shows that no combiner is used and the two Map tasks are not chained when ForwardedFields is enabled. The experiment is documented in [2]
[1] https://github.com/apache/flink/blob/master/flink-examples/flink-examples-batch/src/main/java/org/apache/flink/examples/java/clustering/KMeans.java
[2] https://drive.google.com/open?id=0B0IlZv0uHBuvVEZ5ZmNpN19jVVU