Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
1.2.0
-
None
Description
I am using group combiner in DataSet API with disabled object reuse.
In code it may be expressed as follows:
tuples.groupBy(1) .combineGroup((it, collector) -> { // store first item for future use Pojo stored = it.next(); while (it.hasNext()) { .... } })
It seems even the object reuse feature is disabled, my instance is actually replaced when .next() is called on the iterator. It leads to very confusing and wrong results.
I checked the Flink codebase and it seems CombiningUnilateralSortMerger is actually reusing object instances even though object reuse is explicitly disabled.
In spilling phase user's combiner is called with instance of CombineValueIterator that actually reuses instances without any warning.
When I disable combiner and use groupReduce only with the same reduce function, results are fine.
Please let me know if you can confirm this as a bug. From my point of view it's highly critical as I am getting unpredictable results.
Attachments
Issue Links
- links to