Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-6394

GroupCombine reuses instances even though object reuse is disabled

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.2.0
    • Fix Version/s: 1.3.0, 1.2.2
    • Component/s: DataSet API
    • Labels:
      None

      Description

      I am using group combiner in DataSet API with disabled object reuse.

      In code it may be expressed as follows:

      tuples.groupBy(1)
            .combineGroup((it, collector) -> {
               // store first item for future use
               Pojo stored = it.next();
               while (it.hasNext()) {
                 ....
               }
            })
      

      It seems even the object reuse feature is disabled, my instance is actually replaced when .next() is called on the iterator. It leads to very confusing and wrong results.

      I checked the Flink codebase and it seems CombiningUnilateralSortMerger is actually reusing object instances even though object reuse is explicitly disabled.
      In spilling phase user's combiner is called with instance of CombineValueIterator that actually reuses instances without any warning.

      See https://github.com/apache/flink/blob/d7b59d761601baba6765bb4fc407bcd9fd6a9387/flink-runtime/src/main/java/org/apache/flink/runtime/operators/sort/CombiningUnilateralSortMerger.java#L550

      When I disable combiner and use groupReduce only with the same reduce function, results are fine.

      Please let me know if you can confirm this as a bug. From my point of view it's highly critical as I am getting unpredictable results.

        Issue Links

          Activity

          Hide
          ykt836 Kurt Young added a comment -

          Hi Jaromir Vanek, thanks for reporting this, i will take a look at it soon.

          Show
          ykt836 Kurt Young added a comment - Hi Jaromir Vanek , thanks for reporting this, i will take a look at it soon.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user KurtYoung opened a pull request:

          https://github.com/apache/flink/pull/3803

          FLINK-6394 [runtime] Respect object reuse configuration when execut…

          …ing group combining function

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/KurtYoung/flink flink-6394

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/flink/pull/3803.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #3803


          commit 3f81f60603940ec2b26e81814224b69ae40afcc0
          Author: Kurt Young <ykt836@gmail.com>
          Date: 2017-04-30T08:56:00Z

          FLINK-6394 [runtime] Respect object reuse configuration when executing group combining function


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user KurtYoung opened a pull request: https://github.com/apache/flink/pull/3803 FLINK-6394 [runtime] Respect object reuse configuration when execut… …ing group combining function You can merge this pull request into a Git repository by running: $ git pull https://github.com/KurtYoung/flink flink-6394 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3803.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3803 commit 3f81f60603940ec2b26e81814224b69ae40afcc0 Author: Kurt Young <ykt836@gmail.com> Date: 2017-04-30T08:56:00Z FLINK-6394 [runtime] Respect object reuse configuration when executing group combining function
          Hide
          vanekjar Jaromir Vanek added a comment -

          Hi Kurt,

          thank you very much for your fix. It seems it resolves this issue. It is totally correct.

          But generally, in Flink codebase there are tens of spots using if (objectReuseEnabled) branching. How do we know there are no other occurrences of this bug? I am afraid this style of dealing with object reuse is not sustainable for future development. There may always appear a new issue related to missed if (objectReuseEnabled) check.

          Show
          vanekjar Jaromir Vanek added a comment - Hi Kurt, thank you very much for your fix. It seems it resolves this issue. It is totally correct. But generally, in Flink codebase there are tens of spots using if (objectReuseEnabled) branching. How do we know there are no other occurrences of this bug? I am afraid this style of dealing with object reuse is not sustainable for future development. There may always appear a new issue related to missed if (objectReuseEnabled) check.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StephanEwen commented on the issue:

          https://github.com/apache/flink/pull/3803

          Good catch and good changes. Thanks @KurtYoung

          +1 to merge these

          Show
          githubbot ASF GitHub Bot added a comment - Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3803 Good catch and good changes. Thanks @KurtYoung +1 to merge these
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user KurtYoung commented on the issue:

          https://github.com/apache/flink/pull/3803

          Sorry i wasn't available in the last couple days, it seems the 1.3 branch has been created, should i merge this PR in to 1.3 branch too?

          Show
          githubbot ASF GitHub Bot added a comment - Github user KurtYoung commented on the issue: https://github.com/apache/flink/pull/3803 Sorry i wasn't available in the last couple days, it seems the 1.3 branch has been created, should i merge this PR in to 1.3 branch too?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/flink/pull/3803

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/3803
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user fhueske commented on the issue:

          https://github.com/apache/flink/pull/3803

          Yes, 1.3 and 1.2 as well. Thank you!

          Show
          githubbot ASF GitHub Bot added a comment - Github user fhueske commented on the issue: https://github.com/apache/flink/pull/3803 Yes, 1.3 and 1.2 as well. Thank you!
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user KurtYoung commented on the issue:

          https://github.com/apache/flink/pull/3803

          Ok, do it right way.

          Show
          githubbot ASF GitHub Bot added a comment - Github user KurtYoung commented on the issue: https://github.com/apache/flink/pull/3803 Ok, do it right way.

            People

            • Assignee:
              ykt836 Kurt Young
              Reporter:
              vanekjar Jaromir Vanek
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development