Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-1085

Unnecessary failing of GroupReduceCombineDriver

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.6.1-incubating, 0.7.0-incubating
    • 0.10.0
    • Runtime / Task

    Description

      With a recent update (commit cbbcf7820885a8a9734ffeba637b0182a6637939) the GroupReduceCombineDriver was changed to not use an asynchronous partial sorter. Instead, the driver fills a sort buffer with records, sorts it, combines them, clears the buffer, and continues to fill it again.

      The GroupReduceCombineDriver fails if a record cannot be serialized into an empty sort buffer, i.e., if the record is too large for the buffer.

      Alternatively, we should emit a WARN message for the first record that is too large and just forward all records which do not fit into the empty sort buffer (maybe continue to count how many records were simply forwarded and give a second WARN message with this statistic).

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            fhueske Fabian Hueske
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment