Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-9908

Inconsistent state of SlotPool after ExecutionGraph cancellation

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

      Description

      If the ExecutionGraph is concurrently scheduled and cancelled, it can happen that requested Slots are not properly returned to the SlotPool. This causes an inconsistent state of the SlotPool where it thinks that some of its slots are still occupied even though the respective Execution has already been cancelled.

      The problem seems to be caused by propagating the cancellation of the overall scheduling future to the individual scheduling futures. If the individual scheduling future is cancelled, then the callback which produces its value and also handles the failure case won't be called.

        Attachments

          Activity

            People

            • Assignee:
              trohrmann Till Rohrmann
              Reporter:
              trohrmann Till Rohrmann

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment