Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-25817 FLIP-201: Persist local state in working directory
  3. FLINK-25855

DefaultDeclarativeSlotPool rejects offered slots when the job is restarting

Attach filesAttach ScreenshotVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      The DefaultDeclarativeSlotPool rejects offered slots if the job is currently restarting. The problem is that in case of a job restart, the scheduler sets the required resources to zero. Hence, all offered slots will be rejected.

      This is a problem for local recovery because rejected slots will be freed by the TaskExecutor and thereby all local state will be deleted. Hence, in order to properly support local recovery, we need to handle this situation somehow. I do see different options here:

      This problem only affects the DefaultScheduler since the AdaptiveScheduler sets the required resources when transitioning into the WaitingForResources state.

      Accept excess slots

      Accepting excess slots means that the DefaultDeclarativeSlotPool accepts slots which exceed the currently required set of slots.

      Advantages:

      • Easy to implement

      Disadvantages:

      • Offered slots that are not really needed will only be freed after the idle slot timeout. This means that some resources might be left unused for some time.

      Let DefaultDeclarativeSlotPool accept excess slots only when job is restarting

      Here the idea is to only accept excess slots when the job is currently restarting. This will required that the scheduler tells the DefaultDeclarativeSlotPool about the restarting state.

      Advantages:

      • We would only accept excess slots for the time of restarting

      Disadvantages:

      • We are complicating the semantics of the DefaultDeclarativeSlotPool. Moreover, we are introducing additional signals that communicate the restarting state to the pool.

      Don't immediately free slots on the TaskExecutor when they are rejected

      Instead of freeing the slot immediately on the TaskExecutor after it is rejected. We could also retry for some time and only free the slot after some timeout.

      Advantages:

      • No changes on the JobMaster side needed.

      Disadvantages:

      • Complication of the slot lifecycle on the TaskExecutor
      • Unneeded slots are not made available for other jobs as fast as possible

      Don't zero resource requirements during job restart

      Instead of zeroing the resource requirements during a job restart, we could also keep the last know requirements. Once the job is restarted, we could adjust the requirements.

      Advantages:

      • Conceptually easy to do

      Disadvantages:

      • The old requirements mustn't necessarily be the new ones
      • Convolutes logic in the scheduler

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            trohrmann Till Rohrmann
            trohrmann Till Rohrmann
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment