Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-24038

DispatcherResourceManagerComponent fails to deregister application if no leading ResourceManager

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Hide
      A new multiple component leader election service was implemented that only runs a single leader election per Flink process. If this should cause any problems, then you can set `high-availability.use-old-ha-services: true` in the `flink-conf.yaml` to use the old high availability services.
      Show
      A new multiple component leader election service was implemented that only runs a single leader election per Flink process. If this should cause any problems, then you can set `high-availability.use-old-ha-services: true` in the `flink-conf.yaml` to use the old high availability services.

    Description

      With FLINK-21667 we introduced a change that can cause the DispatcherResourceManagerComponent to fail when trying to stop the application. The problem is that the DispatcherResourceManagerComponent needs a leading ResourceManager to successfully execute the stop/deregister application call. If this is not the case, then it will fail fatally. In the case of multiple standby JobManager processes it can happen that the leading ResourceManager runs somewhere else.

      I do see two possible solutions:

      1. Run the leader election process for the whole JobManager process
      2. Move the registration/deregistration of the application out of the ResourceManager so that it can be executed w/o a leader

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            trohrmann Till Rohrmann
            trohrmann Till Rohrmann
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment