Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-25893

ResourceManagerServiceImpl's lifecycle can lead to exceptions

    XMLWordPrintableJSON

Details

    Description

      The ResourceManagerServiceImpl lifecycle can lead to exceptions when calling ResourceManagerServiceImpl.deregisterApplication. The problem arises when the DispatcherResourceManagerComponent is shutdown before the ResourceManagerServiceImpl gains leadership or while it is starting the ResourceManager.

      One problem is that deregisterApplication returns an exceptionally completed future if there is no leading ResourceManager.

      Another problem is that if there is a leading ResourceManager, then it can still be the case that it has not been started yet. If this is the case, then ResourceManagerGateway.deregisterApplication will be discarded. The reason for this behaviour is that we create a ResourceManager in one Runnable and only start it in another. Due to this there can be the deregisterApplication call that gets the lock in between.

      I'd suggest to correct the lifecycle and contract of the ResourceManagerServiceImpl.deregisterApplication.

      Please note that due to this problem, the error reporting of this method has been suppressed. See FLINK-25885 for more details.

      Attachments

        Issue Links

          Activity

            People

              xtsong Xintong Song
              trohrmann Till Rohrmann
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: