Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-9885

Resource provider configuration are only removing its container, causing issues in failover scenarios

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.8.0
    • None
    • resource provider
    • None

    Description

      An agent could crash while it is handling a REMOVE_RESOURCE_PROVIDER_CONFIG call. In that case, the resource provider won't be removed. This is because its configuration is only removed if the actual resource provider container has been stopped. I.e. in LocalResourceProviderDaemonProcess::remove os::rm is only called if cleanupContainers was successful. After agent failover, the resource provider will still be running. This can be a problem for frameworks/operators, because there isn't a feedback channel that informs them if their removal requests was successful or not.

      Attachments

        Activity

          People

            Unassigned Unassigned
            nfnt Jan Schlicht
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: