Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-6586

Teardown endpoint should remove framework

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.0.1
    • Fix Version/s: None
    • Component/s: cli, HTTP API, scheduler api
    • Labels:

      Description

      The Mesos /teardown endpoint is:

      • Removing the framework on the mesos-master. As a result, the framework is in state removed
      • Shuts down all executors and tasks running on the Mesos agents

      However, I'd also expect that a message from the mesos-master is sent to the framework (Scheduler API) so that the framework processes can initiate a shutdown as well. This is not the case. As a result, it is necessary to manually suspend the framework, e.g. by using the DC/OS UI.

      A possible solution would be to provide an additional callback teardown at the scheduler API that will notify the framework that the mesos-master has initiated a teardown. Mesos-master should only mark the framework as removed if the framework has been successfully terminated, e.g. the framework could send a message to mesos-master indicating that the termination was successful / has been started.

      This change will also affect the dcos service shutdown command which uses the /teardown endpoint. From a DC/OS CLI perspective, I'd expect that the dcos service shutdown service-id command shuts down all components of the framework, not only the executors and tasks.

      Also, for consistency reasons I'd expect that this shutdown action can also be taken by using the DC/OS UI. So far on DC/OS, you can only Suspend a service / framework which will stop the framework instances, but will not remove the framework from mesos-master and terminate it's executors. As far as I am aware there is no documentation that explains in detail the difference between the shutdown command in the DC/OS CLI and the Suspend button on the DC/OS UI. A user should carefully understand what these actions are doing with the system, especially if they are not consistent. Again, I'd recommend adding a new button to the DC/OS UI that uses the /teardown endpoint.

      Tested on DC/OS with the frameworks conductr and elasticsearch.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                markusjura Markus Jura
              • Votes:
                1 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated: