Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-6828

Consider ways for frameworks to ignore offers with an Unavailability

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Accepted
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      Due to the opt-in nature of maintenance primitives in Mesos, there is a deficiency for cluster administrators when frameworks have not opted in.

      An example case:

      • Cluster with reasonable churn (tasks terminate naturally)
      • Operator specifies maintenance schedule

      Ideally even in a world where none of the frameworks had opted in to maintenance primitives the operator would have some way of preventing frameworks from scheduling further work on agents in the schedule. The natural termination of the tasks in the cluster would allow the nodes to drain gracefully and the operator to then perform maintenance.

      2 options that have been discussed so far:

      1. Provide a capability for frameworks to automatically filter offers with an Unavailability set.
        • Pro: Finer grained control. Allows other frameworks to keep scheduling short lived tasks that can complete before the Unavailability.
        • Con: All frameworks have to be updated. Consider making this an environment variable to the scheduler driver for legacy frameworks.
      2. Provide a flag on the master to filter all offers with an Unavailability set.
        • Pro: Immediately actionable / usable.
        • Con: Coarse grained. Some frameworks may suffer efficiency.
        • Con: Dangerous: planning out a multi-day maintenance schedule for an entire cluster will prevent any frameworks from scheduling further work, potentially stalling the cluster.

      Action Items: Provide further context for each option and consider others. We need to ensure we have something immediately consumable by users to fill the gap until maintenance primitives are the norm. We also need to ensure we prevent dangerous scenarios like the Con listed for option #2.

      Attachments

        Issue Links

          Activity

            People

              hartem Artem Harutyunyan
              jvanremoortere Joris Van Remoortere
              Joris Van Remoortere Joris Van Remoortere
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: