Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-1148

Add support for rate limiting slave removal

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.22.0
    • Component/s: master
    • Labels:

      Description

      To safeguard against unforeseen bugs leading to widespread slave removal, it would be nice to allow for rate limiting of the decision to remove slaves and/or send TASK_LOST messages for tasks on those slaves. Ideally this would allow an operator to be notified soon enough to intervene before causing cluster impact.

        Issue Links

          Activity

          Hide
          vinodkone Vinod Kone added a comment -

          commit 3048e5e1686a5ae0a0f04fd30fdda0a380e9d13d
          Author: Vinod Kone <vinodkone@gmail.com>
          Date: Tue Feb 3 14:34:28 2015 -0800

          Added metrics for slave shutdowns.

          Review: https://reviews.apache.org/r/30584

          commit 886efefc5f294b3ea22c1fa2ce70a9e9324eca19
          Author: Vinod Kone <vinodkone@gmail.com>
          Date: Thu Jan 29 11:51:02 2015 -0800

          Rate limited the removal of slaves failing health checks.

          Review: https://reviews.apache.org/r/30514

          commit fafccbd9dcb92d4db27bc272d7443ac6ebecbae8
          Author: Vinod Kone <vinodkone@gmail.com>
          Date: Thu Jan 29 12:44:53 2015 -0800

          Moved framework related rate limiters into Master::Frameworks.

          Review: https://reviews.apache.org/r/30511

          Show
          vinodkone Vinod Kone added a comment - commit 3048e5e1686a5ae0a0f04fd30fdda0a380e9d13d Author: Vinod Kone <vinodkone@gmail.com> Date: Tue Feb 3 14:34:28 2015 -0800 Added metrics for slave shutdowns. Review: https://reviews.apache.org/r/30584 commit 886efefc5f294b3ea22c1fa2ce70a9e9324eca19 Author: Vinod Kone <vinodkone@gmail.com> Date: Thu Jan 29 11:51:02 2015 -0800 Rate limited the removal of slaves failing health checks. Review: https://reviews.apache.org/r/30514 commit fafccbd9dcb92d4db27bc272d7443ac6ebecbae8 Author: Vinod Kone <vinodkone@gmail.com> Date: Thu Jan 29 12:44:53 2015 -0800 Moved framework related rate limiters into Master::Frameworks. Review: https://reviews.apache.org/r/30511
          Show
          vinodkone Vinod Kone added a comment - - edited https://reviews.apache.org/r/30511 https://reviews.apache.org/r/30514 https://reviews.apache.org/r/30584
          Hide
          vinodkone Vinod Kone added a comment -

          Here's the proposal:

          --> Add a flag to the slave that describes the rate of slave removal.

          --> Configure a rate limiter on the master that is shared among the slave observers.

          --> Update RateLimiter to provide the ability to let it's users discard the acquired future (Will create a new ticket for this).

          --> Update the slave observer, to acquire a permit (future) for slave removal and cancel the removal if a pong is received before the future is ready.

          Show
          vinodkone Vinod Kone added a comment - Here's the proposal: --> Add a flag to the slave that describes the rate of slave removal. --> Configure a rate limiter on the master that is shared among the slave observers. --> Update RateLimiter to provide the ability to let it's users discard the acquired future (Will create a new ticket for this). --> Update the slave observer, to acquire a permit (future) for slave removal and cancel the removal if a pong is received before the future is ready.

            People

            • Assignee:
              vinodkone Vinod Kone
              Reporter:
              wfarner Bill Farner
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development

                  Agile