Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
Twitter Mesos Q1 Sprint 3
-
3
Description
Much like we rate limit slave removals in the common path (MESOS-1148), we need to rate limit slave removals that occur during master recovery. When a master recovers and is using a strict registry, slaves that do not re-register within a timeout will be removed.
Currently there is a safeguard in place to abort when too many slaves have not re-registered. However, in the case of a transient partition, we don't want to remove large sections of slaves without rate limiting.
Attachments
Issue Links
- relates to
-
MESOS-1148 Add support for rate limiting slave removal
- Resolved