Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
If we can assume that all frameworks are PARTITION_AWARE (e.g., for Mesos 2), we can likely remove the code that applies a rate-limit to agent removal. This is because "agent removal" just means marking the agent as UNREACHABLE; because this is a non-destructive operation, we don't need to be as careful about the situations in which we do it. If a framework responds to UNREACHABLE by terminating and replacing tasks, they can (and often should) use their own safety mechanisms, whether a rate-limit or something else.
Attachments
Issue Links
- is blocked by
-
MESOS-7721 Master's agent removal rate limit also applies to agent unreachability.
- Accepted
- is related to
-
MESOS-8386 Inaccurate rate limiting of marking agents unreachable after master failover.
- Open