Details
-
Epic
-
Status: Accepted
-
Major
-
Resolution: Unresolved
-
None
-
None
-
Manage Offers in Allocator
Description
Currently, Offers are managed by Master while Resources are handled by the Allocator. This introduces a variety of races between the Master and Allocator actors; and limitations with regards to information that the Allocator can act upon. See the linked issues for some examples of races/limitations.
The goal of this epic is to track a refactor of the Master and Allocator. The Master should continue to manage communication with Frameworks, including the act of sending Offers, but all state associated with Offers (primarily OfferIDs and Timers), will be off-loaded to the Allocator.
Attachments
Issue Links
- blocks
-
MESOS-4303 Support resources re-shuffle when new framework registered
- Open
-
MESOS-6844 Avoid offer fragmentation between multiple frameworks / within a single framework.
- Open
-
MESOS-8524 When `UPDATE_SLAVE` messages are received, offers might not be rescinded due to a race
- Open
-
MESOS-7966 check for maintenance on agent causes fatal error
- Resolved
-
MESOS-8638 Support re-balancing of outstanding offers to satisfy fairness / quota updates.
- Open
-
MESOS-8639 Improve rescinding of offers during quota configuration changes.
- Open
- is related to
-
MESOS-6596 Dynamic reservation endpoint returns 409s
- Open
-
MESOS-7639 Oversubscription could crash the master due to CHECK failure in the allocator
- Resolved
-
MESOS-1452 Improve Master::removeOffer to avoid further resource accounting bugs.
- Resolved
-
MESOS-3147 Allocator refactor
- Resolved
- relates to
-
MESOS-8850 Race between master and allocator when destroying shared volume could lead to sorter check failure.
- Open
-
MESOS-7566 Master crash due to failed check in DRFSorter::remove
- Accepted
-
MESOS-3078 Recovered resources are not re-allocated until the next allocation delay.
- Reviewable