Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
Q3 Sprint 3, Q3 Sprint 4
-
8
Description
The following sequence of events can cause an overcommit
--> Launch task is called for a task whose executor is already running
--> Executor's resources are not accounted for on the master
--> Executor exits and the event is enqueued behind launch tasks on the master
--> Master sends the task to the slave which needs to commit for resources for task and the (new) executor.
--> Master processes the executor exited event and re-offers the executor's resources causing an overcommit of resources.
Attachments
Issue Links
- blocks
-
MESOS-1654 Expose ephemeral ports per container as a resource
- Open
- is blocked by
-
MESOS-1718 Command executor can overcommit the agent.
- Accepted
- is related to
-
MESOS-1720 Slave should send exited executor message when the executor is never launched.
- Resolved
- relates to
-
MESOS-1674 Kill private_resources and treat 'ephemeral_ports' as a resource.
- Resolved