-
Type:
Epic
-
Status: Accepted
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: None
-
Fix Version/s: None
-
Component/s: allocation, framework, master
-
Labels:
-
Epic Name:oversubscription for reservation
Reserved resources allow frameworks and cluster operators to ensure sufficient resources are available when needed. Reservations are usually made to guarantee there are enough resources under peak loads. Often times, reserved resources are not actually allocated; in other words, the frameworks do not use those resources and they sit reserved, but idle.
This underutilization is either an opportunity cost or a direct cost, particularly to the cluster operator. Reserved but unallocated resources held by a Lender Framework could be optimistically offered to other frameworks, which we refer to as Tenant Frameworks. When the resources are requested back by the Lender Framework, some of the Tenant Frameworkâs tasks are evicted and the original resource offer guarantee is preserved.
The first step is to identify when resources are reserved, but not allocated. We then offer these reserved resources to other frameworks, but mark these offered resources as revocable resources. This allows Tenant Frameworks to use these resources temporarily in a 'best-effort' fashion, knowing that they could be revoked or reclaimed at any time.
- is duplicated by
-
MESOS-6113 Offer unused quota resources as revocable
-
- Accepted
-
- is related to
-
MESOS-1607 Introduce optimistic offers.
-
- Accepted
-
- links to
- mentioned in
-
Page Loading...
|
MESOS-1615 | Create design document for Optimistic Offers |
![]() |
Resolved | Joseph Wu | |
MESOS-3887 | Add a flag to master to enable optimistic offers. |
|
Reviewable | Guangya Liu | ||
MESOS-3888 | Support distinguishing revocable resources in the Resource protobuf. |
|
Open | Unassigned | ||
MESOS-3889 | Modify Oversubscription documentation to explicitly forbid the QoS Controller from killing executors running on optimistically offered resources. |
|
Accepted | Unassigned | ||
MESOS-3890 | Add notion of evictable task to RunTaskMessage |
|
Reviewable | Guangya Liu | ||
MESOS-3891 | Add a helper function to the Agent to check available resources before launching a task. |
|
Accepted | Guangya Liu | ||
MESOS-3892 | Add a helper function to the Agent to retrieve the list of executors that are using optimistically offered, revocable resources. |
|
Open | Unassigned | ||
MESOS-3893 | Implement tests for verifying allocator resource math. |
|
Accepted | Guangya Liu | ||
MESOS-3894 | Rebuild reservation slack allocator state during master failover. |
|
Accepted | Guangya Liu | ||
MESOS-3895 | Update reservation slack allocator state during agent failover. |
|
Accepted | Artem Harutyunyan | ||
MESOS-3896 | Add accounting for reservation slack in the allocator. |
|
Accepted | Unassigned | ||
MESOS-3897 | Identify and implement test cases for verifying eviction logic in the agent |
|
Open | Unassigned | ||
MESOS-3898 | Identify and implement test cases for handling a race between optimistic lender and tenant offers. |
|
Accepted | Unassigned | ||
MESOS-3930 | Set resource type as USAGE_SLACK for Oversubscription |
|
Open | Unassigned | ||
MESOS-3931 | Do not enable task and executor run on different resources |
|
Reviewable | Guangya Liu | ||
MESOS-3955 | Add helper function to get stateless resources. |
|
Reviewable | Guangya Liu | ||
MESOS-4123 | Added USAGE_SLACK metrics to snapshot endpoint for master/agent |
|
Reviewable | Guangya Liu | ||
MESOS-4124 | Added ALLOCATION_SLACK metrics to snapshot endpoint for master/agent |
|
Reviewable | Guangya Liu | ||
MESOS-4145 | Update allocator to get allocation slack resources |
|
Reviewable | Guangya Liu | ||
MESOS-4146 | Distinguish usage slack and allocation slack revocable resources |
|
Reviewable | Guangya Liu | ||
MESOS-4148 | Set task as REASON_RESOURCE_PREEMPTED if not enough allocation slack resources |
|
Reviewable | Guangya Liu | ||
MESOS-4265 | Launch tasks after executors evicted |
|
Open | Unassigned | ||
MESOS-4267 | Added helper function to flatten resources. |
|
Reviewable | Guangya Liu | ||
MESOS-4320 | Did not rescind offer if offer did not include USAGE_SLACK |
|
Open | Unassigned | ||
MESOS-4321 | Slave total resources in master does not include ALLOCATION SLACK |
|
Reviewable | Guangya Liu | ||
MESOS-4322 | The load qos controller should use only USAGE SLACK resources. |
|
Reviewable | Guangya Liu | ||
MESOS-4323 | Deprecate the revocable_resources metrics for both master and agent |
|
Open | Guangya Liu | ||
MESOS-4327 | Update state endpoint support both usage and allocation slack resources. |
|
Reviewable | Guangya Liu | ||
MESOS-4426 | Should not send resource offer if there are not enough resources for a task |
|
Reviewable | Guangya Liu | ||
|
MESOS-4447 | Renamed reserved() API to reservations() |
|
Resolved | Guangya Liu |