[MESOS-3157] Only perform periodic resource allocations. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Won't Fix
Affects Version/s: None
Fix Version/s: None
Component/s: allocation
Labels:
None

Epic Link:
allocator performance phase 2

Description

Our deployment environments have a lot of churn, with many short-live frameworks that often revive offers. Running the allocator takes a long time (from seconds up to minutes).

In this situation, event-triggered allocation causes the event queue in the allocator process to get very long, and the allocator effectively becomes unresponsive (eg. a revive offers message takes too long to come to the head of the queue).

We have been running a patch to remove all the event-triggered allocations and only allocate periodically on the allocation interval. This works great and really improves responsiveness.

Attachments

Issue Links

contains

MESOS-2285 Eliminate dependency on master::Flags in Allocator

Resolved

duplicates

MESOS-4766 Improve allocator performance.

Resolved

MESOS-4767 Apply batching to allocation events to reduce allocator backlogging.

Resolved

is related to

MESOS-4102 Quota doesn't allocate resources on slave joining.

Resolved

MESOS-3353 generic mechanism to smuggle allocation options

Open

MESOS-4694 DRFAllocator takes very long to allocate resources with a large number of frameworks

Resolved

is superceded by

MESOS-6904 Perform batching of allocations to reduce allocator queue backlogging.

Resolved

relates to

MESOS-3078 Recovered resources are not re-allocated until the next allocation delay.

Reviewable

(1 is related to, 1 is superceded by, 1 relates to)

Activity

People

Assignee:: Unassigned

Reporter:: James Peach

Shepherd:: Benjamin Mahler

Votes:: 1 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 27/Jul/15 21:05

Updated:: 07/Jun/18 21:52

Resolved:: 07/Jun/18 21:52