[YARN-789] Enable zero capabilities resource requests in fair scheduler - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 2.0.4-alpha
Fix Version/s: 2.1.0-beta
Component/s: scheduler
Labels:
None

Target Version/s:

2.1.0-beta
Hadoop Flags:

Reviewed

Description

Per discussion in ~~YARN-689~~, reposting updated use case:

1. I have a set of services co-existing with a Yarn cluster.

2. These services run out of band from Yarn. They are not started as yarn containers and they don't use Yarn containers for processing.

3. These services use, dynamically, different amounts of CPU and memory based on their load. They manage their CPU and memory requirements independently. In other words, depending on their load, they may require more CPU but not memory or vice-versa.
By using YARN as RM for these services I'm able share and utilize the resources of the cluster appropriately and in a dynamic way. Yarn keeps tab of all the resources.

These services run an AM that reserves resources on their behalf. When this AM gets the requested resources, the services bump up their CPU/memory utilization out of band from Yarn. If the Yarn allocations are released/preempted, the services back off on their resources utilization. By doing this, Yarn and these service correctly share the cluster resources, being Yarn RM the only one that does the overall resource bookkeeping.

The services AM, not to break the lifecycle of containers, start containers in the corresponding NMs. These container processes do basically a sleep forever (i.e. sleep 10000d). They are almost not using any CPU nor memory (less than 1MB). Thus it is reasonable to assume their required CPU and memory utilization is NIL (more on hard enforcement later). Because of this almost NIL utilization of CPU and memory, it is possible to specify, when doing a request, zero as one of the dimensions (CPU or memory).

The current limitation is that the increment is also the minimum.

If we set the memory increment to 1MB. When doing a pure CPU request, we would have to specify 1MB of memory. That would work. However it would allow discretionary memory requests without a desired normalization (increments of 256, 512, etc).

If we set the CPU increment to 1CPU. When doing a pure memory request, we would have to specify 1CPU. CPU amounts a much smaller than memory amounts, and because we don't have fractional CPUs, it would mean that all my pure memory requests will be wasting 1 CPU thus reducing the overall utilization of the cluster.

Finally, on hard enforcement.

For CPU. Hard enforcement can be done via a cgroup cpu controller. Using an absolute minimum of a few CPU shares (ie 10) in the LinuxContainerExecutor we ensure there is enough CPU cycles to run the sleep process. This absolute minimum would only kick-in if zero is allowed, otherwise will never kick in as the shares for 1 CPU are 1024.

For Memory. Hard enforcement is currently done by the ProcfsBasedProcessTree.java, using a minimum absolute of 1 or 2 MBs would take care of zero memory resources. And again, this absolute minimum would only kick-in if zero is allowed, otherwise will never kick in as the increment memory is in several MBs if not 1GB.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

YARN-789.patch
07/Jun/13 20:11
10 kB
Alejandro Abdelnur
YARN-789.patch
13/Jun/13 00:28
25 kB
Alejandro Abdelnur
YARN-789.patch
13/Jun/13 20:05
22 kB
Alejandro Abdelnur
YARN-789.patch
14/Jun/13 17:21
20 kB
Alejandro Abdelnur

Issue Links

depends upon

YARN-803 factor out scheduler config validation from the ResourceManager to each scheduler implementation

Closed

is blocked by

YARN-788 Rename scheduler resource minimum to increment

Resolved

is broken by

YARN-3996 YARN-789 (Support for zero capabilities in fairscheduler) is broken after YARN-3305

Open

is related to

YARN-787 Remove resource min from Yarn client API

Closed

YARN-788 Rename scheduler resource minimum to increment

Resolved

MAPREDUCE-5310 MRAM should not normalize allocation request capabilities

Closed

relates to

YARN-689 Add multiplier unit to resourcecapabilities

Resolved

(1 is related to, 1 relates to)

Activity

People

Assignee:: Alejandro Abdelnur

Reporter:: Alejandro Abdelnur

Votes:: 0 Vote for this issue

Watchers:: 14 Start watching this issue

Dates

Created:: 07/Jun/13 19:45

Updated:: 30/Jul/15 01:35

Resolved:: 14/Jun/13 19:20