[YARN-8804] resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: 3.2.0
Fix Version/s: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2, 2.8.6
Component/s: capacityscheduler
Labels:
None

Target Version/s:

2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2, 2.8.6
Hadoop Flags:

Reviewed

Description

This problem is due to ~~YARN-4280~~, parent queue will deduct child queue's headroom when the child queue reached its resource limit and the skipped type is QUEUE_LIMIT, the resource limits of deepest parent queue will be correctly calculated, but for non-deepest parent queue, its headroom may be much more than the sum of reached-limit child queues' headroom, so that the resource limit of non-deepest parent may be much less than its true value and block the allocation for later queues.

To reproduce this problem with UT:
(1) Cluster has two nodes whose node resource both are <10GB, 10core> and 3-level queues as below, among them max-capacity of "c1" is 10 and others are all 100, so that max-capacity of queue "c1" is <2GB, 2core>

                  Root
                 /  |  \
                a   b    c
               10   20   70
                         |   \
                        c1   c2
                  10(max=10) 90

(2) Submit app1 to queue "c1" and launch am1(resource=<1GB, 1 core>) on nm1
(3) Submit app2 to queue "b" and launch am2(resource=<1GB, 1 core>) on nm1
(4) app1 and app2 both ask one <2GB, 1core> containers.
(5) nm1 do 1 heartbeat
Now queue "c" has lower capacity percentage than queue "b", the allocation sequence will be "a" -> "c" -> "b",
queue "c1" has reached queue limit so that requests of app1 should be pending,
headroom of queue "c1" is <1GB, 1core> (=max-capacity - used),
headroom of queue "c" is <18GB, 18core> (=max-capacity - used),
after allocation for queue "c", resource limit of queue "b" will be wrongly calculated as <2GB, 2core>,
headroom of queue "b" will be <1GB, 1core> (=resource-limit - used)
so that scheduler won't allocate one container for app2 on nm1

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

YARN-8804.001.patch
20/Sep/18 08:01
9 kB
Tao Yang
YARN-8804.002.patch
21/Sep/18 04:11
10 kB
Tao Yang
YARN-8804.003.patch
21/Sep/18 22:45
9 kB
Tao Yang

Issue Links

is caused by

YARN-4280 CapacityScheduler reservations may not prevent indefinite postponement on a busy cluster

Resolved

Activity

People

Assignee:: Tao Yang

Reporter:: Tao Yang

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 20/Sep/18 07:33

Updated:: 27/Sep/18 03:27

Resolved:: 27/Sep/18 00:26