[YARN-3849] Too much of preemption activity causing continuos killing of containers across queues - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Critical
Resolution: Fixed
Affects Version/s: 2.7.0
Fix Version/s: 2.8.0, 2.7.3, 2.6.4, 3.0.0-alpha1
Component/s: capacityscheduler
Labels:
None

Target Version/s:

2.6.4
Hadoop Flags:

Reviewed

Description

Two queues are used. Each queue has given a capacity of 0.5. Dominant Resource policy is used.

1. An app is submitted in QueueA which is consuming full cluster capacity
2. After submitting an app in QueueB, there are some demand and invoking preemption in QueueA
3. Instead of killing the excess of 0.5 guaranteed capacity, we observed that all containers other than AM is getting killed in QueueA
4. Now the app in QueueB is trying to take over cluster with the current free space. But there are some updated demand from the app in QueueA which lost its containers earlier, and preemption is kicked in QueueB now.

Scenario in step 3 and 4 continuously happening in loop. Thus none of the apps are completing.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

0004-YARN-3849-branch2-7.patch
17/Nov/15 13:48
21 kB
Sunil G
0004-YARN-3849-branch2-6.patch
06/Jan/16 16:45
16 kB
Sunil G
0004-YARN-3849.patch
06/Jul/15 04:04
30 kB
Sunil G
0003-YARN-3849.patch
02/Jul/15 15:45
30 kB
Sunil G
0002-YARN-3849.patch
01/Jul/15 11:40
29 kB
Sunil G
0001-YARN-3849.patch
29/Jun/15 18:53
54 kB
Sunil G

Activity

People

Assignee:: Sunil G

Reporter:: Sunil G

Votes:: 0 Vote for this issue

Watchers:: 19 Start watching this issue

Dates

Created:: 25/Jun/15 16:16

Updated:: 25/Oct/19 20:25

Resolved:: 11/Jul/15 17:30