[YARN-8127] Resource leak when async scheduling is enabled - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 3.2.0, 3.1.1, 2.10.2
Component/s: None
Labels:
None

Hadoop Flags:

Reviewed

Description

Brief steps to reproduce

Enable async scheduling, 5 threads
Submit a lot of jobs trying to exhaust cluster resource
After a while, observed NM allocated resource is more than resource requested by allocated containers

Looks like the commit phase is not sync handling reserved containers, causing some proposal incorrectly accepted, subsequently resource was deducted multiple times for a container.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

YARN-8127.branch-2.10.004.patch
04/Oct/21 20:05
6 kB
Eric Payne
YARN-8127.004.patch
11/Apr/18 06:24
6 kB
Tao Yang
YARN-8127.003.patch
11/Apr/18 04:46
6 kB
Tao Yang
YARN-8127.002.patch
11/Apr/18 02:25
7 kB
Tao Yang
YARN-8127.001.patch
10/Apr/18 14:57
7 kB
Tao Yang

Activity

People

Assignee:: Tao Yang

Reporter:: Weiwei Yang

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 09/Apr/18 02:48

Updated:: 05/Oct/21 21:27

Resolved:: 05/Oct/21 21:27