[YARN-4697] NM aggregation thread pool is not bound by limits - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.9.0, 3.0.0-alpha1
Component/s: nodemanager
Labels:
None

Target Version/s:

2.8.0, 2.7.3, 2.6.4
Hadoop Flags:

Reviewed

Description

In the LogAggregationService.java we create a threadpool to upload logs from the nodemanager to HDFS if log aggregation is turned on. This is a cached threadpool which based on the javadoc is an ulimited pool of threads.
In the case that we have had a problem with log aggregation this could cause a problem on restart. The number of threads created at that point could be huge and will put a large load on the NameNode and in worse case could even bring it down due to file descriptor issues.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

yarn4697.001.patch
17/Feb/16 00:19
7 kB
Haibo Chen
yarn4697.002.patch
17/Feb/16 18:11
8 kB
Haibo Chen
yarn4697.003.patch
20/Feb/16 01:31
9 kB
Haibo Chen
yarn4697.004.patch
23/Feb/16 17:16
12 kB
Haibo Chen

Issue Links

is related to

YARN-4325 Nodemanager log handlers fail to send finished/failed events in some cases

Resolved

YARN-4984 LogAggregationService shouldn't swallow exception in handling createAppDir() which cause thread leak.

Resolved

Activity

People

Assignee:: Haibo Chen

Reporter:: Haibo Chen

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Dates

Created:: 17/Feb/16 00:05

Updated:: 26/Feb/20 05:29

Resolved:: 24/Feb/16 23:05