[YARN-3491] PublicLocalizer#addResource is too slow. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: 2.7.0
Fix Version/s: 2.8.0, 3.0.0-alpha1
Component/s: nodemanager
Labels:
None

Hadoop Flags:

Reviewed

Description

Based on the profiling, The bottleneck in PublicLocalizer#addResource is getInitializedLocalDirs. getInitializedLocalDirs call checkLocalDir.
checkLocalDir is very slow which takes about 10+ ms.
The total delay will be approximately number of local dirs * 10+ ms.
This delay will be added for each public resource localization.
Because PublicLocalizer#addResource is slow, the thread pool can't be fully utilized. Instead of doing public resource localization in parallel(multithreading), public resource localization is serialized most of the time.

And also PublicLocalizer#addResource is running in Dispatcher thread,
So the Dispatcher thread will be blocked by PublicLocalizer#addResource for long time.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

YARN-3491.000.patch
17/Apr/15 09:28
2 kB
Zhihai Xu
YARN-3491.001.patch
17/Apr/15 18:21
3 kB
Zhihai Xu
YARN-3491.002.patch
26/Apr/15 00:42
5 kB
Zhihai Xu
YARN-3491.003.patch
04/May/15 05:02
15 kB
Zhihai Xu
YARN-3491.004.patch
06/May/15 08:03
16 kB
Zhihai Xu

Issue Links

is related to

YARN-3549 use JNI-based FileStatus implementation from io.nativeio.NativeIO.POSIX#getFstat instead of shell-based implementation from RawLocalFileSystem in checkLocalDir.

Resolved

relates to

YARN-3496 Add a configuration to disable/enable storing localization state in NMLeveldbStateStore

Resolved

Activity

People

Assignee:: Zhihai Xu

Reporter:: Zhihai Xu

Votes:: 0 Vote for this issue

Watchers:: 19 Start watching this issue

Dates

Created:: 15/Apr/15 18:26

Updated:: 26/Feb/20 05:29

Resolved:: 06/May/15 21:21