[MAPREDUCE-6631] shuffle handler would benefit from per-local-dir threads - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 2.7.2, 3.0.0-alpha1
Fix Version/s: None
Component/s: nodemanager
Labels:
None

Description

jlowe and I discussed this while investigating I/O starvation we have been seeing on our clusters lately (possibly amplified by increased tez workloads).

If a particular disk is being slow, it is very likely that all shuffle netty threads will be blocked on the read side of sendfile(). (sendfile() is asynchronous on the outbound socket side, but not on the read side.) This causes the entire shuffle subsystem to slow down.

It seems like we could make the netty threads more asynchronous by introducing a small set of threads per local-dir that are responsible for the actual sendfile() invocations.

This would not only improve shuffles that span drives, but also improve situations where there is a single large shuffle from a single local-dir. It would allow other drives to continue serving shuffle requests, AND avoid a large number of readers (2X number_of_cores by default) all fighting for the same drive, which becomes unfair to everything else on the system.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Nathan Roberts

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 08/Feb/16 18:38

Updated:: 20/Oct/17 21:30