[MAPREDUCE-3278] 0.20: avoid a busy-loop in ReduceTask scheduling - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.20.205.0
Fix Version/s: 1.1.0
Component/s: mrv1, performance, task
Labels:
None

Target Version/s:

1.1.0
Hadoop Flags:

Reviewed

Description

Looking at profiling results, it became clear that the ReduceTask has the following busy-loop which was causing it to suck up 100% of CPU in the fetch phase in some configurations:

the number of reduce fetcher threads is configured to more than the number of hosts
therefore "busyEnough()" never returns true
the "scheduling" portion of the code can't schedule any new fetches, since all of the pending fetches in the mapLocations buffer correspond to hosts that are already being fetched (the hosts are in the uniqueHosts map)
getCopyResult() immediately returns null, since there are no completed maps.
Hence ReduceTask spins back and forth between trying to schedule things (and failing), and trying to grab completed results (of which there are none), with no waits.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

mr-3278.txt
27/Oct/11 05:34
4 kB
Todd Lipcon
reducer-cpu-usage.png
27/Oct/11 04:51
99 kB
Todd Lipcon

Activity

People

Assignee:: Todd Lipcon

Reporter:: Todd Lipcon

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 27/Oct/11 04:42

Updated:: 17/Oct/12 18:27

Resolved:: 03/Nov/11 19:41