[HADOOP-2437] final map output not evenly distributed across multiple disks - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Blocker
Resolution: Fixed
Affects Version/s: 0.16.0
Fix Version/s: 0.15.2
Component/s: None
Labels:
None

Description

It seems that the final merge output of map tasks for a particular job does not select the output location in random fashion.

This results in a job with a lot of map tasks eventually running out of taskTrackers asking for more tasks because the disk with most of the map outputs eventually has less disk space than specified by mapred.local.dir.minspacestart.

Maybe the start of round-robin selection of multiple locations should be randomized.

In our case:
110,000 maps, each about 3GB final output, on a 1300 node cluster.
Out of 4 locations and after processing about 79,000 maps, the selection for final map outputs 'file.out' looked like:
location1: 24,000
location2: 25
location3: 55,000
location4: 7

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HADOOP-2437_1_20071218.patch
19/Dec/07 09:11
0.9 kB
Arun Murthy
HADOOP-2437_1_20071218.patch
18/Dec/07 19:17
1 kB
Arun Murthy
HADOOP-2437_2_20071220.patch
19/Dec/07 19:03
4 kB
Arun Murthy

Issue Links

relates to

HDFS-325 DFS should not use round robin policy in determing on which volume (file system partition) to allocate for the next block

Reopened

Activity

People

Assignee:: Arun Murthy

Reporter:: Christian Kunz

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 15/Dec/07 23:56

Updated:: 08/Jul/09 16:52

Resolved:: 20/Dec/07 11:50