[MAPREDUCE-1521] Protection against incorrectly configured reduces - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.22.1
Component/s: jobtracker
Labels:
None

Description

We've seen a fair number of instances where naive users process huge data-sets (>10TB) with badly mis-configured #reduces e.g. 1 reduce.

This is a significant problem on large clusters since it takes each attempt of the reduce a long time to shuffle and then run into problems such as local disk-space etc. Then it takes 4 such attempts.

Proposal: Come up with heuristics/configs to fail such jobs early.

Thoughts?

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

MAPREDUCE-1521-0.20-yahoo.patch
12/Jul/10 18:42
12 kB
Mahadev Konar
MAPREDUCE-1521-0.20-yahoo.patch
09/Jul/10 22:37
11 kB
Mahadev Konar
MAPREDUCE-1521-0.20-yahoo.patch
09/Jul/10 22:25
11 kB
Mahadev Konar
MAPREDUCE-1521-0.20-yahoo.patch
09/Jul/10 18:54
9 kB
Mahadev Konar
MAPREDUCE-1521-0.20-yahoo.patch
08/Jul/10 22:49
3 kB
Mahadev Konar
MAPREDUCE-1521-trunk.patch
03/Aug/10 20:51
13 kB
Mahadev Konar
resourceestimator-threshold.txt
15/Feb/11 01:43
2 kB
Todd Lipcon
resourcestimator-overflow.txt
15/Feb/11 01:43
1 kB
Todd Lipcon

Issue Links

is related to

PIG-1249 Safe-guards against misconfigured Pig scripts without PARALLEL keyword

Closed

Activity

People

Assignee:: Mahadev Konar

Reporter:: Arun Murthy

Votes:: 0 Vote for this issue

Watchers:: 13 Start watching this issue

Dates

Created:: 22/Feb/10 17:43

Updated:: 29/Jul/14 23:35

Resolved:: 29/Jul/14 23:35