[MAPREDUCE-5891] Improved shuffle error handling across NM restarts - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 2.5.0
Fix Version/s: 2.6.0
Component/s: None
Labels:
None

Target Version/s:

2.6.0
Hadoop Flags:

Reviewed

Description

To minimize the number of map fetch failures reported by reducers across an NM restart it would be nice if reducers only reported a fetch failure after trying for at specified period of time to retrieve the data.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

MAPREDUCE-5891-v6.patch
18/Sep/14 21:01
28 kB
Junping Du
MAPREDUCE-5891-v5.patch
17/Sep/14 18:25
28 kB
Junping Du
MAPREDUCE-5891-v4.patch
04/Sep/14 02:01
28 kB
Junping Du
MAPREDUCE-5891-v3.patch
30/Aug/14 05:42
28 kB
Junping Du
MAPREDUCE-5891-v2.patch
28/Aug/14 01:20
27 kB
Junping Du
MAPREDUCE-5891-demo.patch
21/Aug/14 18:13
14 kB
Junping Du
MAPREDUCE-5891.patch
26/Aug/14 16:46
26 kB
Junping Du

Issue Links

breaks

MAPREDUCE-6303 Read timeout when retrying a fetch error can be fatal to a reducer

Closed

is related to

MAPREDUCE-6156 Fetcher - connect() doesn't handle connection refused correctly

Closed

YARN-666 [Umbrella] Support rolling upgrades in YARN

Closed

relates to

YARN-1336 [Umbrella] Work-preserving nodemanager restart

Resolved

Activity

People

Assignee:: Junping Du

Reporter:: Jason Darrell Lowe

Votes:: 0 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 15/May/14 15:50

Updated:: 02/Apr/15 13:49

Resolved:: 18/Sep/14 22:05