[MAPREDUCE-2187] map tasks timeout during sorting - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.20.2, 0.20.205.0
Fix Version/s: 0.20.205.0
Component/s: None
Labels:
None

Release Note:
I just committed this. Thanks Anupam!

Description

During the execution of a large job, the map tasks timeout:

INFO mapred.JobClient: Task Id : attempt_201010290414_60974_m_000057_1, Status : FAILED
Task attempt_201010290414_60974_m_000057_1 failed to report status for 609 seconds. Killing!

The bug is in the fact that the mapper has already finished, and, according to the logs, the timeout occurs during the merge sort phase.
The intermediate data generated by the map task is quite large. So I think this is the problem.

The logs show that the merge-sort was running for 10 minutes when the task was killed.
I think the mapred.Merger should call Reporter.progress() somewhere.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

MAPREDUCE-2187-trunk-v3.patch
01/Aug/11 22:16
8 kB
Anupam Seth
MAPREDUCE-2187-trunk-v2.patch
30/Jul/11 02:21
7 kB
Anupam Seth
MAPREDUCE-2187-trunk.patch
07/Jun/11 22:00
6 kB
Anupam Seth
MAPREDUCE-2187-MR-279-v2.patch
30/Jul/11 02:20
8 kB
Anupam Seth
MAPREDUCE-2187-branch-MR-279.patch
22/Jun/11 14:04
7 kB
Anupam Seth
MAPREDUCE-2187-22.patch
07/Jun/11 22:07
6 kB
Anupam Seth
MAPREDUCE-2187-20-security-v2.patch
30/Jul/11 02:19
7 kB
Anupam Seth
MAPREDUCE-2187-20-security.patch
07/Jun/11 17:02
6 kB
Anupam Seth

Issue Links

relates to

MAPREDUCE-2177 The wait for spill completion should call Condition.awaitNanos(long nanosTimeout)

Open

Activity

People

Assignee:: Anupam Seth

Reporter:: Gianmarco De Francisci Morales

Votes:: 1 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 15/Nov/10 17:29

Updated:: 19/Oct/11 00:26

Resolved:: 01/Aug/11 22:53