[HADOOP-1152] Reduce task hang failing in MapOutputCopier.copyOutput - ASF JIRA

Voters

Watch issue

Watchers

Create sub-task

Link

Clone

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.13.0
Component/s: None
Labels:
None

Description

We had couple of reduce tasks hang repeating the output below.

2007-03-22 23:57:16,296 WARN org.apache.hadoop.mapred.TaskRunner: java.io.IOException: Path /hadoop/mapred/local/task_0026_r_000307_0/map_7854.out already exists
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.rename(InMemoryFileSystem.java:246)
at org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:471)
at org.apache.hadoop.mapred.ReduceTaskRunner$MapOutputCopier.copyOutput(ReduceTaskRunner.java:336)
at org.apache.hadoop.mapred.ReduceTaskRunner$MapOutputCopier.run(ReduceTaskRunner.java:274)

2007-03-22 23:57:16,296 WARN org.apache.hadoop.mapred.TaskRunner: task_0026_r_000307_0 adding host ______ to penalty box, next contact in 192 seconds

===============================
Before the above output, there was

2007-03-22 18:15:24,274 ERROR org.apache.hadoop.mapred.TaskRunner: Map output copy failure: java.lang.NullPointerException
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:416)
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getLength(InMemoryFileSystem.java:286)
at org.apache.hadoop.fs.FilterFileSystem.getLength(FilterFileSystem.java:178)
at org.apache.hadoop.mapred.ReduceTaskRunner$MapOutputCopier.copyOutput(ReduceTaskRunner.java:340)
at org.apache.hadoop.mapred.ReduceTaskRunner$MapOutputCopier.run(ReduceTaskRunner.java:274)

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

1152.patch
20/Apr/07 09:24
1 kB
Tahir Hashmi

Activity

Comment

This comment will be Viewable by All Users Viewable by All Users

Cancel

People

Assignee:: Tahir Hashmi

Reporter:: Koji Noguchi

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 23/Mar/07 16:29

Updated:: 08/Jul/09 16:52

Resolved:: 21/Apr/07 18:57

Agile

View on Board

Reduce task hang failing in MapOutputCopier.copyOutput

Details

Description

Attachments

Attachments

Activity

People

Dates

Agile

Slack

Issue deployment