Issue Details (XML | Word | Printable)

Key: HADOOP-1152
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Tahir Hashmi
Reporter: Koji Noguchi
Votes: 0
Watchers: 2
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Reduce task hang failing in MapOutputCopier.copyOutput

Created: 23/Mar/07 04:29 PM   Updated: 08/Jul/09 04:52 PM
Return to search
Component/s: None
Affects Version/s: None
Fix Version/s: 0.13.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works 1152.patch 2007-04-20 09:24 AM Tahir Hashmi 1 kB

Resolution Date: 21/Apr/07 06:57 PM


 Description  « Hide
We had couple of reduce tasks hang repeating the output below.

2007-03-22 23:57:16,296 WARN org.apache.hadoop.mapred.TaskRunner: java.io.IOException: Path /hadoop/mapred/local/task_0026_r_000307_0/map_7854.out already exists
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.rename(InMemoryFileSystem.java:246)
at org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:471)
at org.apache.hadoop.mapred.ReduceTaskRunner$MapOutputCopier.copyOutput(ReduceTaskRunner.java:336)
at org.apache.hadoop.mapred.ReduceTaskRunner$MapOutputCopier.run(ReduceTaskRunner.java:274)

2007-03-22 23:57:16,296 WARN org.apache.hadoop.mapred.TaskRunner: task_0026_r_000307_0 adding host ______ to penalty box, next contact in 192 seconds

===============================
Before the above output, there was

2007-03-22 18:15:24,274 ERROR org.apache.hadoop.mapred.TaskRunner: Map output copy failure: java.lang.NullPointerException
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:416)
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getLength(InMemoryFileSystem.java:286)
at org.apache.hadoop.fs.FilterFileSystem.getLength(FilterFileSystem.java:178)
at org.apache.hadoop.mapred.ReduceTaskRunner$MapOutputCopier.copyOutput(ReduceTaskRunner.java:340)
at org.apache.hadoop.mapred.ReduceTaskRunner$MapOutputCopier.run(ReduceTaskRunner.java:274)



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Devaraj Das added a comment - 23/Mar/07 05:13 PM
Dug the logs a bit more (for the hung reduces).. Here is another relevant "rename" exception, I found for all the hung reduces; this is for the .crc file. I looked at the code also - and it seems like the first rename is somehow not really updating (or the updates are not visible) the datastructures in the ramfs. So, later on the first getLength() is failing on the final file that rename just attempted to create (in the ramfs datastructure). However, later on, the final filenames become visible in the ramfs and then the renames really fail leading to the chain of IOExceptions... Some weird behavior...

2007-03-22 18:17:28,670 WARN org.apache.hadoop.mapred.TaskRunner: task_0026_r_000307_0 copy failed: task_0026_m_007854_0 from __
2007-03-22 18:17:28,670 WARN org.apache.hadoop.mapred.TaskRunner: java.io.IOException: Path /export/crawlspace3/kryptonite/hadoop/mapred/local/task_0026_r_000307_0/.map_7854.out.crc already exists
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.rename(InMemoryFileSystem.java:246)
at org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:480)
at org.apache.hadoop.mapred.ReduceTaskRunner$MapOutputCopier.copyOutput(ReduceTaskRunner.java:336)
at org.apache.hadoop.mapred.ReduceTaskRunner$MapOutputCopier.run(ReduceTaskRunner.java:274)


Johan Oskarsson added a comment - 19/Apr/07 11:00 AM
This is happening on a daily basis on our 25 node cluster running 0.12.3 causing serious delays.
I noticed there's no news for about a month so thought I'd ask if a hack would be to just make it overwrite the file that already exists?
I've not had time to look at the code but as long as it doesn't hang the job I'm fine with a minor performance hit until a real solution can be found.

Devaraj Das added a comment - 19/Apr/07 01:03 PM
Actually the problem might not be in the rename. The NPE should not happen in the first place.
Here is a patch that will kill the task if it ever encounters a NPE in the affected part of the code. It will also log the state of the ramfs in terms of the files it currently has and their lengths. It will also log a diagnostic message which you would see in the web ui for the killed task. Please upload those log messages whenever you see tasks failing with this exception.

Tahir Hashmi added a comment - 20/Apr/07 09:24 AM
Looked at this with Devaraj yesterday and our theory about why this fails is that in MapTask.MapOutputCopier.copyOutput(), there's a call to rename a temporary file to the actual .out file. After the rename, another call is made to get the length of the actual file (which is same as that of the temporary file, obviously). Between these calls, if a MergeThread flushes the file to .out file to disk and deletes it, the call to getLength() will fail.

Sameer's suggestion was to simply invoke getLength() on the temporary file before the rename and discard the value in case the rename fails. After the file is renamed, it should be assumed that the local code no longer owns it. 1152.patch has these changes incorporated.


Devaraj Das added a comment - 20/Apr/07 09:38 AM
+1, regardless of the race condition theory, it makes sense to not call getLength on a renamed ramfs file (that the merge thread might have already deleted in the time interval between rename and getLength). But it seems very likely the race condition is caused due to this.

Hadoop QA added a comment - 20/Apr/07 10:47 AM
-1, build or testing failed

2 attempts failed to build and test the latest attachment http://issues.apache.org/jira/secure/attachment/12355904/1152.patch against trunk revision r530556.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/65/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/65/console

Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.


Tom White added a comment - 21/Apr/07 06:56 PM
I've just committed this. Thanks Tahir!

Hadoop QA added a comment - 22/Apr/07 11:22 AM