Issue Details (XML | Word | Printable)

Key: HADOOP-5746
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Amar Kamat
Reporter: Devaraj Das
Votes: 0
Watchers: 2
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Errors encountered in MROutputThread after the last map/reduce call can go undetected

Created: 27/Apr/09 03:39 AM   Updated: 08/Jul/09 05:05 PM
Return to search
Component/s: None
Affects Version/s: None
Fix Version/s: 0.20.1

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works 5746-reproduce.1.patch 2009-06-03 06:16 AM Amar Kamat 1 kB
Text File Licensed for inclusion in ASF works 5746-testcase.patch 2009-06-03 06:16 AM Amar Kamat 3 kB
Text File Licensed for inclusion in ASF works 5746.1.patch 2009-04-27 06:57 PM Devaraj Das 0.7 kB
Text File Licensed for inclusion in ASF works 5746.6.patch 2009-06-03 06:16 AM Amar Kamat 0.8 kB

Hadoop Flags: Reviewed
Release Note:
If the child (streaming) process returns successfully and the MROutputThread throws an error, there was no way to detect that as all the IOExceptions was ignored. Such issues can occur when DFS clients were closed etc. Now a check for errors (in threads) is made before finishing off the task and an exception is thrown that fails he task.
Resolution Date: 04/Jun/09 12:45 PM


 Description  « Hide
The framework map/reduce bridge methods make a check at the beginning of the respective methods whether MROutputThread encountered an exception while writing keys/values that the streaming process emitted. However, if the exception happens in MROutputThread after the last call to the map/reduce method, the exception goes undetected. An example of such an exception is an exception from the DFSClient that fails to write to a file on the HDFS.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Devaraj Das added a comment - 27/Apr/09 06:57 PM
Ok here is an early version of the patch (no testcase yet). The patch applies on 0.18 as well.

Amar Kamat added a comment - 03/Jun/09 06:16 AM
Attaching the patch [5746.6.patch]. It looks like its a very timing issue to reproduce/test this bug. The problem occurs when the pipe process finishes off and then the output thread cranks up with some exception (like fs errors).
Attaching a framework change [5746-reproduce.1.patch] and a testcase [/5746-testcase.patch] to verify the fix. The caller of PipeMapRed.waitOutputThreads() i.e PipeMapRed.mapRedFinished() simply ignores IOException hence changed the exception to RuntimeException. I dont know why PipeMapRed.mapRedFinished() ignores IOException. But for now I have kept it as it is.

Amar Kamat added a comment - 04/Jun/09 12:22 PM
Result of test-patch

[exec] -1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] -1 tests included. The patch doesn't appear to include any new or modified tests.
[exec] Please justify why no tests are needed for this patch.
[exec]
[exec] +1 javadoc. The javadoc tool did not generate any warning messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
[exec]
[exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
[exec]
[exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
[code}

Streaming tests passed on my box except TestStreamingExitStatus which fails even on trunk.


Devaraj Das added a comment - 04/Jun/09 12:45 PM
I just committed this. Thanks, Amar!