Hadoop Common
  1. Hadoop Common
  2. HADOOP-5746

Errors encountered in MROutputThread after the last map/reduce call can go undetected

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.20.1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      If the child (streaming) process returns successfully and the MROutputThread throws an error, there was no way to detect that as all the IOExceptions was ignored. Such issues can occur when DFS clients were closed etc. Now a check for errors (in threads) is made before finishing off the task and an exception is thrown that fails he task.
      Show
      If the child (streaming) process returns successfully and the MROutputThread throws an error, there was no way to detect that as all the IOExceptions was ignored. Such issues can occur when DFS clients were closed etc. Now a check for errors (in threads) is made before finishing off the task and an exception is thrown that fails he task.

      Description

      The framework map/reduce bridge methods make a check at the beginning of the respective methods whether MROutputThread encountered an exception while writing keys/values that the streaming process emitted. However, if the exception happens in MROutputThread after the last call to the map/reduce method, the exception goes undetected. An example of such an exception is an exception from the DFSClient that fails to write to a file on the HDFS.

      1. 5746-testcase.patch
        3 kB
        Amar Kamat
      2. 5746-reproduce.1.patch
        1 kB
        Amar Kamat
      3. 5746.6.patch
        0.8 kB
        Amar Kamat
      4. 5746.1.patch
        0.7 kB
        Devaraj Das

        Activity

        Devaraj Das created issue -
        Hide
        Devaraj Das added a comment -

        Ok here is an early version of the patch (no testcase yet). The patch applies on 0.18 as well.

        Show
        Devaraj Das added a comment - Ok here is an early version of the patch (no testcase yet). The patch applies on 0.18 as well.
        Devaraj Das made changes -
        Field Original Value New Value
        Attachment 5746.1.patch [ 12406555 ]
        Hide
        Amar Kamat added a comment -

        Attaching the patch [5746.6.patch]. It looks like its a very timing issue to reproduce/test this bug. The problem occurs when the pipe process finishes off and then the output thread cranks up with some exception (like fs errors).
        Attaching a framework change [5746-reproduce.1.patch] and a testcase [/5746-testcase.patch] to verify the fix. The caller of PipeMapRed.waitOutputThreads() i.e PipeMapRed.mapRedFinished() simply ignores IOException hence changed the exception to RuntimeException. I dont know why PipeMapRed.mapRedFinished() ignores IOException. But for now I have kept it as it is.

        Show
        Amar Kamat added a comment - Attaching the patch [5746.6.patch] . It looks like its a very timing issue to reproduce/test this bug. The problem occurs when the pipe process finishes off and then the output thread cranks up with some exception (like fs errors). Attaching a framework change [5746-reproduce.1.patch] and a testcase [/5746-testcase.patch] to verify the fix. The caller of PipeMapRed.waitOutputThreads() i.e PipeMapRed.mapRedFinished() simply ignores IOException hence changed the exception to RuntimeException . I dont know why PipeMapRed.mapRedFinished() ignores IOException. But for now I have kept it as it is.
        Amar Kamat made changes -
        Attachment 5746.6.patch [ 12409736 ]
        Attachment 5746-reproduce.1.patch [ 12409737 ]
        Attachment 5746-testcase.patch [ 12409738 ]
        Hide
        Amar Kamat added a comment -

        Result of test-patch

        
        

        [exec] -1 overall.
        [exec]
        [exec] +1 @author. The patch does not contain any @author tags.
        [exec]
        [exec] -1 tests included. The patch doesn't appear to include any new or modified tests.
        [exec] Please justify why no tests are needed for this patch.
        [exec]
        [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
        [exec]
        [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
        [exec]
        [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
        [exec]
        [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
        [exec]
        [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
        [code}

        Streaming tests passed on my box except TestStreamingExitStatus which fails even on trunk.

        Show
        Amar Kamat added a comment - Result of test-patch [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [code} Streaming tests passed on my box except TestStreamingExitStatus which fails even on trunk.
        Hide
        Devaraj Das added a comment -

        I just committed this. Thanks, Amar!

        Show
        Devaraj Das added a comment - I just committed this. Thanks, Amar!
        Devaraj Das made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Hadoop Flags [Reviewed]
        Assignee Amar Kamat [ amar_kamat ]
        Fix Version/s 0.20.1 [ 12313866 ]
        Fix Version/s 0.21.0 [ 12313563 ]
        Resolution Fixed [ 1 ]
        Amar Kamat made changes -
        Release Note If the child (streaming) process returns successfully and the MROutputThread throws an error, there was no way to detect that as all the IOExceptions was ignored. Such issues can occur when DFS clients were closed etc. Now a check for errors (in threads) is made before finishing off the task and an exception is thrown that fails he task.
        Owen O'Malley made changes -
        Component/s contrib/streaming [ 12310972 ]

          People

          • Assignee:
            Amar Kamat
            Reporter:
            Devaraj Das
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development