Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-421

mapred pipes might return exit code 0 even when failing

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.20.1
    • Component/s: pipes
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      up to hadoop 0.18.3 org.apache.hadoop.mapred.JobShell ensured that 'hadoop jar' returns non-zero exit code when the job fails.
      This is no longer true after moving this to org.apache.hadoop.util.RunJar.

      Pipes jobs submitted through cli never returned proper exit code.

      The main methods in org.apache.hadoop.util.RunJar. and org.apache.hadoop.mapred.pipes.Submitter should be modified to return an exit code similar to how org.apache.hadoop.mapred.JobShell did it.

      1. MAPREDUCE-421.patch
        0.5 kB
        Christian Kunz

        Issue Links

          Activity

          Christian Kunz created issue -
          Christian Kunz made changes -
          Field Original Value New Value
          Component/s mapred [ 12310690 ]
          Christian Kunz made changes -
          Link This issue is related to HADOOP-4340 [ HADOOP-4340 ]
          Hide
          Qi Liu added a comment -

          Can we use Thread.setUncaughtExceptionHandler() to resolve this?

          Show
          Qi Liu added a comment - Can we use Thread.setUncaughtExceptionHandler() to resolve this?
          Hide
          Christian Kunz added a comment -

          As Qi's comment indicates, the statement in the description should be changed.

          'hadoop jar' will return correct exit code depending on the handling of uncaught exceptions. If the default handler has as consequence a non-zero exit code (seems to be the case, but is this fact actually documented somewhere?), then no change is needed.

          'mapred pipes' still needs modifications to return non-zero exit code for improper submission.

          Show
          Christian Kunz added a comment - As Qi's comment indicates, the statement in the description should be changed. 'hadoop jar' will return correct exit code depending on the handling of uncaught exceptions. If the default handler has as consequence a non-zero exit code (seems to be the case, but is this fact actually documented somewhere?), then no change is needed. 'mapred pipes' still needs modifications to return non-zero exit code for improper submission.
          Owen O'Malley made changes -
          Project Hadoop Common [ 12310240 ] Hadoop Map/Reduce [ 12310941 ]
          Key HADOOP-6030 MAPREDUCE-421
          Component/s mapred [ 12310690 ]
          Hide
          Christian Kunz added a comment -

          Changed summary to reflect the fact that the issue is limited to pipes.

          Show
          Christian Kunz added a comment - Changed summary to reflect the fact that the issue is limited to pipes.
          Christian Kunz made changes -
          Summary hadoop jar and hadoop/mapred pipes might return exit code 0 even when failing mapred pipes might return exit code 0 even when failing
          Assignee Christian Kunz [ ckunz ]
          Component/s pipes [ 12312908 ]
          Christian Kunz made changes -
          Attachment MAPREDUCE-421.patch [ 12413047 ]
          Christian Kunz made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Fix Version/s 0.20.1 [ 12314047 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12413047/MAPREDUCE-421.patch
          against trunk revision 793136.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/378/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/378/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/378/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/378/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12413047/MAPREDUCE-421.patch against trunk revision 793136. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/378/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/378/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/378/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/378/console This message is automatically generated.
          Hide
          Amareshwari Sriramadasu added a comment -

          +1

          Show
          Amareshwari Sriramadasu added a comment - +1
          Hide
          Owen O'Malley added a comment -

          There should be a unit test for this so that it doesn't regress in the future.

          Show
          Owen O'Malley added a comment - There should be a unit test for this so that it doesn't regress in the future.
          Owen O'Malley made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Owen O'Malley added a comment -

          I realized that this was difficult to test.

          I just committed this. Thanks, Christian!

          Show
          Owen O'Malley added a comment - I realized that this was difficult to test. I just committed this. Thanks, Christian!
          Owen O'Malley made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Resolution Fixed [ 1 ]
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #3 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/3/)
          Pipes need to set the exit return code when something goes wrong.
          (Christian Kunz via omalley)

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #3 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/3/ ) Pipes need to set the exit return code when something goes wrong. (Christian Kunz via omalley)
          Hide
          gary murry added a comment -

          Hey, even if this hard to write new unit tests for, could we get a note on how it was tested? Also a note on why we were not concerned by the failed core and contrib tests? Thanks.

          Show
          gary murry added a comment - Hey, even if this hard to write new unit tests for, could we get a note on how it was tested? Also a note on why we were not concerned by the failed core and contrib tests? Thanks.
          Hide
          Christian Kunz added a comment -

          1) Tested by submitting 'mapred pipes' commands with incorrect parameters. Obviously the patch does not impact correct pipes jobs, otherwise we would have seen test failures.
          2) I would interpret Amareshwari's '+1' as an indication that the tests failed for unrelated reasons (unfortunately, cannot be verified, as the link to the testReport no longer exists). That test fail for unrelated reasons is not unusual – it is my subjective impression that it happens often.

          Show
          Christian Kunz added a comment - 1) Tested by submitting 'mapred pipes' commands with incorrect parameters. Obviously the patch does not impact correct pipes jobs, otherwise we would have seen test failures. 2) I would interpret Amareshwari's '+1' as an indication that the tests failed for unrelated reasons (unfortunately, cannot be verified, as the link to the testReport no longer exists). That test fail for unrelated reasons is not unusual – it is my subjective impression that it happens often.
          Hide
          gary murry added a comment -

          Thanks for adding the extra info on your testing. As for the core and contrib test failures, I agree that we have tests that fail for unrelated reason. But it is good to link those failure to the jira, so that we can get momentum to get them fixed too. It would be nice some day to have enough confidence in all the tests that we can say a failure is most likely related to the code it is testing. Thanks again.

          Show
          gary murry added a comment - Thanks for adding the extra info on your testing. As for the core and contrib test failures, I agree that we have tests that fail for unrelated reason. But it is good to link those failure to the jira, so that we can get momentum to get them fixed too. It would be nice some day to have enough confidence in all the tests that we can say a failure is most likely related to the code it is testing. Thanks again.
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Patch Available Patch Available
          26d 21h 44m 1 Christian Kunz 09/Jul/09 19:39
          Patch Available Patch Available Open Open
          5d 21h 1m 1 Owen O'Malley 15/Jul/09 16:41
          Open Open Resolved Resolved
          43d 6h 25m 1 Owen O'Malley 27/Aug/09 23:06

            People

            • Assignee:
              Christian Kunz
              Reporter:
              Christian Kunz
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development