Hadoop Common
  1. Hadoop Common
  2. HADOOP-4340

"hadoop jar" always returns exit code 0 (success) to the shell when jar throws a fatal exception

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.18.1, 0.19.0, 0.20.0
    • Fix Version/s: 0.18.2
    • Component/s: None
    • Labels:
      None
    • Environment:

      Ubuntu 8.04 Server, 7 Hadoop nodes, GNU bash, version 3.2.39(1)-release (i486-pc-linux-gnu)

    • Hadoop Flags:
      Reviewed

      Description

      Running "hadoop jar" always returns 0 (success) when the jar dies with a stack trace. As an example, run these commands:

      /usr/local/hadoop/bin/hadoop jar /usr/local/hadoop/hadoop-0.18.1-examples.jar pi 10 10 2>&1; echo $?
      exits with 0

      /usr/local/hadoop/bin/hadoop jar /usr/local/hadoop/hadoop-0.18.1-examples.jar pi 2>&1; echo $?
      exits with 255

      /usr/local/hadoop/bin/hadoop jar /usr/local/hadoop/hadoop-0.18.1-examples.jar 2>&1; echo $?
      exits with 0

      This seems to be expected behavior. However, running:

      /usr/local/hadoop/bin/hadoop jar /usr/local/hadoop/hadoop-0.18.1-examples.jar pi 10 badparam 2>&1; echo $?
      java.lang.NumberFormatException: For input string: "badparam"
      at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
      at java.lang.Long.parseLong(Long.java:403)
      at java.lang.Long.parseLong(Long.java:461)
      at org.apache.hadoop.examples.PiEstimator.run(PiEstimator.java:241)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
      at org.apache.hadoop.examples.PiEstimator.main(PiEstimator.java:252)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
      at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
      at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:53)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
      at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
      at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
      exits with 0.

      In my opinion, if a jar throws an exception that kills the program being run, and the developer doesn't catch the exception and do a sane exit with a exit code, hadoop should at least exit with a non-zero exit code.

      As another example, while running a main class that exits with an exit code of 201, Hadoop will preserve the correct exit code:

      public static void main(String[] args) throws Exception

      { System.exit(201); }

      But when deliberately creating a null pointer exception, Hadoop exits with 0.

      public static void main(String[] args) throws Exception

      { Object o = null; o.toString(); System.exit(201); }

      This behaviour makes it very difficult, if not impossible, to use Hadoop programatically with tools such as HOD or non-Java data processing frameworks, since if a jar crashes with an unhandled exception, Hadoop doesn't inform the calling program in a well-bahaved way (polling stderr for output is not a very good way to detect application failure).

      I'm not a Java programmer, so I don't know what the best code to signal failure would be.

      Please let me know what other information I can include about my setup

      Thanks.

      1. patch-4340.txt
        0.5 kB
        Amareshwari Sriramadasu
      2. patch-4340-1.txt
        0.9 kB
        Amareshwari Sriramadasu
      3. HADOOP-4340_2_20081029.patch
        2 kB
        Arun C Murthy

        Issue Links

          Activity

          Hide
          steve_l added a comment -

          Looking at the stack trace, the cause is t

          JobShell.main() doesn't set an exit code

          public static void main(String[] argv) throws Exception

          { JobShell jshell = new JobShell(); ToolRunner.run(jshell, argv); }

          It should go System.exit(ToolRunner.run(...)))

          question is, what is going to break?

          Show
          steve_l added a comment - Looking at the stack trace, the cause is t JobShell.main() doesn't set an exit code public static void main(String[] argv) throws Exception { JobShell jshell = new JobShell(); ToolRunner.run(jshell, argv); } It should go System.exit(ToolRunner.run(...))) question is, what is going to break?
          Hide
          Amareshwari Sriramadasu added a comment -

          Thanks Steve for finding the cause. That looks like a bug, it should not break anything.

          Show
          Amareshwari Sriramadasu added a comment - Thanks Steve for finding the cause. That looks like a bug, it should not break anything.
          Hide
          Amareshwari Sriramadasu added a comment -

          Patch returning exit code from JobShell

          Show
          Amareshwari Sriramadasu added a comment - Patch returning exit code from JobShell
          Hide
          Vinod Kumar Vavilapalli added a comment -

          The patch will still return a zero exit code if the jar throws an uncaught exception. It merely tries to pass any non-zero return code that the Tool itself returns; uncaught exceptions are still not shielded.

          Show
          Vinod Kumar Vavilapalli added a comment - The patch will still return a zero exit code if the jar throws an uncaught exception. It merely tries to pass any non-zero return code that the Tool itself returns; uncaught exceptions are still not shielded.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          My bad, an exception in main WILL return a non-zero exit code. But the reason why I've seen that the above patch was not sufficient was that ExamplesDriver catches uncaught exceptions from examples and returns silently. I think that needs to be fixed.

          +1 for the fix. Examples can be fixed here or separately.

          Show
          Vinod Kumar Vavilapalli added a comment - My bad, an exception in main WILL return a non-zero exit code. But the reason why I've seen that the above patch was not sufficient was that ExamplesDriver catches uncaught exceptions from examples and returns silently. I think that needs to be fixed. +1 for the fix. Examples can be fixed here or separately.
          Hide
          Amareshwari Sriramadasu added a comment -

          Changed ExampleDriver also to return with non-zero exit code.

          Show
          Amareshwari Sriramadasu added a comment - Changed ExampleDriver also to return with non-zero exit code.
          Hide
          Amareshwari Sriramadasu added a comment -

          test-patch result:

               [exec]
               [exec] -1 overall.
               [exec]
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec]
               [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
               [exec]                         Please justify why no tests are needed for this patch.
               [exec]
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec]
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec]
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
          

          All core and contrib tests passed on my machine

          Show
          Amareshwari Sriramadasu added a comment - test-patch result: [exec] [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. All core and contrib tests passed on my machine
          Hide
          Arun C Murthy added a comment -

          In ExampleDriver.java It isn't quite elegant to call System.exit from inside a catch clause, we should use an exit code:

          int exitCode = -1;
          ...
          
          try {
           ...
           pgd.driver(argv);
           exitCode = 0;
          } catch(...) {
           ...
          }
          
          System.exit(exitCode);
          
          

          Ideally, ProgramDriver.driver should have returned an exit code... sigh!

          We should also fix ProgramDriver.driver to throw an IllegalArgumentException when the sanity checks fail.

          Show
          Arun C Murthy added a comment - In ExampleDriver.java It isn't quite elegant to call System.exit from inside a catch clause, we should use an exit code: int exitCode = -1; ... try { ... pgd.driver(argv); exitCode = 0; } catch(...) { ... } System.exit(exitCode); Ideally, ProgramDriver.driver should have returned an exit code... sigh! We should also fix ProgramDriver.driver to throw an IllegalArgumentException when the sanity checks fail.
          Hide
          Arun C Murthy added a comment -

          Updated patch.

          Show
          Arun C Murthy added a comment - Updated patch.
          Hide
          Owen O'Malley added a comment -

          +1

          Show
          Owen O'Malley added a comment - +1
          Hide
          Arun C Murthy added a comment -

          I just committed this. Thanks to Amareshwari and Steve too!

          Show
          Arun C Murthy added a comment - I just committed this. Thanks to Amareshwari and Steve too!
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12393014/HADOOP-4340_2_20081029.patch
          against trunk revision 709022.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no tests are needed for this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3508/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3508/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3508/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3508/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12393014/HADOOP-4340_2_20081029.patch against trunk revision 709022. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3508/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3508/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3508/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3508/console This message is automatically generated.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-trunk #647 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/647/)
          . Correctly set the exit code from JobShell.main so that the 'hadoop jar' command returns the right code to the user.

          Show
          Hudson added a comment - Integrated in Hadoop-trunk #647 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/647/ ) . Correctly set the exit code from JobShell.main so that the 'hadoop jar' command returns the right code to the user.

            People

            • Assignee:
              Arun C Murthy
              Reporter:
              David Litster
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development