Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4825

JobImpl.finished doesn't expect ERROR as a final job state

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.3-alpha, 0.23.5
    • Fix Version/s: 3.0.0, 2.0.3-alpha, 0.23.6
    • Component/s: mr-am
    • Labels:
      None

      Description

      TestMRApp.testJobError is causing AsyncDispatcher to exit with System.exit due to an exception being thrown. From the console output from testJobError:

      2012-11-27 18:46:15,240 ERROR [AsyncDispatcher event handler] impl.TaskImpl (TaskImpl.java:internalError(665)) - Invalid event T_SCHEDULE on Task task_0_0000_m_000000
      2012-11-27 18:46:15,242 FATAL [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(132)) - Error in dispatcher thread
      java.lang.IllegalArgumentException: Illegal job state: ERROR
      	at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.finished(JobImpl.java:838)
      	at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InternalErrorTransition.transition(JobImpl.java:1622)
      	at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InternalErrorTransition.transition(JobImpl.java:1)
      	at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:359)
      	at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:299)
      	at org.apache.hadoop.yarn.state.StateMachineFactory.access$3(StateMachineFactory.java:287)
      	at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445)
      	at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:723)
      	at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:1)
      	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:974)
      	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1)
      	at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:128)
      	at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77)
      	at java.lang.Thread.run(Thread.java:662)
      2012-11-27 18:46:15,242 INFO  [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(135)) - Exiting, bbye..
      

        Issue Links

          Activity

          Hide
          Jason Lowe added a comment -

          Simple fix. No additional unit tests since this is fixing an existing test.

          Show
          Jason Lowe added a comment - Simple fix. No additional unit tests since this is fixing an existing test.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12555049/MAPREDUCE-4825.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3072//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3072//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12555049/MAPREDUCE-4825.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3072//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3072//console This message is automatically generated.
          Hide
          Robert Joseph Evans added a comment -

          The patch looks fine to me. +1

          I'll check it in.

          Show
          Robert Joseph Evans added a comment - The patch looks fine to me. +1 I'll check it in.
          Hide
          Robert Joseph Evans added a comment -

          Thanks Jason,

          I put this in trunk, branch-2, and branch-0.23

          Show
          Robert Joseph Evans added a comment - Thanks Jason, I put this in trunk, branch-2, and branch-0.23
          Hide
          Hudson added a comment -

          Integrated in Hadoop-trunk-Commit #3069 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3069/)
          MAPREDUCE-4825. JobImpl.finished doesn't expect ERROR as a final job state (jlowe via bobby) (Revision 1414840)

          Result = SUCCESS
          bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1414840
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
          Show
          Hudson added a comment - Integrated in Hadoop-trunk-Commit #3069 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3069/ ) MAPREDUCE-4825 . JobImpl.finished doesn't expect ERROR as a final job state (jlowe via bobby) (Revision 1414840) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1414840 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
          Hide
          Bikas Saha added a comment -

          The JobImpl state machine allows SUCCEEDED->ERROR and FAILED->ERROR transition which would call finished(). Would there be a problem in metrics being notified of success/failure and then again of error?

          Show
          Bikas Saha added a comment - The JobImpl state machine allows SUCCEEDED->ERROR and FAILED->ERROR transition which would call finished(). Would there be a problem in metrics being notified of success/failure and then again of error?
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Yarn-trunk #51 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/51/)
          MAPREDUCE-4825. JobImpl.finished doesn't expect ERROR as a final job state (jlowe via bobby) (Revision 1414840)

          Result = SUCCESS
          bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1414840
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
          Show
          Hudson added a comment - Integrated in Hadoop-Yarn-trunk #51 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/51/ ) MAPREDUCE-4825 . JobImpl.finished doesn't expect ERROR as a final job state (jlowe via bobby) (Revision 1414840) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1414840 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-0.23-Build #450 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/450/)
          svn merge -c 1414840 FIXES: MAPREDUCE-4825. JobImpl.finished doesn't expect ERROR as a final job state (jlowe via bobby) (Revision 1414844)

          Result = SUCCESS
          bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1414844
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #450 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/450/ ) svn merge -c 1414840 FIXES: MAPREDUCE-4825 . JobImpl.finished doesn't expect ERROR as a final job state (jlowe via bobby) (Revision 1414844) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1414844 Files : /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #1241 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1241/)
          MAPREDUCE-4825. JobImpl.finished doesn't expect ERROR as a final job state (jlowe via bobby) (Revision 1414840)

          Result = FAILURE
          bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1414840
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1241 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1241/ ) MAPREDUCE-4825 . JobImpl.finished doesn't expect ERROR as a final job state (jlowe via bobby) (Revision 1414840) Result = FAILURE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1414840 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
          Hide
          Jason Lowe added a comment -

          Would there be a problem in metrics being notified of success/failure and then again of error?

          Potentially, I forgot the job could leave these terminal states. Some potential ways to address it:

          • Don't allow the state to leave "terminal" states like SUCCEEDED/FAILED/KILLED.
          • Add metrics for "errored" jobs to distinguish between failed and error. This still means that the sum of metrics could exceed the total number of job since a job can both succeed and error.
          • Have finished ignore incrementing any metrics if the job is already in a terminal state (SUCCEEDED/FAILED/KILLED) to avoid double-counting a job.
          Show
          Jason Lowe added a comment - Would there be a problem in metrics being notified of success/failure and then again of error? Potentially, I forgot the job could leave these terminal states. Some potential ways to address it: Don't allow the state to leave "terminal" states like SUCCEEDED/FAILED/KILLED. Add metrics for "errored" jobs to distinguish between failed and error. This still means that the sum of metrics could exceed the total number of job since a job can both succeed and error. Have finished ignore incrementing any metrics if the job is already in a terminal state (SUCCEEDED/FAILED/KILLED) to avoid double-counting a job.
          Hide
          Jason Lowe added a comment -

          Filed MAPREDUCE-4835 to track the metric double-counting problem. Thanks for pointing out the issue, Bikas!

          Show
          Jason Lowe added a comment - Filed MAPREDUCE-4835 to track the metric double-counting problem. Thanks for pointing out the issue, Bikas!

            People

            • Assignee:
              Jason Lowe
              Reporter:
              Jason Lowe
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development