Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2667

MR279: mapred job -kill leaves application in RUNNING state

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.23.0
    • Component/s: mrv2
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      the mapred job -kill command doesn't seem to fully clean up the application.

      If you kill a job and run mapred job -list again it still shows up as running:

      mapred job -kill job_1310072430717_0003
      Killed job job_1310072430717_0003

      mapred job -list
      Total jobs:1
      JobId State StartTime UserName Queue Priority SchedulingInfo
      job_1310072430717_0003 RUNNING 0 tgraves default NORMAL 98.139.92.22:19888/yarn/job/job_1310072430717_3_3

      Running kill again will error out.

      It also still shows up in the RM Applications UI as running with a note of: Kill Job received from client
      job_1310072430717_0003 Job received Kill while in RUNNING state.

      1. MAPREDUCE-2587-279-v2.patch
        14 kB
        Thomas Graves
      2. MAPREDUCE-2667-mr279.patch
        1 kB
        Thomas Graves
      3. MAPREDUCE-2667-mr279-v2.patch
        1 kB
        Thomas Graves

        Activity

        Hide
        Thomas Graves added a comment -

        The issue here seems to be even though the unregister routine in the RMCommunicator is setting the state to KILLED and
        then calls finishApplicationMaster, the finishApplicationMaster is just sending the FINISHED event which isn't handled
        by the ApplicationImpl when it is in the RUNNING state. So basically the killed state that got set is ignored.

        2011-07-08 19:50:21,889 ERROR org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.ApplicationImpl: Can't
        handle this event at current stateorg.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: FINISH at
        RUNNING at
        org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:416)
        at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:331) at
        org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:39) at
        org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:476)
        at org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.ApplicationImpl.handle(ApplicationImpl.java:587)
        at
        org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:202)
        at
        org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:187)
        at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:111) at
        org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:74) at
        java.lang.Thread.run(Thread.java:619)

        Show
        Thomas Graves added a comment - The issue here seems to be even though the unregister routine in the RMCommunicator is setting the state to KILLED and then calls finishApplicationMaster, the finishApplicationMaster is just sending the FINISHED event which isn't handled by the ApplicationImpl when it is in the RUNNING state. So basically the killed state that got set is ignored. 2011-07-08 19:50:21,889 ERROR org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.ApplicationImpl: Can't handle this event at current stateorg.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: FINISH at RUNNING at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:416) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:331) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:39) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:476) at org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.ApplicationImpl.handle(ApplicationImpl.java:587) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:202) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:187) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:111) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:74) at java.lang.Thread.run(Thread.java:619)
        Hide
        Mahadev konar added a comment -

        +1, looks good to me. Ill push it.

        Show
        Mahadev konar added a comment - +1, looks good to me. Ill push it.
        Hide
        Thomas Graves added a comment -

        fixed wrapping

        Show
        Thomas Graves added a comment - fixed wrapping
        Hide
        Thomas Graves added a comment -

        please ignore attachment for MAPREDUCE-2587-279-v2.patch, I accidentally attached the wrong file.

        Show
        Thomas Graves added a comment - please ignore attachment for MAPREDUCE-2587 -279-v2.patch, I accidentally attached the wrong file.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12486054/MAPREDUCE-2667-mr279-v2.patch
        against trunk revision 1144403.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/452//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12486054/MAPREDUCE-2667-mr279-v2.patch against trunk revision 1144403. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/452//console This message is automatically generated.
        Hide
        Mahadev konar added a comment -

        I just pushed this. Thanks Thomas!

        Show
        Mahadev konar added a comment - I just pushed this. Thanks Thomas!

          People

          • Assignee:
            Thomas Graves
            Reporter:
            Thomas Graves
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development