Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2875

NM does not communicate Container crash to RM

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • 0.23.0
    • None
    • None

    Description

      Faulty container crash detection code path in NodeManager.

      Steps:
      Run a job.
      Kill the AM container in NM.

      NM logs has:
      org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: CONTAINER_KILLED_ON_REQUEST at RUNNING
      at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:297)
      at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:39)
      at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:439)
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:685)

      Attachments

        Issue Links

          Activity

            People

              sseth Siddharth Seth
              sharadag Sharad Agarwal
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: