Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-9645

Fix Invalid event FINISHED_CONTAINERS_PULLED_BY_AM at NEW on NM restart

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.3.0, 3.2.1
    • None
    • None

    Description

      *Description: *While Restarting NM throughing org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: FINISHED_CONTAINERS_PULLED_BY_AM at NEW"

      *Environment: *
      Server OS :- UBUNTU
      No. of Cluster Node:- 2 RM / 4850 NMs
      total 240 machines, in each machine 21 docker containers (1 DN & 20 NM's)

      Steps:
      1. Total number of containers running state : ~53000
      2. Restart the NM's and check in the log

      019-06-24 09:37:35,345 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application with id 32744 submitted by user root
      2019-06-24 09:37:35,346 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root     IP=255.255.19.245       OPERATION=Submit Application Request    TARGET=ClientRMService  RESULT=SUCCESS  APPID=application_1561358926330_32744   QUEUENAME=default
      2019-06-24 09:37:35,345 ERROR org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Can't handle this event at current state
      org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: FINISHED_CONTAINERS_PULLED_BY_AM at NEW
              at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
              at org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
              at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
              at org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.handle(RMNodeImpl.java:669)
              at org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.handle(RMNodeImpl.java:99)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$NodeEventDispatcher.handle(ResourceManager.java:1107)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$NodeEventDispatcher.handle(ResourceManager.java:1091)
              at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:221)
              at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:143)
              at java.lang.Thread.run(Thread.java:748)
      
      

      Attachments

        1. YARN-9645-001.patch
          4 kB
          Bilwa S T
        2. YARN-9645-002.patch
          4 kB
          Bilwa S T

        Activity

          People

            BilwaST Bilwa S T
            mkris.reddy@gmail.com krishna reddy
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: