Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
Description
If the RM goes down and comes back up, it tells the NM to reboot. When the NM reboots, if it has any applications it aggregates the logs for those applications, then it transitions the app to APPLICATION_LOG_HANDLING_FINISHED. I saw a case where there was an app that was in the RUNNING state and tried to transition to APPLICATION_LOG_HANDLING_finished and it got the invalid transition.
DeletionService #12012-04-11 15:12:40,476 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Can't handle this event at current state
[AsyncDispatcher event handler]org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: APPLICATION_LOG_HANDLING_FINISHED at RUNNING
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:382)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:58)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:517)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:509)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:125)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:74)
at java.lang.Thread.run(Thread.java:619)
2012-04-11 15:12:40,476 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1333003059741_15999 transitioned from RUNNING to null
Attachments
Issue Links
- is duplicated by
-
YARN-495 Change NM behavior of reboot to resync
- Closed