Details
-
Bug
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
0.23.1
-
None
-
Reviewed
-
Fixed an NPE occuring during scheduling in the ResourceManager.
Description
Sometimes NODE_UPDATE to the scheduler throws NPE causes scheduling to stop but ResourceManager keeps on running.
I have been observing intermitently for last 3 weeks.
But with latest svn code. I tried to run sort twice and both times Job got stuck due to NPE.
java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApp.containerLaunchedOnNode(SchedulerApp.java:181) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.containerLaunchedOnNode(CapacityScheduler.java:596) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:539) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:617) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:77) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:294) at java.lang.Thread.run(Thread.java:619)