Details
Description
I wrote a test case to reproduce another problem for branch-2 and found new NPE error, log:
FATAL event.EventDispatcher (EventDispatcher.java:run(75)) - Error in handling event type NODE_UPDATE to the Event Dispatcher java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:446) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:516) at org.apache.hadoop.yarn.client.TestNegativePendingResource$1.answer(TestNegativePendingResource.java:225) at org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:31) at org.mockito.internal.MockHandler.handle(MockHandler.java:97) at org.mockito.internal.creation.MethodInterceptorFilter.intercept(MethodInterceptorFilter.java:47) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp$$EnhancerByMockitoWithCGLIB$$29eb8afc.apply(<generated>) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2396) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.submitResourceCommitRequest(CapacityScheduler.java:2281) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1247) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1236) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1325) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1112) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:987) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1367) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:143) at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) at java.lang.Thread.run(Thread.java:745)
Reproduce this error in chronological order:
1. AM started and requested 1 container with schedulerRequestKey#1 :
ApplicationMasterService#allocate --> CapacityScheduler#allocate --> SchedulerApplicationAttempt#updateResourceRequests --> AppSchedulingInfo#updateResourceRequests
Added schedulerRequestKey#1 into schedulerKeyToPlacementSets
2. Scheduler allocatd 1 container for this request and accepted the proposal
3. AM removed this request
ApplicationMasterService#allocate --> CapacityScheduler#allocate --> SchedulerApplicationAttempt#updateResourceRequests --> AppSchedulingInfo#updateResourceRequests --> AppSchedulingInfo#addToPlacementSets --> AppSchedulingInfo#updatePendingResources
Removed schedulerRequestKey#1 from schedulerKeyToPlacementSets)
4. Scheduler applied this proposal
CapacityScheduler#tryCommit --> FiCaSchedulerApp#apply --> AppSchedulingInfo#allocate
Throw NPE when called schedulerKeyToPlacementSets.get(schedulerRequestKey).allocate(schedulerKey, type, node);