Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-6629

NPE occurred when container allocation proposal is applied but its resource requests are removed before

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.9.0, 3.0.0-alpha2
    • Fix Version/s: 3.1.0, 2.10.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      I wrote a test case to reproduce another problem for branch-2 and found new NPE error, log:

      FATAL event.EventDispatcher (EventDispatcher.java:run(75)) - Error in handling event type NODE_UPDATE to the Event Dispatcher
      java.lang.NullPointerException
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:446)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:516)
              at org.apache.hadoop.yarn.client.TestNegativePendingResource$1.answer(TestNegativePendingResource.java:225)
              at org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:31)
              at org.mockito.internal.MockHandler.handle(MockHandler.java:97)
              at org.mockito.internal.creation.MethodInterceptorFilter.intercept(MethodInterceptorFilter.java:47)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp$$EnhancerByMockitoWithCGLIB$$29eb8afc.apply(<generated>)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2396)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.submitResourceCommitRequest(CapacityScheduler.java:2281)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1247)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1236)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1325)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1112)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:987)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1367)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:143)
              at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
              at java.lang.Thread.run(Thread.java:745)
      

      Reproduce this error in chronological order:
      1. AM started and requested 1 container with schedulerRequestKey#1 :
      ApplicationMasterService#allocate --> CapacityScheduler#allocate --> SchedulerApplicationAttempt#updateResourceRequests --> AppSchedulingInfo#updateResourceRequests
      Added schedulerRequestKey#1 into schedulerKeyToPlacementSets
      2. Scheduler allocatd 1 container for this request and accepted the proposal
      3. AM removed this request
      ApplicationMasterService#allocate --> CapacityScheduler#allocate --> SchedulerApplicationAttempt#updateResourceRequests --> AppSchedulingInfo#updateResourceRequests --> AppSchedulingInfo#addToPlacementSets --> AppSchedulingInfo#updatePendingResources
      Removed schedulerRequestKey#1 from schedulerKeyToPlacementSets)
      4. Scheduler applied this proposal
      CapacityScheduler#tryCommit --> FiCaSchedulerApp#apply --> AppSchedulingInfo#allocate
      Throw NPE when called schedulerKeyToPlacementSets.get(schedulerRequestKey).allocate(schedulerKey, type, node);

        Attachments

        1. YARN-6629.branch-2.001.patch
          8 kB
          Tao Yang
        2. YARN-6629.006.patch
          9 kB
          Tao Yang
        3. YARN-6629.005.patch
          9 kB
          Tao Yang
        4. YARN-6629.004.patch
          7 kB
          Tao Yang
        5. YARN-6629.003.patch
          7 kB
          Tao Yang
        6. YARN-6629.002.patch
          6 kB
          Tao Yang
        7. YARN-6629.001.patch
          1 kB
          Tao Yang

          Activity

            People

            • Assignee:
              Tao Yang Tao Yang
              Reporter:
              Tao Yang Tao Yang
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: