Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-7249

Fix CapacityScheduler NPE issue when a container preempted while the node is being removed

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 2.8.1, 2.7.5
    • 2.8.2, 2.7.6
    • None
    • None
    • Reviewed

    Description

      This issue could happen when 3 conditions satisfied:

      1) A node is removing from scheduler.
      2) A container running on the node is being preempted.
      3) A rare race condition causes scheduler pass a null node to leaf queue.

      Fix of the problem is to add a null node check inside CapacityScheduler.

      Stack trace:

      2017-08-31 02:51:24,748 FATAL resourcemanager.ResourceManager (ResourceManager.java:run(714)) - Error in handling event type KILL_RESERVED_CONTAINER to the scheduler 
      java.lang.NullPointerException 
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.completedContainer(LeafQueue.java:1308) 
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.completedContainerInternal(CapacityScheduler.java:1469) 
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completedContainer(AbstractYarnScheduler.java:497) 
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.killReservedContainer(CapacityScheduler.java:1505) 
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1341) 
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:127) 
      at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:705) 
      

      This is an issue only existed in 2.8.x

      Attachments

        Activity

          People

            leftnoteasy Wangda Tan
            leftnoteasy Wangda Tan
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: