Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5195

RM intermittently crashed with NPE while handling APP_ATTEMPT_REMOVED event when async-scheduling enabled in CapacityScheduler

    XMLWordPrintableJSON

    Details

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      While running gridmix experiments one time came across incident where RM went down with following exception

      2016-05-28 15:45:24,459 [ResourceManager Event Processor] FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type APP_ATTEMPT_REMOVED to the scheduler
      java.lang.NullPointerException
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.completedContainer(LeafQueue.java:1282)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.completedContainerInternal(CapacityScheduler.java:1469)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completedContainer(AbstractYarnScheduler.java:497)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.doneApplicationAttempt(CapacityScheduler.java:860)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1319)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:127)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:704)
              at java.lang.Thread.run(Thread.java:745)
      2016-05-28 15:45:24,460 [ApplicationMasterLauncher #49] INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Cleaning master appattempt_1464449118385_0006_000001
      2016-05-28 15:45:24,460 [ResourceManager Event Processor] INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..
      

        Attachments

        1. YARN-5195.01.patch
          1 kB
          sandflee
        2. YARN-5195.02.patch
          4 kB
          sandflee
        3. YARN-5195.03.patch
          4 kB
          sandflee
        4. YARN-5195-branch-2.7.001.patch
          5 kB
          Jonathan Hung
        5. YARN-5195-branch-2.8.001.patch
          4 kB
          Jason Darrell Lowe
        6. YARN-5195-branch-2.8.001.patch
          4 kB
          Jonathan Hung

          Issue Links

            Activity

              People

              • Assignee:
                sandflee sandflee
                Reporter:
                karams Karam Singh
              • Votes:
                0 Vote for this issue
                Watchers:
                15 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: