Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-1197 Support changing resources of an allocated container
  3. YARN-4230

Increasing container resource while there is no headroom left will cause ResourceManager to crash

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: resourcemanager
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      This issue was found while doing end-to-end test of YARN-1197 in YARN-4175.

      When increasing resource of a container, if there is no headroom left for the user, the ResourceManager crashes with NPE.

      The following is the stack trace:

      15/10/05 20:35:21 INFO capacity.ParentQueue: assignedContainer queue=root usedCapacity=0.9375 absoluteUsedCapacity=0.9375 used=<memory:15360, vCores:9> cluster=<memory:16384, vCores:16>
      15/10/05 20:35:49 FATAL resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler
      java.lang.NullPointerException
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.IncreaseContainerAllocator.assignContainers(IncreaseContainerAllocator.java:327)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.ContainerAllocator.assignContainers(ContainerAllocator.java:66)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.assignContainers(FiCaSchedulerApp.java:474)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:819)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:572)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:423)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1177)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1274)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:134)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:691)
              at java.lang.Thread.run(Thread.java:745)
      15/10/05 20:35:49 INFO resourcemanager.ResourceManager: Exiting, bbye..
      

        Activity

        Hide
        mding MENG DING added a comment -

        The fix is simple. Attaching the patch with an added test case.

        Show
        mding MENG DING added a comment - The fix is simple. Attaching the patch with an added test case.
        Hide
        hadoopqa Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        0 pre-patch 16m 55s Pre-patch trunk compilation is healthy.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
        +1 javac 7m 55s There were no new javac warning messages.
        +1 javadoc 10m 39s There were no new javadoc warning messages.
        -1 release audit 0m 16s The applied patch generated 1 release audit warnings.
        +1 checkstyle 0m 49s There were no new checkstyle issues.
        +1 whitespace 0m 0s The patch has no lines that end in whitespace.
        +1 install 1m 33s mvn install still works.
        +1 eclipse:eclipse 0m 35s The patch built with eclipse:eclipse.
        +1 findbugs 1m 33s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
        +1 yarn tests 61m 55s Tests passed in hadoop-yarn-server-resourcemanager.
            102m 13s  



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12765197/YARN-4230.1.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / 874c8ed
        Release Audit https://builds.apache.org/job/PreCommit-YARN-Build/9361/artifact/patchprocess/patchReleaseAuditProblems.txt
        hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9361/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9361/testReport/
        Java 1.7.0_55
        uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/9361/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 16m 55s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 55s There were no new javac warning messages. +1 javadoc 10m 39s There were no new javadoc warning messages. -1 release audit 0m 16s The applied patch generated 1 release audit warnings. +1 checkstyle 0m 49s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 33s mvn install still works. +1 eclipse:eclipse 0m 35s The patch built with eclipse:eclipse. +1 findbugs 1m 33s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 61m 55s Tests passed in hadoop-yarn-server-resourcemanager.     102m 13s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12765197/YARN-4230.1.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 874c8ed Release Audit https://builds.apache.org/job/PreCommit-YARN-Build/9361/artifact/patchprocess/patchReleaseAuditProblems.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9361/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9361/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9361/console This message was automatically generated.
        Hide
        jianhe Jian He added a comment -

        looks good, committing.

        Show
        jianhe Jian He added a comment - looks good, committing.
        Hide
        jianhe Jian He added a comment -

        Committed to trunk and branch-2, thanks MENG DING !

        Show
        jianhe Jian He added a comment - Committed to trunk and branch-2, thanks MENG DING !
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-trunk-Commit #8612 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8612/)
        YARN-4230. RM crashes with NPE when increasing container resource if (jianhe: rev 9849c8b3865c7c9c9be81ae0ef8f29caa1d5f881)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerResizing.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/allocator/IncreaseContainerAllocator.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8612 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8612/ ) YARN-4230 . RM crashes with NPE when increasing container resource if (jianhe: rev 9849c8b3865c7c9c9be81ae0ef8f29caa1d5f881) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerResizing.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/allocator/IncreaseContainerAllocator.java hadoop-yarn-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk #1250 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1250/)
        YARN-4230. RM crashes with NPE when increasing container resource if (jianhe: rev 9849c8b3865c7c9c9be81ae0ef8f29caa1d5f881)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerResizing.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/allocator/IncreaseContainerAllocator.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #1250 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1250/ ) YARN-4230 . RM crashes with NPE when increasing container resource if (jianhe: rev 9849c8b3865c7c9c9be81ae0ef8f29caa1d5f881) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerResizing.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/allocator/IncreaseContainerAllocator.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #513 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/513/)
        YARN-4230. RM crashes with NPE when increasing container resource if (jianhe: rev 9849c8b3865c7c9c9be81ae0ef8f29caa1d5f881)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/allocator/IncreaseContainerAllocator.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerResizing.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #513 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/513/ ) YARN-4230 . RM crashes with NPE when increasing container resource if (jianhe: rev 9849c8b3865c7c9c9be81ae0ef8f29caa1d5f881) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/allocator/IncreaseContainerAllocator.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerResizing.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk #2459 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2459/)
        YARN-4230. RM crashes with NPE when increasing container resource if (jianhe: rev 9849c8b3865c7c9c9be81ae0ef8f29caa1d5f881)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/allocator/IncreaseContainerAllocator.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerResizing.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2459 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2459/ ) YARN-4230 . RM crashes with NPE when increasing container resource if (jianhe: rev 9849c8b3865c7c9c9be81ae0ef8f29caa1d5f881) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/allocator/IncreaseContainerAllocator.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerResizing.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #525 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/525/)
        YARN-4230. RM crashes with NPE when increasing container resource if (jianhe: rev 9849c8b3865c7c9c9be81ae0ef8f29caa1d5f881)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/allocator/IncreaseContainerAllocator.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerResizing.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #525 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/525/ ) YARN-4230 . RM crashes with NPE when increasing container resource if (jianhe: rev 9849c8b3865c7c9c9be81ae0ef8f29caa1d5f881) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/allocator/IncreaseContainerAllocator.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerResizing.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #485 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/485/)
        YARN-4230. RM crashes with NPE when increasing container resource if (jianhe: rev 9849c8b3865c7c9c9be81ae0ef8f29caa1d5f881)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/allocator/IncreaseContainerAllocator.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerResizing.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #485 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/485/ ) YARN-4230 . RM crashes with NPE when increasing container resource if (jianhe: rev 9849c8b3865c7c9c9be81ae0ef8f29caa1d5f881) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/allocator/IncreaseContainerAllocator.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerResizing.java hadoop-yarn-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk #2423 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2423/)
        YARN-4230. RM crashes with NPE when increasing container resource if (jianhe: rev 9849c8b3865c7c9c9be81ae0ef8f29caa1d5f881)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerResizing.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/allocator/IncreaseContainerAllocator.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2423 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2423/ ) YARN-4230 . RM crashes with NPE when increasing container resource if (jianhe: rev 9849c8b3865c7c9c9be81ae0ef8f29caa1d5f881) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerResizing.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/allocator/IncreaseContainerAllocator.java

          People

          • Assignee:
            mding MENG DING
            Reporter:
            mding MENG DING
          • Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development