Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3005

MR app hangs because of a NPE in ResourceManager

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.23.0
    • Fix Version/s: 0.23.0
    • Component/s: mrv2
    • Labels:
      None

      Description

      The app hangs and it turns out to be a NPE in ResourceManager. This happened two of five times on Karam Singh's sort runs on a big cluster.

      2011-09-12 15:02:33,715 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler
      java.lang.NullPointerException
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:244)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:206)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApp.allocate(SchedulerApp.java:230)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1120)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignNodeLocalContainers(LeafQueue.java:961)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:933)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:725)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:577)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:509)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:579)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:620)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:75)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:266)
              at java.lang.Thread.run(Thread.java:619)
      
      1. MAPREDUCE-3005.patch
        9 kB
        Arun C Murthy
      2. MAPREDUCE-3005.patch
        2 kB
        Arun C Murthy
      3. MAPREDUCE-3005-20110914.txt
        1 kB
        Vinod Kumar Vavilapalli

        Activity

        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #801 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/801/)
        MAPREDUCE-3005. Fix both FifoScheduler and CapacityScheduler to correctly enforce locality constraints.

        acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1170879
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #801 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/801/ ) MAPREDUCE-3005 . Fix both FifoScheduler and CapacityScheduler to correctly enforce locality constraints. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1170879 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-0.23-Build #5 (See https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/5/)
        Merge -r 1170878:1170879 from trunk to branch-0.23 to fix MAPREDUCE-3005.

        acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1170880
        Files :

        • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
        • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
        • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-0.23-Build #5 (See https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/5/ ) Merge -r 1170878:1170879 from trunk to branch-0.23 to fix MAPREDUCE-3005 . acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1170880 Files : /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #831 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/831/)
        MAPREDUCE-3005. Fix both FifoScheduler and CapacityScheduler to correctly enforce locality constraints.

        acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1170879
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #831 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/831/ ) MAPREDUCE-3005 . Fix both FifoScheduler and CapacityScheduler to correctly enforce locality constraints. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1170879 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-0.23-Build #7 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/7/)
        Merge -r 1170878:1170879 from trunk to branch-0.23 to fix MAPREDUCE-3005.

        acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1170880
        Files :

        • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
        • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
        • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #7 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/7/ ) Merge -r 1170878:1170879 from trunk to branch-0.23 to fix MAPREDUCE-3005 . acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1170880 Files : /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #901 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/901/)
        MAPREDUCE-3005. Fix both FifoScheduler and CapacityScheduler to correctly enforce locality constraints.

        acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1170879
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #901 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/901/ ) MAPREDUCE-3005 . Fix both FifoScheduler and CapacityScheduler to correctly enforce locality constraints. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1170879 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk-Commit #966 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/966/)
        MAPREDUCE-3005. Fix both FifoScheduler and CapacityScheduler to correctly enforce locality constraints.

        acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1170879
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #966 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/966/ ) MAPREDUCE-3005 . Fix both FifoScheduler and CapacityScheduler to correctly enforce locality constraints. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1170879 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk-Commit #889 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/889/)
        MAPREDUCE-3005. Fix both FifoScheduler and CapacityScheduler to correctly enforce locality constraints.

        acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1170879
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #889 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/889/ ) MAPREDUCE-3005 . Fix both FifoScheduler and CapacityScheduler to correctly enforce locality constraints. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1170879 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
        Hide
        Arun C Murthy added a comment -

        I just committed this.

        Show
        Arun C Murthy added a comment - I just committed this.
        Hide
        Mahadev konar added a comment -

        +1 looks good to me!

        Show
        Mahadev konar added a comment - +1 looks good to me!
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12494527/MAPREDUCE-3005.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/740//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/740//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12494527/MAPREDUCE-3005.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/740//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/740//console This message is automatically generated.
        Hide
        Arun C Murthy added a comment -

        Same fix, after testing. Also added a unit test.

        Show
        Arun C Murthy added a comment - Same fix, after testing. Also added a unit test.
        Hide
        Arun C Murthy added a comment -

        Preliminary patch, I'll test on cluster, add unit tests etc.

        Show
        Arun C Murthy added a comment - Preliminary patch, I'll test on cluster, add unit tests etc.
        Hide
        Vinod Kumar Vavilapalli added a comment -

        My averseness to null-checks is vindicated

        Arun, can you please provide a patch? Thanks!

        Show
        Vinod Kumar Vavilapalli added a comment - My averseness to null-checks is vindicated Arun, can you please provide a patch? Thanks!
        Hide
        Arun C Murthy added a comment -

        Thinking about it some more...

        We also need to fix LeafQueue.canAssign to ensure it returns 'false' for node-local allocations when rackRequest is null or rackRequest.getNumContainers() == 0. This prevents allocations on nodes which are on racks on which apps don't need containers.

        Show
        Arun C Murthy added a comment - Thinking about it some more... We also need to fix LeafQueue.canAssign to ensure it returns 'false' for node-local allocations when rackRequest is null or rackRequest.getNumContainers() == 0. This prevents allocations on nodes which are on racks on which apps don't need containers.
        Hide
        Arun C Murthy added a comment -

        +1 for the patch.

        Normally, the MR AM shouldn't have outstanding node-local requests without rack-local requests. But this fix is valid for lots of scenarios.

        Show
        Arun C Murthy added a comment - +1 for the patch. Normally, the MR AM shouldn't have outstanding node-local requests without rack-local requests. But this fix is valid for lots of scenarios.
        Hide
        Vinod Kumar Vavilapalli added a comment -

        I am generally averse to null checks, but this one seems valid, Arun can you please confirm this fix? We do remove rack-entries from the requests-table when there are no longer any requests against that rack (in the same method, AppSchedulingInfo.allocateNodeLocal(), but in previous scheduling cycle).

        Will get this verified on the cluster. Will see if I can add a test too.

        Show
        Vinod Kumar Vavilapalli added a comment - I am generally averse to null checks, but this one seems valid, Arun can you please confirm this fix? We do remove rack-entries from the requests-table when there are no longer any requests against that rack (in the same method, AppSchedulingInfo.allocateNodeLocal(), but in previous scheduling cycle). Will get this verified on the cluster. Will see if I can add a test too.

          People

          • Assignee:
            Arun C Murthy
            Reporter:
            Vinod Kumar Vavilapalli
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development