Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4975

Fair Scheduler: exception thrown when a parent queue marked 'parent' has configured child queues

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.7.2
    • Fix Version/s: 2.9.0, 3.0.0-alpha4
    • Component/s: fairscheduler
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      We upgraded our clusters to 2.7.2 from 2.4.1 and saw the following exception in RM logs :

      Caused by: org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException: Both <reservation> and type="parent" found for queue root.adhoc which is unsupported
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:519)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:352)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1440)
      

      From the exception, it looks like we've configured 'reservation', but we've not. The issue is that AllocationFileLoaderService#loadQueue assumes that a parent queue marked as 'type=parent' cannot have configured child queues. That can be a problem in cases where we mark a queue as 'parent' which has no configured child queues to start with, but we can add child queues later on.
      Also the exception message is kind of misleading since we haven't configured 'reservation'.

      How to reproduce:
      Run fair scheduler with following queue config:

      <queue name="p" type="parent">
              <weight>10</weight>
              <maxRunningApps>300</maxRunningApps>
              <queue name="c">
                  <weight>3</weight>
               </queue>
      </queue>
      
      1. YARN-4975.001.patch
        7 kB
        Yufei Gu
      2. YARN-4975.002.patch
        8 kB
        Yufei Gu

        Issue Links

          Activity

          Hide
          yufeigu Yufei Gu added a comment -

          Thanks Daniel Templeton for the review and commit!

          Show
          yufeigu Yufei Gu added a comment - Thanks Daniel Templeton for the review and commit!
          Hide
          templedf Daniel Templeton added a comment -

          Nevermind. I pulled in YARN-4997 and YARN-6000, and now this patch applies to branch-2 cleanly.

          Show
          templedf Daniel Templeton added a comment - Nevermind. I pulled in YARN-4997 and YARN-6000 , and now this patch applies to branch-2 cleanly.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11178 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11178/)
          YARN-4975. Fair Scheduler: exception thrown when a parent queue marked (templedf: rev f85b74ccf9f1c1c1444cc00750b03468cbf40fb9)

          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestAllocationFileLoaderService.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11178 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11178/ ) YARN-4975 . Fair Scheduler: exception thrown when a parent queue marked (templedf: rev f85b74ccf9f1c1c1444cc00750b03468cbf40fb9) (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestAllocationFileLoaderService.java
          Hide
          templedf Daniel Templeton added a comment -

          Thanks for the patch, Yufei Gu! Committed to trunk. If you want to bring it back into branch-2, I'll need a branch-2 patch.

          Show
          templedf Daniel Templeton added a comment - Thanks for the patch, Yufei Gu ! Committed to trunk. If you want to bring it back into branch-2, I'll need a branch-2 patch.
          Hide
          yufeigu Yufei Gu added a comment -

          I've tested these failure tests locally. They are unrelated.

          Show
          yufeigu Yufei Gu added a comment - I've tested these failure tests locally. They are unrelated.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 22s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 13m 57s trunk passed
          +1 compile 0m 33s trunk passed
          +1 checkstyle 0m 21s trunk passed
          +1 mvnsite 0m 34s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 1m 0s trunk passed
          +1 javadoc 0m 21s trunk passed
          +1 mvninstall 0m 37s the patch passed
          +1 compile 0m 36s the patch passed
          +1 javac 0m 36s the patch passed
          +1 checkstyle 0m 20s the patch passed
          +1 mvnsite 0m 40s the patch passed
          +1 mvneclipse 0m 13s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 15s the patch passed
          +1 javadoc 0m 20s the patch passed
          -1 unit 41m 30s hadoop-yarn-server-resourcemanager in the patch failed.
          +1 asflicense 0m 19s The patch does not generate ASF License warnings.
          64m 32s



          Reason Tests
          Failed junit tests hadoop.yarn.server.resourcemanager.TestRMRestart
            hadoop.yarn.server.resourcemanager.TestResourceTrackerService



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue YARN-4975
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12849388/YARN-4975.002.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux b1904647aadc 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 425a7e5
          Default Java 1.8.0_121
          findbugs v3.0.0
          unit https://builds.apache.org/job/PreCommit-YARN-Build/14755/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/14755/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/14755/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 22s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 13m 57s trunk passed +1 compile 0m 33s trunk passed +1 checkstyle 0m 21s trunk passed +1 mvnsite 0m 34s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 1m 0s trunk passed +1 javadoc 0m 21s trunk passed +1 mvninstall 0m 37s the patch passed +1 compile 0m 36s the patch passed +1 javac 0m 36s the patch passed +1 checkstyle 0m 20s the patch passed +1 mvnsite 0m 40s the patch passed +1 mvneclipse 0m 13s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 15s the patch passed +1 javadoc 0m 20s the patch passed -1 unit 41m 30s hadoop-yarn-server-resourcemanager in the patch failed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 64m 32s Reason Tests Failed junit tests hadoop.yarn.server.resourcemanager.TestRMRestart   hadoop.yarn.server.resourcemanager.TestResourceTrackerService Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue YARN-4975 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12849388/YARN-4975.002.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux b1904647aadc 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 425a7e5 Default Java 1.8.0_121 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-YARN-Build/14755/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/14755/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/14755/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          templedf Daniel Templeton added a comment -

          +1 pending Jenkins' approval.

          Show
          templedf Daniel Templeton added a comment - +1 pending Jenkins' approval.
          Hide
          yufeigu Yufei Gu added a comment -

          Thanks Daniel Templeton for the review. They totally make sense to me. Uploaded patch 002 for your comments.

          Show
          yufeigu Yufei Gu added a comment - Thanks Daniel Templeton for the review. They totally make sense to me. Uploaded patch 002 for your comments.
          Hide
          templedf Daniel Templeton added a comment -

          Changes look good to me. Couple of nits:

          • This message: "Can't mark <reservation> to a parent queue: " could be clearer; maybe "The configuration settings for " + queue + " are invalid. A queue element that contains child queue elements or that has the type="parent" attribute cannot also include a reservation element."
          • I'm not a huge fan of expected exceptions in tests. I'd rather you catch the exception and make sure the exception text is from the right exception. With expected exceptions, you could get the exception for the wrong reason and still pass.
          • In the last test, it would be nice to do a couple of basic asserts to confirm that the config was instantiated correctly, i.e. check that the parent and child queues exist. It's redundant, but better to be safe.
          Show
          templedf Daniel Templeton added a comment - Changes look good to me. Couple of nits: This message: "Can't mark <reservation> to a parent queue: " could be clearer; maybe "The configuration settings for " + queue + " are invalid. A queue element that contains child queue elements or that has the type="parent" attribute cannot also include a reservation element." I'm not a huge fan of expected exceptions in tests. I'd rather you catch the exception and make sure the exception text is from the right exception. With expected exceptions, you could get the exception for the wrong reason and still pass. In the last test, it would be nice to do a couple of basic asserts to confirm that the config was instantiated correctly, i.e. check that the parent and child queues exist. It's redundant, but better to be safe.
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 18s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 14m 25s trunk passed
          +1 compile 0m 39s trunk passed
          +1 checkstyle 0m 26s trunk passed
          +1 mvnsite 0m 42s trunk passed
          +1 mvneclipse 0m 18s trunk passed
          +1 findbugs 1m 8s trunk passed
          +1 javadoc 0m 23s trunk passed
          +1 mvninstall 0m 36s the patch passed
          +1 compile 0m 37s the patch passed
          +1 javac 0m 37s the patch passed
          +1 checkstyle 0m 20s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 43 unchanged - 1 fixed = 43 total (was 44)
          +1 mvnsite 0m 37s the patch passed
          +1 mvneclipse 0m 16s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 14s the patch passed
          +1 javadoc 0m 21s the patch passed
          +1 unit 42m 0s hadoop-yarn-server-resourcemanager in the patch passed.
          +1 asflicense 0m 18s The patch does not generate ASF License warnings.
          66m 1s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue YARN-4975
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12846626/YARN-4975.001.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 3db8c3b80b0e 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / c18590f
          Default Java 1.8.0_111
          findbugs v3.0.0
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/14624/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/14624/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 18s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 14m 25s trunk passed +1 compile 0m 39s trunk passed +1 checkstyle 0m 26s trunk passed +1 mvnsite 0m 42s trunk passed +1 mvneclipse 0m 18s trunk passed +1 findbugs 1m 8s trunk passed +1 javadoc 0m 23s trunk passed +1 mvninstall 0m 36s the patch passed +1 compile 0m 37s the patch passed +1 javac 0m 37s the patch passed +1 checkstyle 0m 20s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 43 unchanged - 1 fixed = 43 total (was 44) +1 mvnsite 0m 37s the patch passed +1 mvneclipse 0m 16s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 14s the patch passed +1 javadoc 0m 21s the patch passed +1 unit 42m 0s hadoop-yarn-server-resourcemanager in the patch passed. +1 asflicense 0m 18s The patch does not generate ASF License warnings. 66m 1s Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue YARN-4975 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12846626/YARN-4975.001.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 3db8c3b80b0e 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / c18590f Default Java 1.8.0_111 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/14624/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/14624/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          yufeigu Yufei Gu added a comment -

          YARN-2738 has two assumptions:
          1. Only leaf queues are reservable.
          2. The queue marked as "parent" should be a leaf.

          Obviously, the second one is broken here.

          Show
          yufeigu Yufei Gu added a comment - YARN-2738 has two assumptions: 1. Only leaf queues are reservable. 2. The queue marked as "parent" should be a leaf. Obviously, the second one is broken here.

            People

            • Assignee:
              yufeigu Yufei Gu
              Reporter:
              ashwinshankar77 Ashwin Shankar
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development