Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3764

CapacityScheduler should forbid moving LeafQueue from one parent to another

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 2.7.1, 3.0.0-alpha1
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      Currently CapacityScheduler doesn't handle the case well, for example:

      A queue structure:

          root
            |
            a (100)
          /   \
         x     y
        (50)   (50)
      

      And reinitialize using following structure:

           root
           /   \ 
      (50)a     x (50)
          |
          y
         (100)
      

      The actual queue structure after reinitialize is:

           root
          /    \
         a (50) x (50)
        /  \
       x    y
      (50)  (100)
      

      We should forbid admin doing that.

        Activity

        Hide
        leftnoteasy Wangda Tan added a comment -

        CS's reinitialize logic creates new queues, but only copies configuration properties to old queue, and new queue will be discarded after reinitialization.

        A comprehensive fix for this is, copy old queue's run time information to new queue, including runningApplications, etc. And discard old queue after reinitialization.

        A short term fix is don't allow remove queue under parentQueue. IAW, CS will throw exception if a LeafQueue is moved from one parent to another. I prefer to do comprehensive fix for 2.8.0, and short term fix for 2.7.1/2.6.1 (if required).

        Thoughts?

        Show
        leftnoteasy Wangda Tan added a comment - CS's reinitialize logic creates new queues, but only copies configuration properties to old queue, and new queue will be discarded after reinitialization. A comprehensive fix for this is, copy old queue's run time information to new queue, including runningApplications, etc. And discard old queue after reinitialization. A short term fix is don't allow remove queue under parentQueue. IAW, CS will throw exception if a LeafQueue is moved from one parent to another. I prefer to do comprehensive fix for 2.8.0, and short term fix for 2.7.1/2.6.1 (if required). Thoughts?
        Hide
        leftnoteasy Wangda Tan added a comment -

        Following test case can verify this issue:

          @Test
          public void testQueueParsingWithMoveQueue()
              throws IOException {
            YarnConfiguration conf = new YarnConfiguration();
            CapacitySchedulerConfiguration csConf =
                new CapacitySchedulerConfiguration(conf);
            csConf.setQueues("root", new String[] { "a" });
            csConf.setQueues("root.a", new String[] { "x", "y" });
            csConf.setCapacity("root.a", 100);
            csConf.setCapacity("root.a.x", 50);
            csConf.setCapacity("root.a.y", 50);
        
            CapacityScheduler capacityScheduler = new CapacityScheduler();
            RMContextImpl rmContext =
                new RMContextImpl(null, null, null, null, null, null,
                    new RMContainerTokenSecretManager(csConf),
                    new NMTokenSecretManagerInRM(csConf),
                    new ClientToAMTokenSecretManagerInRM(), null);
            rmContext.setNodeLabelManager(nodeLabelManager);
            capacityScheduler.setConf(csConf);
            capacityScheduler.setRMContext(rmContext);
            capacityScheduler.init(csConf);
            capacityScheduler.start();
            
            csConf.setQueues("root", new String[] { "a", "x" });
            csConf.setQueues("root.a", new String[] { "y" });
            csConf.setCapacity("root.x", 50);
            csConf.setCapacity("root.a", 50);
            csConf.setCapacity("root.a.y", 100);
            
            capacityScheduler.reinitialize(csConf, rmContext);
            
            Assert.assertEquals(1, ((ParentQueue) capacityScheduler.getQueue("a"))
                .getChildQueues().size());
          }
        
        Show
        leftnoteasy Wangda Tan added a comment - Following test case can verify this issue: @Test public void testQueueParsingWithMoveQueue() throws IOException { YarnConfiguration conf = new YarnConfiguration(); CapacitySchedulerConfiguration csConf = new CapacitySchedulerConfiguration(conf); csConf.setQueues( "root" , new String [] { "a" }); csConf.setQueues( "root.a" , new String [] { "x" , "y" }); csConf.setCapacity( "root.a" , 100); csConf.setCapacity( "root.a.x" , 50); csConf.setCapacity( "root.a.y" , 50); CapacityScheduler capacityScheduler = new CapacityScheduler(); RMContextImpl rmContext = new RMContextImpl( null , null , null , null , null , null , new RMContainerTokenSecretManager(csConf), new NMTokenSecretManagerInRM(csConf), new ClientToAMTokenSecretManagerInRM(), null ); rmContext.setNodeLabelManager(nodeLabelManager); capacityScheduler.setConf(csConf); capacityScheduler.setRMContext(rmContext); capacityScheduler.init(csConf); capacityScheduler.start(); csConf.setQueues( "root" , new String [] { "a" , "x" }); csConf.setQueues( "root.a" , new String [] { "y" }); csConf.setCapacity( "root.x" , 50); csConf.setCapacity( "root.a" , 50); csConf.setCapacity( "root.a.y" , 100); capacityScheduler.reinitialize(csConf, rmContext); Assert.assertEquals(1, ((ParentQueue) capacityScheduler.getQueue( "a" )) .getChildQueues().size()); }
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        A short term fix is don't allow remove queue under parentQueue.

        We never supported removing queues. So this is not just a short-term fix, this is the right fix for now.

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - A short term fix is don't allow remove queue under parentQueue. We never supported removing queues. So this is not just a short-term fix, this is the right fix for now.
        Hide
        leftnoteasy Wangda Tan added a comment -

        Vinod Kumar Vavilapalli, agree. Update the title/desc and will search/file separated ticket for moving/removing queue.

        Show
        leftnoteasy Wangda Tan added a comment - Vinod Kumar Vavilapalli , agree. Update the title/desc and will search/file separated ticket for moving/removing queue.
        Hide
        leftnoteasy Wangda Tan added a comment -

        Attached initial patch for review.

        Show
        leftnoteasy Wangda Tan added a comment - Attached initial patch for review.
        Hide
        hadoopqa Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        -1 pre-patch 20m 52s Findbugs (version ) appears to be broken on trunk.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
        +1 javac 9m 42s There were no new javac warning messages.
        +1 javadoc 10m 48s There were no new javadoc warning messages.
        +1 release audit 0m 21s The applied patch does not increase the total number of release audit warnings.
        +1 checkstyle 0m 41s There were no new checkstyle issues.
        -1 whitespace 0m 0s The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix.
        +1 install 1m 32s mvn install still works.
        +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse.
        +1 findbugs 1m 26s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
        +1 yarn tests 54m 5s Tests passed in hadoop-yarn-server-resourcemanager.
            100m 6s  



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12737374/YARN-3764.1.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / bc85959
        whitespace https://builds.apache.org/job/PreCommit-YARN-Build/8186/artifact/patchprocess/whitespace.txt
        hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8186/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8186/testReport/
        Java 1.7.0_55
        uname Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/8186/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 pre-patch 20m 52s Findbugs (version ) appears to be broken on trunk. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 9m 42s There were no new javac warning messages. +1 javadoc 10m 48s There were no new javadoc warning messages. +1 release audit 0m 21s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 41s There were no new checkstyle issues. -1 whitespace 0m 0s The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 1m 32s mvn install still works. +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse. +1 findbugs 1m 26s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 54m 5s Tests passed in hadoop-yarn-server-resourcemanager.     100m 6s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12737374/YARN-3764.1.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / bc85959 whitespace https://builds.apache.org/job/PreCommit-YARN-Build/8186/artifact/patchprocess/whitespace.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8186/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8186/testReport/ Java 1.7.0_55 uname Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8186/console This message was automatically generated.
        Hide
        jianhe Jian He added a comment -

        looks good, +1

        Show
        jianhe Jian He added a comment - looks good, +1
        Hide
        jianhe Jian He added a comment -

        committed to trunk, branch-2, branch-2.7. thanks Wangda !

        Show
        jianhe Jian He added a comment - committed to trunk, branch-2, branch-2.7. thanks Wangda !
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-trunk-Commit #7966 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7966/)
        YARN-3764. CapacityScheduler should forbid moving LeafQueue from one parent to another. Contributed by Wangda Tan (jianhe: rev 6ad4e59cfc111a92747fdb1fb99cc6378044832a)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueParsing.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #7966 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7966/ ) YARN-3764 . CapacityScheduler should forbid moving LeafQueue from one parent to another. Contributed by Wangda Tan (jianhe: rev 6ad4e59cfc111a92747fdb1fb99cc6378044832a) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueParsing.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #219 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/219/)
        YARN-3764. CapacityScheduler should forbid moving LeafQueue from one parent to another. Contributed by Wangda Tan (jianhe: rev 6ad4e59cfc111a92747fdb1fb99cc6378044832a)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueParsing.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #219 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/219/ ) YARN-3764 . CapacityScheduler should forbid moving LeafQueue from one parent to another. Contributed by Wangda Tan (jianhe: rev 6ad4e59cfc111a92747fdb1fb99cc6378044832a) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueParsing.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Yarn-trunk #949 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/949/)
        YARN-3764. CapacityScheduler should forbid moving LeafQueue from one parent to another. Contributed by Wangda Tan (jianhe: rev 6ad4e59cfc111a92747fdb1fb99cc6378044832a)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueParsing.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk #949 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/949/ ) YARN-3764 . CapacityScheduler should forbid moving LeafQueue from one parent to another. Contributed by Wangda Tan (jianhe: rev 6ad4e59cfc111a92747fdb1fb99cc6378044832a) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueParsing.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Hdfs-trunk #2147 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2147/)
        YARN-3764. CapacityScheduler should forbid moving LeafQueue from one parent to another. Contributed by Wangda Tan (jianhe: rev 6ad4e59cfc111a92747fdb1fb99cc6378044832a)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueParsing.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Hdfs-trunk #2147 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2147/ ) YARN-3764 . CapacityScheduler should forbid moving LeafQueue from one parent to another. Contributed by Wangda Tan (jianhe: rev 6ad4e59cfc111a92747fdb1fb99cc6378044832a) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueParsing.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java hadoop-yarn-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #208 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/208/)
        YARN-3764. CapacityScheduler should forbid moving LeafQueue from one parent to another. Contributed by Wangda Tan (jianhe: rev 6ad4e59cfc111a92747fdb1fb99cc6378044832a)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueParsing.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #208 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/208/ ) YARN-3764 . CapacityScheduler should forbid moving LeafQueue from one parent to another. Contributed by Wangda Tan (jianhe: rev 6ad4e59cfc111a92747fdb1fb99cc6378044832a) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueParsing.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk #2165 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2165/)
        YARN-3764. CapacityScheduler should forbid moving LeafQueue from one parent to another. Contributed by Wangda Tan (jianhe: rev 6ad4e59cfc111a92747fdb1fb99cc6378044832a)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueParsing.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2165 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2165/ ) YARN-3764 . CapacityScheduler should forbid moving LeafQueue from one parent to another. Contributed by Wangda Tan (jianhe: rev 6ad4e59cfc111a92747fdb1fb99cc6378044832a) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueParsing.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #217 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/217/)
        YARN-3764. CapacityScheduler should forbid moving LeafQueue from one parent to another. Contributed by Wangda Tan (jianhe: rev 6ad4e59cfc111a92747fdb1fb99cc6378044832a)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueParsing.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #217 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/217/ ) YARN-3764 . CapacityScheduler should forbid moving LeafQueue from one parent to another. Contributed by Wangda Tan (jianhe: rev 6ad4e59cfc111a92747fdb1fb99cc6378044832a) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueParsing.java hadoop-yarn-project/CHANGES.txt

          People

          • Assignee:
            leftnoteasy Wangda Tan
            Reporter:
            leftnoteasy Wangda Tan
          • Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development