Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.6.0
    • Component/s: capacityscheduler
    • Labels:
      None
    • Target Version/s:

      Description

      `We would like to have the capability (same as the Fair Scheduler has) to move applications between queues.

      We have made a baseline implementation and tests to start with - and we would like the community to review, come up with suggestions and finally have this contributed.

      The current implementation is available for 2.4.1 - so the first thing is that we'd need to identify the target version as there are differences between 2.4.* and 3.* interfaces.

      The story behind is available at http://blog.sequenceiq.com/blog/2014/07/02/move-applications-between-queues/ and the baseline implementation and test at:

      https://github.com/sequenceiq/hadoop-common/blob/branch-2.4.1/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/a/ExtendedCapacityScheduler.java#L924

      https://github.com/sequenceiq/hadoop-common/blob/branch-2.4.1/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/a/TestExtendedCapacitySchedulerAppMove.java

      1. YARN-2248-1.patch
        25 kB
        Krisztian Horvath
      2. YARN-2248-2.patch
        32 kB
        Krisztian Horvath
      3. YARN-2248-3.patch
        33 kB
        Krisztian Horvath

        Issue Links

          Activity

          Hide
          keyki Krisztian Horvath added a comment -

          We will and let you know if anything seems abnormal.

          Show
          keyki Krisztian Horvath added a comment - We will and let you know if anything seems abnormal.
          Hide
          subru Subru Krishnan added a comment -

          [~jmatyas], Krisztian Horvath, we (myself and Carlo Curino) did some testing on our side. It will be good if you guys also take a look and validate it. Thanks.

          Show
          subru Subru Krishnan added a comment - [~jmatyas] , Krisztian Horvath , we (myself and Carlo Curino ) did some testing on our side. It will be good if you guys also take a look and validate it. Thanks.
          Hide
          jianhe Jian He added a comment -

          I just committed YARN-2378. Thanks Krisztian Horvath and [~jmatyas] for your contributions !

          Show
          jianhe Jian He added a comment - I just committed YARN-2378 . Thanks Krisztian Horvath and [~jmatyas] for your contributions !
          Hide
          matyix Janos Matyas added a comment -

          Sounds good - let us know if we can help anyhow - we use this feature internally, so once you submit a patch we can check/test on our side as well.

          Show
          matyix Janos Matyas added a comment - Sounds good - let us know if we can help anyhow - we use this feature internally, so once you submit a patch we can check/test on our side as well.
          Hide
          curino Carlo Curino added a comment -

          I work with Subru Krishnan on this... We can run this further on our test clusters (to increase our confidence level before commit), we have a Gridmix harness that among other thing ends up exercising this (once clean we will release as part of YARN-1051).

          More generally, I agree that merging is the two is the best approach and committing in 2.6.0 sounds good.

          Show
          curino Carlo Curino added a comment - I work with Subru Krishnan on this... We can run this further on our test clusters (to increase our confidence level before commit), we have a Gridmix harness that among other thing ends up exercising this (once clean we will release as part of YARN-1051 ). More generally, I agree that merging is the two is the best approach and committing in 2.6.0 sounds good.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Krisztian Horvath, I agree we should get move-app committed in 2.6.0.

          Show
          leftnoteasy Wangda Tan added a comment - Krisztian Horvath , I agree we should get move-app committed in 2.6.0.
          Hide
          keyki Krisztian Horvath added a comment -

          Is there a change we can get this committed in 2.6.0?

          Show
          keyki Krisztian Horvath added a comment - Is there a change we can get this committed in 2.6.0?
          Hide
          subru Subru Krishnan added a comment -

          Thanks Krisztian Horvath. I just added all your test cases and ran them & they do pass with my patch including the queue metrics test. The test cases are quite useful, thanks again.

          Show
          subru Subru Krishnan added a comment - Thanks Krisztian Horvath . I just added all your test cases and ran them & they do pass with my patch including the queue metrics test. The test cases are quite useful, thanks again.
          Hide
          keyki Krisztian Horvath added a comment -

          Hi,

          As long as we don't break the functionality we can merge them and try to take the best out of them, so yes. Have you tried your patch with the queue metrics test, yet?

          Show
          keyki Krisztian Horvath added a comment - Hi, As long as we don't break the functionality we can merge them and try to take the best out of them, so yes. Have you tried your patch with the queue metrics test, yet?
          Hide
          subru Subru Krishnan added a comment -

          Hi Krisztian Horvath, we have been working on adding support for move for sometime in Capacity Scheduler as part of YARN-2378 (originally YARN-1707) and Varun Vasudev was kind enough to point out that you were doing the same. To prevent duplication, I suggest we merge our work.

          I looked at your patch & we are doing essentially the same thing (which was good validation for both of us ). Based on Wangda Tan's feedback , I think it would be easiest if I merged your metrics test with the patch I have. Would that be OK?

          Show
          subru Subru Krishnan added a comment - Hi Krisztian Horvath , we have been working on adding support for move for sometime in Capacity Scheduler as part of YARN-2378 (originally YARN-1707 ) and Varun Vasudev was kind enough to point out that you were doing the same. To prevent duplication, I suggest we merge our work. I looked at your patch & we are doing essentially the same thing (which was good validation for both of us ). Based on Wangda Tan 's feedback , I think it would be easiest if I merged your metrics test with the patch I have. Would that be OK?
          Hide
          hadoopqa Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12654481/YARN-2248-3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4216//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4216//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12654481/YARN-2248-3.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4216//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4216//console This message is automatically generated.
          Hide
          keyki Krisztian Horvath added a comment -

          The order of the live container resource release is important as the user can go under Resource usage <0, 0>. To avoid it changed the order. More metrics test added.

          Show
          keyki Krisztian Horvath added a comment - The order of the live container resource release is important as the user can go under Resource usage <0, 0>. To avoid it changed the order. More metrics test added.
          Hide
          hadoopqa Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12654330/YARN-2248-2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4215//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4215//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12654330/YARN-2248-2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4215//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4215//console This message is automatically generated.
          Hide
          keyki Krisztian Horvath added a comment -

          Can anyone take a look at the patch? I've some concerns regarding the live containers.

          Movement steps:

          1, Check if the target queue has enough capacity and some more validation, exception otherwise (same as with FairScheduler)
          2, Remove the app attempt from the current queue
          3, Release resources used by live containers on this queue
          4, Remove application upwards root (--numApplications)
          5, QueueMetrics update
          6, Set new queue in application
          7, Allocate resources consumed by the live containers (basically the resource usage moved here from the original queue)
          8, Submit new app attempt
          9, Add application (++numApplications)

          Show
          keyki Krisztian Horvath added a comment - Can anyone take a look at the patch? I've some concerns regarding the live containers. Movement steps: 1, Check if the target queue has enough capacity and some more validation, exception otherwise (same as with FairScheduler) 2, Remove the app attempt from the current queue 3, Release resources used by live containers on this queue 4, Remove application upwards root (--numApplications) 5, QueueMetrics update 6, Set new queue in application 7, Allocate resources consumed by the live containers (basically the resource usage moved here from the original queue) 8, Submit new app attempt 9, Add application (++numApplications)
          Hide
          keyki Krisztian Horvath added a comment -

          I've found some issues with queue metrics update

          Show
          keyki Krisztian Horvath added a comment - I've found some issues with queue metrics update
          Hide
          hadoopqa Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12654072/YARN-2248-1.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4203//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4203//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12654072/YARN-2248-1.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4203//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4203//console This message is automatically generated.
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Do you mind attaching a patch against latest YARN trunk? Thanks..

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Do you mind attaching a patch against latest YARN trunk? Thanks..
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Janos Matyas, tx for opening this. Assigning it to you..

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Janos Matyas , tx for opening this. Assigning it to you..

            People

            • Assignee:
              matyix Janos Matyas
              Reporter:
              matyix Janos Matyas
            • Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development