Details

      Description

      Subtask of YARN-2139.

      This should include

      • Add API support for introducing disk I/O as the 3rd type resource.
      • NM should report this information to the RM
      • RM should consider this to avoid over-allocation
      1. YARN-2618-7.patch
        104 kB
        Wei Yan
      2. YARN-2618-6.patch
        97 kB
        Wei Yan
      3. YARN-2618-5.patch
        97 kB
        Wei Yan
      4. YARN-2618-4.patch
        97 kB
        Wei Yan
      5. YARN-2618-3.patch
        65 kB
        Wei Yan
      6. YARN-2618-2.patch
        52 kB
        Wei Yan
      7. YARN-2618-1.patch
        47 kB
        Wei Yan

        Activity

        Hide
        kasha Karthik Kambatla added a comment -

        Thanks Wei. Quickly skimmed through the patch. High-level comments:

        1. We should annotate all configuration changes as @Private. Users don't need to see them yet.
        2. Add unit tests to verify the patch indeed avoids over-allocating disk resources.
        Show
        kasha Karthik Kambatla added a comment - Thanks Wei. Quickly skimmed through the patch. High-level comments: We should annotate all configuration changes as @Private. Users don't need to see them yet. Add unit tests to verify the patch indeed avoids over-allocating disk resources.
        Hide
        ywskycn Wei Yan added a comment -

        Thanks, Karthik Kambatla. Update a new patch to fix the comments.
        The existing patch works well with FairScheduler. But for FifoScheduler and CapacityScheduler, it cannot avoid over-allocating disk resources. This is because both Fifo and Capacity only care memory capacity when assigning containers to nodes, and support over-consuming for cpu resources. Jian He, do u know any special reason why CapacityScheduler support over-consuming cpu resources?

        Show
        ywskycn Wei Yan added a comment - Thanks, Karthik Kambatla . Update a new patch to fix the comments. The existing patch works well with FairScheduler. But for FifoScheduler and CapacityScheduler, it cannot avoid over-allocating disk resources. This is because both Fifo and Capacity only care memory capacity when assigning containers to nodes, and support over-consuming for cpu resources. Jian He , do u know any special reason why CapacityScheduler support over-consuming cpu resources?
        Hide
        leftnoteasy Wangda Tan added a comment -

        Wei Yan, Capacity Scheduler already support multi-dimension resource by DominateResourceCalculator and it should work when DRC updated to support disk. The statement is not true:

        This is because both Fifo and Capacity only care memory capacity when assigning containers to nodes

        See CapacitySchedulerConfiguration.getResourceCalculator.

        Show
        leftnoteasy Wangda Tan added a comment - Wei Yan , Capacity Scheduler already support multi-dimension resource by DominateResourceCalculator and it should work when DRC updated to support disk. The statement is not true: This is because both Fifo and Capacity only care memory capacity when assigning containers to nodes See CapacitySchedulerConfiguration.getResourceCalculator .
        Hide
        ywskycn Wei Yan added a comment -

        Thanks for pointing out, Wangda Tan. I'll check that and update testcases for Capacity.

        Show
        ywskycn Wei Yan added a comment - Thanks for pointing out, Wangda Tan . I'll check that and update testcases for Capacity.
        Hide
        kasha Karthik Kambatla added a comment -

        Thanks Wei. Comments:

        1. I think we should avoid adding new methods to BuilderUtils. Resource.newInstance does the necessary job. But for the number of places BuilderUtils is used, we want to get rid of it.
        2. Add vdisks to ResourceType.
        3. DominantResourceCalculator#getResourceAsValue: We should probably leave the method's name as is, and change the boolean argument to either an int or the ResourceType itself.
        4. In DRC#computeAvailableContainers, we should add check-for-zero for memory and cpu as well. Similarly, DRC#ratio should have zero-checks for memory and cpu as well.

        Capacity Scheduler already support multi-dimension resource by DominateResourceCalculator and it should work when DRC updated to support disk.

        I haven't looked at the CS code much, but it appears the scheduling decisions use DRC#divide which only looks at memory. Wangda Tan - any thoughts here?

        Show
        kasha Karthik Kambatla added a comment - Thanks Wei. Comments: I think we should avoid adding new methods to BuilderUtils. Resource.newInstance does the necessary job. But for the number of places BuilderUtils is used, we want to get rid of it. Add vdisks to ResourceType. DominantResourceCalculator#getResourceAsValue: We should probably leave the method's name as is, and change the boolean argument to either an int or the ResourceType itself. In DRC#computeAvailableContainers, we should add check-for-zero for memory and cpu as well. Similarly, DRC#ratio should have zero-checks for memory and cpu as well. Capacity Scheduler already support multi-dimension resource by DominateResourceCalculator and it should work when DRC updated to support disk. I haven't looked at the CS code much, but it appears the scheduling decisions use DRC#divide which only looks at memory. Wangda Tan - any thoughts here?
        Hide
        vvasudev Varun Vasudev added a comment -

        Karthik Kambatla, Wei Yan - question on "vdisks". What is the expected difference between a container with 1 vdisk and one with 2 vdisks? Is it that the second gets a higher proportion of disk operations or is it meant to imply a greater number of spindles? Sorry if the question seems very basic - I've gone through the parent ticket but I'd like to understand the term independent of the enforcement mechanism.

        Show
        vvasudev Varun Vasudev added a comment - Karthik Kambatla , Wei Yan - question on "vdisks". What is the expected difference between a container with 1 vdisk and one with 2 vdisks? Is it that the second gets a higher proportion of disk operations or is it meant to imply a greater number of spindles? Sorry if the question seems very basic - I've gone through the parent ticket but I'd like to understand the term independent of the enforcement mechanism.
        Hide
        kasha Karthik Kambatla added a comment -

        In the context of this JIRA, vdisks is always 1. So, the value of vdisks per container doesn't have any meaning here

        In the context of YARN-2139, vdisks captures both (1) proportion of disk resources and (2) potential parallelism. The proportion of disk resources is only through enforcement. As Bikas suggested on the umbrella JIRA and I responded, the potential parallelism (a.k.a number of spindles) can be achieved through the number of local directories assigned to the container.

        Show
        kasha Karthik Kambatla added a comment - In the context of this JIRA, vdisks is always 1. So, the value of vdisks per container doesn't have any meaning here In the context of YARN-2139 , vdisks captures both (1) proportion of disk resources and (2) potential parallelism. The proportion of disk resources is only through enforcement. As Bikas suggested on the umbrella JIRA and I responded, the potential parallelism (a.k.a number of spindles) can be achieved through the number of local directories assigned to the container.
        Hide
        vvasudev Varun Vasudev added a comment -

        Thanks for the reply! I'll follow up on the parent JIRA on expressing parallelism.

        Show
        vvasudev Varun Vasudev added a comment - Thanks for the reply! I'll follow up on the parent JIRA on expressing parallelism.
        Hide
        kasha Karthik Kambatla added a comment -

        The design doc talks about addressing this as part of spindle locality, which I am okay with fixing as part of YARN-2139 or on another JIRA

        Show
        kasha Karthik Kambatla added a comment - The design doc talks about addressing this as part of spindle locality, which I am okay with fixing as part of YARN-2139 or on another JIRA
        Hide
        vvasudev Varun Vasudev added a comment -

        Wei Yan - can you please think about changing 'vdisks' to something that reflects that it is a weighted number? Something like 'weighted-disk-io'. The term vdisks intuitively seems to imply a spindle with the 'v' meaning that we don't know if it's SSD or a SATA drive.

        Show
        vvasudev Varun Vasudev added a comment - Wei Yan - can you please think about changing 'vdisks' to something that reflects that it is a weighted number? Something like 'weighted-disk-io'. The term vdisks intuitively seems to imply a spindle with the 'v' meaning that we don't know if it's SSD or a SATA drive.
        Hide
        kasha Karthik Kambatla added a comment -

        Varun Vasudev - Filed YARN-2941 to revisit the config names.

        Show
        kasha Karthik Kambatla added a comment - Varun Vasudev - Filed YARN-2941 to revisit the config names.
        Hide
        leftnoteasy Wangda Tan added a comment -

        Hi Karthik Kambatla,

        I haven't looked at the CS code much, but it appears the scheduling decisions use DRC#divide which only looks at memory. Wangda Tan - any thoughts here?

        Does "DRC" means "Default..." or "Dominant..."? By default CS uses DefaultResourceCalculator, but it can also support not over-allocation of CPU when DominantResourceCalculator configured.

        Could you please confirm about this, Varun Vasudev?

        Show
        leftnoteasy Wangda Tan added a comment - Hi Karthik Kambatla , I haven't looked at the CS code much, but it appears the scheduling decisions use DRC#divide which only looks at memory. Wangda Tan - any thoughts here? Does "DRC" means "Default..." or "Dominant..."? By default CS uses DefaultResourceCalculator , but it can also support not over-allocation of CPU when DominantResourceCalculator configured. Could you please confirm about this, Varun Vasudev ?
        Hide
        ywskycn Wei Yan added a comment -

        Wangda Tan, yes, by configuring the calculator using DRC, we can avoid over-consuming. We'll update a new patch to include a testcase for CapacityScheduler.

        Show
        ywskycn Wei Yan added a comment - Wangda Tan , yes, by configuring the calculator using DRC, we can avoid over-consuming. We'll update a new patch to include a testcase for CapacityScheduler.
        Hide
        ywskycn Wei Yan added a comment -

        Update a new patch by combing comments. Verified in a local cluster, "avoid over-allocation of disk resources" works well with FairScheduler and CapacityScheduler (with DRF enabled).

        Show
        ywskycn Wei Yan added a comment - Update a new patch by combing comments. Verified in a local cluster, "avoid over-allocation of disk resources" works well with FairScheduler and CapacityScheduler (with DRF enabled).
        Hide
        kasha Karthik Kambatla added a comment -

        Looks mostly good. One minor comment - DominantResourceCalculator#getResourceAsValue:

        1. Add a comment to the tune of "lower the rank, more dominant the resource"
        2. Also add a check rank > 0 and rank < number of resources, otherwise throw IllegalArgumentException
        Show
        kasha Karthik Kambatla added a comment - Looks mostly good. One minor comment - DominantResourceCalculator#getResourceAsValue: Add a comment to the tune of "lower the rank, more dominant the resource" Also add a check rank > 0 and rank < number of resources, otherwise throw IllegalArgumentException
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12686151/YARN-2618-3.patch
        against trunk revision 03867eb.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 9 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        -1 findbugs. The patch appears to introduce 73 new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

        org.apache.hadoop.yarn.client.cli.TestYarnCLI
        org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
        org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler
        org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService
        org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched
        org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart
        org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestRMNMRPCResponseId
        org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
        org.apache.hadoop.yarn.server.resourcemanager.TestAppManager
        org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestNMExpiry
        org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.TestDominantResourceFairnessPolicy
        org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService
        org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestNMReconnect

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6066//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6066//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-client.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6066//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6066//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6066//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6066//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-common.html
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6066//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12686151/YARN-2618-3.patch against trunk revision 03867eb. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 9 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 73 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.client.cli.TestYarnCLI org.apache.hadoop.yarn.client.TestResourceTrackerOnHA org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestRMNMRPCResponseId org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart org.apache.hadoop.yarn.server.resourcemanager.TestAppManager org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestNMExpiry org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.TestDominantResourceFairnessPolicy org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestNMReconnect Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6066//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6066//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6066//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6066//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6066//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6066//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-common.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6066//console This message is automatically generated.
        Hide
        ywskycn Wei Yan added a comment -

        Update a new patch fixing the test failures.

        Show
        ywskycn Wei Yan added a comment - Update a new patch fixing the test failures.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12686237/YARN-2618-4.patch
        against trunk revision 2e98ad3.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6071//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12686237/YARN-2618-4.patch against trunk revision 2e98ad3. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6071//console This message is automatically generated.
        Hide
        vvasudev Varun Vasudev added a comment -

        Wei, Karthik, in the parent JIRA design docs, the model presented was that when an app asks for vdisks as part of the resource allocation, what it's doing is asking for a portion of the disk operations on the node. There's a section at the end on extending vdisks to include spindles but it wasn't clear on how to extend it. If the vdisks in the Resource class represents a share of disk operations, can we change the name in the Resource class as well to reflect this(from vdisks to something else)?

        Show
        vvasudev Varun Vasudev added a comment - Wei, Karthik, in the parent JIRA design docs, the model presented was that when an app asks for vdisks as part of the resource allocation, what it's doing is asking for a portion of the disk operations on the node. There's a section at the end on extending vdisks to include spindles but it wasn't clear on how to extend it. If the vdisks in the Resource class represents a share of disk operations, can we change the name in the Resource class as well to reflect this(from vdisks to something else)?
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12686243/YARN-2618-5.patch
        against trunk revision 2e98ad3.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 20 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        -1 findbugs. The patch appears to introduce 73 new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

        org.apache.hadoop.yarn.server.resourcemanager.security.TestAMRMTokens
        org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart
        org.apache.hadoop.yarn.server.resourcemanager.TestRM
        org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6072//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6072//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-client.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6072//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6072//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6072//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6072//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-common.html
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6072//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12686243/YARN-2618-5.patch against trunk revision 2e98ad3. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 20 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 73 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.security.TestAMRMTokens org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart org.apache.hadoop.yarn.server.resourcemanager.TestRM org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6072//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6072//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6072//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6072//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6072//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6072//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-common.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6072//console This message is automatically generated.
        Hide
        ywskycn Wei Yan added a comment -

        The four failed testcases passed locally. Some of the failures were timeout issue.

        Show
        ywskycn Wei Yan added a comment - The four failed testcases passed locally. Some of the failures were timeout issue.
        Hide
        kasha Karthik Kambatla added a comment -

        Looks good. Ran findbugs locally, and didn't see any new issues. The tests pass locally as well.

        If the vdisks in the Resource class represents a share of disk operations, can we change the name in the Resource class as well to reflect this(from vdisks to something else)?

        Spoke to Varun offline. I created a sub-task earlier to revisit the config names and it is a blocker for the merge. Let us look into all the configs together there.

        +1. Committing this.

        Show
        kasha Karthik Kambatla added a comment - Looks good. Ran findbugs locally, and didn't see any new issues. The tests pass locally as well. If the vdisks in the Resource class represents a share of disk operations, can we change the name in the Resource class as well to reflect this(from vdisks to something else)? Spoke to Varun offline. I created a sub-task earlier to revisit the config names and it is a blocker for the merge. Let us look into all the configs together there. +1. Committing this.
        Hide
        ywskycn Wei Yan added a comment -
        Show
        ywskycn Wei Yan added a comment - Thanks, Karthik Kambatla
        Hide
        kasha Karthik Kambatla added a comment -

        Sorry, looks like I forgot to check this in. Tried applying the patch, and there is one conflict. Wei Yan - mind updating the patch?

        Show
        kasha Karthik Kambatla added a comment - Sorry, looks like I forgot to check this in. Tried applying the patch, and there is one conflict. Wei Yan - mind updating the patch?
        Hide
        ywskycn Wei Yan added a comment -

        Karthik Kambatla, sure, will do it soon.

        Show
        ywskycn Wei Yan added a comment - Karthik Kambatla , sure, will do it soon.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12686243/YARN-2618-5.patch
        against trunk revision c906a1d.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7101//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12686243/YARN-2618-5.patch against trunk revision c906a1d. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7101//console This message is automatically generated.
        Hide
        ywskycn Wei Yan added a comment -

        Rebase the patch.

        Show
        ywskycn Wei Yan added a comment - Rebase the patch.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12707506/YARN-2618-6.patch
        against trunk revision 2228456.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 20 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

        org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
        org.apache.hadoop.yarn.client.api.impl.TestYarnClient
        org.apache.hadoop.yarn.client.TestResourceManagerAdministrationProtocolPBClientImpl
        org.apache.hadoop.yarn.client.api.impl.TestAMRMClient
        org.apache.hadoop.yarn.client.TestGetGroups
        org.apache.hadoop.yarn.client.api.impl.TestNMClient
        org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
        org.apache.hadoop.yarn.client.TestApplicationMasterServiceProtocolOnHA
        org.apache.hadoop.yarn.client.TestRMFailover
        org.apache.hadoop.yarn.server.resourcemanager.TestRMHA
        org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService
        org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits
        org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestLeafQueue
        org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebappAuthentication
        org.apache.hadoop.yarn.server.resourcemanager.TestMoveApplication
        org.apache.hadoop.yarn.server.resourcemanager.TestRM
        org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler
        org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStore

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7116//testReport/
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7116//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12707506/YARN-2618-6.patch against trunk revision 2228456. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 20 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.client.TestResourceTrackerOnHA org.apache.hadoop.yarn.client.api.impl.TestYarnClient org.apache.hadoop.yarn.client.TestResourceManagerAdministrationProtocolPBClientImpl org.apache.hadoop.yarn.client.api.impl.TestAMRMClient org.apache.hadoop.yarn.client.TestGetGroups org.apache.hadoop.yarn.client.api.impl.TestNMClient org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA org.apache.hadoop.yarn.client.TestApplicationMasterServiceProtocolOnHA org.apache.hadoop.yarn.client.TestRMFailover org.apache.hadoop.yarn.server.resourcemanager.TestRMHA org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestLeafQueue org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebappAuthentication org.apache.hadoop.yarn.server.resourcemanager.TestMoveApplication org.apache.hadoop.yarn.server.resourcemanager.TestRM org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStore Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7116//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7116//console This message is automatically generated.
        Hide
        ywskycn Wei Yan added a comment -

        Fix the testing errors.

        Show
        ywskycn Wei Yan added a comment - Fix the testing errors.
        Hide
        hadoopqa Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12723515/YARN-2618-7.patch
        against trunk revision 3fb5abf.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 22 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7231//testReport/
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7231//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12723515/YARN-2618-7.patch against trunk revision 3fb5abf. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 22 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7231//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7231//console This message is automatically generated.
        Hide
        ywskycn Wei Yan added a comment -

        Karthik Kambatla, help commit the patch?

        Show
        ywskycn Wei Yan added a comment - Karthik Kambatla , help commit the patch?
        Hide
        djp Junping Du added a comment -

        Kick off test again manually.

        Show
        djp Junping Du added a comment - Kick off test again manually.
        Hide
        hadoopqa Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        -1 patch 0m 0s The patch command could not apply the patch during dryrun.



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12723515/YARN-2618-7.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / bb9ddef
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/7687/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 patch 0m 0s The patch command could not apply the patch during dryrun. Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12723515/YARN-2618-7.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / bb9ddef Console output https://builds.apache.org/job/PreCommit-YARN-Build/7687/console This message was automatically generated.
        Hide
        ywskycn Wei Yan added a comment -

        Thanks, Junping Du, I'll rebase the patch.

        Show
        ywskycn Wei Yan added a comment - Thanks, Junping Du , I'll rebase the patch.
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Haven't looked at this so far, Tx for rekicking it Junping! Taking a quick look now..

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Haven't looked at this so far, Tx for rekicking it Junping! Taking a quick look now..
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Okay, quickly scanned. Seems like you are having other related discussions at the umbrella ticket and other JIRAs. So please go ahead.

        Is this only for trunk or branch-2 also?

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Okay, quickly scanned. Seems like you are having other related discussions at the umbrella ticket and other JIRAs. So please go ahead. Is this only for trunk or branch-2 also?
        Hide
        kasha Karthik Kambatla added a comment -

        Vinod Kumar Vavilapalli - we were thinking of working on a branch and merge back to trunk in phases. Do you think this alone can directly go to trunk?

        Related - it would be nice if the scheduler parts of YARN-2140 are also worked on in a branch. Also, looking to hear thoughts on the branch-development thread on yarn-dev@

        Show
        kasha Karthik Kambatla added a comment - Vinod Kumar Vavilapalli - we were thinking of working on a branch and merge back to trunk in phases. Do you think this alone can directly go to trunk? Related - it would be nice if the scheduler parts of YARN-2140 are also worked on in a branch. Also, looking to hear thoughts on the branch-development thread on yarn-dev@
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Okay, makes sense. Let's get it into a branch. /cc Varun Vasudev.

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Okay, makes sense. Let's get it into a branch. /cc Varun Vasudev .
        Hide
        vvasudev Varun Vasudev added a comment -

        Karthik Kambatla - should we commit this to the YARN-2139 branch? Should we get the branch up to date with trunk first?

        Show
        vvasudev Varun Vasudev added a comment - Karthik Kambatla - should we commit this to the YARN-2139 branch? Should we get the branch up to date with trunk first?
        Hide
        kasha Karthik Kambatla added a comment -

        Varun Vasudev - thanks for the ping. I haven't had the time to do a thorough review of remaining tasks here, and hence avoided committing this. Do you have the cycles to help shepherd this work into the branch?

        And yes, we should true YARN-2139 up to trunk and commit this.

        Show
        kasha Karthik Kambatla added a comment - Varun Vasudev - thanks for the ping. I haven't had the time to do a thorough review of remaining tasks here, and hence avoided committing this. Do you have the cycles to help shepherd this work into the branch? And yes, we should true YARN-2139 up to trunk and commit this.
        Hide
        hadoopqa Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        -1 patch 0m 0s The patch command could not apply the patch during dryrun.



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12723515/YARN-2618-7.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / a2bd621
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/8167/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 patch 0m 0s The patch command could not apply the patch during dryrun. Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12723515/YARN-2618-7.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / a2bd621 Console output https://builds.apache.org/job/PreCommit-YARN-Build/8167/console This message was automatically generated.
        Hide
        chris.douglas Chris Douglas added a comment -

        Canceling patch.

        The parent JIRA seems to have stalled, unfortunately. Please feel free to rebase and reopen discussion.

        Show
        chris.douglas Chris Douglas added a comment - Canceling patch. The parent JIRA seems to have stalled, unfortunately. Please feel free to rebase and reopen discussion.

          People

          • Assignee:
            ywskycn Wei Yan
            Reporter:
            ywskycn Wei Yan
          • Votes:
            3 Vote for this issue
            Watchers:
            15 Start watching this issue

            Dates

            • Created:
              Updated:

              Development