Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: resourcemanager
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      Currently, max-am-resource-percentage considers default_partition only. When a queue can access multiple partitions, we should be able to compute max-am-resource-percentage based on that.

      1. 0001-YARN-3216.patch
        8 kB
        Sunil G
      2. 0002-YARN-3216.patch
        26 kB
        Sunil G
      3. 0003-YARN-3216.patch
        34 kB
        Sunil G
      4. 0004-YARN-3216.patch
        33 kB
        Sunil G
      5. 0005-YARN-3216.patch
        43 kB
        Sunil G
      6. 0006-YARN-3216.patch
        43 kB
        Sunil G
      7. 0007-YARN-3216.patch
        46 kB
        Sunil G
      8. 0008-YARN-3216.patch
        68 kB
        Sunil G
      9. 0009-YARN-3216.patch
        68 kB
        Sunil G
      10. 0010-YARN-3216.patch
        63 kB
        Sunil G
      11. 0011-YARN-3216.patch
        63 kB
        Sunil G

        Issue Links

          Activity

          Hide
          leftnoteasy Wangda Tan added a comment -

          There're two approaches of doing that,

          • Make maxAMResource = queue's-total-guaranteed-resource (Sum of queue's guaranteed resource on all partitions) * maxAmResourcePercent. It will be straightforward, but also can lead to too many AMs launched under a single partition.
          • Make maxAMResource computed per queue per partition, this can make AM usages under partitions are more balanced, but can also lead to hard debugging (My application get stuck because of AMResourceLimit for a partition is violated).

          I prefer 1st solution since it's easier to understand and debugging.

          Show
          leftnoteasy Wangda Tan added a comment - There're two approaches of doing that, Make maxAMResource = queue's-total-guaranteed-resource (Sum of queue's guaranteed resource on all partitions) * maxAmResourcePercent. It will be straightforward, but also can lead to too many AMs launched under a single partition. Make maxAMResource computed per queue per partition, this can make AM usages under partitions are more balanced, but can also lead to hard debugging (My application get stuck because of AMResourceLimit for a partition is violated). I prefer 1st solution since it's easier to understand and debugging.
          Hide
          sunilg Sunil G added a comment -

          Yes Wangda Tan
          I also feel that we can do 1st solution for now.

          I will upload a patch for this and taking over. Thank you Wangda Tan

          Show
          sunilg Sunil G added a comment - Yes Wangda Tan I also feel that we can do 1st solution for now. I will upload a patch for this and taking over. Thank you Wangda Tan
          Hide
          sunilg Sunil G added a comment -

          Hi Wangda Tan
          Attaching an initial version of work in progress patch. I used getAMResourceRequest from RMApp to compare with container resource request to know whether its an AM Container from FicaSchedulerApp.

          +    if (request.equals(rmApp.getAMResourceRequest())) {
          +      setAMResource(node.getPartition(), request.getCapability());
          +    }
          

          But I suspect a potential problem. If a buggy app submits a resource request which is similar to the one which RM creates for AM request, then this check may be problematic. In that case I feel i need an event from RMAppAttemptImpl where we detect AM container to Scheduler to update the AM used resource details on specified partition. Cud u pls share your thoughts on same.

          Show
          sunilg Sunil G added a comment - Hi Wangda Tan Attaching an initial version of work in progress patch. I used getAMResourceRequest from RMApp to compare with container resource request to know whether its an AM Container from FicaSchedulerApp. + if (request.equals(rmApp.getAMResourceRequest())) { + setAMResource(node.getPartition(), request.getCapability()); + } But I suspect a potential problem. If a buggy app submits a resource request which is similar to the one which RM creates for AM request, then this check may be problematic. In that case I feel i need an event from RMAppAttemptImpl where we detect AM container to Scheduler to update the AM used resource details on specified partition. Cud u pls share your thoughts on same.
          Hide
          Naganarasimha Naganarasimha G R added a comment -

          Hi Sunil G & Wangda Tan,
          This issue is very critical for our usecase where in we use cluster by having 2 partitions (non DEFAULT_PARTITION s). So if the Max-AM-Resource-Percentage is calculated based on DEFAULT_PARTITION size then it practically limits to only one app being submitted.
          Also i feel even though its hard for debugging its better to opt for approach 2 as we can clearly specify AMResourceLimit for each partition and further we are having some jira's YARN-3946 which try to indicate the reasons for Application not being launched. Thoughts ?

          Show
          Naganarasimha Naganarasimha G R added a comment - Hi Sunil G & Wangda Tan , This issue is very critical for our usecase where in we use cluster by having 2 partitions (non DEFAULT_PARTITION s). So if the Max-AM-Resource-Percentage is calculated based on DEFAULT_PARTITION size then it practically limits to only one app being submitted. Also i feel even though its hard for debugging its better to opt for approach 2 as we can clearly specify AMResourceLimit for each partition and further we are having some jira's YARN-3946 which try to indicate the reasons for Application not being launched. Thoughts ?
          Hide
          sunilg Sunil G added a comment -

          hi NGarla_Unused
          Option2 may come with more complexity and user may not understand that calculation very well. One of the reason is, if we have multiple partitions for queue, the resource size is also reducing per partition. And its possible that , even to launch one application, we may need to cross this limit totally in one partition OR even stop that application from launching. Yes, its more apt to do it, but we also have to see that how much complexity it adds. we can discuss more and can decide on approach by weighing the simpleness and correctiveness in the approach.

          Show
          sunilg Sunil G added a comment - hi NGarla_Unused Option2 may come with more complexity and user may not understand that calculation very well. One of the reason is, if we have multiple partitions for queue, the resource size is also reducing per partition. And its possible that , even to launch one application, we may need to cross this limit totally in one partition OR even stop that application from launching. Yes, its more apt to do it, but we also have to see that how much complexity it adds. we can discuss more and can decide on approach by weighing the simpleness and correctiveness in the approach.
          Hide
          Naganarasimha Naganarasimha G R added a comment -

          Hi Sunil G, Ideal would have been that Max-AM-Resource-Percentage can be configured for per queue per partition then it would have been clear but at the minimum i feel it should be 2nd option. In terms of debug ability in the case which Wangda Tan mentioned but also can lead to too many AMs launched under a single partition, if user is viewing the web ui, user need to see each of the partitions which is accessible to the queue and find out which partition is using more am resource and then try to resolve. But with the 2nd option it would be clear whats Max-AM-Resource-Percentage per partition with changes in UI to point AM resource usage per partition per queue.

          Show
          Naganarasimha Naganarasimha G R added a comment - Hi Sunil G , Ideal would have been that Max-AM-Resource-Percentage can be configured for per queue per partition then it would have been clear but at the minimum i feel it should be 2nd option. In terms of debug ability in the case which Wangda Tan mentioned but also can lead to too many AMs launched under a single partition , if user is viewing the web ui, user need to see each of the partitions which is accessible to the queue and find out which partition is using more am resource and then try to resolve. But with the 2nd option it would be clear whats Max-AM-Resource-Percentage per partition with changes in UI to point AM resource usage per partition per queue.
          Hide
          sunilg Sunil G added a comment -

          Synced up with NGarla_Unused offline.
          One of the concern with option2 as such, is the calculation of max-am-resource per partition level which will be a subset of Queue itself. Assuming a configuration (its a corner case scenario), where queue has 10GB resource with 0.1 as max-am-resource-percent. Now considering 2 partition sharing this queue with 50%, max am resource will be 500MB per partition. Earlier, a single application could have got 1Gb (as Naga told, over utilizing from other partition). Such case will pop-up as we go with Option2.
          I feel this also can be considered as a point to decide over these 2 options. Also I agree that having a non-used DEFAULT-PARTITION sharing am resources may also lead issue in future too. So earlier we get rid of that, its more better. . So my suggestion is implementing option 2, with one of partition can borrow AM resources from other partition.

          Wangda Tan, could you also share your thoughts over here and we can syncup offline as needed to discuss this.

          Show
          sunilg Sunil G added a comment - Synced up with NGarla_Unused offline. One of the concern with option2 as such, is the calculation of max-am-resource per partition level which will be a subset of Queue itself. Assuming a configuration (its a corner case scenario), where queue has 10GB resource with 0.1 as max-am-resource-percent. Now considering 2 partition sharing this queue with 50%, max am resource will be 500MB per partition. Earlier, a single application could have got 1Gb (as Naga told, over utilizing from other partition). Such case will pop-up as we go with Option2. I feel this also can be considered as a point to decide over these 2 options. Also I agree that having a non-used DEFAULT-PARTITION sharing am resources may also lead issue in future too. So earlier we get rid of that, its more better. . So my suggestion is implementing option 2, with one of partition can borrow AM resources from other partition. Wangda Tan , could you also share your thoughts over here and we can syncup offline as needed to discuss this.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Thanks for sharing your thoughts, Sunil G, Naganarasimha G R.
          Reconsidered this problem, I prefer to go with option-2, partition should be considered as a sub-cluster, so it should have its own max-am-resource-percentage. I think what we can do is:

          • Have a per-queue-per-partition "max am percentage" configuration, if it not specific, we assume it inherits queue's configured one.
          • We guarantee to launch at least one AM container in every partition no matter which is the AM-resource-percentage setting.
          • If we change node's label, we should update per-queue-per-partition-am-used resource as well. (We have YARN-4082 committed).

          Sounds like a plan?

          Show
          leftnoteasy Wangda Tan added a comment - Thanks for sharing your thoughts, Sunil G , Naganarasimha G R . Reconsidered this problem, I prefer to go with option-2, partition should be considered as a sub-cluster, so it should have its own max-am-resource-percentage. I think what we can do is: Have a per-queue-per-partition "max am percentage" configuration, if it not specific, we assume it inherits queue's configured one. We guarantee to launch at least one AM container in every partition no matter which is the AM-resource-percentage setting. If we change node's label, we should update per-queue-per-partition-am-used resource as well. (We have YARN-4082 committed). Sounds like a plan?
          Hide
          sunilg Sunil G added a comment -

          Yes. This sounds good. I will work on a patch.

          Show
          sunilg Sunil G added a comment - Yes. This sounds good. I will work on a patch.
          Hide
          Naganarasimha Naganarasimha G R added a comment -

          Wangda Tan, Yes this sounds like a good plan for me !

          Show
          Naganarasimha Naganarasimha G R added a comment - Wangda Tan , Yes this sounds like a good plan for me !
          Hide
          sunilg Sunil G added a comment -

          Attaching an initial work in progress patch. I will add tests in coming patch. Kindly help to check the same.

          Show
          sunilg Sunil G added a comment - Attaching an initial work in progress patch. I will add tests in coming patch. Kindly help to check the same.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Hi Sunil G,
          Thanks for working on this, not finished, some comments so far:
          1) (minor) Could you make amlimit computation treat label=="" as a normal label? Which could simplify logic. You can use a map to store computed amlimit-by-partition to avoid duplicated computation.
          2) (major) getAMResourceLimitPerPartition should uses partition.totalResource (from RMNodeLabelsManager.getPartitionResource) instead of clusterResource.
          3) (minor) ResourceUsage#getAllAMUsed is not used.
          4) (major) LeafQueue#getNumActiveAppsPerPartition is a O operation, should be optimized, otherwise activateApplication becomes a O(n^2) operation.

          Show
          leftnoteasy Wangda Tan added a comment - Hi Sunil G , Thanks for working on this, not finished, some comments so far: 1) (minor) Could you make amlimit computation treat label=="" as a normal label? Which could simplify logic. You can use a map to store computed amlimit-by-partition to avoid duplicated computation. 2) (major) getAMResourceLimitPerPartition should uses partition.totalResource (from RMNodeLabelsManager.getPartitionResource) instead of clusterResource. 3) (minor) ResourceUsage#getAllAMUsed is not used. 4) (major) LeafQueue#getNumActiveAppsPerPartition is a O operation, should be optimized, otherwise activateApplication becomes a O(n^2) operation.
          Hide
          leftnoteasy Wangda Tan added a comment -

          O -> O(n), the sticker syntax is too unfriendly with computer science .

          Show
          leftnoteasy Wangda Tan added a comment - O -> O(n) , the sticker syntax is too unfriendly with computer science .
          Hide
          sunilg Sunil G added a comment -

          Thank you Wangda Tan for sharing the comments. Yea, I think smileys have given preference.

          I will address them in next patch.

          getAMResourceLimitPerPartition should uses partition.totalResource

          Here we would like to get the resource used by partition in a queue and then take the am % on that resource value. correct?

          Show
          sunilg Sunil G added a comment - Thank you Wangda Tan for sharing the comments. Yea, I think smileys have given preference. I will address them in next patch. getAMResourceLimitPerPartition should uses partition.totalResource Here we would like to get the resource used by partition in a queue and then take the am % on that resource value. correct?
          Hide
          leftnoteasy Wangda Tan added a comment -

          Here we would like to get the resource used by partition in a queue and then take the am % on that resource value. correct?

          Exactly, when we computing am-resouce-limit, the queue-resource should be partition.totalResource * queue.capacity-of-the-partititon.

          Show
          leftnoteasy Wangda Tan added a comment - Here we would like to get the resource used by partition in a queue and then take the am % on that resource value. correct? Exactly, when we computing am-resouce-limit, the queue-resource should be partition.totalResource * queue.capacity-of-the-partititon.
          Hide
          eepayne Eric Payne added a comment -

          Hi Sunil G, Wangda Tan, and Naganarasimha G R. Thank you all for the great work.

          Have you considered how the Max Application Master Resources will be presented in the GUI? I assume it will just be expressed in the existing Max Application Master Resources field under the partition-specific tab in the scheduler page. Is that correct?

          Show
          eepayne Eric Payne added a comment - Hi Sunil G , Wangda Tan , and Naganarasimha G R . Thank you all for the great work. Have you considered how the Max Application Master Resources will be presented in the GUI? I assume it will just be expressed in the existing Max Application Master Resources field under the partition-specific tab in the scheduler page. Is that correct?
          Hide
          Naganarasimha Naganarasimha G R added a comment -

          Hi Eric Payne,
          Yes that would be the plan along with the support in REST(similar to the lines of YARN-4162), if not user will not be aware of such configuration.

          Show
          Naganarasimha Naganarasimha G R added a comment - Hi Eric Payne , Yes that would be the plan along with the support in REST(similar to the lines of YARN-4162 ), if not user will not be aware of such configuration.
          Hide
          sunilg Sunil G added a comment -

          Hi Eric Payne
          Thank you for sharing the comments. As Naga mentioned, we could retrieve am-resource-precent per-partition (per queue) config information can also be fetched from REST. Also as you mentioned, this information can also be retrieved from GUI (such as "am resource usage per queue per partition) from partition-tab in scheduler page.

          Show
          sunilg Sunil G added a comment - Hi Eric Payne Thank you for sharing the comments. As Naga mentioned, we could retrieve am-resource-precent per-partition (per queue) config information can also be fetched from REST. Also as you mentioned, this information can also be retrieved from GUI (such as "am resource usage per queue per partition) from partition-tab in scheduler page.
          Hide
          sunilg Sunil G added a comment -

          Hi Wangda Tan
          Attaching v2 version of patch addressing the major comments. Kindly help to check the same.

          Show
          sunilg Sunil G added a comment - Hi Wangda Tan Attaching v2 version of patch addressing the major comments. Kindly help to check the same.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Thanks Sunil G.

          Went through the patch, some comments:

          1. AbstractCSQueue: Instead of adding AM-used-resource to parentQueue, I think we may only need to calculate AM-used-resource on LeafQueueu and user. Currently we don't have limitation of AM-used-resource on parentQueue, so the aggregated resource may not be very useful. We can add it along the hierachy if we want to limit max-am-percent on parentQueue in the future.

          2. CapacitySchedulerConfiguration: Instead of introduce a new configuration: MAXIMUM_AM_RESOURCE_PARTITION_SUFFIX, I suggest to use the existing one: maximum-am-resource-percent. If queue.accessible-node-labels.<label>.maximum-am-resource-percent not set, it uses queue.maximum-am-resource-percent. Please let me know if there's any specific reason to add a new maximum-am-resource-partition.

          3. LeafQueue: I'm wondering if we need to maintain map of PartitionInfo: PartitionInfo.getActiveApplications is only used to check if there's any activated apps under a partition, it is equivalent to queueUsage.getAMUsed(partitionName) > 0.

          4. SchedulerApplicationAttempt: I think return value getAMUsed should be:

          • Before AM container allocated, it returns AM-Resource-Request.resource on partition=AM-Resource-Request.node-label-request
          • After AM container allocated, it returns AM-Container.resource on partition=AM-Node.partition
          • you don't have to update am-resource when AM container just allocated, because AM-container.resource and am-resource-request.node-label-request won't be changed, but you need to update this if partition of AM-container's NM updated). I'm not sure if it is clear to you, please let me know if you need more elaborate about this comment.

          I found you removed some code from FiCaSchedulerApp's constructor, I think getAMUsed should still return correct value before AM container allocated, otherwise the computation might be wrong. Let me know if I didn't understand your code correctly.

          Show
          leftnoteasy Wangda Tan added a comment - Thanks Sunil G . Went through the patch, some comments: 1. AbstractCSQueue: Instead of adding AM-used-resource to parentQueue, I think we may only need to calculate AM-used-resource on LeafQueueu and user. Currently we don't have limitation of AM-used-resource on parentQueue, so the aggregated resource may not be very useful. We can add it along the hierachy if we want to limit max-am-percent on parentQueue in the future. 2. CapacitySchedulerConfiguration: Instead of introduce a new configuration: MAXIMUM_AM_RESOURCE_PARTITION_SUFFIX, I suggest to use the existing one: maximum-am-resource-percent. If queue.accessible-node-labels.<label>.maximum-am-resource-percent not set, it uses queue.maximum-am-resource-percent. Please let me know if there's any specific reason to add a new maximum-am-resource-partition. 3. LeafQueue: I'm wondering if we need to maintain map of PartitionInfo : PartitionInfo.getActiveApplications is only used to check if there's any activated apps under a partition, it is equivalent to queueUsage.getAMUsed(partitionName) > 0 . 4. SchedulerApplicationAttempt: I think return value getAMUsed should be: Before AM container allocated, it returns AM-Resource-Request.resource on partition=AM-Resource-Request.node-label-request After AM container allocated, it returns AM-Container.resource on partition=AM-Node.partition you don't have to update am-resource when AM container just allocated, because AM-container.resource and am-resource-request.node-label-request won't be changed, but you need to update this if partition of AM-container's NM updated). I'm not sure if it is clear to you, please let me know if you need more elaborate about this comment. I found you removed some code from FiCaSchedulerApp's constructor, I think getAMUsed should still return correct value before AM container allocated, otherwise the computation might be wrong. Let me know if I didn't understand your code correctly.
          Hide
          leftnoteasy Wangda Tan added a comment -

          And forgot to mention:
          5. About am-resource-percent per user per partition. Currently you have only considered am-resource-percent per queue, I think you need to calculate (not configure) per-user-per-partition am-resource-limit as well. Since the patch is already very complex to me, I'm fine with doing the math of am-resource-limit-per-user in a separated JIRA.

          Show
          leftnoteasy Wangda Tan added a comment - And forgot to mention: 5. About am-resource-percent per user per partition. Currently you have only considered am-resource-percent per queue, I think you need to calculate (not configure) per-user-per-partition am-resource-limit as well. Since the patch is already very complex to me, I'm fine with doing the math of am-resource-limit-per-user in a separated JIRA.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 19m 46s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 javac 9m 6s There were no new javac warning messages.
          +1 javadoc 11m 44s There were no new javadoc warning messages.
          -1 release audit 0m 18s The applied patch generated 1 release audit warnings.
          -1 checkstyle 1m 10s The applied patch generated 18 new checkstyle issues (total was 271, now 268).
          -1 whitespace 0m 8s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix.
          +1 install 1m 44s mvn install still works.
          +1 eclipse:eclipse 0m 37s The patch built with eclipse:eclipse.
          +1 findbugs 1m 40s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          -1 yarn tests 62m 27s Tests failed in hadoop-yarn-server-resourcemanager.
              108m 45s  



          Reason Tests
          Failed unit tests hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps
            hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing
            hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerNodeLabelUpdate
            hadoop.yarn.server.resourcemanager.scheduler.capacity.TestWorkPreservingRMRestartForNodeLabel
            hadoop.yarn.server.resourcemanager.scheduler.capacity.TestLeafQueue
            hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler
            hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12765017/0003-YARN-3216.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 30ac69c
          Release Audit https://builds.apache.org/job/PreCommit-YARN-Build/9355/artifact/patchprocess/patchReleaseAuditProblems.txt
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9355/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
          whitespace https://builds.apache.org/job/PreCommit-YARN-Build/9355/artifact/patchprocess/whitespace.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9355/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9355/testReport/
          Java 1.7.0_55
          uname Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/9355/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 19m 46s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac 9m 6s There were no new javac warning messages. +1 javadoc 11m 44s There were no new javadoc warning messages. -1 release audit 0m 18s The applied patch generated 1 release audit warnings. -1 checkstyle 1m 10s The applied patch generated 18 new checkstyle issues (total was 271, now 268). -1 whitespace 0m 8s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 1m 44s mvn install still works. +1 eclipse:eclipse 0m 37s The patch built with eclipse:eclipse. +1 findbugs 1m 40s The patch does not introduce any new Findbugs (version 3.0.0) warnings. -1 yarn tests 62m 27s Tests failed in hadoop-yarn-server-resourcemanager.     108m 45s   Reason Tests Failed unit tests hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps   hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing   hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerNodeLabelUpdate   hadoop.yarn.server.resourcemanager.scheduler.capacity.TestWorkPreservingRMRestartForNodeLabel   hadoop.yarn.server.resourcemanager.scheduler.capacity.TestLeafQueue   hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler   hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12765017/0003-YARN-3216.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 30ac69c Release Audit https://builds.apache.org/job/PreCommit-YARN-Build/9355/artifact/patchprocess/patchReleaseAuditProblems.txt checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9355/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt whitespace https://builds.apache.org/job/PreCommit-YARN-Build/9355/artifact/patchprocess/whitespace.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9355/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9355/testReport/ Java 1.7.0_55 uname Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9355/console This message was automatically generated.
          Hide
          sunilg Sunil G added a comment -

          Thank you Wangda Tan for sharing the comments.

          Please let me know if there's any specific reason to add a new maximum-am-resource-partition.

          I agree with you. We could use the same configuration name under each label.

          if there's any activated apps under a partition, it is equivalent to queueUsage.getAMUsed(partitionName)

          Yes. This will be enough, i kept a new map with the idea of maintaining some more information in similar lines with User. But as of now, the change suggested is enough. I will remove the map.

          you don't have to update am-resource when AM container just allocated, because AM-container.resource and am-resource-request.node-label-request won't be changed, but you need to update this if partition of AM-container's NM updated

          As I see it, we may need to below change.

          • In FiCaSchedulerApp's ctor, update AM-Resource-Request.resource on partition ( keep existing code). But use rmApp.getAMResourceRequest().getNodeLabelExpression() to setAMResource instead of setting to NO_LABEL. Because this information wont be changed later.
          • if partition of AM-container's NM updated, we need to change AMResource which I am handling in nodePartitionUpdated as below.
            +    if (rmContainer.isAMContainer()) {
            +      setAppAMNodePartitionName(newPartition);
            +      this.attemptResourceUsage.decAMUsed(oldPartition, containerResource);
            +      this.attemptResourceUsage.incAMUsed(newPartition, containerResource);
            +      getCSLeafQueue().decAMUsedResource(oldPartition, containerResource, this);
            +      getCSLeafQueue().incAMUsedResource(newPartition, containerResource, this);
            +    }
            

            Here AM-Resource-Request.resource is updated in FiCaSchedulerApp's ctor based on rmApp.getAMResourceRequest. Once container is allocated, this resource will be come a part of the partition with no change in resource. So I feel I need not have to update resource in allocate call of FicaSchedulerApp. Am I correct?

          • am-resource-percent per user per partition: Yes, I will raise a new ticket to handle this and will make changes there instead of doing in this.
          Show
          sunilg Sunil G added a comment - Thank you Wangda Tan for sharing the comments. Please let me know if there's any specific reason to add a new maximum-am-resource-partition. I agree with you. We could use the same configuration name under each label. if there's any activated apps under a partition, it is equivalent to queueUsage.getAMUsed(partitionName) Yes. This will be enough, i kept a new map with the idea of maintaining some more information in similar lines with User. But as of now, the change suggested is enough. I will remove the map. you don't have to update am-resource when AM container just allocated, because AM-container.resource and am-resource-request.node-label-request won't be changed, but you need to update this if partition of AM-container's NM updated As I see it, we may need to below change. In FiCaSchedulerApp's ctor, update AM-Resource-Request.resource on partition ( keep existing code). But use rmApp.getAMResourceRequest().getNodeLabelExpression() to setAMResource instead of setting to NO_LABEL. Because this information wont be changed later. if partition of AM-container's NM updated, we need to change AMResource which I am handling in nodePartitionUpdated as below. + if (rmContainer.isAMContainer()) { + setAppAMNodePartitionName(newPartition); + this .attemptResourceUsage.decAMUsed(oldPartition, containerResource); + this .attemptResourceUsage.incAMUsed(newPartition, containerResource); + getCSLeafQueue().decAMUsedResource(oldPartition, containerResource, this ); + getCSLeafQueue().incAMUsedResource(newPartition, containerResource, this ); + } Here AM-Resource-Request.resource is updated in FiCaSchedulerApp's ctor based on rmApp.getAMResourceRequest . Once container is allocated, this resource will be come a part of the partition with no change in resource. So I feel I need not have to update resource in allocate call of FicaSchedulerApp. Am I correct? am-resource-percent per user per partition: Yes, I will raise a new ticket to handle this and will make changes there instead of doing in this.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Sunil G,

          In FiCaSchedulerApp's ctor, update AM-Resource-Request.resource on partition ( keep existing code). But use rmApp.getAMResourceRequest().getNodeLabelExpression() to setAMResource instead of setting to NO_LABEL. Because this information wont be changed later.

          Make sense to me.

          Here AM-Resource-Request.resource is updated in FiCaSchedulerApp's ctor based on rmApp.getAMResourceRequest. Once container is allocated, this resource will be come a part of the partition with no change in resource. So I feel I need not have to update resource in allocate call of FicaSchedulerApp. Am I correct?

          I think so, maybe we need to consider increase container also, but I think it's an edge case, we may or may not need to handle it.

          Show
          leftnoteasy Wangda Tan added a comment - Sunil G , In FiCaSchedulerApp's ctor, update AM-Resource-Request.resource on partition ( keep existing code). But use rmApp.getAMResourceRequest().getNodeLabelExpression() to setAMResource instead of setting to NO_LABEL. Because this information wont be changed later. Make sense to me. Here AM-Resource-Request.resource is updated in FiCaSchedulerApp's ctor based on rmApp.getAMResourceRequest. Once container is allocated, this resource will be come a part of the partition with no change in resource. So I feel I need not have to update resource in allocate call of FicaSchedulerApp. Am I correct? I think so, maybe we need to consider increase container also, but I think it's an edge case, we may or may not need to handle it.
          Hide
          sunilg Sunil G added a comment -

          Attaching an updated version of patch addressing the comments.

          I will updated another patch with more test cases to cover all possible error conditions.

          Show
          sunilg Sunil G added a comment - Attaching an updated version of patch addressing the comments. I will updated another patch with more test cases to cover all possible error conditions.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 17m 53s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 8m 15s There were no new javac warning messages.
          +1 javadoc 10m 27s There were no new javadoc warning messages.
          -1 release audit 0m 20s The applied patch generated 1 release audit warnings.
          -1 checkstyle 0m 51s The applied patch generated 11 new checkstyle issues (total was 270, now 260).
          -1 whitespace 0m 6s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix.
          +1 install 1m 50s mvn install still works.
          +1 eclipse:eclipse 0m 40s The patch built with eclipse:eclipse.
          +1 findbugs 1m 31s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 61m 0s Tests passed in hadoop-yarn-server-resourcemanager.
              102m 58s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12765766/0004-YARN-3216.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 8f19538
          Release Audit https://builds.apache.org/job/PreCommit-YARN-Build/9391/artifact/patchprocess/patchReleaseAuditProblems.txt
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9391/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
          whitespace https://builds.apache.org/job/PreCommit-YARN-Build/9391/artifact/patchprocess/whitespace.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9391/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9391/testReport/
          Java 1.7.0_55
          uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/9391/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 17m 53s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 8m 15s There were no new javac warning messages. +1 javadoc 10m 27s There were no new javadoc warning messages. -1 release audit 0m 20s The applied patch generated 1 release audit warnings. -1 checkstyle 0m 51s The applied patch generated 11 new checkstyle issues (total was 270, now 260). -1 whitespace 0m 6s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 1m 50s mvn install still works. +1 eclipse:eclipse 0m 40s The patch built with eclipse:eclipse. +1 findbugs 1m 31s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 61m 0s Tests passed in hadoop-yarn-server-resourcemanager.     102m 58s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12765766/0004-YARN-3216.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 8f19538 Release Audit https://builds.apache.org/job/PreCommit-YARN-Build/9391/artifact/patchprocess/patchReleaseAuditProblems.txt checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9391/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt whitespace https://builds.apache.org/job/PreCommit-YARN-Build/9391/artifact/patchprocess/whitespace.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9391/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9391/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9391/console This message was automatically generated.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Hi Sunil G,

          I think the changes to AbstractCSQueue may not be necessary, could you take a look at my previous comment (copied here) and let me know your thoughts?

          AbstractCSQueue: Instead of adding AM-used-resource to parentQueue, I think we may only need to calculate AM-used-resource on LeafQueueu and user. Currently we don't have limitation of AM-used-resource on parentQueue, so the aggregated resource may not be very useful. We can add it along the hierachy if we want to limit max-am-percent on parentQueue in the future.

          If max-am-percent for queue-partitions isn't set, I think it should use queue.max-am-percent instead of 0 to avoid painful of configuration. (admin has to set max-am-percent after add a new partition)

          I found the logic in your patch is: if max-am-percent for partition-x is not set, partition-x's am-limit equals to default-partition's am-limit, which is not correct to me. am-limit under each partition should be calculated independently, since total resource for different partitions varies.

          If you agree, could you merge the am-limit computation logic of default partition and specific partition?

          Thoughts?

          Thanks,

          Show
          leftnoteasy Wangda Tan added a comment - Hi Sunil G , I think the changes to AbstractCSQueue may not be necessary, could you take a look at my previous comment (copied here) and let me know your thoughts? AbstractCSQueue: Instead of adding AM-used-resource to parentQueue, I think we may only need to calculate AM-used-resource on LeafQueueu and user. Currently we don't have limitation of AM-used-resource on parentQueue, so the aggregated resource may not be very useful. We can add it along the hierachy if we want to limit max-am-percent on parentQueue in the future. If max-am-percent for queue-partitions isn't set, I think it should use queue.max-am-percent instead of 0 to avoid painful of configuration. (admin has to set max-am-percent after add a new partition) I found the logic in your patch is: if max-am-percent for partition-x is not set, partition-x's am-limit equals to default-partition's am-limit, which is not correct to me. am-limit under each partition should be calculated independently, since total resource for different partitions varies. If you agree, could you merge the am-limit computation logic of default partition and specific partition? Thoughts? Thanks,
          Hide
          sunilg Sunil G added a comment -

          Hi Wangda Tan
          Thank you very much for the comments. Updating patch addressing the comments.

          Instead of adding AM-used-resource to parentQueue, I think we may only need to calculate AM-used-resource on LeafQueueu and user

          Yes, I understood your point. I have kept the changes now only to LeafQueue.

          If you agree, could you merge the am-limit computation logic of default partition and specific partition?

          Yes. It will be better if we have a default am-limit per-queue when label-am-limit is not present. Agreeing your point on easiness in configuration and to avoid extra if checks.

          One more point. As per YARN-3265, you have introduced
          queueResourceLimitsInfo.getQueueCurrentLimit() instead of queueHeadroomInfo.getQueueMaxCap() and this is used in old getAMResourceLimit to see the max-capacity per-queue.

          now queueResourceLimitsInfo.getQueueCurrentLimit() is common per queue. And a queue may have 2 or 3 accessible-labels. So I feel I may not be able to use this total value always like in getAMResourceLimit. Hence I think I need to calculate max-capacity based on label-percentage per-queue. How do you feel?

          Show
          sunilg Sunil G added a comment - Hi Wangda Tan Thank you very much for the comments. Updating patch addressing the comments. Instead of adding AM-used-resource to parentQueue, I think we may only need to calculate AM-used-resource on LeafQueueu and user Yes, I understood your point. I have kept the changes now only to LeafQueue. If you agree, could you merge the am-limit computation logic of default partition and specific partition? Yes. It will be better if we have a default am-limit per-queue when label-am-limit is not present. Agreeing your point on easiness in configuration and to avoid extra if checks. One more point. As per YARN-3265 , you have introduced queueResourceLimitsInfo.getQueueCurrentLimit() instead of queueHeadroomInfo.getQueueMaxCap() and this is used in old getAMResourceLimit to see the max-capacity per-queue. now queueResourceLimitsInfo.getQueueCurrentLimit() is common per queue. And a queue may have 2 or 3 accessible-labels. So I feel I may not be able to use this total value always like in getAMResourceLimit . Hence I think I need to calculate max-capacity based on label-percentage per-queue. How do you feel?
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 16m 41s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 5 new or modified test files.
          +1 javac 7m 44s There were no new javac warning messages.
          +1 javadoc 10m 10s There were no new javadoc warning messages.
          -1 release audit 0m 19s The applied patch generated 1 release audit warnings.
          -1 checkstyle 0m 49s The applied patch generated 7 new checkstyle issues (total was 191, now 177).
          -1 whitespace 0m 13s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix.
          +1 install 1m 29s mvn install still works.
          +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse.
          +1 findbugs 1m 27s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          -1 yarn tests 57m 9s Tests failed in hadoop-yarn-server-resourcemanager.
              96m 40s  



          Reason Tests
          Failed unit tests hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12766026/0005-YARN-3216.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / db93047
          Release Audit https://builds.apache.org/job/PreCommit-YARN-Build/9404/artifact/patchprocess/patchReleaseAuditProblems.txt
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9404/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
          whitespace https://builds.apache.org/job/PreCommit-YARN-Build/9404/artifact/patchprocess/whitespace.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9404/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9404/testReport/
          Java 1.7.0_55
          uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/9404/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 16m 41s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 5 new or modified test files. +1 javac 7m 44s There were no new javac warning messages. +1 javadoc 10m 10s There were no new javadoc warning messages. -1 release audit 0m 19s The applied patch generated 1 release audit warnings. -1 checkstyle 0m 49s The applied patch generated 7 new checkstyle issues (total was 191, now 177). -1 whitespace 0m 13s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 1m 29s mvn install still works. +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse. +1 findbugs 1m 27s The patch does not introduce any new Findbugs (version 3.0.0) warnings. -1 yarn tests 57m 9s Tests failed in hadoop-yarn-server-resourcemanager.     96m 40s   Reason Tests Failed unit tests hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12766026/0005-YARN-3216.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / db93047 Release Audit https://builds.apache.org/job/PreCommit-YARN-Build/9404/artifact/patchprocess/patchReleaseAuditProblems.txt checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9404/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt whitespace https://builds.apache.org/job/PreCommit-YARN-Build/9404/artifact/patchprocess/whitespace.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9404/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9404/testReport/ Java 1.7.0_55 uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9404/console This message was automatically generated.
          Hide
          leftnoteasy Wangda Tan added a comment -

          now queueResourceLimitsInfo.getQueueCurrentLimit() is common per queue. And a queue may have 2 or 3 accessible-labels. So I feel I may not be able to use this total value always like in getAMResourceLimit. Hence I think I need to calculate max-capacity based on label-percentage per-queue. How do you feel?

          You can take a look at AbstractCSQueue#getCurrentLimitResource as an example to see how queue calculates max-capacity by partition.

          Show
          leftnoteasy Wangda Tan added a comment - now queueResourceLimitsInfo.getQueueCurrentLimit() is common per queue. And a queue may have 2 or 3 accessible-labels. So I feel I may not be able to use this total value always like in getAMResourceLimit. Hence I think I need to calculate max-capacity based on label-percentage per-queue. How do you feel? You can take a look at AbstractCSQueue#getCurrentLimitResource as an example to see how queue calculates max-capacity by partition.
          Hide
          sunilg Sunil G added a comment -

          Hi Wangda Tan
          Thank you for confirming to use max-capacity also while calculating max-am-resource per-partition. Yes, getCurrentLimitResource shows the way how we can get the per-partition level max-capacity in a queue, I will be using the same only. Also TestApplicationLimits is not correct. It statically calculates full cluster resource as queue limit. I will correct that also in the next patch.

          Show
          sunilg Sunil G added a comment - Hi Wangda Tan Thank you for confirming to use max-capacity also while calculating max-am-resource per-partition. Yes, getCurrentLimitResource shows the way how we can get the per-partition level max-capacity in a queue, I will be using the same only. Also TestApplicationLimits is not correct. It statically calculates full cluster resource as queue limit. I will correct that also in the next patch.
          Hide
          sunilg Sunil G added a comment -

          Attaching patch addressing the comments. Kindly help to check the same.

          Show
          sunilg Sunil G added a comment - Attaching patch addressing the comments. Kindly help to check the same.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 17m 9s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 5 new or modified test files.
          +1 javac 8m 26s There were no new javac warning messages.
          +1 javadoc 10m 36s There were no new javadoc warning messages.
          -1 release audit 0m 19s The applied patch generated 1 release audit warnings.
          +1 checkstyle 0m 49s There were no new checkstyle issues.
          +1 whitespace 0m 10s The patch has no lines that end in whitespace.
          +1 install 1m 31s mvn install still works.
          +1 eclipse:eclipse 0m 37s The patch built with eclipse:eclipse.
          +1 findbugs 1m 30s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          -1 yarn tests 58m 24s Tests failed in hadoop-yarn-server-resourcemanager.
              99m 34s  



          Reason Tests
          Failed unit tests hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices
          Timed out tests org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
            org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12766145/0006-YARN-3216.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / e617cf6
          Release Audit https://builds.apache.org/job/PreCommit-YARN-Build/9411/artifact/patchprocess/patchReleaseAuditProblems.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9411/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9411/testReport/
          Java 1.7.0_55
          uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/9411/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 17m 9s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 5 new or modified test files. +1 javac 8m 26s There were no new javac warning messages. +1 javadoc 10m 36s There were no new javadoc warning messages. -1 release audit 0m 19s The applied patch generated 1 release audit warnings. +1 checkstyle 0m 49s There were no new checkstyle issues. +1 whitespace 0m 10s The patch has no lines that end in whitespace. +1 install 1m 31s mvn install still works. +1 eclipse:eclipse 0m 37s The patch built with eclipse:eclipse. +1 findbugs 1m 30s The patch does not introduce any new Findbugs (version 3.0.0) warnings. -1 yarn tests 58m 24s Tests failed in hadoop-yarn-server-resourcemanager.     99m 34s   Reason Tests Failed unit tests hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices Timed out tests org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart   org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12766145/0006-YARN-3216.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / e617cf6 Release Audit https://builds.apache.org/job/PreCommit-YARN-Build/9411/artifact/patchprocess/patchReleaseAuditProblems.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9411/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9411/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9411/console This message was automatically generated.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Hi Sunil,

          Thanks for update, some questions/comments:

          1) getAMResourceLimitPerPartition: why computing max of queue's max resource and configured resource?

          ...
          Resources.max(resourceCalculator,
            lastClusterResource, queueMaxResource, partitionCapacity),
          ...
          

          I think partitionCapacity should be enough.

          And some naming suggestions:

          • partitionCapacity -> queuePartitionResource, "capacity" means the percentage of capacity in most of CS logic.

          2) getUserAMResourceLimit should consider partition as well? I think we should use getAMResourceLimitPerPartition to get userAMResourceLimitPerPartition.

          3) If you agree with 2), getAMResourceLimit and getUserAMResourceLimit is used by test/REST-API only, I think we need to remove then and use the get(User)AMLimitPerPartition instead.

          4) Could you give a default am-label value to MockRM.submitApp, which could avoid some test code changes.

          5) Suggest to add new tests to a new file such as TestApplicationLimitsByPartition. TestNodeLabelContainerAllocation is used to test container allocation behavior. And could you add more corner case tests, such as less than one application activated in a queue / by default partition' am-percent is queue's am-percent, etc.

          Show
          leftnoteasy Wangda Tan added a comment - Hi Sunil, Thanks for update, some questions/comments: 1) getAMResourceLimitPerPartition: why computing max of queue's max resource and configured resource? ... Resources.max(resourceCalculator, lastClusterResource, queueMaxResource, partitionCapacity), ... I think partitionCapacity should be enough. And some naming suggestions: partitionCapacity -> queuePartitionResource, "capacity" means the percentage of capacity in most of CS logic. 2) getUserAMResourceLimit should consider partition as well? I think we should use getAMResourceLimitPerPartition to get userAMResourceLimitPerPartition. 3) If you agree with 2), getAMResourceLimit and getUserAMResourceLimit is used by test/REST-API only, I think we need to remove then and use the get(User)AMLimitPerPartition instead. 4) Could you give a default am-label value to MockRM.submitApp, which could avoid some test code changes. 5) Suggest to add new tests to a new file such as TestApplicationLimitsByPartition. TestNodeLabelContainerAllocation is used to test container allocation behavior. And could you add more corner case tests, such as less than one application activated in a queue / by default partition' am-percent is queue's am-percent, etc.
          Hide
          sunilg Sunil G added a comment -

          Hi Wangda Tan
          Thank you for sharing the comments.

          why computing max of queue's max resource and configured resource?

          I have one scenario here. When a queue capacity is 0.1 and max capacity is 1.0, it can use complete resources in cluster. So if all nodes are with NO_LABEL, we can take all nodes of applications running in this queue (best case scenario).

          So max-capacity will be helpful here, and as I see it getAMResourceLimit() is having this also considered.

              synchronized (queueResourceLimitsInfo) {
                queueCurrentLimit = queueResourceLimitsInfo.getQueueCurrentLimit();
              }
              Resource queueCap = Resources.max(resourceCalculator, lastClusterResource,
                  absoluteCapacityResource, queueCurrentLimit);
          

          Am I missing something, please correct me if I am wrong.

          getUserAMResourceLimit should consider partition as well

          As suggested in an earlier comment, I created a JIRA to track user-am-resource-limit per-partition separately. YARN-4229. I will link the issue here.

          I will address all the other comments in next patch.

          Show
          sunilg Sunil G added a comment - Hi Wangda Tan Thank you for sharing the comments. why computing max of queue's max resource and configured resource? I have one scenario here. When a queue capacity is 0.1 and max capacity is 1.0, it can use complete resources in cluster. So if all nodes are with NO_LABEL, we can take all nodes of applications running in this queue (best case scenario). So max-capacity will be helpful here, and as I see it getAMResourceLimit() is having this also considered. synchronized (queueResourceLimitsInfo) { queueCurrentLimit = queueResourceLimitsInfo.getQueueCurrentLimit(); } Resource queueCap = Resources.max(resourceCalculator, lastClusterResource, absoluteCapacityResource, queueCurrentLimit); Am I missing something, please correct me if I am wrong. getUserAMResourceLimit should consider partition as well As suggested in an earlier comment, I created a JIRA to track user-am-resource-limit per-partition separately. YARN-4229 . I will link the issue here. I will address all the other comments in next patch.
          Hide
          sunilg Sunil G added a comment -

          Hi Wangda Tan
          Attaching patch addressing most of the comments except "why computing max-queue limit". I have share my thoughts in earlier comment, kindly help to check the same.
          As needed, I will update subsequent patch based on discussion.

          Show
          sunilg Sunil G added a comment - Hi Wangda Tan Attaching patch addressing most of the comments except "why computing max-queue limit". I have share my thoughts in earlier comment, kindly help to check the same. As needed, I will update subsequent patch based on discussion.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 21m 8s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 3 new or modified test files.
          +1 javac 10m 28s There were no new javac warning messages.
          +1 javadoc 14m 4s There were no new javadoc warning messages.
          +1 release audit 0m 48s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 1m 41s There were no new checkstyle issues.
          -1 whitespace 0m 11s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix.
          +1 install 2m 26s mvn install still works.
          +1 eclipse:eclipse 0m 51s The patch built with eclipse:eclipse.
          +1 findbugs 1m 49s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          -1 yarn tests 61m 45s Tests failed in hadoop-yarn-server-resourcemanager.
              115m 16s  



          Reason Tests
          Failed unit tests hadoop.yarn.server.resourcemanager.scheduler.capacity.TestWorkPreservingRMRestartForNodeLabel
            hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
          Timed out tests org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12766591/0007-YARN-3216.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 56dc777
          whitespace https://builds.apache.org/job/PreCommit-YARN-Build/9446/artifact/patchprocess/whitespace.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9446/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9446/testReport/
          Java 1.7.0_55
          uname Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/9446/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 21m 8s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 3 new or modified test files. +1 javac 10m 28s There were no new javac warning messages. +1 javadoc 14m 4s There were no new javadoc warning messages. +1 release audit 0m 48s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 1m 41s There were no new checkstyle issues. -1 whitespace 0m 11s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 2m 26s mvn install still works. +1 eclipse:eclipse 0m 51s The patch built with eclipse:eclipse. +1 findbugs 1m 49s The patch does not introduce any new Findbugs (version 3.0.0) warnings. -1 yarn tests 61m 45s Tests failed in hadoop-yarn-server-resourcemanager.     115m 16s   Reason Tests Failed unit tests hadoop.yarn.server.resourcemanager.scheduler.capacity.TestWorkPreservingRMRestartForNodeLabel   hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation Timed out tests org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12766591/0007-YARN-3216.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 56dc777 whitespace https://builds.apache.org/job/PreCommit-YARN-Build/9446/artifact/patchprocess/whitespace.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9446/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9446/testReport/ Java 1.7.0_55 uname Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9446/console This message was automatically generated.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Thanks for update Sunil G.

          As suggested in an earlier comment, I created a JIRA to track user-am-resource-limit per-partition separately. YARN-4229. I will link the issue here.

          I think most of user-am-resource-limit per partition is covered by the patch, correct? One thing lacking in the patch is, we haven't consider number of active users for each partition, I'm not sure if that is the highest priority.

          So max-capacity will be helpful here, and as I see it getAMResourceLimit() is having this also considered.

          I can remember what we discussed while doing YARN-2637, I think the approach in your patch is slightly different from YARN-2637:

          YARN-2637 is: amLimit = max(queue-limit, queue-configured-capacity) * max-am-percent
          And queue-limit is based on queue's max capacity as well as sibling used resource, so it is possible queue-limit less than queue-configured-capacity.

          In your patch:
          amLimit = max(queue-max-capacity, queue-configured-capacity) * max-am-percent. Since the queue-max-capacity will be always >= queue-configured-capacity.

          Before we have queue-limit computation for node partition, I think we should use queue-configured-capacity. Otherwise, a queue has max-capacity >> configured-capacity will be problematic.

          Few more comments:

          • For the userAMCheck:
            if (getNumActiveApplications() < 1) {
               //...
            }
            

            I think we should be able to allocate at least one AM container in the partition, correct? Just like what you did to queue:

            	            || (Resources.lessThanOrEqual(resourceCalculator,
            	                lastClusterResource, queueUsage.getAMUsed(partitionName),
            	                Resources.none()))) {
            
          • Could you use org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestUtils.getConfigurationWithQueueLabels(Configuration) instead of redefining the queue's configuration in the test?
          • Could you add a brief description at each of your test case?
          • I think there're two cases need to be covered: 1) User's AM limit 2) AM-limit of queue/user which can allocate multiple AM (I didn't see the case in your test, this is important to make sure we calculate total AM resource correct). 3) When AM-usage will be updated after we update partition of nodes.

          Thoughts?

          Show
          leftnoteasy Wangda Tan added a comment - Thanks for update Sunil G . As suggested in an earlier comment, I created a JIRA to track user-am-resource-limit per-partition separately. YARN-4229 . I will link the issue here. I think most of user-am-resource-limit per partition is covered by the patch, correct? One thing lacking in the patch is, we haven't consider number of active users for each partition, I'm not sure if that is the highest priority. So max-capacity will be helpful here, and as I see it getAMResourceLimit() is having this also considered. I can remember what we discussed while doing YARN-2637 , I think the approach in your patch is slightly different from YARN-2637 : YARN-2637 is: amLimit = max(queue-limit, queue-configured-capacity) * max-am-percent And queue-limit is based on queue's max capacity as well as sibling used resource, so it is possible queue-limit less than queue-configured-capacity. In your patch: amLimit = max(queue-max-capacity, queue-configured-capacity) * max-am-percent. Since the queue-max-capacity will be always >= queue-configured-capacity. Before we have queue-limit computation for node partition, I think we should use queue-configured-capacity. Otherwise, a queue has max-capacity >> configured-capacity will be problematic. Few more comments: For the userAMCheck: if (getNumActiveApplications() < 1) { //... } I think we should be able to allocate at least one AM container in the partition, correct? Just like what you did to queue: || (Resources.lessThanOrEqual(resourceCalculator, lastClusterResource, queueUsage.getAMUsed(partitionName), Resources.none()))) { Could you use org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestUtils.getConfigurationWithQueueLabels(Configuration) instead of redefining the queue's configuration in the test? Could you add a brief description at each of your test case? I think there're two cases need to be covered: 1) User's AM limit 2) AM-limit of queue/user which can allocate multiple AM (I didn't see the case in your test, this is important to make sure we calculate total AM resource correct). 3) When AM-usage will be updated after we update partition of nodes. Thoughts?
          Hide
          leftnoteasy Wangda Tan added a comment -

          And could you take a look at test failures? I think they're related.

          Show
          leftnoteasy Wangda Tan added a comment - And could you take a look at test failures? I think they're related.
          Hide
          sunilg Sunil G added a comment -

          Thank you Wangda Tan for the comments.

          Otherwise, a queue has max-capacity >> configured-capacity will be problematic

          I also wanted to use queue-limit computation for node partition at first. Yes, for now we can use queue-configured-capacity which is been already handled with cachedHeadroom in queue level. This can almost match our usecase. and I feel I can file a ticket to handle queue-limit computation for node partition. Thoughts?

          For the userAMCheck, I think we should be able to allocate at least one AM container in the partition, correct? Just like what you did to queue:

          Yes. We just need that one check to handle user-am-limit. I will close YARN-4229 as Wont Fix as we are already handling this here.

          I will submit patch with test cases modifications as suggested.

          Show
          sunilg Sunil G added a comment - Thank you Wangda Tan for the comments. Otherwise, a queue has max-capacity >> configured-capacity will be problematic I also wanted to use queue-limit computation for node partition at first. Yes, for now we can use queue-configured-capacity which is been already handled with cachedHeadroom in queue level. This can almost match our usecase. and I feel I can file a ticket to handle queue-limit computation for node partition. Thoughts? For the userAMCheck, I think we should be able to allocate at least one AM container in the partition, correct? Just like what you did to queue: Yes. We just need that one check to handle user-am-limit. I will close YARN-4229 as Wont Fix as we are already handling this here. I will submit patch with test cases modifications as suggested.
          Hide
          leftnoteasy Wangda Tan added a comment -

          This can almost match our usecase. and I feel I can file a ticket to handle queue-limit computation for node partition. Thoughts?

          We need it, but I think the problem is, we don't have an API for application to get headroom for different partitions. IIRC, queue limit is used to compute more precise headroom for application. We can add the queue-limit computation for node partition after we have such API.

          Show
          leftnoteasy Wangda Tan added a comment - This can almost match our usecase. and I feel I can file a ticket to handle queue-limit computation for node partition. Thoughts? We need it, but I think the problem is, we don't have an API for application to get headroom for different partitions. IIRC, queue limit is used to compute more precise headroom for application. We can add the queue-limit computation for node partition after we have such API.
          Hide
          sunilg Sunil G added a comment -

          Sure Wangda Tan. For now we will use queueResourceLimitsInfo.getQueueCurrentLimit() as one limit to calculate am-resource-percent along with configured capacity of partition in queue.

          Show
          sunilg Sunil G added a comment - Sure Wangda Tan . For now we will use queueResourceLimitsInfo.getQueueCurrentLimit() as one limit to calculate am-resource-percent along with configured capacity of partition in queue.
          Hide
          leftnoteasy Wangda Tan added a comment -

          For now we will use queueResourceLimitsInfo.getQueueCurrentLimit() as one limit to calculate am-resource-percent along with configured capacity of partition in queue.

          I think we should use queueResourceLimitsInfo.getQueueCurrentLimit() as one limit of non-labeled partition.

          Show
          leftnoteasy Wangda Tan added a comment - For now we will use queueResourceLimitsInfo.getQueueCurrentLimit() as one limit to calculate am-resource-percent along with configured capacity of partition in queue. I think we should use queueResourceLimitsInfo.getQueueCurrentLimit() as one limit of non-labeled partition .
          Hide
          sunilg Sunil G added a comment -

          Yes. For labeled partition, we will only use resource limit per-partition per-queue (based on absolute capacity). For empty label, we will use both. Correct?

          Show
          sunilg Sunil G added a comment - Yes. For labeled partition, we will only use resource limit per-partition per-queue (based on absolute capacity). For empty label, we will use both. Correct?
          Hide
          leftnoteasy Wangda Tan added a comment -

          Yes. For labeled partition, we will only use resource limit per-partition per-queue (based on absolute capacity). For empty label, we will use both. Correct?

          Yes I think so.

          Show
          leftnoteasy Wangda Tan added a comment - Yes. For labeled partition, we will only use resource limit per-partition per-queue (based on absolute capacity). For empty label, we will use both. Correct? Yes I think so.
          Hide
          sunilg Sunil G added a comment -

          Thanks Wangda Tan. I ll also write more tests to see this Cases too. Will upload a patch soon.

          Show
          sunilg Sunil G added a comment - Thanks Wangda Tan . I ll also write more tests to see this Cases too. Will upload a patch soon.
          Hide
          sunilg Sunil G added a comment -

          Thank you Wangda Tan for sharing the comments. Attaching an updated patch addressing the comments.

          Few information's about this patch:
          1. User AM limit changes are also included in this patch.
          2. More test cases are added for user AM check, and change partition resources etc.

          Kindly help to check the same.

          Show
          sunilg Sunil G added a comment - Thank you Wangda Tan for sharing the comments. Attaching an updated patch addressing the comments. Few information's about this patch: 1. User AM limit changes are also included in this patch. 2. More test cases are added for user AM check, and change partition resources etc. Kindly help to check the same.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 17m 11s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 7 new or modified test files.
          +1 javac 7m 57s There were no new javac warning messages.
          +1 javadoc 10m 21s There were no new javadoc warning messages.
          +1 release audit 0m 25s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 49s There were no new checkstyle issues.
          -1 whitespace 0m 15s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix.
          +1 install 1m 32s mvn install still works.
          +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse.
          +1 findbugs 1m 28s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 63m 34s Tests passed in hadoop-yarn-server-resourcemanager.
              104m 11s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12767216/0008-YARN-3216.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 58590fe
          whitespace https://builds.apache.org/job/PreCommit-YARN-Build/9469/artifact/patchprocess/whitespace.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9469/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9469/testReport/
          Java 1.7.0_55
          uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/9469/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 17m 11s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 7 new or modified test files. +1 javac 7m 57s There were no new javac warning messages. +1 javadoc 10m 21s There were no new javadoc warning messages. +1 release audit 0m 25s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 49s There were no new checkstyle issues. -1 whitespace 0m 15s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 1m 32s mvn install still works. +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse. +1 findbugs 1m 28s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 63m 34s Tests passed in hadoop-yarn-server-resourcemanager.     104m 11s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12767216/0008-YARN-3216.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 58590fe whitespace https://builds.apache.org/job/PreCommit-YARN-Build/9469/artifact/patchprocess/whitespace.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9469/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9469/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9469/console This message was automatically generated.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Thanks for update, Sunil G.

          Some minor comments:

          1) Minor suggestions in LeafQueue#getAMResourceLimitPerPartition:

              Resource queueCurrentLimit;
              synchronized (queueResourceLimitsInfo) {
                queueCurrentLimit = queueResourceLimitsInfo.getQueueCurrentLimit();
              }
          

          could be placed within if (!nodePartition.equals(RMNodeLabelsManager.NO_LABEL)) check.

          It maybe better to assign a variable to

          Resources.max(resourceCalculator,
                      lastClusterResource, queueCurrentLimit,
                      queuePartitionResource)
          

          Such as, queuePartitionUsableResource, etc.

          2) Will tests fail if revert changes of TestNodeLabelContainerAllocation. Same as TestWorkPreservingRMRestartForNodeLabel

          3) Tests:
          Suggestions:
          testAtleastOneAMRunScenarioPerPartition -> testAtleastOneAMRunPerPartition

          Show
          leftnoteasy Wangda Tan added a comment - Thanks for update, Sunil G . Some minor comments: 1) Minor suggestions in LeafQueue#getAMResourceLimitPerPartition: Resource queueCurrentLimit; synchronized (queueResourceLimitsInfo) { queueCurrentLimit = queueResourceLimitsInfo.getQueueCurrentLimit(); } could be placed within if (!nodePartition.equals(RMNodeLabelsManager.NO_LABEL)) check. It maybe better to assign a variable to Resources.max(resourceCalculator, lastClusterResource, queueCurrentLimit, queuePartitionResource) Such as, queuePartitionUsableResource, etc. 2) Will tests fail if revert changes of TestNodeLabelContainerAllocation. Same as TestWorkPreservingRMRestartForNodeLabel 3) Tests: Suggestions: testAtleastOneAMRunScenarioPerPartition -> testAtleastOneAMRunPerPartition
          Hide
          sunilg Sunil G added a comment -

          Hi Wangda Tan
          Thank you for sharing the comments. Updating patch addressing the same.

          2) Will tests fail if revert changes of TestNodeLabelContainerAllocation

          Yes, tests will fail For example, if we do not give any label to submitApp, then it will try to allocate to default label. Because in MockRM, we set NO_LABEL in all am resource request by default. And MockRM.launchAndRegisterAM(app1, rm1, nm1);, will try to schedule to nm1 which only takes "x". So AM will not be allocated, rather it will stay in scheduled state.

              MockNM nm1 = rm1.registerNode("h1:1234", 8000); // label = x
              rm1.registerNode("h2:1234", 8000); // label = y
              MockNM nm3 = rm1.registerNode("h3:1234", 8000); // label = <empty>
          
              // launch an app to queue a1 (label = x), and check all container will
              // be allocated in h1
              RMApp app1 = rm1.submitApp(200, "app", "user", null, "a1", "x");
              MockAM am1 = MockRM.launchAndRegisterAM(app1, rm1, nm1);
          
          Show
          sunilg Sunil G added a comment - Hi Wangda Tan Thank you for sharing the comments. Updating patch addressing the same. 2) Will tests fail if revert changes of TestNodeLabelContainerAllocation Yes, tests will fail For example, if we do not give any label to submitApp , then it will try to allocate to default label. Because in MockRM, we set NO_LABEL in all am resource request by default. And MockRM.launchAndRegisterAM(app1, rm1, nm1); , will try to schedule to nm1 which only takes "x". So AM will not be allocated, rather it will stay in scheduled state. MockNM nm1 = rm1.registerNode( "h1:1234" , 8000); // label = x rm1.registerNode( "h2:1234" , 8000); // label = y MockNM nm3 = rm1.registerNode( "h3:1234" , 8000); // label = <empty> // launch an app to queue a1 (label = x), and check all container will // be allocated in h1 RMApp app1 = rm1.submitApp(200, "app" , "user" , null , "a1" , "x" ); MockAM am1 = MockRM.launchAndRegisterAM(app1, rm1, nm1);
          Hide
          hadoopqa Hadoop QA added a comment -



          +1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 17m 35s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 7 new or modified test files.
          +1 javac 8m 7s There were no new javac warning messages.
          +1 javadoc 10m 26s There were no new javadoc warning messages.
          +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 50s There were no new checkstyle issues.
          +1 whitespace 0m 15s The patch has no lines that end in whitespace.
          +1 install 1m 30s mvn install still works.
          +1 eclipse:eclipse 0m 35s The patch built with eclipse:eclipse.
          +1 findbugs 1m 29s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 64m 8s Tests passed in hadoop-yarn-server-resourcemanager.
              105m 24s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12768001/0009-YARN-3216.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 381610d
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9520/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9520/testReport/
          Java 1.7.0_55
          uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/9520/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 17m 35s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 7 new or modified test files. +1 javac 8m 7s There were no new javac warning messages. +1 javadoc 10m 26s There were no new javadoc warning messages. +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 50s There were no new checkstyle issues. +1 whitespace 0m 15s The patch has no lines that end in whitespace. +1 install 1m 30s mvn install still works. +1 eclipse:eclipse 0m 35s The patch built with eclipse:eclipse. +1 findbugs 1m 29s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 64m 8s Tests passed in hadoop-yarn-server-resourcemanager.     105m 24s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12768001/0009-YARN-3216.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 381610d hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9520/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9520/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9520/console This message was automatically generated.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Sunil G.

          I just checked code: getComplexConfigurationWithQueueLabels and other methods used by TestNodeLabelContainerAllocation has default node label expression setting. IIUC, these tests shouldn't fail. I think they may relate to the code in MockRM:

          523	    amResourceRequest.setNodeLabelExpression((amLabel == null) ? "" : amLabel
          524	        .trim());
          

          Thanks,

          Show
          leftnoteasy Wangda Tan added a comment - Sunil G . I just checked code: getComplexConfigurationWithQueueLabels and other methods used by TestNodeLabelContainerAllocation has default node label expression setting. IIUC, these tests shouldn't fail. I think they may relate to the code in MockRM: 523 amResourceRequest.setNodeLabelExpression((amLabel == null ) ? "" : amLabel 524 .trim()); Thanks,
          Hide
          sunilg Sunil G added a comment -

          Yes Wangda. I feel we need not have to give a default label ("") in am
          resource request from MockRM.

          Show
          sunilg Sunil G added a comment - Yes Wangda. I feel we need not have to give a default label ("") in am resource request from MockRM.
          Hide
          sunilg Sunil G added a comment -

          Hi Wangda Tan
          In MockRM#submitApp, if we set a default amResourceRequest, it will force the scheduler to allocate AM on that label. i think we need not have to set default label from submitApp, and can set it when any test case specifically needs the same. Will it be fine? attaching a patch with same.

          Show
          sunilg Sunil G added a comment - Hi Wangda Tan In MockRM#submitApp , if we set a default amResourceRequest, it will force the scheduler to allocate AM on that label. i think we need not have to set default label from submitApp, and can set it when any test case specifically needs the same. Will it be fine? attaching a patch with same.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 17m 12s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 5 new or modified test files.
          +1 javac 7m 59s There were no new javac warning messages.
          +1 javadoc 10m 29s There were no new javadoc warning messages.
          +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 49s There were no new checkstyle issues.
          +1 whitespace 0m 12s The patch has no lines that end in whitespace.
          +1 install 1m 30s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 1m 31s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          -1 yarn tests 57m 47s Tests failed in hadoop-yarn-server-resourcemanager.
              98m 29s  



          Reason Tests
          Timed out tests org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12768201/0010-YARN-3216.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 124a412
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9538/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9538/testReport/
          Java 1.7.0_55
          uname Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/9538/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 17m 12s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 5 new or modified test files. +1 javac 7m 59s There were no new javac warning messages. +1 javadoc 10m 29s There were no new javadoc warning messages. +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 49s There were no new checkstyle issues. +1 whitespace 0m 12s The patch has no lines that end in whitespace. +1 install 1m 30s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 1m 31s The patch does not introduce any new Findbugs (version 3.0.0) warnings. -1 yarn tests 57m 47s Tests failed in hadoop-yarn-server-resourcemanager.     98m 29s   Reason Tests Timed out tests org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12768201/0010-YARN-3216.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 124a412 hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9538/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9538/testReport/ Java 1.7.0_55 uname Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9538/console This message was automatically generated.
          Hide
          sunilg Sunil G added a comment -

          Attaching an updating version after optimizing test in MockRM.

          Show
          sunilg Sunil G added a comment - Attaching an updating version after optimizing test in MockRM.
          Hide
          hadoopqa Hadoop QA added a comment -



          +1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 18m 45s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 5 new or modified test files.
          +1 javac 8m 52s There were no new javac warning messages.
          +1 javadoc 11m 32s There were no new javadoc warning messages.
          +1 release audit 0m 26s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 55s There were no new checkstyle issues.
          +1 whitespace 0m 12s The patch has no lines that end in whitespace.
          +1 install 1m 39s mvn install still works.
          +1 eclipse:eclipse 0m 36s The patch built with eclipse:eclipse.
          +1 findbugs 1m 35s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 61m 12s Tests passed in hadoop-yarn-server-resourcemanager.
              105m 48s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12768224/0011-YARN-3216.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 124a412
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9541/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9541/testReport/
          Java 1.7.0_55
          uname Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/9541/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 18m 45s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 5 new or modified test files. +1 javac 8m 52s There were no new javac warning messages. +1 javadoc 11m 32s There were no new javadoc warning messages. +1 release audit 0m 26s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 55s There were no new checkstyle issues. +1 whitespace 0m 12s The patch has no lines that end in whitespace. +1 install 1m 39s mvn install still works. +1 eclipse:eclipse 0m 36s The patch built with eclipse:eclipse. +1 findbugs 1m 35s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 61m 12s Tests passed in hadoop-yarn-server-resourcemanager.     105m 48s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12768224/0011-YARN-3216.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 124a412 hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9541/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9541/testReport/ Java 1.7.0_55 uname Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9541/console This message was automatically generated.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Sunil G,

          Thanks for update, the latest patch looks good to me, will commit in a few days if no objections.

          Show
          leftnoteasy Wangda Tan added a comment - Sunil G , Thanks for update, the latest patch looks good to me, will commit in a few days if no objections.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Committed to trunk/branch-2, thanks Sunil G and review from Naganarasimha G R/Eric Payne!

          Show
          leftnoteasy Wangda Tan added a comment - Committed to trunk/branch-2, thanks Sunil G and review from Naganarasimha G R / Eric Payne !
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #8711 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8711/)
          YARN-3216. Max-AM-Resource-Percentage should respect node labels. (Sunil (wangda: rev 56e4f6237ae8b1852e82b186e08db3934f79a9db)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerNodeLabelUpdate.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/QueueCapacities.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8711 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8711/ ) YARN-3216 . Max-AM-Resource-Percentage should respect node labels. (Sunil (wangda: rev 56e4f6237ae8b1852e82b186e08db3934f79a9db) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerNodeLabelUpdate.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/QueueCapacities.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #600 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/600/)
          YARN-3216. Max-AM-Resource-Percentage should respect node labels. (Sunil (wangda: rev 56e4f6237ae8b1852e82b186e08db3934f79a9db)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerNodeLabelUpdate.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/QueueCapacities.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #600 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/600/ ) YARN-3216 . Max-AM-Resource-Percentage should respect node labels. (Sunil (wangda: rev 56e4f6237ae8b1852e82b186e08db3934f79a9db) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerNodeLabelUpdate.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/QueueCapacities.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Yarn-trunk #1324 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1324/)
          YARN-3216. Max-AM-Resource-Percentage should respect node labels. (Sunil (wangda: rev 56e4f6237ae8b1852e82b186e08db3934f79a9db)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerNodeLabelUpdate.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/QueueCapacities.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk #1324 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1324/ ) YARN-3216 . Max-AM-Resource-Percentage should respect node labels. (Sunil (wangda: rev 56e4f6237ae8b1852e82b186e08db3934f79a9db) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerNodeLabelUpdate.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/QueueCapacities.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2478 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2478/)
          YARN-3216. Max-AM-Resource-Percentage should respect node labels. (Sunil (wangda: rev 56e4f6237ae8b1852e82b186e08db3934f79a9db)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerNodeLabelUpdate.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/QueueCapacities.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2478 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2478/ ) YARN-3216 . Max-AM-Resource-Percentage should respect node labels. (Sunil (wangda: rev 56e4f6237ae8b1852e82b186e08db3934f79a9db) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerNodeLabelUpdate.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/QueueCapacities.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2531 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2531/)
          YARN-3216. Max-AM-Resource-Percentage should respect node labels. (Sunil (wangda: rev 56e4f6237ae8b1852e82b186e08db3934f79a9db)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/QueueCapacities.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerNodeLabelUpdate.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2531 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2531/ ) YARN-3216 . Max-AM-Resource-Percentage should respect node labels. (Sunil (wangda: rev 56e4f6237ae8b1852e82b186e08db3934f79a9db) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/QueueCapacities.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerNodeLabelUpdate.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
          Hide
          sunilg Sunil G added a comment -

          Thank you very much Wangda Tan for the review and commit. Thank you Naganarasimha G R and Eric Payne for the review.

          Show
          sunilg Sunil G added a comment - Thank you very much Wangda Tan for the review and commit. Thank you Naganarasimha G R and Eric Payne for the review.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #589 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/589/)
          YARN-3216. Max-AM-Resource-Percentage should respect node labels. (Sunil (wangda: rev 56e4f6237ae8b1852e82b186e08db3934f79a9db)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/QueueCapacities.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerNodeLabelUpdate.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #589 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/589/ ) YARN-3216 . Max-AM-Resource-Percentage should respect node labels. (Sunil (wangda: rev 56e4f6237ae8b1852e82b186e08db3934f79a9db) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/QueueCapacities.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerNodeLabelUpdate.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #541 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/541/)
          YARN-3216. Max-AM-Resource-Percentage should respect node labels. (Sunil (wangda: rev 56e4f6237ae8b1852e82b186e08db3934f79a9db)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/QueueCapacities.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerNodeLabelUpdate.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #541 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/541/ ) YARN-3216 . Max-AM-Resource-Percentage should respect node labels. (Sunil (wangda: rev 56e4f6237ae8b1852e82b186e08db3934f79a9db) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/QueueCapacities.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerNodeLabelUpdate.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java
          Hide
          Naganarasimha Naganarasimha G R added a comment -

          Hi Tan, Wangda and Sunil G, What about this for 2.7.3 ?

          Show
          Naganarasimha Naganarasimha G R added a comment - Hi Tan, Wangda and Sunil G , What about this for 2.7.3 ?
          Hide
          sunilg Sunil G added a comment -

          Yes NGarla_Unused. It will be covering a good fix for AM allocation. However we need YARN-4304 along with this, else UI will not give correct information. I missed adding some screen shots there, which I ll add today.

          Show
          sunilg Sunil G added a comment - Yes NGarla_Unused . It will be covering a good fix for AM allocation. However we need YARN-4304 along with this, else UI will not give correct information. I missed adding some screen shots there, which I ll add today.
          Hide
          leftnoteasy Wangda Tan added a comment -

          I feel this could be risky regarding to complexity of this patch. And I'm not very sure if all dependencies of this patch is in branch-2.7. I would prefer to make decision after we have a branch-2.7 patch for this.

          Will review YARN-4304 tomorrow.

          Show
          leftnoteasy Wangda Tan added a comment - I feel this could be risky regarding to complexity of this patch. And I'm not very sure if all dependencies of this patch is in branch-2.7. I would prefer to make decision after we have a branch-2.7 patch for this. Will review YARN-4304 tomorrow.
          Hide
          sunilg Sunil G added a comment -

          Yes, patch covers major changes in scheduler, and depends on few metric items in NodeLabel some of which seems is not present in 2.7 line. I can double confirm the same.
          I will work on 2.7 patch for same, and will put the same here. Thank you Wangda Tan.

          Show
          sunilg Sunil G added a comment - Yes, patch covers major changes in scheduler, and depends on few metric items in NodeLabel some of which seems is not present in 2.7 line. I can double confirm the same. I will work on 2.7 patch for same, and will put the same here. Thank you Wangda Tan .
          Hide
          Naganarasimha Naganarasimha G R added a comment -

          That would good Sunil G,
          Please try once, as i feel this is one of the important limitation for using Node Label feature !

          Show
          Naganarasimha Naganarasimha G R added a comment - That would good Sunil G , Please try once, as i feel this is one of the important limitation for using Node Label feature !
          Hide
          Naganarasimha Naganarasimha G R added a comment -

          Hi Tan, Wangda & Sunil G,
          Given that YARN-4751 is trying to do in 2.6 what YARN-4304 is doing for the trunk and 2.8 do we need to raise a jira to back port (may be not completely but atleast the bug should not exist). ? Recently came across this issue in user forum by Marcin (Handy company) .They had to do some inconvenient workaround to overcome the issue.

          Show
          Naganarasimha Naganarasimha G R added a comment - Hi Tan, Wangda & Sunil G , Given that YARN-4751 is trying to do in 2.6 what YARN-4304 is doing for the trunk and 2.8 do we need to raise a jira to back port (may be not completely but atleast the bug should not exist). ? Recently came across this issue in user forum by Marcin (Handy company) .They had to do some inconvenient workaround to overcome the issue.
          Hide
          sunilg Sunil G added a comment -

          Agreeing to your point. And extremely sorry for missing out. I had some work done for same. Yes, I will help in completing to get a cleaner version of YARN-3216 and YARN-4304 here.

          Show
          sunilg Sunil G added a comment - Agreeing to your point. And extremely sorry for missing out. I had some work done for same. Yes, I will help in completing to get a cleaner version of YARN-3216 and YARN-4304 here.
          Hide
          Naganarasimha Naganarasimha G R added a comment -

          Sunil G, Well before putting the effort just would like to know Tan, Wangda's thoughts on the same.
          and also isnt YARN-4304 already planned to be handled in YARN-4751, may be if eric agrees then you can assign to yourself and handle the both issues there,
          or else raise one new jira mapping to this jira, anything is fine with me

          Show
          Naganarasimha Naganarasimha G R added a comment - Sunil G , Well before putting the effort just would like to know Tan, Wangda 's thoughts on the same. and also isnt YARN-4304 already planned to be handled in YARN-4751 , may be if eric agrees then you can assign to yourself and handle the both issues there, or else raise one new jira mapping to this jira, anything is fine with me
          Hide
          sunilg Sunil G added a comment -

          Yes. YARN-4751 is handling that as per current available ways. But if we are porting YARN-3216, we can handle in a better way. Eric Payne has mentioned the dependent JIRAs for this in YARN-4751 in comments section. Yes, We can wait for opinion from Wangda Tan.

          Show
          sunilg Sunil G added a comment - Yes. YARN-4751 is handling that as per current available ways. But if we are porting YARN-3216 , we can handle in a better way. Eric Payne has mentioned the dependent JIRAs for this in YARN-4751 in comments section. Yes, We can wait for opinion from Wangda Tan .
          Hide
          marcin.tustin Marcin Tustin added a comment -
          Show
          marcin.tustin Marcin Tustin added a comment - I've written up how we worked around this issue here: https://medium.com/handy-tech/practical-capacity-scheduling-with-yarn-28548ae4fb88#.5ihha0oqy
          Hide
          leftnoteasy Wangda Tan added a comment -

          Thanks Marcin Tustin, that is very helpful!

          Show
          leftnoteasy Wangda Tan added a comment - Thanks Marcin Tustin , that is very helpful!
          Hide
          sunilg Sunil G added a comment -

          Yes. This is helpful. Thanks Marcin.

          With the recent progress in YARN-4751, I think I can make some necessary changes here too. I will wait to see the dependencies cleared out for YARN-4751, and will make the progress here.

          Show
          sunilg Sunil G added a comment - Yes. This is helpful. Thanks Marcin. With the recent progress in YARN-4751 , I think I can make some necessary changes here too. I will wait to see the dependencies cleared out for YARN-4751 , and will make the progress here.
          Hide
          Naganarasimha Naganarasimha G R added a comment -

          Thanks Marcin Tustin its a useful doc and yes Sunil G it would be ideal to have fix specific to 2.7.x as its a mjor limitation to use nodelabels.

          Show
          Naganarasimha Naganarasimha G R added a comment - Thanks Marcin Tustin its a useful doc and yes Sunil G it would be ideal to have fix specific to 2.7.x as its a mjor limitation to use nodelabels.
          Hide
          marcin.tustin Marcin Tustin added a comment -

          Sunil G Yeah, YARN-4751 will be VERY nice to have.

          Tan, Wangda NGarla_Unused My pleasure! Just trying to build up the community's knowledge, and keep it shared.

          Show
          marcin.tustin Marcin Tustin added a comment - Sunil G Yeah, YARN-4751 will be VERY nice to have. Tan, Wangda NGarla_Unused My pleasure! Just trying to build up the community's knowledge, and keep it shared.

            People

            • Assignee:
              sunilg Sunil G
              Reporter:
              leftnoteasy Wangda Tan
            • Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development