Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.23.9, 2.1.1-beta
    • Fix Version/s: 2.7.0
    • Component/s: mr-am, mrv2
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      <!-- markdown -->
      This introduces two new MR2 job configs, mentioned below, which allow users to control the maximum simultaneously-running tasks of the submitted job, across the cluster:

      * mapreduce.job.running.map.limit (default: 0, for no limit)
      * mapreduce.job.running.reduce.limit (default: 0, for no limit)

      This is controllable at a per-job level.
      Show
      <!-- markdown --> This introduces two new MR2 job configs, mentioned below, which allow users to control the maximum simultaneously-running tasks of the submitted job, across the cluster: * mapreduce.job.running.map.limit (default: 0, for no limit) * mapreduce.job.running.reduce.limit (default: 0, for no limit) This is controllable at a per-job level.

      Description

      It would be nice if users could specify a limit to the number of map or reduce tasks that are running simultaneously. Occasionally users are performing operations in tasks that can lead to DDoS scenarios if too many tasks run simultaneously (e.g.: accessing a database, web service, etc.). Having the ability to throttle the number of tasks simultaneously running would provide users a way to mitigate issues with too many tasks on a large cluster attempting to access a serivce at any one time.

      This is similar to the functionality requested by MAPREDUCE-224 and implemented by HADOOP-3412 but was dropped in mrv2.

      1. MAPREDUCE-5583v4.patch
        28 kB
        Jason Lowe
      2. MAPREDUCE-5583v3.patch
        28 kB
        Jason Lowe
      3. MAPREDUCE-5583v2.patch
        28 kB
        Jason Lowe
      4. MAPREDUCE-5583v1.patch
        17 kB
        Jason Lowe
      5. MAPREDUCE-5583-branch2.4.1.patch
        16 kB
        Yang Hao

        Issue Links

          Activity

          Hide
          acmurthy Arun C Murthy added a comment -

          Jason Lowe We can already accomplish this with queue/user limits?

          Show
          acmurthy Arun C Murthy added a comment - Jason Lowe We can already accomplish this with queue/user limits?
          Hide
          jlowe Jason Lowe added a comment -

          Not in a general way. Different jobs can have different limits, and queue/user limits are too granular a tool to handle that appropriately. We're either creating a ton of queues for the various scenarios which is a huge pain from the usability and admin point of view, or we're artificially constricting jobs that don't have a need for those limits that happen to run in a queue that was shrunk for other jobs. For example, take the case where we need to increase the memory for map tasks. If we took the use-the-queue-as-the-limit route we now have less tasks running simultaneously than we did before which is undesirable, and the queue needs to be changed each time the job grows or shrinks. If we could limit it per-job in the AM it would have run with the appropriate parallelism, assuming the original queue had the capacity.

          Having per-job limits allows the user to tune their jobs in a much more intuitive way and without requiring admins to assist in that tuning.

          Show
          jlowe Jason Lowe added a comment - Not in a general way. Different jobs can have different limits, and queue/user limits are too granular a tool to handle that appropriately. We're either creating a ton of queues for the various scenarios which is a huge pain from the usability and admin point of view, or we're artificially constricting jobs that don't have a need for those limits that happen to run in a queue that was shrunk for other jobs. For example, take the case where we need to increase the memory for map tasks. If we took the use-the-queue-as-the-limit route we now have less tasks running simultaneously than we did before which is undesirable, and the queue needs to be changed each time the job grows or shrinks. If we could limit it per-job in the AM it would have run with the appropriate parallelism, assuming the original queue had the capacity. Having per-job limits allows the user to tune their jobs in a much more intuitive way and without requiring admins to assist in that tuning.
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Yes, +1 for this. This is how HADOOP-5170 is supposed to have been addressed, but couldn't in MRv1.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Yes, +1 for this. This is how HADOOP-5170 is supposed to have been addressed, but couldn't in MRv1.
          Hide
          acmurthy Arun C Murthy added a comment -

          I'm very concerned about allowing this sort of control to the end user without checks/balances. If every one does the same I want max of n tasks, all resources in the cluster can deadlock. This is one of the main reasons we haven't supported this. I'm still against it.

          Show
          acmurthy Arun C Murthy added a comment - I'm very concerned about allowing this sort of control to the end user without checks/balances. If every one does the same I want max of n tasks, all resources in the cluster can deadlock. This is one of the main reasons we haven't supported this. I'm still against it.
          Hide
          acmurthy Arun C Murthy added a comment -

          Looks like we are rehashing the discussion on HADOOP-5170.

          Show
          acmurthy Arun C Murthy added a comment - Looks like we are rehashing the discussion on HADOOP-5170 .
          Hide
          sandyr Sandy Ryza added a comment -

          Arun C Murthy, how would this cause deadlock?

          Show
          sandyr Sandy Ryza added a comment - Arun C Murthy , how would this cause deadlock?
          Hide
          acmurthy Arun C Murthy added a comment -

          Cluster with 100,000 containers, 1,000 jobs, each with 100000 tasks, and specifies that they can only run 5 tasks. So, you are now only using 5% of the cluster and no one makes progress leading to very poor utilization and peanut-buttering effect.

          Admittedly it's a contrived example and yes, I agree a user can hack his own AM to do this - but let's not make this trivial for normal users. This leads to all sorts of bad side-effects by supporting it out of the box.

          Some form of admin control (e.g. queue with a max-cap) for a small number of use-cases where you actually need this feature is much safer.

          Show
          acmurthy Arun C Murthy added a comment - Cluster with 100,000 containers, 1,000 jobs, each with 100000 tasks, and specifies that they can only run 5 tasks. So, you are now only using 5% of the cluster and no one makes progress leading to very poor utilization and peanut-buttering effect. Admittedly it's a contrived example and yes, I agree a user can hack his own AM to do this - but let's not make this trivial for normal users. This leads to all sorts of bad side-effects by supporting it out of the box. Some form of admin control (e.g. queue with a max-cap) for a small number of use-cases where you actually need this feature is much safer.
          Hide
          toffer Francis Liu added a comment -

          Cluster with 100,000 containers, 1,000 jobs, each with 100000 tasks, and specifies that they can only run 5 tasks. So, you are now only using 5% of the cluster and no one makes progress leading to very poor utilization and peanut-buttering effect.

          Given that YARN is supposed to engender a diverse set of AMs. This seems to be a problem that should be solved by the RM anyway? I'm not that familiar with the scheduler, but if we were to use queues to limit the number of tasks the outcome would be the same wouldn't it? Since we're bound by the upper-limit config of the max jobs?

          Some form of admin control (e.g. queue with a max-cap) for a small number of use-cases where you actually need this feature is much safer.

          We have a number of use cases and it is growing. I'm hoping we can come up with a solution that does not require users to hack the MRv2 AM. This would not only be useful as a manual MR config. I can see this being useful as something an InputFormat/OutputFormat automatically sets or maybe even something that DSLs can leverage. Apart from queues some users control this by limiting the number of reducers or controlling the map task. The latter is done by merging split files which is undesirable as it would make a task failure costly. So it'd be great if we could have a clean way of doing this.

          Show
          toffer Francis Liu added a comment - Cluster with 100,000 containers, 1,000 jobs, each with 100000 tasks, and specifies that they can only run 5 tasks. So, you are now only using 5% of the cluster and no one makes progress leading to very poor utilization and peanut-buttering effect. Given that YARN is supposed to engender a diverse set of AMs. This seems to be a problem that should be solved by the RM anyway? I'm not that familiar with the scheduler, but if we were to use queues to limit the number of tasks the outcome would be the same wouldn't it? Since we're bound by the upper-limit config of the max jobs? Some form of admin control (e.g. queue with a max-cap) for a small number of use-cases where you actually need this feature is much safer. We have a number of use cases and it is growing. I'm hoping we can come up with a solution that does not require users to hack the MRv2 AM. This would not only be useful as a manual MR config. I can see this being useful as something an InputFormat/OutputFormat automatically sets or maybe even something that DSLs can leverage. Apart from queues some users control this by limiting the number of reducers or controlling the map task. The latter is done by merging split files which is undesirable as it would make a task failure costly. So it'd be great if we could have a clean way of doing this.
          Hide
          jlowe Jason Lowe added a comment -

          Had an offline discussion about this with Arun, and he suggested using the ANY ask (i.e.: host="*") to act as a limit to the request. YARN only schedules containers for an application as long as the ANY ask is non-zero, so sending a request for 100 hosts and 10 racks but an ANY ask of 1 will only return 1 container. If the AM carefully modulates the ANY ask then it can self-limit without needing to give up telling the RM about all of its locality desires.

          Attaching a patch that implements this approach. It needs unit tests, but I've manually tested it and maps and reduces are being limited, accordingly. The mapreduce.job.running.maps.limit and mapreduce.job.running.reduces.limit properties control it, where 0 (the default) means no limit otherwise it specifies the number of maps or reduces, respectively, that will be allowed to run concurrently.

          Feedback appreciated.

          Show
          jlowe Jason Lowe added a comment - Had an offline discussion about this with Arun, and he suggested using the ANY ask (i.e.: host="*") to act as a limit to the request. YARN only schedules containers for an application as long as the ANY ask is non-zero, so sending a request for 100 hosts and 10 racks but an ANY ask of 1 will only return 1 container. If the AM carefully modulates the ANY ask then it can self-limit without needing to give up telling the RM about all of its locality desires. Attaching a patch that implements this approach. It needs unit tests, but I've manually tested it and maps and reduces are being limited, accordingly. The mapreduce.job.running.maps.limit and mapreduce.job.running.reduces.limit properties control it, where 0 (the default) means no limit otherwise it specifies the number of maps or reduces, respectively, that will be allowed to run concurrently. Feedback appreciated.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12680200/MAPREDUCE-5583v1.patch
          against trunk revision 42bbe37.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4999//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4999//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12680200/MAPREDUCE-5583v1.patch against trunk revision 42bbe37. +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4999//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4999//console This message is automatically generated.
          Hide
          jlowe Jason Lowe added a comment -

          Added a unit test.

          Show
          jlowe Jason Lowe added a comment - Added a unit test.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12683415/MAPREDUCE-5583v2.patch
          against trunk revision 2967c17.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core:

          org.apache.hadoop.mapreduce.lib.db.TestDbClasses

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5047//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5047//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12683415/MAPREDUCE-5583v2.patch against trunk revision 2967c17. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: org.apache.hadoop.mapreduce.lib.db.TestDbClasses +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5047//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5047//console This message is automatically generated.
          Hide
          jlowe Jason Lowe added a comment -

          TestDbClasses failure appears to be unrelated. I can't reproduce it locally, and the only changes in the patch that would be exposed to hadoop-mapreduce-client-core are the addition of new properties to MRJobConfig and mapred-default.xml. The code that acts on those new properties is above hadoop-mapreduce-client-core. I suspect it's a sporadic failure due to a very aggressive timeout (only 1 second). Filed MAPREDUCE-6172 to track that issue.

          Show
          jlowe Jason Lowe added a comment - TestDbClasses failure appears to be unrelated. I can't reproduce it locally, and the only changes in the patch that would be exposed to hadoop-mapreduce-client-core are the addition of new properties to MRJobConfig and mapred-default.xml. The code that acts on those new properties is above hadoop-mapreduce-client-core. I suspect it's a sporadic failure due to a very aggressive timeout (only 1 second). Filed MAPREDUCE-6172 to track that issue.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12684819/MAPREDUCE-5583-branch2.4.0.patch
          against trunk revision 7caa3bc.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5056//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12684819/MAPREDUCE-5583-branch2.4.0.patch against trunk revision 7caa3bc. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5056//console This message is automatically generated.
          Hide
          yanghaogn Yang Hao added a comment -

          the patch for branch2.4.1

          Show
          yanghaogn Yang Hao added a comment - the patch for branch2.4.1
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12684848/MAPREDUCE-5583-branch2.4.1.patch
          against trunk revision 7caa3bc.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5057//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12684848/MAPREDUCE-5583-branch2.4.1.patch against trunk revision 7caa3bc. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5057//console This message is automatically generated.
          Hide
          yanghaogn Yang Hao added a comment -

          The configure mapreduce.job.running.reduces.limit in mapred-default.xml should be mapreduce.job.running.reduce.limit

          Show
          yanghaogn Yang Hao added a comment - The configure mapreduce.job.running.reduces.limit in mapred-default.xml should be mapreduce.job.running.reduce.limit
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12698945/MAPREDUCE-5583v3.patch
          against trunk revision 3338f6d.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5196//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12698945/MAPREDUCE-5583v3.patch against trunk revision 3338f6d. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5196//console This message is automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12684848/MAPREDUCE-5583-branch2.4.1.patch
          against trunk revision 3338f6d.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5197//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12684848/MAPREDUCE-5583-branch2.4.1.patch against trunk revision 3338f6d. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5197//console This message is automatically generated.
          Hide
          jlowe Jason Lowe added a comment -

          The configure mapreduce.job.running.reduces.limit in mapred-default.xml should be mapreduce.job.running.reduce.limit

          Nice catch, Yang! I fixed this and rebased the patch on trunk.

          Show
          jlowe Jason Lowe added a comment - The configure mapreduce.job.running.reduces.limit in mapred-default.xml should be mapreduce.job.running.reduce.limit Nice catch, Yang! I fixed this and rebased the patch on trunk.
          Hide
          hadoopqa Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12699734/MAPREDUCE-5583v3.patch
          against trunk revision d49ae72.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5209//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5209//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12699734/MAPREDUCE-5583v3.patch against trunk revision d49ae72. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5209//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5209//console This message is automatically generated.
          Hide
          djp Junping Du added a comment -

          Thanks Jason Lowe for updating the patch. A few comments here:
          I see we checked three times of canAssignMaps() In assignMapsWithLocality() in while loop. Can we simply return in the beginning of method if this is false.

          +  protected void setRequestLimit(Priority priority, Resource capability,
          +      Integer limit) {
          +    if (limit == null || limit < 0) {
          +      limit = Integer.MAX_VALUE;
          +    }
          

          Any special reason to use Integer instead of int for limit? If no, I think using int could be slightly better, then we can use "if (limit <= 0)". Isn't it?

          +  private void applyRequestLimits() {
          +    Iterator<ResourceRequest> iter = requestLimits.values().iterator();
          +    while (iter.hasNext()) {
          +      ResourceRequest reqLimit = iter.next();
          +      int limit = reqLimit.getNumContainers();
          +      Map<String, Map<Resource, ResourceRequest>> remoteRequests =
          +          remoteRequestsTable.get(reqLimit.getPriority());
          +      Map<Resource, ResourceRequest> reqMap = (remoteRequests != null)
          +          ? remoteRequests.get(ResourceRequest.ANY) : null;
          +      ResourceRequest req = (reqMap != null)
          +          ? reqMap.get(reqLimit.getCapability()) : null;
          +      if (req == null) {
          +        continue;
          +      }
          +      // update an existing ask or send a new one if updating
          +      if (ask.remove(req) || requestLimitsToUpdate.contains(req)) {
          +        ResourceRequest newReq = req.getNumContainers() > limit
          +            ? reqLimit : req;
          +        ask.add(newReq);
          +        LOG.info("Applying ask limit of " + newReq.getNumContainers()
          +            + " for priority:" + reqLimit.getPriority()
          +            + " and capability:" + reqLimit.getCapability());
          +      }
          +      if (limit == Integer.MAX_VALUE) {
          +        iter.remove();
          +      }
          +    }
          +    requestLimitsToUpdate.clear();
          +  }
          

          I think we are filtering ask here (to make sure don't asking too much containers that beyond the limit) before making remote request to RM. If so, why do we only filter request with "Locality.ANY"? Local node + local rack request number could also exceed this limit. Isn't it? I think we should filter the requests (start from Locality.ANY) and could extend to RACK_LOCAL, then NODE_LOCAL. Thoughts?
          In addition, it sounds like we use Integer.MAX_VALUE as a mark for no limit. Can we check it earlier and return?

          Show
          djp Junping Du added a comment - Thanks Jason Lowe for updating the patch. A few comments here: I see we checked three times of canAssignMaps() In assignMapsWithLocality() in while loop. Can we simply return in the beginning of method if this is false. + protected void setRequestLimit(Priority priority, Resource capability, + Integer limit) { + if (limit == null || limit < 0) { + limit = Integer .MAX_VALUE; + } Any special reason to use Integer instead of int for limit? If no, I think using int could be slightly better, then we can use "if (limit <= 0)". Isn't it? + private void applyRequestLimits() { + Iterator<ResourceRequest> iter = requestLimits.values().iterator(); + while (iter.hasNext()) { + ResourceRequest reqLimit = iter.next(); + int limit = reqLimit.getNumContainers(); + Map< String , Map<Resource, ResourceRequest>> remoteRequests = + remoteRequestsTable.get(reqLimit.getPriority()); + Map<Resource, ResourceRequest> reqMap = (remoteRequests != null ) + ? remoteRequests.get(ResourceRequest.ANY) : null ; + ResourceRequest req = (reqMap != null ) + ? reqMap.get(reqLimit.getCapability()) : null ; + if (req == null ) { + continue ; + } + // update an existing ask or send a new one if updating + if (ask.remove(req) || requestLimitsToUpdate.contains(req)) { + ResourceRequest newReq = req.getNumContainers() > limit + ? reqLimit : req; + ask.add(newReq); + LOG.info( "Applying ask limit of " + newReq.getNumContainers() + + " for priority:" + reqLimit.getPriority() + + " and capability:" + reqLimit.getCapability()); + } + if (limit == Integer .MAX_VALUE) { + iter.remove(); + } + } + requestLimitsToUpdate.clear(); + } I think we are filtering ask here (to make sure don't asking too much containers that beyond the limit) before making remote request to RM. If so, why do we only filter request with "Locality.ANY"? Local node + local rack request number could also exceed this limit. Isn't it? I think we should filter the requests (start from Locality.ANY) and could extend to RACK_LOCAL, then NODE_LOCAL. Thoughts? In addition, it sounds like we use Integer.MAX_VALUE as a mark for no limit. Can we check it earlier and return?
          Hide
          djp Junping Du added a comment -

          More comments should come later.

          Show
          djp Junping Du added a comment - More comments should come later.
          Hide
          jlowe Jason Lowe added a comment -

          Thanks for the review, Junping!

          I see we checked three times of canAssignMaps() In assignMapsWithLocality() in while loop. Can we simply return in the beginning of method if this is false.

          canAssignMaps() is not a constant predicate function, and it's important to make it a conditional of the three while loops. As we assign more maps we can end up hitting the configured task limit.

          We could add an "early out" since once canAssignMaps returns false it will not return true for the remainder of the method. However it's a very cheap function to evaluate and isn't going to save hardly any computation to do this optimization. We'd be optimizing for the uncommon case where the RM handed us more containers than we're supposed to assign. In the common case, the list of containers to assign is empty and that will preclude the need to call canAssignMaps. Since the "early out" at the beginning of the method doesn't save much I didn't make this change. If it's important this be done and I missed something, please let me know.

          Any special reason to use Integer instead of int for limit? If no, I think using int could be slightly better, then we can use "if (limit <= 0)". Isn't it?

          This was an artifact of an earlier design, and I see no reason to use Integer. Changed to int. However I left it as "if (limit < 0)" because there are cases where we want the limit to really be 0 (i.e.: to not request any more containers of that priority).

          why do we only filter request with "Locality.ANY"?

          We only need to check ANY because of the way the YARN AM-RM protocol works. ANY limits the total number of containers the application is requesting. Any new container being requested, whether that's node-local or rack-local, requires the ANY level to be incremented by one so the RM knows it needs to allocate one more container. The RM will only grant as many containers as the app requested at the ANY level regardless of how many it asked at the rack or node level.

          It's important we only limit at the ANY level because the AM has no idea what locality is available on the cluster. Essentially what we're doing with this patch is giving the RM the full information of what we want wrt. locality but then also telling it to not give us all of those containers right now. That allows the RM to make the decision of which containers to grant, and it will try to do its best with locality. The problem with having the AM try to limit tasks at the rack or node locality level is that it has no idea what locality is available at the time – only the RM does. For example, if the AM wants to run two map tasks but is configured to only run one, it would have to decide which map task to request. However it doesn't know which task has availability on the cluster. It could end up asking for only task 1, not realizing that task 1 has no locality available right now (node/rack is full) but task 2 does. By telling the RM we want nodes for task 1 and task 2 but only allowing it to give us one container right now, the RM is able to decide which node to allocate based on the requested locality. Therefore the AM is more likely to get better locality with this approach than by trying to limit its task requests directly itself.

          In addition, it sounds like we use Integer.MAX_VALUE as a mark for no limit. Can we check it earlier and return?

          We can't early-out because if a limit is being removed (i.e.: set to MAX_VALUE) then we need to send the original, unfiltered ask for ANY. Failure to do so means the RM wouldn't see the updated ANY ask and the limit would still be in place.

          Show
          jlowe Jason Lowe added a comment - Thanks for the review, Junping! I see we checked three times of canAssignMaps() In assignMapsWithLocality() in while loop. Can we simply return in the beginning of method if this is false. canAssignMaps() is not a constant predicate function, and it's important to make it a conditional of the three while loops. As we assign more maps we can end up hitting the configured task limit. We could add an "early out" since once canAssignMaps returns false it will not return true for the remainder of the method. However it's a very cheap function to evaluate and isn't going to save hardly any computation to do this optimization. We'd be optimizing for the uncommon case where the RM handed us more containers than we're supposed to assign. In the common case, the list of containers to assign is empty and that will preclude the need to call canAssignMaps. Since the "early out" at the beginning of the method doesn't save much I didn't make this change. If it's important this be done and I missed something, please let me know. Any special reason to use Integer instead of int for limit? If no, I think using int could be slightly better, then we can use "if (limit <= 0)". Isn't it? This was an artifact of an earlier design, and I see no reason to use Integer. Changed to int. However I left it as "if (limit < 0)" because there are cases where we want the limit to really be 0 (i.e.: to not request any more containers of that priority). why do we only filter request with "Locality.ANY"? We only need to check ANY because of the way the YARN AM-RM protocol works. ANY limits the total number of containers the application is requesting. Any new container being requested, whether that's node-local or rack-local, requires the ANY level to be incremented by one so the RM knows it needs to allocate one more container. The RM will only grant as many containers as the app requested at the ANY level regardless of how many it asked at the rack or node level. It's important we only limit at the ANY level because the AM has no idea what locality is available on the cluster. Essentially what we're doing with this patch is giving the RM the full information of what we want wrt. locality but then also telling it to not give us all of those containers right now. That allows the RM to make the decision of which containers to grant, and it will try to do its best with locality. The problem with having the AM try to limit tasks at the rack or node locality level is that it has no idea what locality is available at the time – only the RM does. For example, if the AM wants to run two map tasks but is configured to only run one, it would have to decide which map task to request. However it doesn't know which task has availability on the cluster. It could end up asking for only task 1, not realizing that task 1 has no locality available right now (node/rack is full) but task 2 does. By telling the RM we want nodes for task 1 and task 2 but only allowing it to give us one container right now, the RM is able to decide which node to allocate based on the requested locality. Therefore the AM is more likely to get better locality with this approach than by trying to limit its task requests directly itself. In addition, it sounds like we use Integer.MAX_VALUE as a mark for no limit. Can we check it earlier and return? We can't early-out because if a limit is being removed (i.e.: set to MAX_VALUE) then we need to send the original, unfiltered ask for ANY. Failure to do so means the RM wouldn't see the updated ANY ask and the limit would still be in place.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12700591/MAPREDUCE-5583v4.patch
          against trunk revision 9a37247.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          -1 eclipse:eclipse. The patch failed to build with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5221//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5221//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700591/MAPREDUCE-5583v4.patch against trunk revision 9a37247. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. -1 eclipse:eclipse . The patch failed to build with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5221//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5221//console This message is automatically generated.
          Hide
          djp Junping Du added a comment -

          Thanks Jason Lowe for updating the patch!

          In the common case, the list of containers to assign is empty and that will preclude the need to call canAssignMaps. Since the "early out" at the beginning of the method doesn't save much I didn't make this change.

          Agree. "early out" doesn't seem to have much value here.

          However I left it as "if (limit < 0)" because there are cases where we want the limit to really be 0 (i.e.: to not request any more containers of that priority).

          If so, I think we should keep the logic consistent here. Many places seems to treat limit=0 the same way as <0, e.g.

          +  private boolean canAssignMaps() {
          +    return (maxRunningMaps <= 0
          +        || assignedRequests.maps.size() < maxRunningMaps);
          +  }
          +
          +  private boolean canAssignReduces() {
          +    return (maxRunningReduces <= 0
          +        || assignedRequests.reduces.size() < maxRunningReduces);
          +  }
          

          ANY limits the total number of containers the application is requesting. Any new container being requested, whether that's node-local or rack-local, requires the ANY level to be incremented by one so the RM knows it needs to allocate one more container.

          Agree.

          The RM will only grant as many containers as the app requested at the ANY level regardless of how many it asked at the rack or node level.

          I am not sure if big impact for any inconsistent among rack-local, node-local and ANY in request here. From what I quick check on FIFOScheduler (which is simpler than CS or FS), looks like it will first check maxContainer from ANY request like you said. However, when maxContainer >0 (request for ANY is exist), it will go ahead to assign container according to requests in sequence of Node-local, Rack-local or ANY. Assume the same request for 3 containers on node0, rack0 and ANY, it could adjust to 2 containers for ANY in client side, but the real allocation could still be 3 containers. Am I missing something here?

          It's important we only limit at the ANY level because the AM has no idea what locality is available on the cluster. Essentially what we're doing with this patch is giving the RM the full information of what we want wrt. locality but then also telling it to not give us all of those containers right now. That allows the RM to make the decision of which containers to grant, and it will try to do its best with locality.

          I can see we are making request for node or local can have more rich info for locality reason. However, I have a quick question here: do we add back filtered asks later when limit is not hit then? I didn't find this important logic there.

          Show
          djp Junping Du added a comment - Thanks Jason Lowe for updating the patch! In the common case, the list of containers to assign is empty and that will preclude the need to call canAssignMaps. Since the "early out" at the beginning of the method doesn't save much I didn't make this change. Agree. "early out" doesn't seem to have much value here. However I left it as "if (limit < 0)" because there are cases where we want the limit to really be 0 (i.e.: to not request any more containers of that priority). If so, I think we should keep the logic consistent here. Many places seems to treat limit=0 the same way as <0, e.g. + private boolean canAssignMaps() { + return (maxRunningMaps <= 0 + || assignedRequests.maps.size() < maxRunningMaps); + } + + private boolean canAssignReduces() { + return (maxRunningReduces <= 0 + || assignedRequests.reduces.size() < maxRunningReduces); + } ANY limits the total number of containers the application is requesting. Any new container being requested, whether that's node-local or rack-local, requires the ANY level to be incremented by one so the RM knows it needs to allocate one more container. Agree. The RM will only grant as many containers as the app requested at the ANY level regardless of how many it asked at the rack or node level. I am not sure if big impact for any inconsistent among rack-local, node-local and ANY in request here. From what I quick check on FIFOScheduler (which is simpler than CS or FS), looks like it will first check maxContainer from ANY request like you said. However, when maxContainer >0 (request for ANY is exist), it will go ahead to assign container according to requests in sequence of Node-local, Rack-local or ANY. Assume the same request for 3 containers on node0, rack0 and ANY, it could adjust to 2 containers for ANY in client side, but the real allocation could still be 3 containers. Am I missing something here? It's important we only limit at the ANY level because the AM has no idea what locality is available on the cluster. Essentially what we're doing with this patch is giving the RM the full information of what we want wrt. locality but then also telling it to not give us all of those containers right now. That allows the RM to make the decision of which containers to grant, and it will try to do its best with locality. I can see we are making request for node or local can have more rich info for locality reason. However, I have a quick question here: do we add back filtered asks later when limit is not hit then? I didn't find this important logic there.
          Hide
          jlowe Jason Lowe added a comment -

          If so, I think we should keep the logic consistent here. Many places seems to treat limit=0 the same way as <0

          The limit I referred to before is a temporary limit applied to the current ask. This limit is the one the user specified, which is effectively constant for the lifetime of the job. It makes no sense to configure a job with a maximum concurrent task limit of zero – the job would just hang. Therefore the code treats a user-specified limit of 0 as no limit. However during a limit-enabled job's lifetime it may need to temporarily set the ask to zero because it's already running at the configured limit. That's why one of them uses limit=0 to really mean no tasks will be scheduled and the other does not.

          Assume the same request for 3 containers on node0, rack0 and ANY, it could adjust to 2 containers for ANY in client side, but the real allocation could still be 3 containers. Am I missing something here?

          Yes, it looks like the FIFO scheduler handles this differently and IMHO has a bug. Both the CapacityScheduler and, from my brief glance at it, the FairScheduler will only assign as many containers as specified by ANY. Each container allocated will always decrement the current ANY ask, and the scheduler checks at the top of the assignment loop if there are still ANY asks before proceeding. See LeafQueue#assignContainers and SchedulerApplicationAttempt#getTotalRequiredResources. It's true that it will consider node-level asks and rack-level asks, but it only assigns one container. As I stated before, the problem with the AM tuning down the rack- and node- level requests is knowing which racks and nodes are the correct ones to limit. The AM cannot know this, but the RM can. That's why the full locality information should be sent to the RM and let it decide where to allocate the limited number of containers.

          do we add back filtered asks later when limit is not hit then?

          Yes, RMContainerAllocator performs the current limit calculation every heartbeat and will send the current requested ask (i.e.: how many containers we are short of hitting the limit). See applyConcurrentTaskLimits. So if the limit is 10 and we get 7 containers, it will turn around and ask for 3 the next heartbeat. If two tasks then complete, it will update the ask from 3 to 5 (assuming no more containers were granted in the interim).

          Show
          jlowe Jason Lowe added a comment - If so, I think we should keep the logic consistent here. Many places seems to treat limit=0 the same way as <0 The limit I referred to before is a temporary limit applied to the current ask. This limit is the one the user specified, which is effectively constant for the lifetime of the job. It makes no sense to configure a job with a maximum concurrent task limit of zero – the job would just hang. Therefore the code treats a user-specified limit of 0 as no limit. However during a limit-enabled job's lifetime it may need to temporarily set the ask to zero because it's already running at the configured limit. That's why one of them uses limit=0 to really mean no tasks will be scheduled and the other does not. Assume the same request for 3 containers on node0, rack0 and ANY, it could adjust to 2 containers for ANY in client side, but the real allocation could still be 3 containers. Am I missing something here? Yes, it looks like the FIFO scheduler handles this differently and IMHO has a bug. Both the CapacityScheduler and, from my brief glance at it, the FairScheduler will only assign as many containers as specified by ANY. Each container allocated will always decrement the current ANY ask, and the scheduler checks at the top of the assignment loop if there are still ANY asks before proceeding. See LeafQueue#assignContainers and SchedulerApplicationAttempt#getTotalRequiredResources. It's true that it will consider node-level asks and rack-level asks, but it only assigns one container. As I stated before, the problem with the AM tuning down the rack- and node- level requests is knowing which racks and nodes are the correct ones to limit. The AM cannot know this, but the RM can. That's why the full locality information should be sent to the RM and let it decide where to allocate the limited number of containers. do we add back filtered asks later when limit is not hit then? Yes, RMContainerAllocator performs the current limit calculation every heartbeat and will send the current requested ask (i.e.: how many containers we are short of hitting the limit). See applyConcurrentTaskLimits. So if the limit is 10 and we get 7 containers, it will turn around and ask for 3 the next heartbeat. If two tasks then complete, it will update the ask from 3 to 5 (assuming no more containers were granted in the interim).
          Hide
          kasha Karthik Kambatla added a comment -

          from my brief glance at it, the FairScheduler will only assign as many containers as specified by ANY.

          I think so too. If FIFO doesn't do this today, I agree it is a bug.

          Show
          kasha Karthik Kambatla added a comment - from my brief glance at it, the FairScheduler will only assign as many containers as specified by ANY. I think so too. If FIFO doesn't do this today, I agree it is a bug.
          Hide
          djp Junping Du added a comment -

          Thanks Jason Lowe for explanation and Karthik Kambatla for confirmation for FS.
          I agree that we should fix this in FIFOScheduler given the expected behavior for Schedulers that we discussed above, and I will file a JIRA later for this issue.
          Patch looks good to me now except one NIT below:

          +    ResourceRequest oldReqLimit = requestLimits.put(newReqLimit, newReqLimit);
          +    if (oldReqLimit == null || oldReqLimit.getNumContainers() < limit) {
          +      requestLimitsToUpdate.add(newReqLimit);
          +    }
          

          Looks like requestLimits will always have the same key and value here. So the return value for requestLimits.put(newReqLimit, newReqLimit) can always be null or the same as newReqLimit (event in numOfContainers), so checking for "oldReqLimit.getNumContainers() < limit" sounds unnecessary to me.

          Show
          djp Junping Du added a comment - Thanks Jason Lowe for explanation and Karthik Kambatla for confirmation for FS. I agree that we should fix this in FIFOScheduler given the expected behavior for Schedulers that we discussed above, and I will file a JIRA later for this issue. Patch looks good to me now except one NIT below: + ResourceRequest oldReqLimit = requestLimits.put(newReqLimit, newReqLimit); + if (oldReqLimit == null || oldReqLimit.getNumContainers() < limit) { + requestLimitsToUpdate.add(newReqLimit); + } Looks like requestLimits will always have the same key and value here. So the return value for requestLimits.put(newReqLimit, newReqLimit) can always be null or the same as newReqLimit (event in numOfContainers), so checking for "oldReqLimit.getNumContainers() < limit" sounds unnecessary to me.
          Hide
          djp Junping Du added a comment -

          Filed YARN-3274 to address the issue in FifoScheduler.

          Show
          djp Junping Du added a comment - Filed YARN-3274 to address the issue in FifoScheduler.
          Hide
          jlowe Jason Lowe added a comment -

          So the return value for requestLimits.put(newReqLimit, newReqLimit) can always be null or the same as newReqLimit (event in numOfContainers), so checking for "oldReqLimit.getNumContainers() < limit" sounds unnecessary to me.

          The return value from requestLimits.put will be the previously stored value requested limit which is not necessarily newReqLimit if it's not null. If a job with 3 tasks has a limit of 1 task and 1 is already running then the current ask is 0. As that task completes the ask will go from 0 to 1 since we can now run another task. In that case oldReqLimit.getNumContainers == 0 and newReqLimit.getNumContainers() == 1, so they are not the same.

          The intent of this check is to re-send the ask if the limit is being raised. If we don't re-send the ask when the limit goes up then the RM may fail to allocate more containers when we need them which would result in a hang. (AM thinks it's asking, RM thinks the AM doesn't want any.)

          Show
          jlowe Jason Lowe added a comment - So the return value for requestLimits.put(newReqLimit, newReqLimit) can always be null or the same as newReqLimit (event in numOfContainers), so checking for "oldReqLimit.getNumContainers() < limit" sounds unnecessary to me. The return value from requestLimits.put will be the previously stored value requested limit which is not necessarily newReqLimit if it's not null. If a job with 3 tasks has a limit of 1 task and 1 is already running then the current ask is 0. As that task completes the ask will go from 0 to 1 since we can now run another task. In that case oldReqLimit.getNumContainers == 0 and newReqLimit.getNumContainers() == 1, so they are not the same. The intent of this check is to re-send the ask if the limit is being raised. If we don't re-send the ask when the limit goes up then the RM may fail to allocate more containers when we need them which would result in a hang. (AM thinks it's asking, RM thinks the AM doesn't want any.)
          Hide
          djp Junping Du added a comment -

          As that task completes the ask will go from 0 to 1 since we can now run another task. In that case oldReqLimit.getNumContainers == 0 and newReqLimit.getNumContainers() == 1, so they are not the same.

          I see. I always forgot our old hacking that compare() in ResourceRequestComparator is inconsistent with equals() in ResourceRequest.
          Latest patch LGTM. Given latest patch is upload almost 1 week ago, kick off Jenkins test again.
          +1 pending on Jenkins result.

          Show
          djp Junping Du added a comment - As that task completes the ask will go from 0 to 1 since we can now run another task. In that case oldReqLimit.getNumContainers == 0 and newReqLimit.getNumContainers() == 1, so they are not the same. I see. I always forgot our old hacking that compare() in ResourceRequestComparator is inconsistent with equals() in ResourceRequest. Latest patch LGTM. Given latest patch is upload almost 1 week ago, kick off Jenkins test again. +1 pending on Jenkins result.
          Hide
          hadoopqa Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12700591/MAPREDUCE-5583v4.patch
          against trunk revision b18d383.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5239//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5239//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700591/MAPREDUCE-5583v4.patch against trunk revision b18d383. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5239//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5239//console This message is automatically generated.
          Hide
          djp Junping Du added a comment -

          +1. Committing it in.

          Show
          djp Junping Du added a comment - +1. Committing it in.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-trunk-Commit #7245 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7245/)
          MAPREDUCE-5583. Ability to limit running map and reduce tasks. Contributed by Jason Lowe. (junping_du: rev 4228de94028f1e10ca59ce23e963e488fe566909)

          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
          • hadoop-mapreduce-project/CHANGES.txt
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #7245 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7245/ ) MAPREDUCE-5583 . Ability to limit running map and reduce tasks. Contributed by Jason Lowe. (junping_du: rev 4228de94028f1e10ca59ce23e963e488fe566909) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java hadoop-mapreduce-project/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java
          Hide
          djp Junping Du added a comment -

          I have commit this patch to trunk and branch-2. Thanks Jason Lowe for the patch and Yang Hao, Karthik Kambatla for review!

          Show
          djp Junping Du added a comment - I have commit this patch to trunk and branch-2. Thanks Jason Lowe for the patch and Yang Hao , Karthik Kambatla for review!
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #121 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/121/)
          MAPREDUCE-5583. Ability to limit running map and reduce tasks. Contributed by Jason Lowe. (junping_du: rev 4228de94028f1e10ca59ce23e963e488fe566909)

          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
          • hadoop-mapreduce-project/CHANGES.txt
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #121 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/121/ ) MAPREDUCE-5583 . Ability to limit running map and reduce tasks. Contributed by Jason Lowe. (junping_du: rev 4228de94028f1e10ca59ce23e963e488fe566909) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java hadoop-mapreduce-project/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #855 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/855/)
          MAPREDUCE-5583. Ability to limit running map and reduce tasks. Contributed by Jason Lowe. (junping_du: rev 4228de94028f1e10ca59ce23e963e488fe566909)

          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java
          • hadoop-mapreduce-project/CHANGES.txt
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #855 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/855/ ) MAPREDUCE-5583 . Ability to limit running map and reduce tasks. Contributed by Jason Lowe. (junping_du: rev 4228de94028f1e10ca59ce23e963e488fe566909) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java hadoop-mapreduce-project/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Hdfs-trunk #2053 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2053/)
          MAPREDUCE-5583. Ability to limit running map and reduce tasks. Contributed by Jason Lowe. (junping_du: rev 4228de94028f1e10ca59ce23e963e488fe566909)

          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
          • hadoop-mapreduce-project/CHANGES.txt
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Hdfs-trunk #2053 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2053/ ) MAPREDUCE-5583 . Ability to limit running map and reduce tasks. Contributed by Jason Lowe. (junping_du: rev 4228de94028f1e10ca59ce23e963e488fe566909) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java hadoop-mapreduce-project/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #112 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/112/)
          MAPREDUCE-5583. Ability to limit running map and reduce tasks. Contributed by Jason Lowe. (junping_du: rev 4228de94028f1e10ca59ce23e963e488fe566909)

          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java
          • hadoop-mapreduce-project/CHANGES.txt
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #112 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/112/ ) MAPREDUCE-5583 . Ability to limit running map and reduce tasks. Contributed by Jason Lowe. (junping_du: rev 4228de94028f1e10ca59ce23e963e488fe566909) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java hadoop-mapreduce-project/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #121 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/121/)
          MAPREDUCE-5583. Ability to limit running map and reduce tasks. Contributed by Jason Lowe. (junping_du: rev 4228de94028f1e10ca59ce23e963e488fe566909)

          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
          • hadoop-mapreduce-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #121 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/121/ ) MAPREDUCE-5583 . Ability to limit running map and reduce tasks. Contributed by Jason Lowe. (junping_du: rev 4228de94028f1e10ca59ce23e963e488fe566909) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml hadoop-mapreduce-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2071 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2071/)
          MAPREDUCE-5583. Ability to limit running map and reduce tasks. Contributed by Jason Lowe. (junping_du: rev 4228de94028f1e10ca59ce23e963e488fe566909)

          • hadoop-mapreduce-project/CHANGES.txt
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2071 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2071/ ) MAPREDUCE-5583 . Ability to limit running map and reduce tasks. Contributed by Jason Lowe. (junping_du: rev 4228de94028f1e10ca59ce23e963e488fe566909) hadoop-mapreduce-project/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java

            People

            • Assignee:
              jlowe Jason Lowe
              Reporter:
              jlowe Jason Lowe
            • Votes:
              2 Vote for this issue
              Watchers:
              27 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development