Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-2005

Blacklisting support for scheduling AMs

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.23.10, 2.4.0
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: resourcemanager
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      It would be nice if the RM supported blacklisting a node for an AM launch after the same node fails a configurable number of AM attempts. This would be similar to the blacklisting support for scheduling task attempts in the MapReduce AM but for scheduling AM attempts on the RM side.

      1. YARN-2005.009.patch
        64 kB
        Anubhav Dhoot
      2. YARN-2005.008.patch
        67 kB
        Anubhav Dhoot
      3. YARN-2005.007.patch
        63 kB
        Anubhav Dhoot
      4. YARN-2005.006.patch
        54 kB
        Anubhav Dhoot
      5. YARN-2005.006.patch
        54 kB
        Anubhav Dhoot
      6. YARN-2005.005.patch
        48 kB
        Anubhav Dhoot
      7. YARN-2005.004.patch
        46 kB
        Anubhav Dhoot
      8. YARN-2005.003.patch
        41 kB
        Anubhav Dhoot
      9. YARN-2005.002.patch
        40 kB
        Anubhav Dhoot
      10. YARN-2005.001.patch
        40 kB
        Anubhav Dhoot

        Issue Links

          Activity

          Hide
          jlowe Jason Lowe added a comment -

          This is particularly helpful on a busy cluster where one node happens to be in a state where it can't launch containers for some reason but hasn't self-declared an UNHEALTHY state. In that scenario the only place with spare capacity is a node that fails every container attempt, and apps can fail due to the RM not realizing that repeated AM attempts on the same node aren't working.

          In that sense a fix for YARN-1073 could help quite a bit, but there could still be scenarios where a particular app's AMs end up failing on certain nodes but other containers run just fine.

          Show
          jlowe Jason Lowe added a comment - This is particularly helpful on a busy cluster where one node happens to be in a state where it can't launch containers for some reason but hasn't self-declared an UNHEALTHY state. In that scenario the only place with spare capacity is a node that fails every container attempt, and apps can fail due to the RM not realizing that repeated AM attempts on the same node aren't working. In that sense a fix for YARN-1073 could help quite a bit, but there could still be scenarios where a particular app's AMs end up failing on certain nodes but other containers run just fine.
          Hide
          sunilg Sunil G added a comment -

          Hi Jason Lowe

          As discussed in YARN-2293 , AMs can be failed in few nodes and such nodes can be avoid while launching next attempt or even new AMs.

          Scoring mechanism based on AM container failure will be the key point here. A container failure which is non-related to a buggy application can be considered as genuine candidates here.

          Scoring mechanism also has to be lenient time. If a node is black listed, and if its idle for long time by running only normal containers, such nodes can be brought back as normal nodes.

          I would like to keep a discussion on same here. Please share your thoughts,.

          Show
          sunilg Sunil G added a comment - Hi Jason Lowe As discussed in YARN-2293 , AMs can be failed in few nodes and such nodes can be avoid while launching next attempt or even new AMs. Scoring mechanism based on AM container failure will be the key point here. A container failure which is non-related to a buggy application can be considered as genuine candidates here. Scoring mechanism also has to be lenient time. If a node is black listed, and if its idle for long time by running only normal containers, such nodes can be brought back as normal nodes. I would like to keep a discussion on same here. Please share your thoughts,.
          Hide
          jlowe Jason Lowe added a comment -

          My concern with a cluster-wide approach like that proposed in YARN-2293 is the ability of one buggy setup from a user/app to spoil nodes for others. For example, if there's a workflow that constantly spams the RM with job submissions of AMs that are broken (AM either fails instantly or is able to register but then fails), how does the RM/NM distinguish that failure as being specific to the node vs. a buggy application?

          Even a per-application blacklisting logic could be beneficial to help prevent subsequent attempts from the same application launching on the same node. We may consider doing per-application blacklisting logic if that's simpler to manage/implement in light of buggy apps on a cluster.

          Show
          jlowe Jason Lowe added a comment - My concern with a cluster-wide approach like that proposed in YARN-2293 is the ability of one buggy setup from a user/app to spoil nodes for others. For example, if there's a workflow that constantly spams the RM with job submissions of AMs that are broken (AM either fails instantly or is able to register but then fails), how does the RM/NM distinguish that failure as being specific to the node vs. a buggy application? Even a per-application blacklisting logic could be beneficial to help prevent subsequent attempts from the same application launching on the same node. We may consider doing per-application blacklisting logic if that's simpler to manage/implement in light of buggy apps on a cluster.
          Hide
          sunilg Sunil G added a comment -

          Yes Jason Lowe
          A buggy application is my concern too.

          If an application's AM is failed in a node, prevent RM to launch its further attempts on same node.
          2 or more different applications failed on a node. Such nodes can be given lesser priority (lower rank) in scheduling AM for newer application.

          Generally this logic can only prioritize the nodes to be selected for AM scheduling. In worst cases, lowest ranked NM can still be scheduled for a new AM.

          Show
          sunilg Sunil G added a comment - Yes Jason Lowe A buggy application is my concern too. If an application's AM is failed in a node, prevent RM to launch its further attempts on same node. 2 or more different applications failed on a node. Such nodes can be given lesser priority (lower rank) in scheduling AM for newer application. Generally this logic can only prioritize the nodes to be selected for AM scheduling. In worst cases, lowest ranked NM can still be scheduled for a new AM.
          Hide
          jlowe Jason Lowe added a comment -

          2 or more different applications failed on a node. Such nodes can be given lesser priority (lower rank) in scheduling AM for newer application.

          The problem there is a workflow that spams bad applications. Many separate applications fail in that scenario, so I guess it depends upon what you mean by "different" applications. Is that different users, app names, or ...?

          In worst cases, lowest ranked NM can still be scheduled for a new AM.

          Also ranking alone is not sufficient. We've seen instances on busy clusters where a bad node was the only node with free resources on it, and all the AM attempts were scheduled in quick succession on this node causing the overall application to fail. Relying solely on node weighting is not going to prevent that problem since the only eligible node at the time it wants to schedule is a bad one. In addition to pure node ordering based on weighting it needs some kind of weight threshold below which it will refuse to use the node completely. As mentioned, this weight could be modulated with some time-based metric to allow even poor nodes to be tried if we wait too long. However we need a "don't even go there" level to avoid rapid rescheduling of failed AM attempts on the same node in a busy cluster scenario.

          Show
          jlowe Jason Lowe added a comment - 2 or more different applications failed on a node. Such nodes can be given lesser priority (lower rank) in scheduling AM for newer application. The problem there is a workflow that spams bad applications. Many separate applications fail in that scenario, so I guess it depends upon what you mean by "different" applications. Is that different users, app names, or ...? In worst cases, lowest ranked NM can still be scheduled for a new AM. Also ranking alone is not sufficient. We've seen instances on busy clusters where a bad node was the only node with free resources on it, and all the AM attempts were scheduled in quick succession on this node causing the overall application to fail. Relying solely on node weighting is not going to prevent that problem since the only eligible node at the time it wants to schedule is a bad one. In addition to pure node ordering based on weighting it needs some kind of weight threshold below which it will refuse to use the node completely. As mentioned, this weight could be modulated with some time-based metric to allow even poor nodes to be tried if we wait too long. However we need a "don't even go there" level to avoid rapid rescheduling of failed AM attempts on the same node in a busy cluster scenario.
          Hide
          sunilg Sunil G added a comment -

          Is that different users, app names, or .

          Yes. App name is the first point came in to my thoughts. As you mentioned, challenge here is to find the real buggy application which comes in as a workflow. There also can be a genuine cases, where a workflow of jobs failed because of node problem.
          To overcome this, multiple inputs can be considered. Such as app name, user, queue etc.

          Point 1:
          An app from "user1" with name "job1" failed on node1. If again same app name "job1" fails on same node, an immediate history or current running AM in that node can be cross checked. This may give a better idea about the behavior in that node.
          IN simple words, a sample rate of 2 or more (different applications categorized from name/user etc) always has to be considered before taking a decision on a node.

          Point 2:
          If an app from "user1" with name "job2" fails on node1, it is very much appropriate to try its second attempt in a different node.

          However we need a "don't even go there" level to avoid rapid rescheduling of failed AM attempts on the same node in a busy cluster scenario.

          This is one of the real intention from my side also. But a continuos monitoring in cluster with its historical data will play a pivotal role here, and one decision making point also has to be time. I feel i could jot down few points and share as a doc for same, and we can see whether this adds a value to system without causing a chance to hack.

          Show
          sunilg Sunil G added a comment - Is that different users, app names, or . Yes. App name is the first point came in to my thoughts. As you mentioned, challenge here is to find the real buggy application which comes in as a workflow. There also can be a genuine cases, where a workflow of jobs failed because of node problem. To overcome this, multiple inputs can be considered. Such as app name, user, queue etc. Point 1: An app from "user1" with name "job1" failed on node1. If again same app name "job1" fails on same node, an immediate history or current running AM in that node can be cross checked. This may give a better idea about the behavior in that node. IN simple words, a sample rate of 2 or more (different applications categorized from name/user etc) always has to be considered before taking a decision on a node. Point 2: If an app from "user1" with name "job2" fails on node1, it is very much appropriate to try its second attempt in a different node. However we need a "don't even go there" level to avoid rapid rescheduling of failed AM attempts on the same node in a busy cluster scenario. This is one of the real intention from my side also. But a continuos monitoring in cluster with its historical data will play a pivotal role here, and one decision making point also has to be time. I feel i could jot down few points and share as a doc for same, and we can see whether this adds a value to system without causing a chance to hack.
          Hide
          jlowe Jason Lowe added a comment -

          App name is the first point came in to my thoughts.

          The problem with app name in the workflow spamming case is that many workflows I've seen use a different app name each time they submit, since the app name often includes some timestamp indicating which data window it's consuming/producing. If the workflow is retrying the same failed apps then the app name may not be changing, but if it's plowing ahead submitting other jobs then it very likely is changing.

          If an app from "user1" with name "job2" fails on node1, it is very much appropriate to try its second attempt in a different node.

          Totally agree. I think it's worthwhile to consider implementing a relatively simple app-specific blacklisting logic to avoid this fairly common scenario. We can then follow that up with a much more sophisticated blacklisting algortihm with fancy weighting with time decays, etc., but the biggest problem we're seeing probably doesn't need anything that fancy to solve 80% of the cases we see.

          I feel i could jot down few points and share as a doc for same

          Sounds good, feel free to post one.

          Show
          jlowe Jason Lowe added a comment - App name is the first point came in to my thoughts. The problem with app name in the workflow spamming case is that many workflows I've seen use a different app name each time they submit, since the app name often includes some timestamp indicating which data window it's consuming/producing. If the workflow is retrying the same failed apps then the app name may not be changing, but if it's plowing ahead submitting other jobs then it very likely is changing. If an app from "user1" with name "job2" fails on node1, it is very much appropriate to try its second attempt in a different node. Totally agree. I think it's worthwhile to consider implementing a relatively simple app-specific blacklisting logic to avoid this fairly common scenario. We can then follow that up with a much more sophisticated blacklisting algortihm with fancy weighting with time decays, etc., but the biggest problem we're seeing probably doesn't need anything that fancy to solve 80% of the cases we see. I feel i could jot down few points and share as a doc for same Sounds good, feel free to post one.
          Hide
          sunilg Sunil G added a comment -

          I've seen use a different app name each time they submit

          Yes. This is a real valid point in a production cluster. I hope the user will be same for these, hence user can play a bigger role in identifying such problematic apps.
          as mentioned in point 2,

          • failed app can be restricted to run in same node on a reattempt scenario
          • If an app is failed for a given user, and same user submitting another application, its good to scheduler that also in a different node.

          but the biggest problem we're seeing probably doesn't need anything that fancy to solve 80% of the cases we see.

          I agree that a simple logic can be shaped up first, and can see the feasibility. Then an analysis on how much it can really help. After that its better to go for complexity. Kindly suggests your thoughts on same.

          Show
          sunilg Sunil G added a comment - I've seen use a different app name each time they submit Yes. This is a real valid point in a production cluster. I hope the user will be same for these, hence user can play a bigger role in identifying such problematic apps. as mentioned in point 2, failed app can be restricted to run in same node on a reattempt scenario If an app is failed for a given user, and same user submitting another application, its good to scheduler that also in a different node. but the biggest problem we're seeing probably doesn't need anything that fancy to solve 80% of the cases we see. I agree that a simple logic can be shaped up first, and can see the feasibility. Then an analysis on how much it can really help. After that its better to go for complexity. Kindly suggests your thoughts on same.
          Hide
          jlowe Jason Lowe added a comment -

          As I mentioned earlier, as a first step I think we could implement an app-specific blacklisting approach similar to what is done by the MapReduce AM today. We would track, per application, the nodes that have failed an AM attempt and refuse to launch subsequent AM attempts for that application on those nodes. If we want to keep it really simple, we could just do literally that. From there we can sprinkle additional logic to make it a bit more sophisticated, e.g.: having the blacklisting auto-disable when the percentage of blacklisted nodes compared to the total active nodes is above some threshold and/or the app has waited some amount of time for an AM container for the next attempt.

          Show
          jlowe Jason Lowe added a comment - As I mentioned earlier, as a first step I think we could implement an app-specific blacklisting approach similar to what is done by the MapReduce AM today. We would track, per application, the nodes that have failed an AM attempt and refuse to launch subsequent AM attempts for that application on those nodes. If we want to keep it really simple, we could just do literally that. From there we can sprinkle additional logic to make it a bit more sophisticated, e.g.: having the blacklisting auto-disable when the percentage of blacklisted nodes compared to the total active nodes is above some threshold and/or the app has waited some amount of time for an AM container for the next attempt.
          Hide
          sunilg Sunil G added a comment -

          we could implement an app-specific blacklisting approach similar to what is done by the MapReduce AM today.

          +1 Yes. I also feel this can be the first step. Without much complicating, a step can be made forward.
          For that, we need changes in RMAppImpl transitions to keep track of host which previous failed attempt ran. Then pass that to Scheduler as an exclude list only for AM schedule. Could I work on this, and I can share a prototype soon. Then later it can made to production standard by splitting to appropriate subjira.

          Show
          sunilg Sunil G added a comment - we could implement an app-specific blacklisting approach similar to what is done by the MapReduce AM today. +1 Yes. I also feel this can be the first step. Without much complicating, a step can be made forward. For that, we need changes in RMAppImpl transitions to keep track of host which previous failed attempt ran. Then pass that to Scheduler as an exclude list only for AM schedule. Could I work on this, and I can share a prototype soon. Then later it can made to production standard by splitting to appropriate subjira.
          Hide
          adhoot Anubhav Dhoot added a comment -

          Assigning to myself to as I am starting work on this. Sunil G let me know if you have made progress on this already.

          Show
          adhoot Anubhav Dhoot added a comment - Assigning to myself to as I am starting work on this. Sunil G let me know if you have made progress on this already.
          Hide
          sunilg Sunil G added a comment -

          Hi Anubhav Dhoot
          I have started working on this a lil bit earlier, and made analysis on same. Please feel free to start working, and I could help you with the reviews. If any other sub-parts work is needed for finishing same, please let me know, I could give u a hand.
          Thank you.

          Show
          sunilg Sunil G added a comment - Hi Anubhav Dhoot I have started working on this a lil bit earlier, and made analysis on same. Please feel free to start working, and I could help you with the reviews. If any other sub-parts work is needed for finishing same, please let me know, I could give u a hand. Thank you.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          This is what we do for slider http://steveloughran.blogspot.co.uk/2015/05/dynamic-datacentre-applications.html, with SLIDER-856 containing the failure-analysis, part of the placement rework of SLIDER-611.

          it differentiates

          • known node failure events (counts against node reliability)
          • known app failures (limits exceeded) (counts against component reliability, not nodes)
          • pre-emption (don't worry about them)
          • startup failures (often a symptom of TCP port conflict, localisation failure, lack of keytabs, or some other incompatibility between container and node)
          • general "container exit" events (count against node and component)

          Also

          • it resets the counters regularly.
          • has different failure thresholds for different components (e.g for 30+ region servers, we have a higher threshold than for the 2 hbase masters)
          • doesn't let the unreliability of one component on a node count against it being used for requesting different components on it. (Mixed merit here; good for things like port conflict, bad for other causes).

          None of this looks @ AM failures. We haven't seen specific problems there to the same extent as some containers, because YARN does the tracking, the AM doesn't have any hard-coded ports, and with one AM per app, failure rate is much lower. Where we do have problems it is usually immediately obvious at launch time, and almost invariably environment related.

          Show
          stevel@apache.org Steve Loughran added a comment - This is what we do for slider http://steveloughran.blogspot.co.uk/2015/05/dynamic-datacentre-applications.html , with SLIDER-856 containing the failure-analysis , part of the placement rework of SLIDER-611 . it differentiates known node failure events (counts against node reliability) known app failures (limits exceeded) (counts against component reliability, not nodes) pre-emption (don't worry about them) startup failures (often a symptom of TCP port conflict, localisation failure, lack of keytabs, or some other incompatibility between container and node) general "container exit" events (count against node and component) Also it resets the counters regularly. has different failure thresholds for different components (e.g for 30+ region servers, we have a higher threshold than for the 2 hbase masters) doesn't let the unreliability of one component on a node count against it being used for requesting different components on it. (Mixed merit here; good for things like port conflict, bad for other causes). None of this looks @ AM failures. We haven't seen specific problems there to the same extent as some containers, because YARN does the tracking, the AM doesn't have any hard-coded ports, and with one AM per app, failure rate is much lower. Where we do have problems it is usually immediately obvious at launch time, and almost invariably environment related.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          Don't do it yet, but plan for a future version to add liveness probes, which is what we're adding to slider soon. The AM already registers its IPC and HTTP ports; if the AM could also register a health URL, such as the codehale /healthy URL, then something near the RM could decide when the AM had failed. For that we need

          • URLs to be provided at AM registration, or updated later
          • something to do the liveness checks. The RM is overloaded on a big cluster, but a little YARN service that could be launched standalone or embedded would be enough. I have all the code for liveness probes (basic TCP, http gets & status, with a launch track policy: you are given time to start, but once a probe is up, it must stay up). Of course, it'd need to run on an RM node for the redirect logic to not bounce it through the RM proxy.
          • AMs to provide simple health URLs which return an HTTP error code on failure, 200 if happy.
          Show
          stevel@apache.org Steve Loughran added a comment - Don't do it yet, but plan for a future version to add liveness probes, which is what we're adding to slider soon. The AM already registers its IPC and HTTP ports; if the AM could also register a health URL, such as the codehale /healthy URL, then something near the RM could decide when the AM had failed. For that we need URLs to be provided at AM registration, or updated later something to do the liveness checks. The RM is overloaded on a big cluster, but a little YARN service that could be launched standalone or embedded would be enough. I have all the code for liveness probes (basic TCP, http gets & status, with a launch track policy: you are given time to start, but once a probe is up, it must stay up). Of course, it'd need to run on an RM node for the redirect logic to not bounce it through the RM proxy. AMs to provide simple health URLs which return an HTTP error code on failure, 200 if happy.
          Hide
          sunilg Sunil G added a comment -

          Hi Steve Loughran
          In our environments we have seen AM container launch failures in specific nodes due to memory issues (-Xmx configs). But in other nodes, it was fine. So if AM container fails in node1, then next time for its second attempt we can try in another node other than node1. Keeping in mind that we can do that skipping for certain duration or some retry counts. This is not a clear solution, but somewhat a fail safe option.

          known node failure events (counts against node reliability)

          A proposal was made earlier in YARN-2293, where a count or score was set against the reliability of a node ( here container failures contributes to node reliability also). I could see SLIDER-856 is doing somewhat similar approach. Do you see any advantages of doing this in YARN?

          Show
          sunilg Sunil G added a comment - Hi Steve Loughran In our environments we have seen AM container launch failures in specific nodes due to memory issues (-Xmx configs). But in other nodes, it was fine. So if AM container fails in node1, then next time for its second attempt we can try in another node other than node1. Keeping in mind that we can do that skipping for certain duration or some retry counts. This is not a clear solution, but somewhat a fail safe option. known node failure events (counts against node reliability) A proposal was made earlier in YARN-2293 , where a count or score was set against the reliability of a node ( here container failures contributes to node reliability also). I could see SLIDER-856 is doing somewhat similar approach. Do you see any advantages of doing this in YARN?
          Hide
          stevel@apache.org Steve Loughran added a comment -

          Sunil: we use the scoring to decide whether to trust nodes for specific components (e.g. region servers); we can't do anything in the AM for AM failures.

          Like you propose, you can do differentiate some node related as well as node-unrelated problems, though a generic System.exit(-1) has to be treated as a "both" failure. That is, unless the AM could exit with a specific error code 'this node doesn't suit us', which could be used to bail out on some problem like missing keytab, port in use, no GPU, ...

          Show
          stevel@apache.org Steve Loughran added a comment - Sunil: we use the scoring to decide whether to trust nodes for specific components (e.g. region servers); we can't do anything in the AM for AM failures. Like you propose, you can do differentiate some node related as well as node-unrelated problems, though a generic System.exit(-1) has to be treated as a "both" failure. That is, unless the AM could exit with a specific error code 'this node doesn't suit us', which could be used to bail out on some problem like missing keytab, port in use, no GPU, ...
          Hide
          sunilg Sunil G added a comment -

          Thank you Steve Loughran for sharing your thoughts.
          I completely agree with your point that AM also has to give specific return type as per an agreement with RM so that we can take a decision more wisely. Still I feel it will be completely upto AM to do the handling.
          If its ok, I think we can spin this return code handing from as a ticket and will try make progress in same.

          Show
          sunilg Sunil G added a comment - Thank you Steve Loughran for sharing your thoughts. I completely agree with your point that AM also has to give specific return type as per an agreement with RM so that we can take a decision more wisely. Still I feel it will be completely upto AM to do the handling. If its ok, I think we can spin this return code handing from as a ticket and will try make progress in same.
          Hide
          adhoot Anubhav Dhoot added a comment -

          Submitted patch that implements per app blacklisting for AM.
          Since AM allocations happen similar to any other allocation the blacklisting also has to be done in the same place. So it removes blacklisting after AM launch to limit impact on user's blacklist.
          It also implements a threshold for how much of the cluster can be blacklisted.
          Added a configuration to turn on/off

          Show
          adhoot Anubhav Dhoot added a comment - Submitted patch that implements per app blacklisting for AM. Since AM allocations happen similar to any other allocation the blacklisting also has to be done in the same place. So it removes blacklisting after AM launch to limit impact on user's blacklist. It also implements a threshold for how much of the cluster can be blacklisted. Added a configuration to turn on/off
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 17m 59s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 3 new or modified test files.
          +1 javac 7m 34s There were no new javac warning messages.
          +1 javadoc 9m 40s There were no new javadoc warning messages.
          +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 1m 41s The applied patch generated 1 new checkstyle issues (total was 211, now 211).
          +1 whitespace 0m 6s The patch has no lines that end in whitespace.
          +1 install 1m 35s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 3m 46s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 tools/hadoop tests 0m 52s Tests passed in hadoop-sls.
          +1 yarn tests 0m 23s Tests passed in hadoop-yarn-api.
          -1 yarn tests 51m 5s Tests failed in hadoop-yarn-server-resourcemanager.
              95m 51s  



          Reason Tests
          Failed unit tests hadoop.yarn.server.resourcemanager.rmapp.attempt.TestRMAppAttemptTransitions



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12741955/YARN-2005.001.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / aa5b15b
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8350/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
          hadoop-sls test log https://builds.apache.org/job/PreCommit-YARN-Build/8350/artifact/patchprocess/testrun_hadoop-sls.txt
          hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8350/artifact/patchprocess/testrun_hadoop-yarn-api.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8350/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8350/testReport/
          Java 1.7.0_55
          uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8350/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 17m 59s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 3 new or modified test files. +1 javac 7m 34s There were no new javac warning messages. +1 javadoc 9m 40s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 1m 41s The applied patch generated 1 new checkstyle issues (total was 211, now 211). +1 whitespace 0m 6s The patch has no lines that end in whitespace. +1 install 1m 35s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 3m 46s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 tools/hadoop tests 0m 52s Tests passed in hadoop-sls. +1 yarn tests 0m 23s Tests passed in hadoop-yarn-api. -1 yarn tests 51m 5s Tests failed in hadoop-yarn-server-resourcemanager.     95m 51s   Reason Tests Failed unit tests hadoop.yarn.server.resourcemanager.rmapp.attempt.TestRMAppAttemptTransitions Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12741955/YARN-2005.001.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / aa5b15b checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8350/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt hadoop-sls test log https://builds.apache.org/job/PreCommit-YARN-Build/8350/artifact/patchprocess/testrun_hadoop-sls.txt hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8350/artifact/patchprocess/testrun_hadoop-yarn-api.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8350/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8350/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8350/console This message was automatically generated.
          Hide
          adhoot Anubhav Dhoot added a comment -

          Addressed the test failure. Unmanaged AM also executes the AMLaunched Transition which was causing the the allocate call that removes AM blacklist. Changed it so it does not execute for unmanaged AM.

          Show
          adhoot Anubhav Dhoot added a comment - Addressed the test failure. Unmanaged AM also executes the AMLaunched Transition which was causing the the allocate call that removes AM blacklist. Changed it so it does not execute for unmanaged AM.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          -1 pre-patch 17m 20s Findbugs (version ) appears to be broken on trunk.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 3 new or modified test files.
          +1 javac 7m 39s There were no new javac warning messages.
          +1 javadoc 9m 38s There were no new javadoc warning messages.
          +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 1m 30s The applied patch generated 1 new checkstyle issues (total was 211, now 211).
          +1 whitespace 0m 5s The patch has no lines that end in whitespace.
          +1 install 1m 34s mvn install still works.
          +1 eclipse:eclipse 0m 35s The patch built with eclipse:eclipse.
          +1 findbugs 3m 49s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 tools/hadoop tests 0m 52s Tests passed in hadoop-sls.
          +1 yarn tests 0m 22s Tests passed in hadoop-yarn-api.
          +1 yarn tests 51m 4s Tests passed in hadoop-yarn-server-resourcemanager.
              95m 8s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12742183/YARN-2005.002.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 60b858b
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8358/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
          hadoop-sls test log https://builds.apache.org/job/PreCommit-YARN-Build/8358/artifact/patchprocess/testrun_hadoop-sls.txt
          hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8358/artifact/patchprocess/testrun_hadoop-yarn-api.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8358/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8358/testReport/
          Java 1.7.0_55
          uname Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8358/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 pre-patch 17m 20s Findbugs (version ) appears to be broken on trunk. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 3 new or modified test files. +1 javac 7m 39s There were no new javac warning messages. +1 javadoc 9m 38s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 1m 30s The applied patch generated 1 new checkstyle issues (total was 211, now 211). +1 whitespace 0m 5s The patch has no lines that end in whitespace. +1 install 1m 34s mvn install still works. +1 eclipse:eclipse 0m 35s The patch built with eclipse:eclipse. +1 findbugs 3m 49s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 tools/hadoop tests 0m 52s Tests passed in hadoop-sls. +1 yarn tests 0m 22s Tests passed in hadoop-yarn-api. +1 yarn tests 51m 4s Tests passed in hadoop-yarn-server-resourcemanager.     95m 8s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12742183/YARN-2005.002.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 60b858b checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8358/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt hadoop-sls test log https://builds.apache.org/job/PreCommit-YARN-Build/8358/artifact/patchprocess/testrun_hadoop-sls.txt hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8358/artifact/patchprocess/testrun_hadoop-yarn-api.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8358/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8358/testReport/ Java 1.7.0_55 uname Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8358/console This message was automatically generated.
          Hide
          adhoot Anubhav Dhoot added a comment -

          The checkstyle error is unavoidable (preexisting).
          Jason LoweSunil G this is as per the discussion here and is ready for your review. Jian HeKarthik Kambatla appreciate your review as well

          Show
          adhoot Anubhav Dhoot added a comment - The checkstyle error is unavoidable (preexisting). Jason Lowe Sunil G this is as per the discussion here and is ready for your review. Jian He Karthik Kambatla appreciate your review as well
          Hide
          jlowe Jason Lowe added a comment -

          Thanks for posting a patch, Anubhav!

          I haven't done a very in-depth review, but here's a few comments/questions so far:

          Why add an interface to the scheduler to get the number of nodes? Is there a reason we can't use ClusterMetrics.getNumActiveNMs?

          The blacklist view of the cluster is static. It takes a snapshot of the number of nodes and doesn't update as nodes are added or removed from the cluster. That's problematic if the number of nodes changes drastically from one attempt to the next. I'm thinking in particular about recovery scenarios or something similar where we may create attempts when only a few (possibly none?) of the nodes have registered.

          Show
          jlowe Jason Lowe added a comment - Thanks for posting a patch, Anubhav! I haven't done a very in-depth review, but here's a few comments/questions so far: Why add an interface to the scheduler to get the number of nodes? Is there a reason we can't use ClusterMetrics.getNumActiveNMs? The blacklist view of the cluster is static. It takes a snapshot of the number of nodes and doesn't update as nodes are added or removed from the cluster. That's problematic if the number of nodes changes drastically from one attempt to the next. I'm thinking in particular about recovery scenarios or something similar where we may create attempts when only a few (possibly none?) of the nodes have registered.
          Hide
          adhoot Anubhav Dhoot added a comment -

          The getNumActiveNMs gets number of unique NodeId, while blacklisting occurs on the host part of the NodeId. This ensures we calculate 2 different NodeId on the same host as 1 host for blacklisting. This ends up exposed in the unit tests where were use multiple NMs on same host.

          I was wondering about how often to update this. Should we do it every time we try to get blacklist (add another api on BlacklistManager) or during start of each attempt? I am thinking every time we get blacklist should be ideal. I will update a patch with that.

          Show
          adhoot Anubhav Dhoot added a comment - The getNumActiveNMs gets number of unique NodeId, while blacklisting occurs on the host part of the NodeId. This ensures we calculate 2 different NodeId on the same host as 1 host for blacklisting. This ends up exposed in the unit tests where were use multiple NMs on same host. I was wondering about how often to update this. Should we do it every time we try to get blacklist (add another api on BlacklistManager) or during start of each attempt? I am thinking every time we get blacklist should be ideal. I will update a patch with that.
          Hide
          adhoot Anubhav Dhoot added a comment -

          Added a call to update nodeHostCount before getting blacklist

          Show
          adhoot Anubhav Dhoot added a comment - Added a call to update nodeHostCount before getting blacklist
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 18m 12s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 3 new or modified test files.
          +1 javac 7m 34s There were no new javac warning messages.
          +1 javadoc 9m 34s There were no new javadoc warning messages.
          +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 1m 41s The applied patch generated 1 new checkstyle issues (total was 211, now 211).
          +1 whitespace 0m 8s The patch has no lines that end in whitespace.
          +1 install 1m 38s mvn install still works.
          +1 eclipse:eclipse 0m 32s The patch built with eclipse:eclipse.
          +1 findbugs 3m 44s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 tools/hadoop tests 0m 52s Tests passed in hadoop-sls.
          +1 yarn tests 0m 22s Tests passed in hadoop-yarn-api.
          +1 yarn tests 51m 2s Tests passed in hadoop-yarn-server-resourcemanager.
              95m 57s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12743007/YARN-2005.003.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 7405c59
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8400/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
          hadoop-sls test log https://builds.apache.org/job/PreCommit-YARN-Build/8400/artifact/patchprocess/testrun_hadoop-sls.txt
          hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8400/artifact/patchprocess/testrun_hadoop-yarn-api.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8400/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8400/testReport/
          Java 1.7.0_55
          uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8400/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 18m 12s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 3 new or modified test files. +1 javac 7m 34s There were no new javac warning messages. +1 javadoc 9m 34s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 1m 41s The applied patch generated 1 new checkstyle issues (total was 211, now 211). +1 whitespace 0m 8s The patch has no lines that end in whitespace. +1 install 1m 38s mvn install still works. +1 eclipse:eclipse 0m 32s The patch built with eclipse:eclipse. +1 findbugs 3m 44s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 tools/hadoop tests 0m 52s Tests passed in hadoop-sls. +1 yarn tests 0m 22s Tests passed in hadoop-yarn-api. +1 yarn tests 51m 2s Tests passed in hadoop-yarn-server-resourcemanager.     95m 57s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12743007/YARN-2005.003.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 7405c59 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8400/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt hadoop-sls test log https://builds.apache.org/job/PreCommit-YARN-Build/8400/artifact/patchprocess/testrun_hadoop-sls.txt hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8400/artifact/patchprocess/testrun_hadoop-yarn-api.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8400/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8400/testReport/ Java 1.7.0_55 uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8400/console This message was automatically generated.
          Hide
          adhoot Anubhav Dhoot added a comment -

          Jason Lowe appreciate your review of the updated patch

          Show
          adhoot Anubhav Dhoot added a comment - Jason Lowe appreciate your review of the updated patch
          Hide
          sunilg Sunil G added a comment -

          Hi Anubhav Dhoot
          Thank you for sharing patch for same. I have couple of doubts.

          • DEFAULT_FAILURE_THRESHOLD
            Now default is 0.8, I feel we can keep this as a configurable limit. Based on node size, i feel user can decide till which threshold we can support AM blacklisting.
          • Below code from CS#allocate
               application.updateBlacklist(blacklistAdditions, blacklistRemovals);
            

            Assume a case where app1 AM is running in node1. Due to a failure there, app is relaunched in node2 and node1 is marked for blacklisting by SimpleBlacklistManager.
            Since node1 is added as blacklisted, all containers of this app will be blacklisted in node1. Is this intended, Please correct me if I am wrong.

          Show
          sunilg Sunil G added a comment - Hi Anubhav Dhoot Thank you for sharing patch for same. I have couple of doubts. DEFAULT_FAILURE_THRESHOLD Now default is 0.8, I feel we can keep this as a configurable limit. Based on node size, i feel user can decide till which threshold we can support AM blacklisting. Below code from CS#allocate application.updateBlacklist(blacklistAdditions, blacklistRemovals); Assume a case where app1 AM is running in node1 . Due to a failure there, app is relaunched in node2 and node1 is marked for blacklisting by SimpleBlacklistManager. Since node1 is added as blacklisted, all containers of this app will be blacklisted in node1. Is this intended, Please correct me if I am wrong.
          Hide
          adhoot Anubhav Dhoot added a comment -

          Thanks for your comments Sunil.
          1. Yes we can make it configurable.
          2. The nodes are removed from blacklist once the launch of the AM happens to limit this issue. (see

           // Remove the blacklist added by AM blacklist from shared blacklist) 

          The root cause is there is no difference in YARN for scheduling AM vs normal containers. They hence share the blacklist as well.
          Let me know if you still feel this is an issue, otherwise i can fix 1 and upload a new iteration.

          Show
          adhoot Anubhav Dhoot added a comment - Thanks for your comments Sunil. 1. Yes we can make it configurable. 2. The nodes are removed from blacklist once the launch of the AM happens to limit this issue. (see // Remove the blacklist added by AM blacklist from shared blacklist) The root cause is there is no difference in YARN for scheduling AM vs normal containers. They hence share the blacklist as well. Let me know if you still feel this is an issue, otherwise i can fix 1 and upload a new iteration.
          Hide
          jaideepdhok Jaideep Dhok added a comment -

          Anubhav Dhoot Is it possible to add a metric for total number of blacklisted nodes?

          Show
          jaideepdhok Jaideep Dhok added a comment - Anubhav Dhoot Is it possible to add a metric for total number of blacklisted nodes?
          Hide
          adhoot Anubhav Dhoot added a comment -

          The actual blacklist is already available in the REST API for RM. http://localhost:23188/ws/v1/cluster/apps/application_1436839322176_0001/appattempts. We can add a metric if you still feel its needed.

          Show
          adhoot Anubhav Dhoot added a comment - The actual blacklist is already available in the REST API for RM. http://localhost:23188/ws/v1/cluster/apps/application_1436839322176_0001/appattempts . We can add a metric if you still feel its needed.
          Hide
          adhoot Anubhav Dhoot added a comment -

          Addressed feedback of adding configuration for threshold
          Also made the default to true and updated tests

          Show
          adhoot Anubhav Dhoot added a comment - Addressed feedback of adding configuration for threshold Also made the default to true and updated tests
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 18m 43s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 6 new or modified test files.
          +1 javac 8m 1s There were no new javac warning messages.
          +1 javadoc 10m 6s There were no new javadoc warning messages.
          +1 release audit 0m 21s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 1m 43s The applied patch generated 1 new checkstyle issues (total was 211, now 211).
          +1 whitespace 0m 7s The patch has no lines that end in whitespace.
          +1 install 1m 23s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 3m 56s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 tools/hadoop tests 0m 53s Tests passed in hadoop-sls.
          +1 yarn tests 0m 23s Tests passed in hadoop-yarn-api.
          +1 yarn tests 51m 54s Tests passed in hadoop-yarn-server-resourcemanager.
              98m 20s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12745548/YARN-2005.004.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 3ec0a04
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8553/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
          hadoop-sls test log https://builds.apache.org/job/PreCommit-YARN-Build/8553/artifact/patchprocess/testrun_hadoop-sls.txt
          hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8553/artifact/patchprocess/testrun_hadoop-yarn-api.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8553/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8553/testReport/
          Java 1.7.0_55
          uname Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8553/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 18m 43s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 6 new or modified test files. +1 javac 8m 1s There were no new javac warning messages. +1 javadoc 10m 6s There were no new javadoc warning messages. +1 release audit 0m 21s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 1m 43s The applied patch generated 1 new checkstyle issues (total was 211, now 211). +1 whitespace 0m 7s The patch has no lines that end in whitespace. +1 install 1m 23s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 3m 56s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 tools/hadoop tests 0m 53s Tests passed in hadoop-sls. +1 yarn tests 0m 23s Tests passed in hadoop-yarn-api. +1 yarn tests 51m 54s Tests passed in hadoop-yarn-server-resourcemanager.     98m 20s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12745548/YARN-2005.004.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 3ec0a04 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8553/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt hadoop-sls test log https://builds.apache.org/job/PreCommit-YARN-Build/8553/artifact/patchprocess/testrun_hadoop-sls.txt hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8553/artifact/patchprocess/testrun_hadoop-yarn-api.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8553/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8553/testReport/ Java 1.7.0_55 uname Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8553/console This message was automatically generated.
          Hide
          sunilg Sunil G added a comment -

          Thanks Anubhav Dhoot. Sorry for delayed response.

          The nodes are removed from blacklist once the launch of the AM happens to limit this issue.

          Yes. I feel this will be fine.

          Show
          sunilg Sunil G added a comment - Thanks Anubhav Dhoot . Sorry for delayed response. The nodes are removed from blacklist once the launch of the AM happens to limit this issue. Yes. I feel this will be fine.
          Hide
          jianhe Jian He added a comment -

          Seems the patch will blacklist a node immediately once the AM container fails, I think we may black list a node only after a configurable threshold ? Some apps may still like to be re-started on the same node for reasons like data locality - AM does not want to transfer the local data to a different machine when restarted.

          Show
          jianhe Jian He added a comment - Seems the patch will blacklist a node immediately once the AM container fails, I think we may black list a node only after a configurable threshold ? Some apps may still like to be re-started on the same node for reasons like data locality - AM does not want to transfer the local data to a different machine when restarted.
          Hide
          adhoot Anubhav Dhoot added a comment -

          Hi Jian for apps with default max attempt, aren't we wasting attempts if we are retrying on the same node and in danger of failing the application sooner?

          Show
          adhoot Anubhav Dhoot added a comment - Hi Jian for apps with default max attempt, aren't we wasting attempts if we are retrying on the same node and in danger of failing the application sooner?
          Hide
          bikassaha Bikas Saha added a comment -

          I believe the reverse case is also valid. A user may want to specify a locality constraint for the AM but today that is ignored because its dropped from the am resource request by the RM. Similarly, only the memory resource constraint is used and others (cpu etc) are dropped. Perhaps this jira should tackle this problem wholistically by fully respecting the resource request as specified in the am container launch context.

          Show
          bikassaha Bikas Saha added a comment - I believe the reverse case is also valid. A user may want to specify a locality constraint for the AM but today that is ignored because its dropped from the am resource request by the RM. Similarly, only the memory resource constraint is used and others (cpu etc) are dropped. Perhaps this jira should tackle this problem wholistically by fully respecting the resource request as specified in the am container launch context.
          Hide
          adhoot Anubhav Dhoot added a comment -

          I think blacklisting can have lots of policies and constraints and will probably change over time. Since RMAppAttemptImpl#ScheduleTransition drops the locality constraint it seems ok for the current blacklisting to also be locality constraint unaware. Should we start simple and keep a separate jira for honoring am locality in scheduling and blacklisting at the same time? Jian He,Bikas Saha let me know if you agree and I can file that jira.

          Show
          adhoot Anubhav Dhoot added a comment - I think blacklisting can have lots of policies and constraints and will probably change over time. Since RMAppAttemptImpl#ScheduleTransition drops the locality constraint it seems ok for the current blacklisting to also be locality constraint unaware. Should we start simple and keep a separate jira for honoring am locality in scheduling and blacklisting at the same time? Jian He , Bikas Saha let me know if you agree and I can file that jira.
          Hide
          bikassaha Bikas Saha added a comment -

          I am fine for opening a separate jira for the specific case I mentioned. Opened YARN-3994 for that. If you want you can extend its scope to blacklisting.

          Show
          bikassaha Bikas Saha added a comment - I am fine for opening a separate jira for the specific case I mentioned. Opened YARN-3994 for that. If you want you can extend its scope to blacklisting.
          Hide
          asuresh Arun Suresh added a comment -

          Thanks for the patch Anubhav Dhoot,

          Couple of comments:

          1. noBlacklist in DisabledBlacklistManager can be made static final.
          2. getNumClusterHosts() in AbstractYarnScheduler : Any reason we are creating a new set ? think returning this.nodes.size() should suffice right ?
          3. Instead of removing from the shared blacklist cause problems if the shared blacklist already contained the blacklisted node ?
          Show
          asuresh Arun Suresh added a comment - Thanks for the patch Anubhav Dhoot , Couple of comments: noBlacklist in DisabledBlacklistManager can be made static final. getNumClusterHosts() in AbstractYarnScheduler : Any reason we are creating a new set ? think returning this.nodes.size() should suffice right ? Instead of removing from the shared blacklist cause problems if the shared blacklist already contained the blacklisted node ?
          Hide
          adhoot Anubhav Dhoot added a comment -

          Thanks Arun Suresh for the review
          1. The getNumClusterHosts is described above. Basically blacklist is based on hostname and this counts number of unique hostnames for all NMs.
          3. As discussed offline this is the limitation of the current scheduler api that does not have a notion of a blacklist for RM's own use vs the user's blacklist. We could end up removing a blacklisted node that user added.
          To avoid this we would have to add an API to the scheduler to manage a separate blacklist (say call it system blacklist) that we merge with the users blacklist during allocation.
          Wonder what others think about that?

          Show
          adhoot Anubhav Dhoot added a comment - Thanks Arun Suresh for the review 1. The getNumClusterHosts is described above . Basically blacklist is based on hostname and this counts number of unique hostnames for all NMs. 3. As discussed offline this is the limitation of the current scheduler api that does not have a notion of a blacklist for RM's own use vs the user's blacklist. We could end up removing a blacklisted node that user added. To avoid this we would have to add an API to the scheduler to manage a separate blacklist (say call it system blacklist) that we merge with the users blacklist during allocation. Wonder what others think about that?
          Hide
          adhoot Anubhav Dhoot added a comment -

          Added a patch that maintains a separate system blacklist for launching AMs different than user blacklist. This avoid accidentally affecting the user's blacklist for launching containers.

          Show
          adhoot Anubhav Dhoot added a comment - Added a patch that maintains a separate system blacklist for launching AMs different than user blacklist. This avoid accidentally affecting the user's blacklist for launching containers.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 19m 21s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 6 new or modified test files.
          +1 javac 7m 48s There were no new javac warning messages.
          +1 javadoc 9m 39s There were no new javadoc warning messages.
          +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 1m 47s The applied patch generated 1 new checkstyle issues (total was 211, now 211).
          -1 whitespace 0m 12s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix.
          +1 install 1m 27s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 3m 51s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 tools/hadoop tests 0m 52s Tests passed in hadoop-sls.
          -1 yarn tests 0m 22s Tests failed in hadoop-yarn-api.
          -1 yarn tests 57m 41s Tests failed in hadoop-yarn-server-resourcemanager.
              104m 16s  



          Reason Tests
          Failed unit tests hadoop.yarn.conf.TestYarnConfigurationFields
            hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12750966/YARN-2005.005.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 71566e2
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8874/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
          whitespace https://builds.apache.org/job/PreCommit-YARN-Build/8874/artifact/patchprocess/whitespace.txt
          hadoop-sls test log https://builds.apache.org/job/PreCommit-YARN-Build/8874/artifact/patchprocess/testrun_hadoop-sls.txt
          hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8874/artifact/patchprocess/testrun_hadoop-yarn-api.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8874/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8874/testReport/
          Java 1.7.0_55
          uname Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8874/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 19m 21s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 6 new or modified test files. +1 javac 7m 48s There were no new javac warning messages. +1 javadoc 9m 39s There were no new javadoc warning messages. +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 1m 47s The applied patch generated 1 new checkstyle issues (total was 211, now 211). -1 whitespace 0m 12s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 1m 27s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 3m 51s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 tools/hadoop tests 0m 52s Tests passed in hadoop-sls. -1 yarn tests 0m 22s Tests failed in hadoop-yarn-api. -1 yarn tests 57m 41s Tests failed in hadoop-yarn-server-resourcemanager.     104m 16s   Reason Tests Failed unit tests hadoop.yarn.conf.TestYarnConfigurationFields   hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12750966/YARN-2005.005.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 71566e2 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8874/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt whitespace https://builds.apache.org/job/PreCommit-YARN-Build/8874/artifact/patchprocess/whitespace.txt hadoop-sls test log https://builds.apache.org/job/PreCommit-YARN-Build/8874/artifact/patchprocess/testrun_hadoop-sls.txt hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8874/artifact/patchprocess/testrun_hadoop-yarn-api.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8874/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8874/testReport/ Java 1.7.0_55 uname Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8874/console This message was automatically generated.
          Hide
          adhoot Anubhav Dhoot added a comment -

          Fixed YarnConfiguration unit test. Other failure is not happening locally for me.

          Show
          adhoot Anubhav Dhoot added a comment - Fixed YarnConfiguration unit test. Other failure is not happening locally for me.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          -1 pre-patch 3m 22s trunk compilation may be broken.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 7 new or modified test files.
          -1 javac 2m 25s The patch appears to cause the build to fail.



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12751130/YARN-2005.006.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 7ecbfd4
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8881/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 pre-patch 3m 22s trunk compilation may be broken. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 7 new or modified test files. -1 javac 2m 25s The patch appears to cause the build to fail. Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12751130/YARN-2005.006.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 7ecbfd4 Console output https://builds.apache.org/job/PreCommit-YARN-Build/8881/console This message was automatically generated.
          Hide
          adhoot Anubhav Dhoot added a comment -

          Retriggering for error in jenkins
          ERROR: Workspace has a .git repository, but it appears to be corrupt.
          hudson.plugins.git.GitException: Error performing git command

          Show
          adhoot Anubhav Dhoot added a comment - Retriggering for error in jenkins ERROR: Workspace has a .git repository, but it appears to be corrupt. hudson.plugins.git.GitException: Error performing git command
          Hide
          adhoot Anubhav Dhoot added a comment -

          reattach patch to trigger jenkins

          Show
          adhoot Anubhav Dhoot added a comment - reattach patch to trigger jenkins
          Hide
          jianhe Jian He added a comment -

          sorry for the late response.
          I feel a simpler way is to do this inside the scheduler. RMApp and RMAppAttempt need not be involved in the loop.
          This way, the YarnScheduler interface can also avoid exposing a public updateBlackList interface.
          The SchedulerApplication can hold a reference to the blacklisted nodes. AppSchedulerInfo can consult this reference for blacklisted nodes.

          Show
          jianhe Jian He added a comment - sorry for the late response. I feel a simpler way is to do this inside the scheduler. RMApp and RMAppAttempt need not be involved in the loop. This way, the YarnScheduler interface can also avoid exposing a public updateBlackList interface. The SchedulerApplication can hold a reference to the blacklisted nodes. AppSchedulerInfo can consult this reference for blacklisted nodes.
          Hide
          adhoot Anubhav Dhoot added a comment -

          Jian He thanks for your comments

          RMApp and RMAppAttempt need not be involved in the loop.

          In the current patch the following responsibilities are assigned to RMApp/RMAppAttempt.
          a) When a AM fails, add that host to the system blacklist
          b) Before launching the AM activate the system blacklist with current known AM failure hosts.
          c) After AM launch succeeds, deactivate the system blacklist to avoid impacting other user allocations.
          Since AM launch is responsibility of RMAppAttempt I kept all of these there. Can you please elaborate where and how would these be done SchedulerApplication/AppSchedulingInfo in a clean way? Thanks.

          Show
          adhoot Anubhav Dhoot added a comment - Jian He thanks for your comments RMApp and RMAppAttempt need not be involved in the loop. In the current patch the following responsibilities are assigned to RMApp/RMAppAttempt. a) When a AM fails, add that host to the system blacklist b) Before launching the AM activate the system blacklist with current known AM failure hosts. c) After AM launch succeeds, deactivate the system blacklist to avoid impacting other user allocations. Since AM launch is responsibility of RMAppAttempt I kept all of these there. Can you please elaborate where and how would these be done SchedulerApplication/AppSchedulingInfo in a clean way? Thanks.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 20m 6s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 7 new or modified test files.
          +1 javac 7m 55s There were no new javac warning messages.
          +1 javadoc 9m 50s There were no new javadoc warning messages.
          +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 2m 17s The applied patch generated 1 new checkstyle issues (total was 211, now 211).
          +1 whitespace 0m 17s The patch has no lines that end in whitespace.
          +1 install 1m 32s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 5m 32s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 tools/hadoop tests 0m 54s Tests passed in hadoop-sls.
          +1 yarn tests 0m 23s Tests passed in hadoop-yarn-api.
          -1 yarn tests 1m 59s Tests failed in hadoop-yarn-common.
          -1 yarn tests 54m 0s Tests failed in hadoop-yarn-server-resourcemanager.
              106m 28s  



          Reason Tests
          Failed unit tests hadoop.yarn.util.TestRackResolver
            hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12751303/YARN-2005.006.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 3aac475
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8886/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
          hadoop-sls test log https://builds.apache.org/job/PreCommit-YARN-Build/8886/artifact/patchprocess/testrun_hadoop-sls.txt
          hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8886/artifact/patchprocess/testrun_hadoop-yarn-api.txt
          hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8886/artifact/patchprocess/testrun_hadoop-yarn-common.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8886/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8886/testReport/
          Java 1.7.0_55
          uname Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8886/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 20m 6s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 7 new or modified test files. +1 javac 7m 55s There were no new javac warning messages. +1 javadoc 9m 50s There were no new javadoc warning messages. +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 2m 17s The applied patch generated 1 new checkstyle issues (total was 211, now 211). +1 whitespace 0m 17s The patch has no lines that end in whitespace. +1 install 1m 32s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 5m 32s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 tools/hadoop tests 0m 54s Tests passed in hadoop-sls. +1 yarn tests 0m 23s Tests passed in hadoop-yarn-api. -1 yarn tests 1m 59s Tests failed in hadoop-yarn-common. -1 yarn tests 54m 0s Tests failed in hadoop-yarn-server-resourcemanager.     106m 28s   Reason Tests Failed unit tests hadoop.yarn.util.TestRackResolver   hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12751303/YARN-2005.006.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 3aac475 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8886/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt hadoop-sls test log https://builds.apache.org/job/PreCommit-YARN-Build/8886/artifact/patchprocess/testrun_hadoop-sls.txt hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8886/artifact/patchprocess/testrun_hadoop-yarn-api.txt hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8886/artifact/patchprocess/testrun_hadoop-yarn-common.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8886/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8886/testReport/ Java 1.7.0_55 uname Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8886/console This message was automatically generated.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Anubhav Dhoot,
          I think one possible solution is, we can add necessary field to AppAttemptAddedSchedulerEvent, such as "lastAttemptState" and "AMNode", etc. Which should be scheduler application/attempt to make decisions.

          And another suggestion is, we may not need to create a separated getNumClusterHosts(), using existing #NMs should be enough. We have rare case that multiple NMs running in a same host, and even if there're multiple NMs running, AM failure could still relate to specific NM config.

          Show
          leftnoteasy Wangda Tan added a comment - Anubhav Dhoot , I think one possible solution is, we can add necessary field to AppAttemptAddedSchedulerEvent, such as "lastAttemptState" and "AMNode", etc. Which should be scheduler application/attempt to make decisions. And another suggestion is, we may not need to create a separated getNumClusterHosts(), using existing #NMs should be enough. We have rare case that multiple NMs running in a same host, and even if there're multiple NMs running, AM failure could still relate to specific NM config.
          Hide
          jianhe Jian He added a comment -

          ah, I forgot schedulerApplication isn't aware whether the application is failed or not. I think passing the blacklist info through the event is a good option to avoid the updateBlacklist api in the scheduler.
          Also, looks like the current patch will add the node into blacklist regardless whether the am is FAILED/KILLED/FINISHED. Ideally, we should add that only if the AM is failed ? Further, we may not blacklist the node if AM failed for reasons like preemption.

          Show
          jianhe Jian He added a comment - ah, I forgot schedulerApplication isn't aware whether the application is failed or not. I think passing the blacklist info through the event is a good option to avoid the updateBlacklist api in the scheduler. Also, looks like the current patch will add the node into blacklist regardless whether the am is FAILED/KILLED/FINISHED. Ideally, we should add that only if the AM is failed ? Further, we may not blacklist the node if AM failed for reasons like preemption.
          Hide
          adhoot Anubhav Dhoot added a comment -

          Attached patch makes a couple of changes. Instead of adding a new scheduler api it uses the same allocate call to update the system blacklist. The scheduler updates and uses system/user blacklist based on whether its a AM launch or not.
          It also tracks the cause of the container failure to decide whether to blacklist the Node or not. If we need to consider other reasons for blacklisting I propose we use followup jiras in order to make progress on this one.

          Show
          adhoot Anubhav Dhoot added a comment - Attached patch makes a couple of changes. Instead of adding a new scheduler api it uses the same allocate call to update the system blacklist. The scheduler updates and uses system/user blacklist based on whether its a AM launch or not. It also tracks the cause of the container failure to decide whether to blacklist the Node or not. If we need to consider other reasons for blacklisting I propose we use followup jiras in order to make progress on this one.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          -1 pre-patch 17m 25s Findbugs (version ) appears to be broken on trunk.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 7 new or modified test files.
          +1 javac 7m 52s There were no new javac warning messages.
          +1 javadoc 10m 4s There were no new javadoc warning messages.
          +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 1m 32s The applied patch generated 1 new checkstyle issues (total was 211, now 211).
          +1 whitespace 0m 25s The patch has no lines that end in whitespace.
          +1 install 1m 33s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          -1 findbugs 4m 43s The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 0m 23s Tests passed in hadoop-yarn-api.
          +1 yarn tests 2m 0s Tests passed in hadoop-yarn-common.
          -1 yarn tests 52m 13s Tests failed in hadoop-yarn-server-resourcemanager.
              99m 35s  



          Reason Tests
          FindBugs module:hadoop-yarn-server-resourcemanager
          Failed unit tests hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
            hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
            hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler
            hadoop.yarn.server.resourcemanager.scheduler.fair.TestSchedulingUpdate
            hadoop.yarn.server.resourcemanager.scheduler.capacity.TestReservations
            hadoop.yarn.server.resourcemanager.scheduler.capacity.TestLeafQueue
            hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12753810/YARN-2005.007.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 7d6687f
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8983/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
          Findbugs warnings https://builds.apache.org/job/PreCommit-YARN-Build/8983/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
          hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8983/artifact/patchprocess/testrun_hadoop-yarn-api.txt
          hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8983/artifact/patchprocess/testrun_hadoop-yarn-common.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8983/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8983/testReport/
          Java 1.7.0_55
          uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8983/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 pre-patch 17m 25s Findbugs (version ) appears to be broken on trunk. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 7 new or modified test files. +1 javac 7m 52s There were no new javac warning messages. +1 javadoc 10m 4s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 1m 32s The applied patch generated 1 new checkstyle issues (total was 211, now 211). +1 whitespace 0m 25s The patch has no lines that end in whitespace. +1 install 1m 33s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. -1 findbugs 4m 43s The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. +1 yarn tests 0m 23s Tests passed in hadoop-yarn-api. +1 yarn tests 2m 0s Tests passed in hadoop-yarn-common. -1 yarn tests 52m 13s Tests failed in hadoop-yarn-server-resourcemanager.     99m 35s   Reason Tests FindBugs module:hadoop-yarn-server-resourcemanager Failed unit tests hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation   hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler   hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler   hadoop.yarn.server.resourcemanager.scheduler.fair.TestSchedulingUpdate   hadoop.yarn.server.resourcemanager.scheduler.capacity.TestReservations   hadoop.yarn.server.resourcemanager.scheduler.capacity.TestLeafQueue   hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12753810/YARN-2005.007.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 7d6687f checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8983/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt Findbugs warnings https://builds.apache.org/job/PreCommit-YARN-Build/8983/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8983/artifact/patchprocess/testrun_hadoop-yarn-api.txt hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8983/artifact/patchprocess/testrun_hadoop-yarn-common.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8983/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8983/testReport/ Java 1.7.0_55 uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8983/console This message was automatically generated.
          Hide
          adhoot Anubhav Dhoot added a comment -

          Addressed test failures caused because of missing appattempt data in mock apps used by test.

          Show
          adhoot Anubhav Dhoot added a comment - Addressed test failures caused because of missing appattempt data in mock apps used by test.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          -1 pre-patch 18m 38s Findbugs (version ) appears to be broken on trunk.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 8 new or modified test files.
          +1 javac 7m 56s There were no new javac warning messages.
          +1 javadoc 10m 4s There were no new javadoc warning messages.
          +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 1m 45s The applied patch generated 1 new checkstyle issues (total was 211, now 211).
          +1 whitespace 0m 21s The patch has no lines that end in whitespace.
          +1 install 1m 31s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 4m 44s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 0m 24s Tests passed in hadoop-yarn-api.
          +1 yarn tests 2m 1s Tests passed in hadoop-yarn-common.
          +1 yarn tests 54m 38s Tests passed in hadoop-yarn-server-resourcemanager.
              103m 48s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12753897/YARN-2005.008.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 09c64ba
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8992/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
          hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8992/artifact/patchprocess/testrun_hadoop-yarn-api.txt
          hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8992/artifact/patchprocess/testrun_hadoop-yarn-common.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8992/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8992/testReport/
          Java 1.7.0_55
          uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8992/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 pre-patch 18m 38s Findbugs (version ) appears to be broken on trunk. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 8 new or modified test files. +1 javac 7m 56s There were no new javac warning messages. +1 javadoc 10m 4s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 1m 45s The applied patch generated 1 new checkstyle issues (total was 211, now 211). +1 whitespace 0m 21s The patch has no lines that end in whitespace. +1 install 1m 31s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 4m 44s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 0m 24s Tests passed in hadoop-yarn-api. +1 yarn tests 2m 1s Tests passed in hadoop-yarn-common. +1 yarn tests 54m 38s Tests passed in hadoop-yarn-server-resourcemanager.     103m 48s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12753897/YARN-2005.008.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 09c64ba checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8992/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8992/artifact/patchprocess/testrun_hadoop-yarn-api.txt hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8992/artifact/patchprocess/testrun_hadoop-yarn-common.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8992/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8992/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8992/console This message was automatically generated.
          Hide
          sunilg Sunil G added a comment -

          Hi Anubhav Dhoot
          Thank you for updating the patch. I have a comment here.

          isWaitingForAMContainer is now used in 2 cases. To set the ContainerType and also in blacklist case. And this check is now hitting in every heartbeat from AM.

          I think its better to set a state called amIsStarted in SchedulerApplicationAttempt. And this can be set from 2 places.
          1. RMAppAttemptImpl#AMContainerAllocatedTransition can call a new scheduler api to set amIsStarted flag when AM Container is launched and registered. We need to pass ContainerId to this new api to get attempt object and to set the flag.
          2. AbstrctYarnScheduler#recoverContainersOnNode can also invoke this api to set this flag.

          So now we can directly read from SchedulerApplicationAttempt everytime when heartbeat call comes from AM. If we are not doing this in this ticket, I can open another ticket for this optimization. Please suggest your thoughts.

          Show
          sunilg Sunil G added a comment - Hi Anubhav Dhoot Thank you for updating the patch. I have a comment here. isWaitingForAMContainer is now used in 2 cases. To set the ContainerType and also in blacklist case. And this check is now hitting in every heartbeat from AM. I think its better to set a state called amIsStarted in SchedulerApplicationAttempt . And this can be set from 2 places. 1. RMAppAttemptImpl#AMContainerAllocatedTransition can call a new scheduler api to set amIsStarted flag when AM Container is launched and registered. We need to pass ContainerId to this new api to get attempt object and to set the flag. 2. AbstrctYarnScheduler#recoverContainersOnNode can also invoke this api to set this flag. So now we can directly read from SchedulerApplicationAttempt everytime when heartbeat call comes from AM. If we are not doing this in this ticket, I can open another ticket for this optimization. Please suggest your thoughts.
          Hide
          He Tianyi He Tianyi added a comment -

          Hi,

          I've seen AM failures due to java dependency problem or misinput of parameter, in that case, we certainly do not want to block that node.
          In future version, are we able to tell whether the failure is caused by node itself, or by application (i.e. enforce exit code conventions)?

          Show
          He Tianyi He Tianyi added a comment - Hi, I've seen AM failures due to java dependency problem or misinput of parameter, in that case, we certainly do not want to block that node. In future version, are we able to tell whether the failure is caused by node itself, or by application (i.e. enforce exit code conventions)?
          Hide
          kasha Karthik Kambatla added a comment -

          Thanks for working on this, Anubhav. The approach looks good to me. Minor comments/nits on the patch itself:

          1. Spurious changes/imports in AbstractYarnScheduler, FifoScheduler, RMAppImpl, TestAbstractYarnScheduler, TestAMRestart, YarnScheduler.
          2. In AppSchedulingInfo
            1. synchronize only the common updateBlacklist method?
            2. Also, should we just synchronize on the list in question and not all of AppSchedulingInfo? I am fine with leaving it as is, if that is a lot of work and won't yield any big gains.
            3. The comments in the common updateBlaclist method refer to userBlacklist while the method operates on both system and user blacklists.
            4. Should transferStateFromPreviousAppSchedulingInfo update the systemBlacklist as well? Or, is the decision here to have the system blacklist per app-attempt?
          3. BlacklistAdditionsRemovals:
            1. Mark it Private
            2. Rename to BlacklistUpdates
            3. Rename members blacklistAdditions and blacklistRemovals to additions and removals, and update the getters accordingly?
          4. BlackListManager
            1. Mark it Private
            2. Rename addNodeContainerFailure to addNode?
            3. Rename getter?
          5. In DisabledBlacklistManager, define a static EMPTY_LIST similar to SimpleBlacklistManager and use that to avoid creating two ArrayLists for each AppAttempt.
          6. RMAppImpl: If am blacklisting is not enabled, we don't need to read the disable threshold.
          7. RMAppAttempt: Instead of getAMBlacklist, should we call it getSystemBlacklist to be consistent with the way we refer to it in the scheduler?
          8. RMAppAttemptImpl
            1. EMPTY_SYSTEM_BLACKLIST is unused
            2. Update any variables based on the method name - getAMBlacklist vs getSystemBlacklist
            3. Shouldn't we blacklist nodes on LaunchFailedTransition - may be in a follow-up JIRA? Can we file one if you agree.
          9. The MockRM change is unrelated? I like the change, but may be we should do it in a separate clean-up JIRA. It might have a few other things to clean-up
          10. TestAMRestart: A couple of unused variables.
          11. Why are the changes to TestCapacityScheduler needed? They don't look related to this patch.
          12. TestRMAppLogAggregationStatus change is unrelated.
          Show
          kasha Karthik Kambatla added a comment - Thanks for working on this, Anubhav. The approach looks good to me. Minor comments/nits on the patch itself: Spurious changes/imports in AbstractYarnScheduler, FifoScheduler, RMAppImpl, TestAbstractYarnScheduler, TestAMRestart, YarnScheduler. In AppSchedulingInfo synchronize only the common updateBlacklist method? Also, should we just synchronize on the list in question and not all of AppSchedulingInfo? I am fine with leaving it as is, if that is a lot of work and won't yield any big gains. The comments in the common updateBlaclist method refer to userBlacklist while the method operates on both system and user blacklists. Should transferStateFromPreviousAppSchedulingInfo update the systemBlacklist as well? Or, is the decision here to have the system blacklist per app-attempt? BlacklistAdditionsRemovals: Mark it Private Rename to BlacklistUpdates Rename members blacklistAdditions and blacklistRemovals to additions and removals, and update the getters accordingly? BlackListManager Mark it Private Rename addNodeContainerFailure to addNode? Rename getter? In DisabledBlacklistManager, define a static EMPTY_LIST similar to SimpleBlacklistManager and use that to avoid creating two ArrayLists for each AppAttempt. RMAppImpl: If am blacklisting is not enabled, we don't need to read the disable threshold. RMAppAttempt: Instead of getAMBlacklist, should we call it getSystemBlacklist to be consistent with the way we refer to it in the scheduler? RMAppAttemptImpl EMPTY_SYSTEM_BLACKLIST is unused Update any variables based on the method name - getAMBlacklist vs getSystemBlacklist Shouldn't we blacklist nodes on LaunchFailedTransition - may be in a follow-up JIRA? Can we file one if you agree. The MockRM change is unrelated? I like the change, but may be we should do it in a separate clean-up JIRA. It might have a few other things to clean-up TestAMRestart: A couple of unused variables. Why are the changes to TestCapacityScheduler needed? They don't look related to this patch. TestRMAppLogAggregationStatus change is unrelated.
          Hide
          kasha Karthik Kambatla added a comment -

          In my comments above, if 2.4 shouldn't update the systemBlacklist, we likely don't need the new method in RMAppAttempt.

          Show
          kasha Karthik Kambatla added a comment - In my comments above, if 2.4 shouldn't update the systemBlacklist, we likely don't need the new method in RMAppAttempt.
          Hide
          adhoot Anubhav Dhoot added a comment -

          Hi Karthik Kambatla thanks for your comments.

          2.4 - we do not need to update the systemBlacklist as its updated by the RMAppAttemptImpl#ScheduleTransition call every time to the complete list.
          11, 12 - The changes were needed because now we need a valid submission context for isWaitingForAMContainer.
          9 - Is needed by the new test added in TestAMRestart.
          8.3 - Yes i can file a follow up for that
          Addressed rest of them

          Show
          adhoot Anubhav Dhoot added a comment - Hi Karthik Kambatla thanks for your comments. 2.4 - we do not need to update the systemBlacklist as its updated by the RMAppAttemptImpl#ScheduleTransition call every time to the complete list. 11, 12 - The changes were needed because now we need a valid submission context for isWaitingForAMContainer. 9 - Is needed by the new test added in TestAMRestart. 8.3 - Yes i can file a follow up for that Addressed rest of them
          Hide
          adhoot Anubhav Dhoot added a comment -

          He Tianyi yes we are using the ContainerExitStatus in this. We can refine the conditions in a followup if needed.

          Show
          adhoot Anubhav Dhoot added a comment - He Tianyi yes we are using the ContainerExitStatus in this. We can refine the conditions in a followup if needed.
          Hide
          adhoot Anubhav Dhoot added a comment -

          Sunil G thats a good suggestion. Added a followup for this YARN-4143

          Show
          adhoot Anubhav Dhoot added a comment - Sunil G thats a good suggestion. Added a followup for this YARN-4143
          Hide
          adhoot Anubhav Dhoot added a comment -

          Added YARN-4144 to add the node that causes LaunchFailedTransition also to the AM blacklist

          Show
          adhoot Anubhav Dhoot added a comment - Added YARN-4144 to add the node that causes LaunchFailedTransition also to the AM blacklist
          Hide
          adhoot Anubhav Dhoot added a comment -

          Addressed feedback

          Show
          adhoot Anubhav Dhoot added a comment - Addressed feedback
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          -1 pre-patch 17m 29s Findbugs (version ) appears to be broken on trunk.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 7 new or modified test files.
          +1 javac 7m 48s There were no new javac warning messages.
          +1 javadoc 10m 13s There were no new javadoc warning messages.
          +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 1m 27s The applied patch generated 1 new checkstyle issues (total was 211, now 211).
          +1 whitespace 0m 27s The patch has no lines that end in whitespace.
          +1 install 1m 32s mvn install still works.
          +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse.
          +1 findbugs 4m 39s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 0m 24s Tests passed in hadoop-yarn-api.
          +1 yarn tests 2m 1s Tests passed in hadoop-yarn-common.
          +1 yarn tests 54m 33s Tests passed in hadoop-yarn-server-resourcemanager.
              102m 1s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12755286/YARN-2005.009.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / f103a70
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9084/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
          hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/9084/artifact/patchprocess/testrun_hadoop-yarn-api.txt
          hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/9084/artifact/patchprocess/testrun_hadoop-yarn-common.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9084/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9084/testReport/
          Java 1.7.0_55
          uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/9084/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 pre-patch 17m 29s Findbugs (version ) appears to be broken on trunk. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 7 new or modified test files. +1 javac 7m 48s There were no new javac warning messages. +1 javadoc 10m 13s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 1m 27s The applied patch generated 1 new checkstyle issues (total was 211, now 211). +1 whitespace 0m 27s The patch has no lines that end in whitespace. +1 install 1m 32s mvn install still works. +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse. +1 findbugs 4m 39s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 0m 24s Tests passed in hadoop-yarn-api. +1 yarn tests 2m 1s Tests passed in hadoop-yarn-common. +1 yarn tests 54m 33s Tests passed in hadoop-yarn-server-resourcemanager.     102m 1s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12755286/YARN-2005.009.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / f103a70 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9084/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/9084/artifact/patchprocess/testrun_hadoop-yarn-api.txt hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/9084/artifact/patchprocess/testrun_hadoop-yarn-common.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9084/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9084/testReport/ Java 1.7.0_55 uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9084/console This message was automatically generated.
          Hide
          kasha Karthik Kambatla added a comment -

          Looks good to me, trusting my previous review.

          +1. Checking this in.

          Show
          kasha Karthik Kambatla added a comment - Looks good to me, trusting my previous review. +1. Checking this in.
          Hide
          kasha Karthik Kambatla added a comment -

          Just committed this to trunk and branch-2. Intentionally left out other branches, since the fix is complex enough.

          Thanks Anubhav for fixing this long-standing issue.

          Thanks Bikas, Jason, Jian, Steve and Wangda for chiming in.

          Show
          kasha Karthik Kambatla added a comment - Just committed this to trunk and branch-2. Intentionally left out other branches, since the fix is complex enough. Thanks Anubhav for fixing this long-standing issue. Thanks Bikas, Jason, Jian, Steve and Wangda for chiming in.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #8445 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8445/)
          YARN-2005. Blacklisting support for scheduling AMs. (Anubhav Dhoot via kasha) (kasha: rev 81df7b586a16f8226c7b01c139c1c70c060399c3)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/DisabledBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/logaggregationstatus/TestRMAppLogAggregationStatus.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/SimpleBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistUpdates.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerUtils.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/TestBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8445 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8445/ ) YARN-2005 . Blacklisting support for scheduling AMs. (Anubhav Dhoot via kasha) (kasha: rev 81df7b586a16f8226c7b01c139c1c70c060399c3) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/DisabledBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/logaggregationstatus/TestRMAppLogAggregationStatus.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/SimpleBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistUpdates.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/TestBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #387 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/387/)
          YARN-2005. Blacklisting support for scheduling AMs. (Anubhav Dhoot via kasha) (kasha: rev 81df7b586a16f8226c7b01c139c1c70c060399c3)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/logaggregationstatus/TestRMAppLogAggregationStatus.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/DisabledBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistUpdates.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/TestBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerUtils.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/SimpleBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #387 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/387/ ) YARN-2005 . Blacklisting support for scheduling AMs. (Anubhav Dhoot via kasha) (kasha: rev 81df7b586a16f8226c7b01c139c1c70c060399c3) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/logaggregationstatus/TestRMAppLogAggregationStatus.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/DisabledBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistUpdates.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/TestBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/SimpleBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #1119 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1119/)
          YARN-2005. Blacklisting support for scheduling AMs. (Anubhav Dhoot via kasha) (kasha: rev 81df7b586a16f8226c7b01c139c1c70c060399c3)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistUpdates.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/SimpleBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/TestBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/DisabledBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/logaggregationstatus/TestRMAppLogAggregationStatus.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerUtils.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #1119 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1119/ ) YARN-2005 . Blacklisting support for scheduling AMs. (Anubhav Dhoot via kasha) (kasha: rev 81df7b586a16f8226c7b01c139c1c70c060399c3) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistUpdates.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/SimpleBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/TestBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/DisabledBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/logaggregationstatus/TestRMAppLogAggregationStatus.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #381 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/381/)
          YARN-2005. Blacklisting support for scheduling AMs. (Anubhav Dhoot via kasha) (kasha: rev 81df7b586a16f8226c7b01c139c1c70c060399c3)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/SimpleBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistUpdates.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerUtils.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/logaggregationstatus/TestRMAppLogAggregationStatus.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/DisabledBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/TestBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #381 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/381/ ) YARN-2005 . Blacklisting support for scheduling AMs. (Anubhav Dhoot via kasha) (kasha: rev 81df7b586a16f8226c7b01c139c1c70c060399c3) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/SimpleBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistUpdates.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/logaggregationstatus/TestRMAppLogAggregationStatus.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/DisabledBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/TestBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2306 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2306/)
          YARN-2005. Blacklisting support for scheduling AMs. (Anubhav Dhoot via kasha) (kasha: rev 81df7b586a16f8226c7b01c139c1c70c060399c3)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerUtils.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/SimpleBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/TestBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/logaggregationstatus/TestRMAppLogAggregationStatus.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistUpdates.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/DisabledBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistManager.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2306 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2306/ ) YARN-2005 . Blacklisting support for scheduling AMs. (Anubhav Dhoot via kasha) (kasha: rev 81df7b586a16f8226c7b01c139c1c70c060399c3) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/SimpleBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/TestBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/logaggregationstatus/TestRMAppLogAggregationStatus.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistUpdates.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/DisabledBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistManager.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2329 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2329/)
          YARN-2005. Blacklisting support for scheduling AMs. (Anubhav Dhoot via kasha) (kasha: rev 81df7b586a16f8226c7b01c139c1c70c060399c3)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/logaggregationstatus/TestRMAppLogAggregationStatus.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistUpdates.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/TestBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/DisabledBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerUtils.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/SimpleBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2329 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2329/ ) YARN-2005 . Blacklisting support for scheduling AMs. (Anubhav Dhoot via kasha) (kasha: rev 81df7b586a16f8226c7b01c139c1c70c060399c3) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/logaggregationstatus/TestRMAppLogAggregationStatus.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistUpdates.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/TestBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/DisabledBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/SimpleBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #366 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/366/)
          YARN-2005. Blacklisting support for scheduling AMs. (Anubhav Dhoot via kasha) (kasha: rev 81df7b586a16f8226c7b01c139c1c70c060399c3)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/SimpleBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/logaggregationstatus/TestRMAppLogAggregationStatus.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/TestBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerUtils.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/DisabledBlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistUpdates.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistManager.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #366 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/366/ ) YARN-2005 . Blacklisting support for scheduling AMs. (Anubhav Dhoot via kasha) (kasha: rev 81df7b586a16f8226c7b01c139c1c70c060399c3) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/SimpleBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/logaggregationstatus/TestRMAppLogAggregationStatus.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/TestBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerUtils.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/DisabledBlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistUpdates.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/blacklist/BlacklistManager.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java
          Hide
          adhoot Anubhav Dhoot added a comment -

          Thanks Jian He, Sunil G, Jason Lowe for the reviews and Karthik Kambatla for the review and commit !

          Show
          adhoot Anubhav Dhoot added a comment - Thanks Jian He , Sunil G , Jason Lowe for the reviews and Karthik Kambatla for the review and commit !
          Hide
          sjlee0 Sangjin Lee added a comment -

          Would this be a good candidate for backporting to 2.6.x and 2.7.x? Anubhav Dhoot, thoughts?

          Show
          sjlee0 Sangjin Lee added a comment - Would this be a good candidate for backporting to 2.6.x and 2.7.x? Anubhav Dhoot , thoughts?
          Hide
          djp Junping Du added a comment -

          I think this new feature is still in improving stage. There are some work like: YARN-4389 and YARN-4576 that get recently checked in or get proposed. I would prefer to keep it in minor release but not maintenance release until it get mature enough.

          Show
          djp Junping Du added a comment - I think this new feature is still in improving stage. There are some work like: YARN-4389 and YARN-4576 that get recently checked in or get proposed. I would prefer to keep it in minor release but not maintenance release until it get mature enough.
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          -1 for backporting this, while I understand that the original feature-ask is useful for avoiding AM scheduling getting blocked, there are far too many issues with the feature as it is. Please see my comments on YARN-4576 and YARN-4837.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - -1 for backporting this, while I understand that the original feature-ask is useful for avoiding AM scheduling getting blocked, there are far too many issues with the feature as it is. Please see my comments on YARN-4576 and YARN-4837 .

            People

            • Assignee:
              adhoot Anubhav Dhoot
              Reporter:
              jlowe Jason Lowe
            • Votes:
              3 Vote for this issue
              Watchers:
              38 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development