Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4284

condition for AM blacklisting is too narrow

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.8.0
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: resourcemanager
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Per YARN-2005, there is now a way to blacklist nodes for AM purposes so the next app attempt can be assigned to a different node.

      However, currently the condition under which the node gets blacklisted is limited to DISKS_FAILED. There are a whole host of other issues that may cause the failure, for which we want to locate the AM elsewhere; e.g. disks full, JVM crashes, memory issues, etc.

      Since the AM blacklisting is per-app, there is little practical downside in blacklisting the nodes on any failure (although it might lead to blacklisting the node more aggressively than necessary). I would propose locating the next app attempt to a different node on any failure.

      1. YARN-4284.001.patch
        2 kB
        Sangjin Lee
      2. YARN-4284.002.patch
        9 kB
        Sangjin Lee

        Issue Links

          Activity

          Hide
          sjlee0 Sangjin Lee added a comment -

          v.1 patch

          Show
          sjlee0 Sangjin Lee added a comment - v.1 patch
          Hide
          sunilg Sunil G added a comment -

          Hi Sangjin Lee
          As part of YARN-2293, we were looking into a proposal where we wanted to score NMs based on its performance (more failures of attempts, launch failures,disk crash etc will result in decrementing score). And for all applications AMs, its always best schedule to a highest ranked NM (best performed so far).
          But this is a very generic proposal, and we thought of achieving this step by step, and YARN-2005 was a first step for this as suggested by Jason Lowe. Coming to an improvement, your proposal is very much the same as the next step to this and it can give a better probability of a successful AM container launch for all applications. Currently there are chances that new application's first AM will still fail and only second one will be successful because of AM blacklisting.
          Downfall in achieving this is to collect the general failures (disk crashes/jvm launch problem) Vs application specific errors (some AM containers may not run on a node due to its memory or some other factors). If we cannot achieve this, then there are chances that due to one specific problem with an AM in a node may result that node to be blacklisted for all other apps, this may be dangerous.
          So as per you thought also, I think we can collect all the other issues in container launches and segregate to generic errors, and blacklist for a period of time. +1 for this. Thoughts?

          Show
          sunilg Sunil G added a comment - Hi Sangjin Lee As part of YARN-2293 , we were looking into a proposal where we wanted to score NMs based on its performance (more failures of attempts, launch failures,disk crash etc will result in decrementing score). And for all applications AMs, its always best schedule to a highest ranked NM (best performed so far). But this is a very generic proposal, and we thought of achieving this step by step, and YARN-2005 was a first step for this as suggested by Jason Lowe . Coming to an improvement, your proposal is very much the same as the next step to this and it can give a better probability of a successful AM container launch for all applications. Currently there are chances that new application's first AM will still fail and only second one will be successful because of AM blacklisting. Downfall in achieving this is to collect the general failures (disk crashes/jvm launch problem) Vs application specific errors (some AM containers may not run on a node due to its memory or some other factors). If we cannot achieve this, then there are chances that due to one specific problem with an AM in a node may result that node to be blacklisted for all other apps, this may be dangerous. So as per you thought also, I think we can collect all the other issues in container launches and segregate to generic errors, and blacklist for a period of time. +1 for this. Thoughts?
          Hide
          sjlee0 Sangjin Lee added a comment -

          Hi Sunil G, thanks for the comment. Yes, I've been following the discussion on YARN-2005 as well as YARN-2293. Although it would be nice to have a reliable scoring mechanism as a basis for assigning AM containers, what's implemented in YARN-2005 is actually a pretty solid solution to this problem. By the way, this is one of the more common issues our users encounter.

          The only problem with YARN-2005 is that the blacklisting condition is too narrow. In fact, we rarely encounter the DISKS_FAILED error. It's usually more like INVALID (-1000) or other errors. We can try to be real precise and blacklist nodes only if the container exit status is purely due to the node itself and is not caused by the app. But maintaining that precise condition may prove to be brittle.

          IMO the key is that blacklisting implemented in YARN-2005 is per-app. As such, we can afford to be more aggressive, instead of trying to come up with the 100% accurate blacklisting condition. Since it is per-app, there is no risk one bad app can cause a node to be blacklisted for all other apps (correct me if I'm wrong). Thoughts? Do you see other risk in taking this approach?

          Show
          sjlee0 Sangjin Lee added a comment - Hi Sunil G , thanks for the comment. Yes, I've been following the discussion on YARN-2005 as well as YARN-2293 . Although it would be nice to have a reliable scoring mechanism as a basis for assigning AM containers, what's implemented in YARN-2005 is actually a pretty solid solution to this problem. By the way, this is one of the more common issues our users encounter. The only problem with YARN-2005 is that the blacklisting condition is too narrow. In fact, we rarely encounter the DISKS_FAILED error. It's usually more like INVALID (-1000) or other errors. We can try to be real precise and blacklist nodes only if the container exit status is purely due to the node itself and is not caused by the app. But maintaining that precise condition may prove to be brittle. IMO the key is that blacklisting implemented in YARN-2005 is per-app . As such, we can afford to be more aggressive, instead of trying to come up with the 100% accurate blacklisting condition. Since it is per-app, there is no risk one bad app can cause a node to be blacklisted for all other apps (correct me if I'm wrong). Thoughts? Do you see other risk in taking this approach?
          Hide
          sunilg Sunil G added a comment -

          Thank you Sangjin Lee for the comments.
          Yes, I understood your point and got the idea from the patch also. I was having an assumption that, we are looking into a general blacklisting for all apps by seeing a failure for one app attempt in a node. Thank you for clarifying the same.

          This change seems almost fine for me. But as you told, the solution is slightly aggressive in marking a node as blacklisted per app. Also I am worried about cases like preemption from RM (ContainerExitStatus.PREEMPTED or KILLED_BY_RESOURCEMANAGER). Due to some queue over usage, RM may select AM container to preempt (again this is very unlikely to happen with YARN-1496, but its possible). And if application mark this node as blacklisted due to preemption or some similar cases, its not so correct I think. How do you feel?

          Show
          sunilg Sunil G added a comment - Thank you Sangjin Lee for the comments. Yes, I understood your point and got the idea from the patch also. I was having an assumption that, we are looking into a general blacklisting for all apps by seeing a failure for one app attempt in a node. Thank you for clarifying the same. This change seems almost fine for me. But as you told, the solution is slightly aggressive in marking a node as blacklisted per app. Also I am worried about cases like preemption from RM ( ContainerExitStatus.PREEMPTED or KILLED_BY_RESOURCEMANAGER ). Due to some queue over usage, RM may select AM container to preempt (again this is very unlikely to happen with YARN-1496 , but its possible). And if application mark this node as blacklisted due to preemption or some similar cases, its not so correct I think. How do you feel?
          Hide
          hadoopqa Hadoop QA added a comment -



          +1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 20m 0s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 8m 57s There were no new javac warning messages.
          +1 javadoc 11m 37s There were no new javadoc warning messages.
          +1 release audit 0m 26s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 58s There were no new checkstyle issues.
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 41s mvn install still works.
          +1 eclipse:eclipse 0m 38s The patch built with eclipse:eclipse.
          +1 findbugs 1m 39s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 58m 38s Tests passed in hadoop-yarn-server-resourcemanager.
              104m 37s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12767725/YARN-4284.001.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 0c4af0f
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9506/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9506/testReport/
          Java 1.7.0_55
          uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/9506/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 20m 0s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 8m 57s There were no new javac warning messages. +1 javadoc 11m 37s There were no new javadoc warning messages. +1 release audit 0m 26s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 58s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 41s mvn install still works. +1 eclipse:eclipse 0m 38s The patch built with eclipse:eclipse. +1 findbugs 1m 39s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 58m 38s Tests passed in hadoop-yarn-server-resourcemanager.     104m 37s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12767725/YARN-4284.001.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 0c4af0f hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9506/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9506/testReport/ Java 1.7.0_55 uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9506/console This message was automatically generated.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          what if you only have one or two nodes?

          Show
          stevel@apache.org Steve Loughran added a comment - what if you only have one or two nodes?
          Hide
          sunilg Sunil G added a comment -

          Yes Steve Loughran, clusterwise blacklisting is a potential problem. YARN-2005 has a config parameter named yarn.am.blacklisting.disable-failure-threshold with a default value of 0.8. This means that one application can only blacklist a maximum of 80% of nodes in cluster. This is to ensure that some nodes has to be selected.

          Show
          sunilg Sunil G added a comment - Yes Steve Loughran , clusterwise blacklisting is a potential problem. YARN-2005 has a config parameter named yarn.am.blacklisting.disable-failure-threshold with a default value of 0.8. This means that one application can only blacklist a maximum of 80% of nodes in cluster. This is to ensure that some nodes has to be selected.
          Hide
          sjlee0 Sangjin Lee added a comment -

          Steve Loughran Sunil G, if you have one or two nodes and the AM container of an app fails, yarn.am.blacklisting.disable-failure-threshold will ensure that it cannot blacklist the entire cluster for that app. Once you're above the threshold, the blacklisting is cleared, and all nodes are available. Again, this is a per-app behavior. Other apps are not affected by this decision whatever.

          As for the condition for applying blacklisting, I think we can add PREEMPTED to that list (for not blacklisting). I'm not so sure about KILLED_BY_RESOURCEMANAGER. I think it is possible that an AM container can be killed by the resource manager due to a node issue. Any failure of heartbeating properly will cause the AM container to be killed by the RM, but the cause of that failure of heartbeating can be many. Just because it was killed by the RM doesn't mean definitively that it was purely an app problem. What do you think?

          I think we may want to approach this from the point of view of anti-affinity. Currently there is an inherent affinity to nodes when it comes to assigning the AM containers. In my view, anti-affinity is a better behavior as a default behavior. In the worst case scenario when the AM container failure was caused purely by the app, running subsequent attempts on different nodes will make it only clear the failures were unrelated to nodes. This helps troubleshooting a great deal. Today when all AM containers land on the same node, we sometimes spend a fair amount of time convincing our users that it had nothing to do with the node.

          Thoughts and comments are welcome. Thanks!

          Show
          sjlee0 Sangjin Lee added a comment - Steve Loughran Sunil G , if you have one or two nodes and the AM container of an app fails, yarn.am.blacklisting.disable-failure-threshold will ensure that it cannot blacklist the entire cluster for that app. Once you're above the threshold, the blacklisting is cleared, and all nodes are available. Again, this is a per-app behavior. Other apps are not affected by this decision whatever. As for the condition for applying blacklisting, I think we can add PREEMPTED to that list (for not blacklisting). I'm not so sure about KILLED_BY_RESOURCEMANAGER . I think it is possible that an AM container can be killed by the resource manager due to a node issue. Any failure of heartbeating properly will cause the AM container to be killed by the RM, but the cause of that failure of heartbeating can be many. Just because it was killed by the RM doesn't mean definitively that it was purely an app problem. What do you think? I think we may want to approach this from the point of view of anti-affinity . Currently there is an inherent affinity to nodes when it comes to assigning the AM containers. In my view, anti-affinity is a better behavior as a default behavior. In the worst case scenario when the AM container failure was caused purely by the app, running subsequent attempts on different nodes will make it only clear the failures were unrelated to nodes. This helps troubleshooting a great deal. Today when all AM containers land on the same node, we sometimes spend a fair amount of time convincing our users that it had nothing to do with the node. Thoughts and comments are welcome. Thanks!
          Hide
          sunilg Sunil G added a comment -

          Yes Sangjin Lee. After the threshold, it clears all nodes from blacklisting. Thank you for correcting.

          Just because it was killed by the RM doesn't mean definitively that it was purely an app problem.

          I think yes. It may not be an app specific.

          anti-affinity is a better behavior as a default behavior. In the worst case scenario when the AM container failure was caused purely by the app, running subsequent attempts on different nodes will make it only clear the failures were unrelated to nodes

          Yes, I agree to your point. It can help to isolate the problem of container failure. So we could skip only PREEMPTED for now and consider all other failure cases for blacklisting. Correct?

          Show
          sunilg Sunil G added a comment - Yes Sangjin Lee . After the threshold, it clears all nodes from blacklisting. Thank you for correcting. Just because it was killed by the RM doesn't mean definitively that it was purely an app problem. I think yes. It may not be an app specific. anti-affinity is a better behavior as a default behavior. In the worst case scenario when the AM container failure was caused purely by the app, running subsequent attempts on different nodes will make it only clear the failures were unrelated to nodes Yes, I agree to your point. It can help to isolate the problem of container failure. So we could skip only PREEMPTED for now and consider all other failure cases for blacklisting. Correct?
          Hide
          sjlee0 Sangjin Lee added a comment -

          I think the current patch would be sufficient, yes. I'd also like to hear from others to see if I missed anything important.

          Show
          sjlee0 Sangjin Lee added a comment - I think the current patch would be sufficient, yes. I'd also like to hear from others to see if I missed anything important.
          Hide
          jianhe Jian He added a comment -

          Currently there is an inherent affinity to nodes when it comes to assigning the AM containers.

          I have one question why it has inherent affinity to nodes when assigning AM containers? Is this how you see in real scenario ?
          IIUC, AM container request is 'any', so allocation will be based on node heartbeat - whichever node comes first will allocate the AM container ?

          Show
          jianhe Jian He added a comment - Currently there is an inherent affinity to nodes when it comes to assigning the AM containers. I have one question why it has inherent affinity to nodes when assigning AM containers? Is this how you see in real scenario ? IIUC, AM container request is 'any', so allocation will be based on node heartbeat - whichever node comes first will allocate the AM container ?
          Hide
          sjlee0 Sangjin Lee added a comment -

          It happens more often than not. If the node is having issues and the cluster is pretty busy, it looks as if the node has more capacity than other nodes. As a result, the node will get preferred for the allocation. This happens until and unless the node enters an UNHEALTHY state. This creates a node affinity situation, even though the scheduler itself is neutral.

          Show
          sjlee0 Sangjin Lee added a comment - It happens more often than not. If the node is having issues and the cluster is pretty busy, it looks as if the node has more capacity than other nodes. As a result, the node will get preferred for the allocation. This happens until and unless the node enters an UNHEALTHY state. This creates a node affinity situation, even though the scheduler itself is neutral.
          Hide
          jianhe Jian He added a comment -

          I see, sounds good to me, thanks for your explanation.

          Show
          jianhe Jian He added a comment - I see, sounds good to me, thanks for your explanation.
          Hide
          sjlee0 Sangjin Lee added a comment -

          Jason Lowe, could you kindly let me know what you think? Thanks!

          Show
          sjlee0 Sangjin Lee added a comment - Jason Lowe , could you kindly let me know what you think? Thanks!
          Hide
          jlowe Jason Lowe added a comment -

          Sorry for arriving a bit late. I agree that DISKS_FAILED is too narrow of a failure to use as a trigger, and it's probably better to err on the side of covering too many failure reasons than not enough.

          Patch looks pretty good, but I agree that we should not be blacklisting on preempted containers. Those have nothing to do with the AM or the node, and we shouldn't make the reschedule of a preempted AM more difficult as a result. There should be a test for that scenario.

          Show
          jlowe Jason Lowe added a comment - Sorry for arriving a bit late. I agree that DISKS_FAILED is too narrow of a failure to use as a trigger, and it's probably better to err on the side of covering too many failure reasons than not enough. Patch looks pretty good, but I agree that we should not be blacklisting on preempted containers. Those have nothing to do with the AM or the node, and we shouldn't make the reschedule of a preempted AM more difficult as a result. There should be a test for that scenario.
          Hide
          sjlee0 Sangjin Lee added a comment -

          Thanks. I agree that PREEMPTED is unequivocally not a node issue. I'll update the patch shortly.

          Show
          sjlee0 Sangjin Lee added a comment - Thanks. I agree that PREEMPTED is unequivocally not a node issue. I'll update the patch shortly.
          Hide
          sjlee0 Sangjin Lee added a comment -

          v.2 patch posted.

          Added PREEMPTED as a condition for not blacklisting nodes. Also added a unit test for it.

          Show
          sjlee0 Sangjin Lee added a comment - v.2 patch posted. Added PREEMPTED as a condition for not blacklisting nodes. Also added a unit test for it.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 17m 30s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 56s There were no new javac warning messages.
          +1 javadoc 10m 25s There were no new javadoc warning messages.
          +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 0m 48s The applied patch generated 1 new checkstyle issues (total was 123, now 123).
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 29s mvn install still works.
          +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse.
          +1 findbugs 1m 30s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 58m 20s Tests passed in hadoop-yarn-server-resourcemanager.
              99m 1s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12768452/YARN-4284.002.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 15eb84b
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9555/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
          hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9555/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9555/testReport/
          Java 1.7.0_55
          uname Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/9555/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 17m 30s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 56s There were no new javac warning messages. +1 javadoc 10m 25s There were no new javadoc warning messages. +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 0m 48s The applied patch generated 1 new checkstyle issues (total was 123, now 123). +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 29s mvn install still works. +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse. +1 findbugs 1m 30s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 58m 20s Tests passed in hadoop-yarn-server-resourcemanager.     99m 1s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12768452/YARN-4284.002.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 15eb84b checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9555/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9555/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9555/testReport/ Java 1.7.0_55 uname Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9555/console This message was automatically generated.
          Hide
          sjlee0 Sangjin Lee added a comment -

          The checkstyle issue is not new (and not something I can fix in this: file length).

          Show
          sjlee0 Sangjin Lee added a comment - The checkstyle issue is not new (and not something I can fix in this: file length).
          Hide
          sjlee0 Sangjin Lee added a comment -

          Could you take a quick look at the updated patch, and let me know your comments? Thanks!

          Show
          sjlee0 Sangjin Lee added a comment - Could you take a quick look at the updated patch, and let me know your comments? Thanks!
          Hide
          jlowe Jason Lowe added a comment -

          +1 latest patch lgtm. Committing this.

          Show
          jlowe Jason Lowe added a comment - +1 latest patch lgtm. Committing this.
          Hide
          jlowe Jason Lowe added a comment -

          Thanks to Sangjin for the contribution and to Sunil, Steve, and Jian for additional review! I committed this to trunk and branch-2.

          Show
          jlowe Jason Lowe added a comment - Thanks to Sangjin for the contribution and to Sunil, Steve, and Jian for additional review! I committed this to trunk and branch-2.
          Hide
          sjlee0 Sangjin Lee added a comment -

          Thanks Jason Lowe!

          Show
          sjlee0 Sangjin Lee added a comment - Thanks Jason Lowe !
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #587 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/587/)
          YARN-4284. condition for AM blacklisting is too narrow. Contributed by (jlowe: rev 33a03af3c396097929b9cd9c790d7f52eddc13e0)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #587 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/587/ ) YARN-4284 . condition for AM blacklisting is too narrow. Contributed by (jlowe: rev 33a03af3c396097929b9cd9c790d7f52eddc13e0) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java hadoop-yarn-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #8710 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8710/)
          YARN-4284. condition for AM blacklisting is too narrow. Contributed by (jlowe: rev 33a03af3c396097929b9cd9c790d7f52eddc13e0)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8710 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8710/ ) YARN-4284 . condition for AM blacklisting is too narrow. Contributed by (jlowe: rev 33a03af3c396097929b9cd9c790d7f52eddc13e0) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java hadoop-yarn-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #1323 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1323/)
          YARN-4284. condition for AM blacklisting is too narrow. Contributed by (jlowe: rev 33a03af3c396097929b9cd9c790d7f52eddc13e0)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #1323 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1323/ ) YARN-4284 . condition for AM blacklisting is too narrow. Contributed by (jlowe: rev 33a03af3c396097929b9cd9c790d7f52eddc13e0) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #599 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/599/)
          YARN-4284. condition for AM blacklisting is too narrow. Contributed by (jlowe: rev 33a03af3c396097929b9cd9c790d7f52eddc13e0)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #599 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/599/ ) YARN-4284 . condition for AM blacklisting is too narrow. Contributed by (jlowe: rev 33a03af3c396097929b9cd9c790d7f52eddc13e0) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2477 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2477/)
          YARN-4284. condition for AM blacklisting is too narrow. Contributed by (jlowe: rev 33a03af3c396097929b9cd9c790d7f52eddc13e0)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2477 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2477/ ) YARN-4284 . condition for AM blacklisting is too narrow. Contributed by (jlowe: rev 33a03af3c396097929b9cd9c790d7f52eddc13e0) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java hadoop-yarn-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2530 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2530/)
          YARN-4284. condition for AM blacklisting is too narrow. Contributed by (jlowe: rev 33a03af3c396097929b9cd9c790d7f52eddc13e0)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2530 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2530/ ) YARN-4284 . condition for AM blacklisting is too narrow. Contributed by (jlowe: rev 33a03af3c396097929b9cd9c790d7f52eddc13e0) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java hadoop-yarn-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #540 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/540/)
          YARN-4284. condition for AM blacklisting is too narrow. Contributed by (jlowe: rev 33a03af3c396097929b9cd9c790d7f52eddc13e0)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #540 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/540/ ) YARN-4284 . condition for AM blacklisting is too narrow. Contributed by (jlowe: rev 33a03af3c396097929b9cd9c790d7f52eddc13e0) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java

            People

            • Assignee:
              sjlee0 Sangjin Lee
              Reporter:
              sjlee0 Sangjin Lee
            • Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development