Hadoop YARN
  1. Hadoop YARN
  2. YARN-165

RM should point tracking URL to RM web page for app when AM fails

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.0.3-alpha, 0.23.5
    • Fix Version/s: 3.0.0, 2.0.3-alpha, 0.23.5
    • Component/s: resourcemanager
    • Labels:
      None

      Description

      Currently when an ApplicationMaster fails the ResourceManager is updating the tracking URL to an empty string, see RMAppAttemptImpl.ContainerFinishedTransition. Unfortunately when the client attempts to follow the proxy URL it results in a web page showing an HTTP 500 error and an ugly backtrace because "http://" isn't a very helpful tracking URL.

      It would be much more helpful if the proxy URL redirected to the RM webapp page for the specific application. That page shows the various AM attempts and pointers to their logs which will be useful for debugging the problems that caused the AM attempts to fail.

      1. YARN-165-branch23.patch
        12 kB
        Jason Lowe
      2. YARN-165-branch23.patch
        11 kB
        Jason Lowe
      3. YARN-165.patch
        11 kB
        Jason Lowe

        Issue Links

          Activity

          Jason Lowe created issue -
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Upgrading it to be blocker. It is a terrible user experience when AMs don't start as they are expected. Should be an easy fix.

          Show
          Vinod Kumar Vavilapalli added a comment - Upgrading it to be blocker. It is a terrible user experience when AMs don't start as they are expected. Should be an easy fix.
          Vinod Kumar Vavilapalli made changes -
          Field Original Value New Value
          Priority Major [ 3 ] Blocker [ 1 ]
          Hide
          Jason Lowe added a comment -

          Patch to change the tracking URL to point to the RM app page when the AM fails.

          Show
          Jason Lowe added a comment - Patch to change the tracking URL to point to the RM app page when the AM fails.
          Jason Lowe made changes -
          Attachment YARN-165.patch [ 12550974 ]
          Jason Lowe made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Assignee Jason Lowe [ jlowe ]
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12550974/YARN-165.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/125//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/125//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12550974/YARN-165.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/125//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/125//console This message is automatically generated.
          Vinod Kumar Vavilapalli made changes -
          Link This issue is related to MAPREDUCE-2783 [ MAPREDUCE-2783 ]
          Hide
          Vinod Kumar Vavilapalli added a comment -

          This behavior resulted after MAPREDUCE-2783.

          Show
          Vinod Kumar Vavilapalli added a comment - This behavior resulted after MAPREDUCE-2783 .
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Patch looks good. The test too.

          Any manual tests you have done? I'd expect:

          • a fresh request to an already dead AM getting directed properly.
          • an existing page may get a 404 but on refresh should redirect properly.
          Show
          Vinod Kumar Vavilapalli added a comment - Patch looks good. The test too. Any manual tests you have done? I'd expect: a fresh request to an already dead AM getting directed properly. an existing page may get a 404 but on refresh should redirect properly.
          Hide
          Jason Lowe added a comment -

          Sorry I should have mentioned that I performed some manual tests as well. I ran a sleep job and harshly terminated (via kill -9) the AM to simulate an unclean teardown of the AM. A fresh request to the tracking URL provided by the RM directed to the RM's app page for the sleep job as intended. An existing AM page when refreshed redirected to the RM's app page.

          Show
          Jason Lowe added a comment - Sorry I should have mentioned that I performed some manual tests as well. I ran a sleep job and harshly terminated (via kill -9) the AM to simulate an unclean teardown of the AM. A fresh request to the tracking URL provided by the RM directed to the RM's app page for the sleep job as intended. An existing AM page when refreshed redirected to the RM's app page.
          Hide
          Jason Lowe added a comment -

          Patch for branch-0.23.

          Show
          Jason Lowe added a comment - Patch for branch-0.23.
          Jason Lowe made changes -
          Attachment YARN-165-branch23.patch [ 12551565 ]
          Hide
          Robert Joseph Evans added a comment -

          The patches look good to me +1. I'll check them in.

          Show
          Robert Joseph Evans added a comment - The patches look good to me +1. I'll check them in.
          Hide
          Robert Parker added a comment -

          I am getting an NPE on the test added by YARN-165-branch23.patch

          Show
          Robert Parker added a comment - I am getting an NPE on the test added by YARN-165 -branch23.patch
          Hide
          Robert Joseph Evans added a comment -

          Holding off on checking in 0.23. The tests pass on trunk and branch-2. I already checked them in.

          Show
          Robert Joseph Evans added a comment - Holding off on checking in 0.23. The tests pass on trunk and branch-2. I already checked them in.
          Hide
          Jason Lowe added a comment -

          Sorry, my bad. I manually tested the fix just like I did for the trunk patch but forgot to verify the unit test, duh.

          New patch for branch-0.23 to fix the unit test.

          Show
          Jason Lowe added a comment - Sorry, my bad. I manually tested the fix just like I did for the trunk patch but forgot to verify the unit test, duh. New patch for branch-0.23 to fix the unit test.
          Jason Lowe made changes -
          Attachment YARN-165-branch23.patch [ 12551570 ]
          Hide
          Robert Joseph Evans added a comment -

          The test now passes, and it looks like it was a minor test issue, not a code one.

          +    when(container.getId()).thenReturn(
          +        BuilderUtils.newContainerId(applicationAttempt.getAppAttemptId(), 1));
          

          I am +1 on the new patch and will check it in.

          Show
          Robert Joseph Evans added a comment - The test now passes, and it looks like it was a minor test issue, not a code one. + when(container.getId()).thenReturn( + BuilderUtils.newContainerId(applicationAttempt.getAppAttemptId(), 1)); I am +1 on the new patch and will check it in.
          Hide
          Robert Joseph Evans added a comment -

          Thanks Jason,

          I pulled this into trunk, branch-2, and branch-0.23

          Show
          Robert Joseph Evans added a comment - Thanks Jason, I pulled this into trunk, branch-2, and branch-0.23
          Robert Joseph Evans made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Fix Version/s 2.0.3-alpha [ 12323272 ]
          Fix Version/s 0.23.5 [ 12323311 ]
          Fix Version/s 3.0.0 [ 12323268 ]
          Resolution Fixed [ 1 ]
          Hide
          Hudson added a comment -

          Integrated in Hadoop-trunk-Commit #2946 (See https://builds.apache.org/job/Hadoop-trunk-Commit/2946/)
          YARN-165. RM should point tracking URL to RM web page for app when AM fails (jlowe via bobby) (Revision 1404211)

          Result = SUCCESS
          bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1404211
          Files :

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Show
          Hudson added a comment - Integrated in Hadoop-trunk-Commit #2946 (See https://builds.apache.org/job/Hadoop-trunk-Commit/2946/ ) YARN-165 . RM should point tracking URL to RM web page for app when AM fails (jlowe via bobby) (Revision 1404211) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1404211 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Yarn-trunk #23 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/23/)
          YARN-165. RM should point tracking URL to RM web page for app when AM fails (jlowe via bobby) (Revision 1404211)

          Result = SUCCESS
          bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1404211
          Files :

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Show
          Hudson added a comment - Integrated in Hadoop-Yarn-trunk #23 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/23/ ) YARN-165 . RM should point tracking URL to RM web page for app when AM fails (jlowe via bobby) (Revision 1404211) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1404211 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-0.23-Build #422 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/422/)
          YARN-165. RM should point tracking URL to RM web page for app when AM fails Contributed by Jason Lowe. (Revision 1404224)

          Result = SUCCESS
          bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1404224
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
          • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #422 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/422/ ) YARN-165 . RM should point tracking URL to RM web page for app when AM fails Contributed by Jason Lowe. (Revision 1404224) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1404224 Files : /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #1213 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1213/)
          YARN-165. RM should point tracking URL to RM web page for app when AM fails (jlowe via bobby) (Revision 1404211)

          Result = FAILURE
          bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1404211
          Files :

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1213 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1213/ ) YARN-165 . RM should point tracking URL to RM web page for app when AM fails (jlowe via bobby) (Revision 1404211) Result = FAILURE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1404211 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #1243 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1243/)
          YARN-165. RM should point tracking URL to RM web page for app when AM fails (jlowe via bobby) (Revision 1404211)

          Result = ABORTED
          bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1404211
          Files :

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1243 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1243/ ) YARN-165 . RM should point tracking URL to RM web page for app when AM fails (jlowe via bobby) (Revision 1404211) Result = ABORTED bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1404211 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Jason Lowe made changes -
          Link This issue is related to YARN-236 [ YARN-236 ]
          Thomas Graves made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              Jason Lowe
              Reporter:
              Jason Lowe
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development