Hadoop YARN
  1. Hadoop YARN
  2. YARN-227

Application expiration difficult to debug for end-users

    Details

      Description

      When an AM attempt expires the AMLivelinessMonitor in the RM will kill the job and mark it as failed. However there are no diagnostic messages set for the application indicating that the application failed because of expiration. Even if the AM logs are examined, it's often not obvious that the application was externally killed. The only evidence of what happened to the application is currently in the RM logs, and those are often not accessible by users.

      1. YARN-227-branch-0.23.patch
        6 kB
        Jason Lowe
      2. YARN-227-branch-0.23.patch
        6 kB
        Jason Lowe
      3. YARN-227.patch
        7 kB
        Jason Lowe
      4. YARN-227.patch
        7 kB
        Jason Lowe

        Issue Links

          Activity

          Jason Lowe created issue -
          Hide
          Jason Lowe added a comment -

          Patch to add diagnostics to the expired attempt indicating it timed out. Also changed the tracking URL to point to the RM's app page when the attempt expires so it's not left dangling, referencing an app attempt that is no longer there.

          Show
          Jason Lowe added a comment - Patch to add diagnostics to the expired attempt indicating it timed out. Also changed the tracking URL to point to the RM's app page when the attempt expires so it's not left dangling, referencing an app attempt that is no longer there.
          Jason Lowe made changes -
          Field Original Value New Value
          Attachment YARN-227.patch [ 12565361 ]
          Jason Lowe made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Target Version/s 2.0.3-alpha, 0.23.7 [ 12323272, 12323953 ]
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12565361/YARN-227.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/355//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/355//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12565361/YARN-227.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/355//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/355//console This message is automatically generated.
          Jason Lowe made changes -
          Assignee Jason Lowe [ jlowe ]
          Hitesh Shah made changes -
          Labels usability
          Hitesh Shah made changes -
          Link This issue blocks YARN-414 [ YARN-414 ]
          Hide
          Jonathan Eagles added a comment -

          +1. Jason. If you can provide a 23 patch, I can check the code in there too.

          Show
          Jonathan Eagles added a comment - +1. Jason. If you can provide a 23 patch, I can check the code in there too.
          Hide
          Jason Lowe added a comment -

          Thanks for the review, Jon. Patch for branch-0.23 is attached.

          Show
          Jason Lowe added a comment - Thanks for the review, Jon. Patch for branch-0.23 is attached.
          Jason Lowe made changes -
          Attachment YARN-227-branch-0.23.patch [ 12571954 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12571954/YARN-227-branch-0.23.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          -1 one of tests included doesn't have a timeout.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 eclipse:eclipse. The patch failed to build with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/463//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/463//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12571954/YARN-227-branch-0.23.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. -1 one of tests included doesn't have a timeout. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. -1 eclipse:eclipse . The patch failed to build with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/463//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/463//console This message is automatically generated.
          Hide
          Jonathan Eagles added a comment -

          It looks like the eclipse:eclipse issue is spurious.

          Show
          Jonathan Eagles added a comment - It looks like the eclipse:eclipse issue is spurious.
          Hide
          Jason Lowe added a comment -

          Updated patches to add timeouts.

          Show
          Jason Lowe added a comment - Updated patches to add timeouts.
          Jason Lowe made changes -
          Attachment YARN-227-branch-0.23.patch [ 12572099 ]
          Attachment YARN-227.patch [ 12572100 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12572100/YARN-227.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          -1 one of tests included doesn't have a timeout.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/470//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/470//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12572100/YARN-227.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. -1 one of tests included doesn't have a timeout. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/470//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/470//console This message is automatically generated.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-trunk-Commit #3420 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3420/)
          YARN-227. Application expiration difficult to debug for end-users (Jason Lowe via jeagles) (Revision 1453080)

          Result = SUCCESS
          jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453080
          Files :

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Show
          Hudson added a comment - Integrated in Hadoop-trunk-Commit #3420 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3420/ ) YARN-227 . Application expiration difficult to debug for end-users (Jason Lowe via jeagles) (Revision 1453080) Result = SUCCESS jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453080 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Hide
          Jonathan Eagles added a comment -

          +1. Thanks so much for this patch, Jason.

          Show
          Jonathan Eagles added a comment - +1. Thanks so much for this patch, Jason.
          Jonathan Eagles made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Fix Version/s 0.23.7 [ 12323953 ]
          Fix Version/s 2.0.4-beta [ 12324029 ]
          Fix Version/s 3.0.0 [ 12323268 ]
          Resolution Fixed [ 1 ]
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Yarn-trunk #147 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/147/)
          YARN-227. Application expiration difficult to debug for end-users (Jason Lowe via jeagles) (Revision 1453080)

          Result = SUCCESS
          jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453080
          Files :

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Show
          Hudson added a comment - Integrated in Hadoop-Yarn-trunk #147 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/147/ ) YARN-227 . Application expiration difficult to debug for end-users (Jason Lowe via jeagles) (Revision 1453080) Result = SUCCESS jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453080 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-0.23-Build #545 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/545/)
          YARN-227. Application expiration difficult to debug for end-users (Jason Lowe via jeagles) (Revision 1453092)

          Result = FAILURE
          jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453092
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #545 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/545/ ) YARN-227 . Application expiration difficult to debug for end-users (Jason Lowe via jeagles) (Revision 1453092) Result = FAILURE jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453092 Files : /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #1336 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1336/)
          YARN-227. Application expiration difficult to debug for end-users (Jason Lowe via jeagles) (Revision 1453080)

          Result = SUCCESS
          jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453080
          Files :

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1336 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1336/ ) YARN-227 . Application expiration difficult to debug for end-users (Jason Lowe via jeagles) (Revision 1453080) Result = SUCCESS jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453080 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #1364 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1364/)
          YARN-227. Application expiration difficult to debug for end-users (Jason Lowe via jeagles) (Revision 1453080)

          Result = SUCCESS
          jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453080
          Files :

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1364 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1364/ ) YARN-227 . Application expiration difficult to debug for end-users (Jason Lowe via jeagles) (Revision 1453080) Result = SUCCESS jeagles : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453080 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
          Arun C Murthy made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Allen Wittenauer made changes -
          Fix Version/s 3.0.0 [ 12323268 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Patch Available Patch Available
          58d 22h 36m 1 Jason Lowe 17/Jan/13 19:34
          Patch Available Patch Available Resolved Resolved
          47d 3h 59m 1 Jonathan Eagles 05/Mar/13 23:33
          Resolved Resolved Closed Closed
          174d 22h 41m 1 Arun C Murthy 27/Aug/13 23:15

            People

            • Assignee:
              Jason Lowe
              Reporter:
              Jason Lowe
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development