Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4422

Generic AHS sometimes doesn't show started, node, or logs on App page

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 2.7.3, 3.0.0-alpha1
    • Component/s: None
    • Labels:
      None

      Description

      Sometimes the AM container for an app isn't able to start the JVM. This can happen if bogus JVM options are given to the AM container ( -Dyarn.app.mapreduce.am.command-opts=-InvalidJvmOption) or when misconfiguring the AM container's environment variables (-Dyarn.app.mapreduce.am.env="JAVA_HOME=/foo/bar/baz)

      When the AM container for an app isn't able to start the JVM, the Application page for that application shows N/A for the Started, Node, and Logs columns. It does have links for each app attempt, and if you click on one of them, you go to the Application Attempt page, where you can see all containers with links to their logs and nodes, including the AM container. But none of that shows up for the app attempts on the Application page.

      Also, on the Application Attempt page, in the Application Attempt Overview section, the AM Container value is null and the Node value is N/A.

      1. AppPage no logs or node.jpg
        236 kB
        Eric Payne
      2. AppAttemptPage no container or node.jpg
        298 kB
        Eric Payne
      3. YARN-4422.001.patch
        5 kB
        Eric Payne

        Issue Links

          Activity

          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Closing the JIRA as part of 2.7.3 release.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Closing the JIRA as part of 2.7.3 release.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Committed to branch-2.8.

          Show
          leftnoteasy Wangda Tan added a comment - Committed to branch-2.8.
          Hide
          mingma Ming Ma added a comment -

          Thanks Eric Payne.

          Show
          mingma Ming Ma added a comment - Thanks Eric Payne .
          Hide
          eepayne Eric Payne added a comment -

          Thanks! Will this fix address MAPREDUCE-5502 or MAPREDUCE-4428? It doesn't seem so, but would like to confirm.

          Ming Ma, thanks for your interest. No, this JIRA does not fix the issue documented in MAPREDUCE-5502 or MAPREDUCE-4428. This JIRA only affects the Generic application history server's GUI and not the RM Application GUI. Also, as documented in those JIRAs, the problem is not a missing link in the GUI, but that the log history is missing altogether.

          Show
          eepayne Eric Payne added a comment - Thanks! Will this fix address MAPREDUCE-5502 or MAPREDUCE-4428 ? It doesn't seem so, but would like to confirm. Ming Ma , thanks for your interest. No, this JIRA does not fix the issue documented in MAPREDUCE-5502 or MAPREDUCE-4428 . This JIRA only affects the Generic application history server's GUI and not the RM Application GUI. Also, as documented in those JIRAs, the problem is not a missing link in the GUI, but that the log history is missing altogether.
          Hide
          mingma Ming Ma added a comment -

          Thanks! Will this fix address MAPREDUCE-5502 or MAPREDUCE-4428? It doesn't seem so, but would like to confirm.

          Show
          mingma Ming Ma added a comment - Thanks! Will this fix address MAPREDUCE-5502 or MAPREDUCE-4428 ? It doesn't seem so, but would like to confirm.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #674 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/674/)
          YARN-4422. Generic AHS sometimes doesn't show started, node, or logs on (jeagles: rev 4ff973f96ae7f77cda3b52b38427e2991819ad31)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/SystemMetricsPublisher.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/AppAttemptFinishedEvent.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryManagerOnTimelineStore.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #674 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/674/ ) YARN-4422 . Generic AHS sometimes doesn't show started, node, or logs on (jeagles: rev 4ff973f96ae7f77cda3b52b38427e2991819ad31) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/SystemMetricsPublisher.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/AppAttemptFinishedEvent.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryManagerOnTimelineStore.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #8934 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8934/)
          YARN-4422. Generic AHS sometimes doesn't show started, node, or logs on (jeagles: rev 4ff973f96ae7f77cda3b52b38427e2991819ad31)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryManagerOnTimelineStore.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/AppAttemptFinishedEvent.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/SystemMetricsPublisher.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8934 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8934/ ) YARN-4422 . Generic AHS sometimes doesn't show started, node, or logs on (jeagles: rev 4ff973f96ae7f77cda3b52b38427e2991819ad31) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryManagerOnTimelineStore.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/AppAttemptFinishedEvent.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/SystemMetricsPublisher.java
          Hide
          jeagles Jonathan Eagles added a comment -

          +1. Thanks, Eric Payne.

          Show
          jeagles Jonathan Eagles added a comment - +1. Thanks, Eric Payne .
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 0s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          -1 mvninstall 1m 50s root in trunk failed.
          +1 compile 1m 13s trunk passed with JDK v1.8.0_66
          +1 compile 1m 20s trunk passed with JDK v1.7.0_85
          +1 checkstyle 0m 21s trunk passed
          +1 mvnsite 1m 0s trunk passed
          +1 mvneclipse 0m 27s trunk passed
          +1 findbugs 1m 52s trunk passed
          +1 javadoc 0m 38s trunk passed with JDK v1.8.0_66
          +1 javadoc 0m 45s trunk passed with JDK v1.7.0_85
          +1 mvninstall 0m 56s the patch passed
          +1 compile 1m 14s the patch passed with JDK v1.8.0_66
          +1 javac 1m 14s the patch passed
          +1 compile 1m 23s the patch passed with JDK v1.7.0_85
          +1 javac 1m 23s the patch passed
          -1 checkstyle 0m 21s Patch generated 2 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server (total was 80, now 82).
          +1 mvnsite 1m 1s the patch passed
          +1 mvneclipse 0m 27s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 2m 13s the patch passed
          +1 javadoc 0m 39s the patch passed with JDK v1.8.0_66
          +1 javadoc 0m 44s the patch passed with JDK v1.7.0_85
          +1 unit 3m 52s hadoop-yarn-server-applicationhistoryservice in the patch passed with JDK v1.8.0_66.
          -1 unit 69m 0s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66.
          +1 unit 4m 32s hadoop-yarn-server-applicationhistoryservice in the patch passed with JDK v1.7.0_85.
          -1 unit 66m 20s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_85.
          -1 asflicense 0m 20s Patch generated 1 ASF License warnings.
          163m 45s



          Reason Tests
          JDK v1.8.0_66 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens
            hadoop.yarn.server.resourcemanager.TestAMAuthorization
          JDK v1.7.0_85 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens
            hadoop.yarn.server.resourcemanager.TestAMAuthorization



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:0ca8df7
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12775871/YARN-4422.001.patch
          JIRA Issue YARN-4422
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux d922df5158b1 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 9d817fa
          mvninstall https://builds.apache.org/job/PreCommit-YARN-Build/9869/artifact/patchprocess/branch-mvninstall-root.txt
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9869/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/9869/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/9869/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_85.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/9869/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-YARN-Build/9869/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_85.txt
          JDK v1.7.0_85 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9869/testReport/
          asflicense https://builds.apache.org/job/PreCommit-YARN-Build/9869/artifact/patchprocess/patch-asflicense-problems.txt
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server
          Max memory used 43MB
          Powered by Apache Yetus http://yetus.apache.org
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/9869/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 mvninstall 1m 50s root in trunk failed. +1 compile 1m 13s trunk passed with JDK v1.8.0_66 +1 compile 1m 20s trunk passed with JDK v1.7.0_85 +1 checkstyle 0m 21s trunk passed +1 mvnsite 1m 0s trunk passed +1 mvneclipse 0m 27s trunk passed +1 findbugs 1m 52s trunk passed +1 javadoc 0m 38s trunk passed with JDK v1.8.0_66 +1 javadoc 0m 45s trunk passed with JDK v1.7.0_85 +1 mvninstall 0m 56s the patch passed +1 compile 1m 14s the patch passed with JDK v1.8.0_66 +1 javac 1m 14s the patch passed +1 compile 1m 23s the patch passed with JDK v1.7.0_85 +1 javac 1m 23s the patch passed -1 checkstyle 0m 21s Patch generated 2 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server (total was 80, now 82). +1 mvnsite 1m 1s the patch passed +1 mvneclipse 0m 27s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 2m 13s the patch passed +1 javadoc 0m 39s the patch passed with JDK v1.8.0_66 +1 javadoc 0m 44s the patch passed with JDK v1.7.0_85 +1 unit 3m 52s hadoop-yarn-server-applicationhistoryservice in the patch passed with JDK v1.8.0_66. -1 unit 69m 0s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. +1 unit 4m 32s hadoop-yarn-server-applicationhistoryservice in the patch passed with JDK v1.7.0_85. -1 unit 66m 20s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_85. -1 asflicense 0m 20s Patch generated 1 ASF License warnings. 163m 45s Reason Tests JDK v1.8.0_66 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.TestAMAuthorization JDK v1.7.0_85 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.TestAMAuthorization Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12775871/YARN-4422.001.patch JIRA Issue YARN-4422 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux d922df5158b1 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 9d817fa mvninstall https://builds.apache.org/job/PreCommit-YARN-Build/9869/artifact/patchprocess/branch-mvninstall-root.txt findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9869/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/9869/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/9869/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_85.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/9869/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt https://builds.apache.org/job/PreCommit-YARN-Build/9869/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_85.txt JDK v1.7.0_85 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9869/testReport/ asflicense https://builds.apache.org/job/PreCommit-YARN-Build/9869/artifact/patchprocess/patch-asflicense-problems.txt modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server Max memory used 43MB Powered by Apache Yetus http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-YARN-Build/9869/console This message was automatically generated.
          Hide
          eepayne Eric Payne added a comment -

          Attaching YARN-4422-001.patch. Jonathan Eagles or Jason Lowe, would you mind taking a look?

          The problem was that when the Applications page in the Generic AHS renders, it depends on a MASTER_CONTAINER_EVENT_INFO being in the AppAttemptReport. If it's not there, it will give up on trying to print start time, node, or log lings. The reason that information then appears when you clidk on the app attempt link is because when the Application Attempt page renders, it just gets the whole list of containers for the app attempt and prints that information for each one, including the AM container, but it still doesn't have an indication which one is the AM container.

          The reason the MASTER_CONTAINER_EVENT_INFO isn't in the AppAttemptReport is because that is provided by the REGISTER event in the System Metrics Publisher, and since this use case doesn't ever get to the point of AM registration, the MASTER_CONTAINER_EVENT_INFO isn't there.

          However, in all of these cases, the RM container does get a FINISHED event. I fixed this by adding the MASTER_CONTAINER_EVENT_INFO to the FINISHED event.

          Show
          eepayne Eric Payne added a comment - Attaching YARN-4422 -001.patch . Jonathan Eagles or Jason Lowe , would you mind taking a look? The problem was that when the Applications page in the Generic AHS renders, it depends on a MASTER_CONTAINER_EVENT_INFO being in the AppAttemptReport. If it's not there, it will give up on trying to print start time, node, or log lings. The reason that information then appears when you clidk on the app attempt link is because when the Application Attempt page renders, it just gets the whole list of containers for the app attempt and prints that information for each one, including the AM container, but it still doesn't have an indication which one is the AM container. The reason the MASTER_CONTAINER_EVENT_INFO isn't in the AppAttemptReport is because that is provided by the REGISTER event in the System Metrics Publisher, and since this use case doesn't ever get to the point of AM registration, the MASTER_CONTAINER_EVENT_INFO isn't there. However, in all of these cases, the RM container does get a FINISHED event. I fixed this by adding the MASTER_CONTAINER_EVENT_INFO to the FINISHED event.

            People

            • Assignee:
              eepayne Eric Payne
              Reporter:
              eepayne Eric Payne
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development