Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5190

Registering/unregistering container metrics triggered by ContainerEvent and ContainersMonitorEvent are conflict which cause uncaught exception in ContainerMonitorImpl

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      The exception stack is as following:

      310735 2016-05-22 01:50:04,554 [Container Monitor] ERROR org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[Container Monitor,5,main] threw an Exception.
      310736 org.apache.hadoop.metrics2.MetricsException: Metrics source ContainerResource_container_1463840817638_14484_01_000010 already exists!
      310737         at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:135)
      310738         at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:112)
      310739         at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:229)
      310740         at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics.forContainer(ContainerMetrics.java:212)
      310741         at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics.forContainer(ContainerMetrics.java:198)
      310742         at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl$MonitoringThread.run(ContainersMonitorImpl.java:385)
      

      After YARN-4906, we have multiple places to get ContainerMetrics for a particular container that could cause race condition in registering the same container metrics to DefaultMetricsSystem by different threads. Lacking of proper handling of MetricsException which could get thrown, the exception will could bring down daemon of ContainerMonitorImpl or even whole NM.

      1. YARN-5190-v2.patch
        7 kB
        Junping Du
      2. YARN-5190-branch-2.7.001.patch
        3 kB
        Wangda Tan
      3. YARN-5190.patch
        8 kB
        Junping Du

        Issue Links

          Activity

          Hide
          leftnoteasy Wangda Tan added a comment -

          Junping Du, HADOOP-13362 should be able to fix the issue, thanks for pointing me this. Closing this ticket.

          Show
          leftnoteasy Wangda Tan added a comment - Junping Du , HADOOP-13362 should be able to fix the issue, thanks for pointing me this. Closing this ticket.
          Hide
          djp Junping Du added a comment -

          Hi Wangda Tan, HADOOP-13362 is proposed to fix this issue for branch-2.7 and already get checked in. Anything more to fix here?

          Show
          djp Junping Du added a comment - Hi Wangda Tan , HADOOP-13362 is proposed to fix this issue for branch-2.7 and already get checked in. Anything more to fix here?
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 20s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 5m 55s branch-2.7 passed
          +1 compile 0m 22s branch-2.7 passed with JDK v1.8.0_101
          +1 compile 0m 27s branch-2.7 passed with JDK v1.7.0_111
          +1 checkstyle 0m 16s branch-2.7 passed
          +1 mvnsite 0m 27s branch-2.7 passed
          +1 mvneclipse 0m 13s branch-2.7 passed
          +1 findbugs 0m 50s branch-2.7 passed
          +1 javadoc 0m 15s branch-2.7 passed with JDK v1.8.0_101
          +1 javadoc 0m 19s branch-2.7 passed with JDK v1.7.0_111
          +1 mvninstall 0m 21s the patch passed
          +1 compile 0m 20s the patch passed with JDK v1.8.0_101
          +1 javac 0m 20s the patch passed
          +1 compile 0m 23s the patch passed with JDK v1.7.0_111
          +1 javac 0m 23s the patch passed
          -1 checkstyle 0m 13s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 3 new + 94 unchanged - 0 fixed = 97 total (was 94)
          +1 mvnsite 0m 25s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          -1 whitespace 0m 0s The patch has 1760 line(s) that end in whitespace. Use git apply --whitespace=fix.
          -1 whitespace 0m 40s The patch 81 line(s) with tabs.
          +1 findbugs 0m 59s the patch passed
          +1 javadoc 0m 13s the patch passed with JDK v1.8.0_101
          +1 javadoc 0m 18s the patch passed with JDK v1.7.0_111
          +1 unit 5m 28s hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_101.
          +1 unit 5m 55s hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_111.
          +1 asflicense 0m 15s The patch does not generate ASF License warnings.
          26m 19s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:c420dfe
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827638/YARN-5190-branch-2.7.001.patch
          JIRA Issue YARN-5190
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux a47bc16db437 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision branch-2.7 / 67204f2
          Default Java 1.7.0_111
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_101 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_111
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13052/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          whitespace https://builds.apache.org/job/PreCommit-YARN-Build/13052/artifact/patchprocess/whitespace-eol.txt
          whitespace https://builds.apache.org/job/PreCommit-YARN-Build/13052/artifact/patchprocess/whitespace-tabs.txt
          JDK v1.7.0_111 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13052/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13052/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 20s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 5m 55s branch-2.7 passed +1 compile 0m 22s branch-2.7 passed with JDK v1.8.0_101 +1 compile 0m 27s branch-2.7 passed with JDK v1.7.0_111 +1 checkstyle 0m 16s branch-2.7 passed +1 mvnsite 0m 27s branch-2.7 passed +1 mvneclipse 0m 13s branch-2.7 passed +1 findbugs 0m 50s branch-2.7 passed +1 javadoc 0m 15s branch-2.7 passed with JDK v1.8.0_101 +1 javadoc 0m 19s branch-2.7 passed with JDK v1.7.0_111 +1 mvninstall 0m 21s the patch passed +1 compile 0m 20s the patch passed with JDK v1.8.0_101 +1 javac 0m 20s the patch passed +1 compile 0m 23s the patch passed with JDK v1.7.0_111 +1 javac 0m 23s the patch passed -1 checkstyle 0m 13s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 3 new + 94 unchanged - 0 fixed = 97 total (was 94) +1 mvnsite 0m 25s the patch passed +1 mvneclipse 0m 10s the patch passed -1 whitespace 0m 0s The patch has 1760 line(s) that end in whitespace. Use git apply --whitespace=fix. -1 whitespace 0m 40s The patch 81 line(s) with tabs. +1 findbugs 0m 59s the patch passed +1 javadoc 0m 13s the patch passed with JDK v1.8.0_101 +1 javadoc 0m 18s the patch passed with JDK v1.7.0_111 +1 unit 5m 28s hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_101. +1 unit 5m 55s hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_111. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 26m 19s Subsystem Report/Notes Docker Image:yetus/hadoop:c420dfe JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827638/YARN-5190-branch-2.7.001.patch JIRA Issue YARN-5190 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux a47bc16db437 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision branch-2.7 / 67204f2 Default Java 1.7.0_111 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_101 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13052/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt whitespace https://builds.apache.org/job/PreCommit-YARN-Build/13052/artifact/patchprocess/whitespace-eol.txt whitespace https://builds.apache.org/job/PreCommit-YARN-Build/13052/artifact/patchprocess/whitespace-tabs.txt JDK v1.7.0_111 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13052/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13052/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Reopened for branch-2.7

          Show
          leftnoteasy Wangda Tan added a comment - Reopened for branch-2.7
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-trunk-Commit #9906 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9906/)
          YARN-5190. Registering/unregistering container metrics in (jianhe: rev 99cc439e29794f8e61bebe03b2a7ca4b6743ec92)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainerMetrics.java
          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerMetrics.java
          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/DefaultMetricsSystem.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #9906 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9906/ ) YARN-5190 . Registering/unregistering container metrics in (jianhe: rev 99cc439e29794f8e61bebe03b2a7ca4b6743ec92) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainerMetrics.java hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainerMetrics.java hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/DefaultMetricsSystem.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java
          Hide
          jianhe Jian He added a comment -

          Committed to trunk, branch-2, branch-2.8. Thanks Junping !

          Show
          jianhe Jian He added a comment - Committed to trunk, branch-2, branch-2.8. Thanks Junping !
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 22s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          0 mvndep 0m 11s Maven dependency ordering for branch
          +1 mvninstall 5m 58s trunk passed
          +1 compile 6m 12s trunk passed
          +1 checkstyle 1m 20s trunk passed
          +1 mvnsite 1m 16s trunk passed
          +1 mvneclipse 0m 24s trunk passed
          +1 findbugs 1m 53s trunk passed
          +1 javadoc 1m 11s trunk passed
          0 mvndep 0m 12s Maven dependency ordering for patch
          +1 mvninstall 0m 58s the patch passed
          +1 compile 6m 12s the patch passed
          +1 javac 6m 12s the patch passed
          +1 checkstyle 1m 20s the patch passed
          +1 mvnsite 1m 17s the patch passed
          +1 mvneclipse 0m 23s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 2m 11s the patch passed
          +1 javadoc 1m 11s the patch passed
          -1 unit 7m 8s hadoop-common in the patch failed.
          +1 unit 11m 25s hadoop-yarn-server-nodemanager in the patch passed.
          +1 asflicense 0m 22s The patch does not generate ASF License warnings.
          52m 16s



          Reason Tests
          Failed junit tests hadoop.ipc.TestIPC



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:2c91fd8
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12807807/YARN-5190-v2.patch
          JIRA Issue YARN-5190
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 24c739c523c5 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / ead61c4
          Default Java 1.8.0_91
          findbugs v3.0.0
          unit https://builds.apache.org/job/PreCommit-YARN-Build/11824/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/11824/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/11824/testReport/
          modules C: hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: .
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/11824/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 22s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. 0 mvndep 0m 11s Maven dependency ordering for branch +1 mvninstall 5m 58s trunk passed +1 compile 6m 12s trunk passed +1 checkstyle 1m 20s trunk passed +1 mvnsite 1m 16s trunk passed +1 mvneclipse 0m 24s trunk passed +1 findbugs 1m 53s trunk passed +1 javadoc 1m 11s trunk passed 0 mvndep 0m 12s Maven dependency ordering for patch +1 mvninstall 0m 58s the patch passed +1 compile 6m 12s the patch passed +1 javac 6m 12s the patch passed +1 checkstyle 1m 20s the patch passed +1 mvnsite 1m 17s the patch passed +1 mvneclipse 0m 23s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 11s the patch passed +1 javadoc 1m 11s the patch passed -1 unit 7m 8s hadoop-common in the patch failed. +1 unit 11m 25s hadoop-yarn-server-nodemanager in the patch passed. +1 asflicense 0m 22s The patch does not generate ASF License warnings. 52m 16s Reason Tests Failed junit tests hadoop.ipc.TestIPC Subsystem Report/Notes Docker Image:yetus/hadoop:2c91fd8 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12807807/YARN-5190-v2.patch JIRA Issue YARN-5190 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 24c739c523c5 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / ead61c4 Default Java 1.8.0_91 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-YARN-Build/11824/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/11824/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/11824/testReport/ modules C: hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: . Console output https://builds.apache.org/job/PreCommit-YARN-Build/11824/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          djp Junping Du added a comment -

          Thanks Jian He for review and comments! v2 patch incorporate your comments and fix a checkstyle issue reported by Jenkins.

          Show
          djp Junping Du added a comment - Thanks Jian He for review and comments! v2 patch incorporate your comments and fix a checkstyle issue reported by Jenkins.
          Hide
          jianhe Jian He added a comment -

          looks good, only minor comments on the format: is below slightly better, avoiding a couple null check

               ContainerId containerId = monitoringEvent.getContainerId();
          -    ContainerMetrics usageMetrics = ContainerMetrics
          -        .forContainer(containerId, containerMetricsPeriodMs,
          -        containerMetricsUnregisterDelayMs);
          +    ContainerMetrics usageMetrics;
          
               int vmemLimitMBs;
               int pmemLimitMBs;
               int cpuVcores;
               switch (monitoringEvent.getType()) {
               case START_MONITORING_CONTAINER:
          +     usageMetrics = ContainerMetrics
          +          .forContainer(containerId, containerMetricsPeriodMs,
          +          containerMetricsUnregisterDelayMs);
                 ContainerStartMonitoringEvent startEvent =
                     (ContainerStartMonitoringEvent) monitoringEvent;
                 usageMetrics.recordStateChangeDurations(
          @@ -640,9 +642,16 @@ private void updateContainerMetrics(ContainersMonitorEvent monitoringEvent) {
                     vmemLimitMBs, pmemLimitMBs, cpuVcores);
                 break;
               case STOP_MONITORING_CONTAINER:
          -      usageMetrics.finished();
          +       usageMetrics = ContainerMetrics.getContainerMetrics(
          +          containerId);
          +      if (usageMetrics != null) {
          +        usageMetrics.finished();
          +      }
                 break;
               case CHANGE_MONITORING_CONTAINER_RESOURCE:
          +      usageMetrics = ContainerMetrics
          +          .forContainer(containerId, containerMetricsPeriodMs,
          +              containerMetricsUnregisterDelayMs);
          
          Show
          jianhe Jian He added a comment - looks good, only minor comments on the format: is below slightly better, avoiding a couple null check ContainerId containerId = monitoringEvent.getContainerId(); - ContainerMetrics usageMetrics = ContainerMetrics - .forContainer(containerId, containerMetricsPeriodMs, - containerMetricsUnregisterDelayMs); + ContainerMetrics usageMetrics; int vmemLimitMBs; int pmemLimitMBs; int cpuVcores; switch (monitoringEvent.getType()) { case START_MONITORING_CONTAINER: + usageMetrics = ContainerMetrics + .forContainer(containerId, containerMetricsPeriodMs, + containerMetricsUnregisterDelayMs); ContainerStartMonitoringEvent startEvent = (ContainerStartMonitoringEvent) monitoringEvent; usageMetrics.recordStateChangeDurations( @@ -640,9 +642,16 @@ private void updateContainerMetrics(ContainersMonitorEvent monitoringEvent) { vmemLimitMBs, pmemLimitMBs, cpuVcores); break ; case STOP_MONITORING_CONTAINER: - usageMetrics.finished(); + usageMetrics = ContainerMetrics.getContainerMetrics( + containerId); + if (usageMetrics != null ) { + usageMetrics.finished(); + } break ; case CHANGE_MONITORING_CONTAINER_RESOURCE: + usageMetrics = ContainerMetrics + .forContainer(containerId, containerMetricsPeriodMs, + containerMetricsUnregisterDelayMs);
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 21s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          0 mvndep 0m 12s Maven dependency ordering for branch
          +1 mvninstall 6m 25s trunk passed
          +1 compile 7m 6s trunk passed
          +1 checkstyle 1m 28s trunk passed
          +1 mvnsite 1m 30s trunk passed
          +1 mvneclipse 0m 24s trunk passed
          +1 findbugs 2m 12s trunk passed
          +1 javadoc 1m 11s trunk passed
          0 mvndep 0m 11s Maven dependency ordering for patch
          +1 mvninstall 1m 6s the patch passed
          +1 compile 6m 58s the patch passed
          +1 javac 6m 58s the patch passed
          -1 checkstyle 1m 22s root: The patch generated 1 new + 107 unchanged - 0 fixed = 108 total (was 107)
          +1 mvnsite 1m 23s the patch passed
          +1 mvneclipse 0m 25s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 2m 23s the patch passed
          +1 javadoc 1m 13s the patch passed
          +1 unit 7m 50s hadoop-common in the patch passed.
          +1 unit 11m 11s hadoop-yarn-server-nodemanager in the patch passed.
          +1 asflicense 0m 22s The patch does not generate ASF License warnings.
          56m 5s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:2c91fd8
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12807755/YARN-5190.patch
          JIRA Issue YARN-5190
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 029bd00463ad 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / dc26601
          Default Java 1.8.0_91
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/11818/artifact/patchprocess/diff-checkstyle-root.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/11818/testReport/
          modules C: hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: .
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/11818/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 21s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. 0 mvndep 0m 12s Maven dependency ordering for branch +1 mvninstall 6m 25s trunk passed +1 compile 7m 6s trunk passed +1 checkstyle 1m 28s trunk passed +1 mvnsite 1m 30s trunk passed +1 mvneclipse 0m 24s trunk passed +1 findbugs 2m 12s trunk passed +1 javadoc 1m 11s trunk passed 0 mvndep 0m 11s Maven dependency ordering for patch +1 mvninstall 1m 6s the patch passed +1 compile 6m 58s the patch passed +1 javac 6m 58s the patch passed -1 checkstyle 1m 22s root: The patch generated 1 new + 107 unchanged - 0 fixed = 108 total (was 107) +1 mvnsite 1m 23s the patch passed +1 mvneclipse 0m 25s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 23s the patch passed +1 javadoc 1m 13s the patch passed +1 unit 7m 50s hadoop-common in the patch passed. +1 unit 11m 11s hadoop-yarn-server-nodemanager in the patch passed. +1 asflicense 0m 22s The patch does not generate ASF License warnings. 56m 5s Subsystem Report/Notes Docker Image:yetus/hadoop:2c91fd8 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12807755/YARN-5190.patch JIRA Issue YARN-5190 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 029bd00463ad 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / dc26601 Default Java 1.8.0_91 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/11818/artifact/patchprocess/diff-checkstyle-root.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/11818/testReport/ modules C: hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: . Console output https://builds.apache.org/job/PreCommit-YARN-Build/11818/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          djp Junping Du added a comment -

          Attach the patch to fix three issues mentioned above with unit test.

          Show
          djp Junping Du added a comment - Attach the patch to fix three issues mentioned above with unit test.
          Hide
          djp Junping Du added a comment - - edited

          Discussed offline with Jian He, we think a couple of things need to get fixed here :
          1. Fix the asymmetric behaviors in register()/unregisterSource() at MetricsSystemImpl that source name is still left in sourceNames.map in DefaultMetricsSystem after unregisterSource().

          2. ContainerMetrics.finished() could get called twice - one for container life cycle (involved in YARN-4906) and the other in container monitoring life cycle. Ideally, it is better to make sure ContainerMetrics.finished() for the same container only get called one time one place. However, in practice, the container event life cycle and container monitor event life cycle are independent and cannot replace each other. Alternatively, we will make sure scheduleTimerTaskForUnregistration() only get called one time or it will be more threads of unregistration than needed.

          3. In case one ContainerMetrics already get finished before (triggered as ContainerDoneTransition by ContainerKillEvent, ContianerDoneEvent, etc.), current logic in ContainerMonitorImpl.updateContainerMetrics(ContainersMonitorEvent) will still register metrics into DefaultMetricsSystem first (via ContainerMetrics.forContainer(...)) and unregister it from DefaultMetricsSystem soon after. This is completely unnecessary.

          Will deliver a fix for three issues raised above.

          Show
          djp Junping Du added a comment - - edited Discussed offline with Jian He , we think a couple of things need to get fixed here : 1. Fix the asymmetric behaviors in register()/unregisterSource() at MetricsSystemImpl that source name is still left in sourceNames.map in DefaultMetricsSystem after unregisterSource(). 2. ContainerMetrics.finished() could get called twice - one for container life cycle (involved in YARN-4906 ) and the other in container monitoring life cycle. Ideally, it is better to make sure ContainerMetrics.finished() for the same container only get called one time one place. However, in practice, the container event life cycle and container monitor event life cycle are independent and cannot replace each other. Alternatively, we will make sure scheduleTimerTaskForUnregistration() only get called one time or it will be more threads of unregistration than needed. 3. In case one ContainerMetrics already get finished before (triggered as ContainerDoneTransition by ContainerKillEvent, ContianerDoneEvent, etc.), current logic in ContainerMonitorImpl.updateContainerMetrics(ContainersMonitorEvent) will still register metrics into DefaultMetricsSystem first (via ContainerMetrics.forContainer(...)) and unregister it from DefaultMetricsSystem soon after. This is completely unnecessary. Will deliver a fix for three issues raised above.

            People

            • Assignee:
              djp Junping Du
              Reporter:
              djp Junping Du
            • Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development