Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4051

ContainerKillEvent lost when container is still recovering and application finishes

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9.0, 3.0.0-alpha4, 2.8.2
    • Component/s: nodemanager
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      As in YARN-4050, NM event dispatcher is blocked, and container is in New state, when we finish application, the container still alive even after NM event dispatcher is unblocked.

      1. YARN-4051.08.patch-branch-2
        13 kB
        sandflee
      2. YARN-4051.08.patch
        14 kB
        sandflee
      3. YARN-4051.07.patch
        9 kB
        sandflee
      4. YARN-4051.06.patch
        6 kB
        sandflee
      5. YARN-4051.05.patch
        4 kB
        sandflee
      6. YARN-4051.04.patch
        4 kB
        sandflee
      7. YARN-4051.03.patch
        8 kB
        sandflee
      8. YARN-4051.02.patch
        7 kB
        sandflee
      9. YARN-4051.01.patch
        7 kB
        sandflee

        Issue Links

          Activity

          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          2.8.1 became a security release. Moving fix-version to 2.8.2 after the fact.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - 2.8.1 became a security release. Moving fix-version to 2.8.2 after the fact.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #11414 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11414/)
          YARN-4051. ContainerKillEvent lost when container is still recovering (jlowe: rev 7114baddb627628a54cdab77f68504332a5a0e28)

          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/ApplicationImpl.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/MockContainer.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManagerRecovery.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/Container.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #11414 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11414/ ) YARN-4051 . ContainerKillEvent lost when container is still recovering (jlowe: rev 7114baddb627628a54cdab77f68504332a5a0e28) (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/ApplicationImpl.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/MockContainer.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManagerRecovery.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/Container.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
          Hide
          sandflee sandflee added a comment -

          Thanks Jason Lowe for your review and commit!

          Show
          sandflee sandflee added a comment - Thanks Jason Lowe for your review and commit!
          Hide
          jlowe Jason Lowe added a comment -

          Thanks, sandflee! I committed this to trunk, branch-2, and branch-2.8. I filed YARN-6349 to track the followup work relating to client kill requests while containers are still recovering.

          Show
          jlowe Jason Lowe added a comment - Thanks, sandflee ! I committed this to trunk, branch-2, and branch-2.8. I filed YARN-6349 to track the followup work relating to client kill requests while containers are still recovering.
          Hide
          jlowe Jason Lowe added a comment -

          +1 for the branch-2 patch as well. The unit test failure appears to be unrelated, and the test passes for me locally with the patch applied.

          Committing this.

          Show
          jlowe Jason Lowe added a comment - +1 for the branch-2 patch as well. The unit test failure appears to be unrelated, and the test passes for me locally with the patch applied. Committing this.
          Hide
          sandflee sandflee added a comment -

          update a patch for branch-2

          Show
          sandflee sandflee added a comment - update a patch for branch-2
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 13m 41s Docker mode activated.
          0 patch 0m 4s The patch file was not named according to hadoop's naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute for instructions.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
          +1 mvninstall 6m 54s branch-2 passed
          +1 compile 0m 25s branch-2 passed with JDK v1.8.0_121
          +1 compile 0m 29s branch-2 passed with JDK v1.7.0_121
          +1 checkstyle 0m 20s branch-2 passed
          +1 mvnsite 0m 30s branch-2 passed
          +1 mvneclipse 0m 14s branch-2 passed
          +1 findbugs 0m 51s branch-2 passed
          +1 javadoc 0m 16s branch-2 passed with JDK v1.8.0_121
          +1 javadoc 0m 22s branch-2 passed with JDK v1.7.0_121
          +1 mvninstall 0m 27s the patch passed
          +1 compile 0m 27s the patch passed with JDK v1.8.0_121
          +1 javac 0m 27s the patch passed
          +1 compile 0m 28s the patch passed with JDK v1.7.0_121
          +1 javac 0m 28s the patch passed
          +1 checkstyle 0m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 0 new + 173 unchanged - 1 fixed = 173 total (was 174)
          +1 mvnsite 0m 30s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 8s the patch passed
          +1 javadoc 0m 16s the patch passed with JDK v1.8.0_121
          +1 javadoc 0m 18s the patch passed with JDK v1.7.0_121
          -1 unit 13m 55s hadoop-yarn-server-nodemanager in the patch failed with JDK v1.7.0_121.
          +1 asflicense 0m 18s The patch does not generate ASF License warnings.
          57m 40s



          Reason Tests
          JDK v1.8.0_121 Failed junit tests hadoop.yarn.server.nodemanager.webapp.TestNMWebServer
          JDK v1.7.0_121 Failed junit tests hadoop.yarn.server.nodemanager.webapp.TestNMWebServer



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:b59b8b7
          JIRA Issue YARN-4051
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12859050/YARN-4051.08.patch-branch-2
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux dd915b1c5b46 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision branch-2 / 9f9ccb2
          Default Java 1.7.0_121
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_121 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_121
          findbugs v3.0.0
          unit https://builds.apache.org/job/PreCommit-YARN-Build/15297/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdk1.7.0_121.txt
          JDK v1.7.0_121 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/15297/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/15297/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 13m 41s Docker mode activated. 0 patch 0m 4s The patch file was not named according to hadoop's naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute for instructions. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 6m 54s branch-2 passed +1 compile 0m 25s branch-2 passed with JDK v1.8.0_121 +1 compile 0m 29s branch-2 passed with JDK v1.7.0_121 +1 checkstyle 0m 20s branch-2 passed +1 mvnsite 0m 30s branch-2 passed +1 mvneclipse 0m 14s branch-2 passed +1 findbugs 0m 51s branch-2 passed +1 javadoc 0m 16s branch-2 passed with JDK v1.8.0_121 +1 javadoc 0m 22s branch-2 passed with JDK v1.7.0_121 +1 mvninstall 0m 27s the patch passed +1 compile 0m 27s the patch passed with JDK v1.8.0_121 +1 javac 0m 27s the patch passed +1 compile 0m 28s the patch passed with JDK v1.7.0_121 +1 javac 0m 28s the patch passed +1 checkstyle 0m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 0 new + 173 unchanged - 1 fixed = 173 total (was 174) +1 mvnsite 0m 30s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 8s the patch passed +1 javadoc 0m 16s the patch passed with JDK v1.8.0_121 +1 javadoc 0m 18s the patch passed with JDK v1.7.0_121 -1 unit 13m 55s hadoop-yarn-server-nodemanager in the patch failed with JDK v1.7.0_121. +1 asflicense 0m 18s The patch does not generate ASF License warnings. 57m 40s Reason Tests JDK v1.8.0_121 Failed junit tests hadoop.yarn.server.nodemanager.webapp.TestNMWebServer JDK v1.7.0_121 Failed junit tests hadoop.yarn.server.nodemanager.webapp.TestNMWebServer Subsystem Report/Notes Docker Image:yetus/hadoop:b59b8b7 JIRA Issue YARN-4051 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12859050/YARN-4051.08.patch-branch-2 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux dd915b1c5b46 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision branch-2 / 9f9ccb2 Default Java 1.7.0_121 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_121 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_121 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-YARN-Build/15297/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdk1.7.0_121.txt JDK v1.7.0_121 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/15297/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/15297/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          jlowe Jason Lowe added a comment -

          +1 for the latest patch, however it doesn't apply to branch-2. Could you provide a patch for branch-2 as well?

          Show
          jlowe Jason Lowe added a comment - +1 for the latest patch, however it doesn't apply to branch-2. Could you provide a patch for branch-2 as well?
          Hide
          sandflee sandflee added a comment -

          patch updated, also fix test failure

          Show
          sandflee sandflee added a comment - patch updated, also fix test failure
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 19s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
          +1 mvninstall 12m 38s trunk passed
          +1 compile 0m 29s trunk passed
          +1 checkstyle 0m 19s trunk passed
          +1 mvnsite 0m 27s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 0m 42s trunk passed
          +1 javadoc 0m 17s trunk passed
          +1 mvninstall 0m 23s the patch passed
          +1 compile 0m 24s the patch passed
          +1 javac 0m 24s the patch passed
          -0 checkstyle 0m 17s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 1 new + 166 unchanged - 1 fixed = 167 total (was 167)
          +1 mvnsite 0m 23s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 47s the patch passed
          -1 javadoc 0m 15s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 1 new + 230 unchanged - 0 fixed = 231 total (was 230)
          +1 unit 13m 6s hadoop-yarn-server-nodemanager in the patch passed.
          +1 asflicense 0m 16s The patch does not generate ASF License warnings.
          32m 51s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue YARN-4051
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12858839/YARN-4051.08.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 6ed05f917389 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / cc1292e
          Default Java 1.8.0_121
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/15282/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          javadoc https://builds.apache.org/job/PreCommit-YARN-Build/15282/artifact/patchprocess/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/15282/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/15282/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 19s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 12m 38s trunk passed +1 compile 0m 29s trunk passed +1 checkstyle 0m 19s trunk passed +1 mvnsite 0m 27s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 0m 42s trunk passed +1 javadoc 0m 17s trunk passed +1 mvninstall 0m 23s the patch passed +1 compile 0m 24s the patch passed +1 javac 0m 24s the patch passed -0 checkstyle 0m 17s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 1 new + 166 unchanged - 1 fixed = 167 total (was 167) +1 mvnsite 0m 23s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 47s the patch passed -1 javadoc 0m 15s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 1 new + 230 unchanged - 0 fixed = 231 total (was 230) +1 unit 13m 6s hadoop-yarn-server-nodemanager in the patch passed. +1 asflicense 0m 16s The patch does not generate ASF License warnings. 32m 51s Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue YARN-4051 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12858839/YARN-4051.08.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 6ed05f917389 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / cc1292e Default Java 1.8.0_121 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/15282/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt javadoc https://builds.apache.org/job/PreCommit-YARN-Build/15282/artifact/patchprocess/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/15282/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/15282/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          jlowe Jason Lowe added a comment -

          Thanks for updating the patch!

          I'm OK with fixing the lost kill-from-AM event in a separate JIRA, but I adjusted the headline of this one to avoid confusion.

          Should we use NMNotYetReadyException in the case where the AM tries to kill a container still recovering? We already throw it in similar situations where the NM isn't ready to handle the request.

          Nits:

          • ",because " should be " because "
          • ContainerImpl#isRecovering should check recoveredStatus before container state since recoveredStatus is the cheaper check and likely to avoid a subsequent state check and corresponding lock acquisition.
          Show
          jlowe Jason Lowe added a comment - Thanks for updating the patch! I'm OK with fixing the lost kill-from-AM event in a separate JIRA, but I adjusted the headline of this one to avoid confusion. Should we use NMNotYetReadyException in the case where the AM tries to kill a container still recovering? We already throw it in similar situations where the NM isn't ready to handle the request. Nits: ",because " should be " because " ContainerImpl#isRecovering should check recoveredStatus before container state since recoveredStatus is the cheaper check and likely to avoid a subsequent state check and corresponding lock acquisition.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 17s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 13m 21s trunk passed
          +1 compile 0m 30s trunk passed
          +1 checkstyle 0m 21s trunk passed
          +1 mvnsite 0m 30s trunk passed
          +1 mvneclipse 0m 15s trunk passed
          +1 findbugs 0m 46s trunk passed
          +1 javadoc 0m 19s trunk passed
          +1 mvninstall 0m 26s the patch passed
          +1 compile 0m 29s the patch passed
          +1 javac 0m 29s the patch passed
          -0 checkstyle 0m 21s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 1 new + 153 unchanged - 1 fixed = 154 total (was 154)
          +1 mvnsite 0m 32s the patch passed
          +1 mvneclipse 0m 14s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 59s the patch passed
          -1 javadoc 0m 17s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 1 new + 230 unchanged - 0 fixed = 231 total (was 230)
          -1 unit 13m 31s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 20s The patch does not generate ASF License warnings.
          34m 57s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.containermanager.TestContainerManagerRecovery



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue YARN-4051
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12857685/YARN-4051.07.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 0694ad23f830 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 7992426
          Default Java 1.8.0_121
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/15243/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          javadoc https://builds.apache.org/job/PreCommit-YARN-Build/15243/artifact/patchprocess/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/15243/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/15243/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/15243/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 17s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 13m 21s trunk passed +1 compile 0m 30s trunk passed +1 checkstyle 0m 21s trunk passed +1 mvnsite 0m 30s trunk passed +1 mvneclipse 0m 15s trunk passed +1 findbugs 0m 46s trunk passed +1 javadoc 0m 19s trunk passed +1 mvninstall 0m 26s the patch passed +1 compile 0m 29s the patch passed +1 javac 0m 29s the patch passed -0 checkstyle 0m 21s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 1 new + 153 unchanged - 1 fixed = 154 total (was 154) +1 mvnsite 0m 32s the patch passed +1 mvneclipse 0m 14s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 59s the patch passed -1 javadoc 0m 17s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 1 new + 230 unchanged - 0 fixed = 231 total (was 230) -1 unit 13m 31s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 20s The patch does not generate ASF License warnings. 34m 57s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.containermanager.TestContainerManagerRecovery Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue YARN-4051 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12857685/YARN-4051.07.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 0694ad23f830 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 7992426 Default Java 1.8.0_121 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/15243/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt javadoc https://builds.apache.org/job/PreCommit-YARN-Build/15243/artifact/patchprocess/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/15243/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/15243/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/15243/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          sandflee sandflee added a comment -

          Thanks Jason Lowe,

          I'm also wondering about the scenario where the kill event is coming in from an AM and not the RM.

          simple throw a YarnException when AM stops a recovering container, but seems NMClientAsyncImpl could't try stopContainer again, we could fix this in a new issue?

                      .addTransition(ContainerState.RUNNING,
                          EnumSet.of(ContainerState.DONE, ContainerState.FAILED),
                          ContainerEventType.STOP_CONTAINER,
                          new StopContainerTransition())
          

          do another two changes:
          1, using app.handle(new ApplicationContainerInitEvent(container)) when recover containers, for there is a race condition when Finish events comes, ApplicationContainerInitEvent not processed and containers are not added to app
          2, use ConcurrentHashMap to store containers in app. because I encountered ConcurrentModifyException when iterating app.getContainers() , and I also see web and AppLogAggregator using app.getContainers() without protect.

          Show
          sandflee sandflee added a comment - Thanks Jason Lowe , I'm also wondering about the scenario where the kill event is coming in from an AM and not the RM. simple throw a YarnException when AM stops a recovering container, but seems NMClientAsyncImpl could't try stopContainer again, we could fix this in a new issue? .addTransition(ContainerState.RUNNING, EnumSet.of(ContainerState.DONE, ContainerState.FAILED), ContainerEventType.STOP_CONTAINER, new StopContainerTransition()) do another two changes: 1, using app.handle(new ApplicationContainerInitEvent(container)) when recover containers, for there is a race condition when Finish events comes, ApplicationContainerInitEvent not processed and containers are not added to app 2, use ConcurrentHashMap to store containers in app. because I encountered ConcurrentModifyException when iterating app.getContainers() , and I also see web and AppLogAggregator using app.getContainers() without protect.
          Hide
          jlowe Jason Lowe added a comment -

          Thanks for updating the patch! In the future, please don't delete patches and re-upload them with the same name. It can lead to very confusing cases where Jenkins comments on a patch that happens to have the same name as one of the current attachments but isn't actually the patch that was tested.

          The following code won't actually cause it to ignore the FINISH_APPS event. The continue in the for loop is degenerate, so all this does is log warnings but otherwise is semantically the same logic:

                  for (Container container : app.getContainers().values()) {
                    if (container.isRecovering()) {
                      LOG.warn("drop FINISH_APPS event to " + appID + "because container "
                          + container.getContainerId() + "is recovering");
                      continue;
                    }
                  }
          

          Also this shouldn't be a warning since it's not actually wrong when this happens, correct? Similarly the warn log when ignoring the FINISH_CONTAINERS event seems like that should just be an info log at best.

          I'm also wondering about the scenario where the kill event is coming in from an AM and not the RM. If a container is still in the recovering state when we open up the client service for new requests it seems a client (e.g.: AM) could come in and ask for a still-recovering container to be killed. I think the container process will be orphaned if that occurs, since the NM will mistakenly believe the container has not been launched yet.

          Show
          jlowe Jason Lowe added a comment - Thanks for updating the patch! In the future, please don't delete patches and re-upload them with the same name. It can lead to very confusing cases where Jenkins comments on a patch that happens to have the same name as one of the current attachments but isn't actually the patch that was tested. The following code won't actually cause it to ignore the FINISH_APPS event. The continue in the for loop is degenerate, so all this does is log warnings but otherwise is semantically the same logic: for (Container container : app.getContainers().values()) { if (container.isRecovering()) { LOG.warn( "drop FINISH_APPS event to " + appID + "because container " + container.getContainerId() + "is recovering" ); continue ; } } Also this shouldn't be a warning since it's not actually wrong when this happens, correct? Similarly the warn log when ignoring the FINISH_CONTAINERS event seems like that should just be an info log at best. I'm also wondering about the scenario where the kill event is coming in from an AM and not the RM. If a container is still in the recovering state when we open up the client service for new requests it seems a client (e.g.: AM) could come in and ask for a still-recovering container to be killed. I think the container process will be orphaned if that occurs, since the NM will mistakenly believe the container has not been launched yet.
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 18s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 12m 20s trunk passed
          +1 compile 0m 27s trunk passed
          +1 checkstyle 0m 20s trunk passed
          +1 mvnsite 0m 26s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 0m 39s trunk passed
          +1 javadoc 0m 17s trunk passed
          +1 mvninstall 0m 23s the patch passed
          +1 compile 0m 24s the patch passed
          +1 javac 0m 24s the patch passed
          -0 checkstyle 0m 17s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 2 new + 140 unchanged - 1 fixed = 142 total (was 141)
          +1 mvnsite 0m 24s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 47s the patch passed
          +1 javadoc 0m 14s the patch passed
          +1 unit 13m 2s hadoop-yarn-server-nodemanager in the patch passed.
          +1 asflicense 0m 16s The patch does not generate ASF License warnings.
          32m 19s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue YARN-4051
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12856722/YARN-4051.06.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux fde589693e14 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 28daaf0
          Default Java 1.8.0_121
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/15201/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/15201/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/15201/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 18s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 12m 20s trunk passed +1 compile 0m 27s trunk passed +1 checkstyle 0m 20s trunk passed +1 mvnsite 0m 26s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 0m 39s trunk passed +1 javadoc 0m 17s trunk passed +1 mvninstall 0m 23s the patch passed +1 compile 0m 24s the patch passed +1 javac 0m 24s the patch passed -0 checkstyle 0m 17s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 2 new + 140 unchanged - 1 fixed = 142 total (was 141) +1 mvnsite 0m 24s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 47s the patch passed +1 javadoc 0m 14s the patch passed +1 unit 13m 2s hadoop-yarn-server-nodemanager in the patch passed. +1 asflicense 0m 16s The patch does not generate ASF License warnings. 32m 19s Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue YARN-4051 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12856722/YARN-4051.06.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux fde589693e14 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 28daaf0 Default Java 1.8.0_121 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/15201/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/15201/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/15201/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          sandflee sandflee added a comment -

          since RM will resend FINISH_APPS/FINISH_CONTAINER if nm reports app/container running, seems safe to drop the event if container is recovering, Jason Lowe

          Show
          sandflee sandflee added a comment - since RM will resend FINISH_APPS/FINISH_CONTAINER if nm reports app/container running, seems safe to drop the event if container is recovering, Jason Lowe
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 6s docker + precommit patch detected.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 mvninstall 3m 11s trunk passed
          +1 compile 0m 54s trunk passed with JDK v1.8.0_60
          +1 compile 0m 51s trunk passed with JDK v1.7.0_79
          +1 checkstyle 0m 28s trunk passed
          +1 mvnsite 1m 21s trunk passed
          +1 mvneclipse 0m 39s trunk passed
          -1 findbugs 1m 19s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common in trunk has 3 extant Findbugs warnings.
          +1 javadoc 1m 29s trunk passed with JDK v1.8.0_60
          +1 javadoc 3m 59s trunk passed with JDK v1.7.0_79
          +1 mvninstall 1m 17s the patch passed
          +1 compile 0m 53s the patch passed with JDK v1.8.0_60
          +1 javac 0m 53s the patch passed
          +1 compile 0m 50s the patch passed with JDK v1.7.0_79
          +1 javac 0m 50s the patch passed
          -1 checkstyle 0m 27s Patch generated 1 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 265, now 265).
          +1 mvnsite 1m 22s the patch passed
          +1 mvneclipse 0m 39s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 xml 0m 0s The patch has no ill-formed XML file.
          +1 findbugs 4m 10s the patch passed
          +1 javadoc 1m 40s the patch passed with JDK v1.8.0_60
          +1 javadoc 4m 11s the patch passed with JDK v1.7.0_79
          +1 unit 0m 24s hadoop-yarn-api in the patch passed with JDK v1.8.0_60.
          +1 unit 2m 4s hadoop-yarn-common in the patch passed with JDK v1.8.0_60.
          +1 unit 9m 0s hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_60.
          +1 unit 0m 24s hadoop-yarn-api in the patch passed with JDK v1.7.0_79.
          +1 unit 2m 8s hadoop-yarn-common in the patch passed with JDK v1.7.0_79.
          +1 unit 8m 57s hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_79.
          +1 asflicense 0m 24s Patch does not generate ASF License warnings.
          57m 12s



          Subsystem Report/Notes
          Docker Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-11-12
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12771915/YARN-4051.05.patch
          JIRA Issue YARN-4051
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml
          uname Linux 361522427d7d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/patchprocess/apache-yetus-fa12328/precommit/personality/hadoop.sh
          git revision trunk / 9ad708a
          findbugs v3.0.0
          findbugs https://builds.apache.org/job/PreCommit-YARN-Build/9670/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common-warnings.html
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9670/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
          JDK v1.7.0_79 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9670/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn
          Max memory used 227MB
          Powered by Apache Yetus http://yetus.apache.org
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/9670/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 6s docker + precommit patch detected. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 3m 11s trunk passed +1 compile 0m 54s trunk passed with JDK v1.8.0_60 +1 compile 0m 51s trunk passed with JDK v1.7.0_79 +1 checkstyle 0m 28s trunk passed +1 mvnsite 1m 21s trunk passed +1 mvneclipse 0m 39s trunk passed -1 findbugs 1m 19s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common in trunk has 3 extant Findbugs warnings. +1 javadoc 1m 29s trunk passed with JDK v1.8.0_60 +1 javadoc 3m 59s trunk passed with JDK v1.7.0_79 +1 mvninstall 1m 17s the patch passed +1 compile 0m 53s the patch passed with JDK v1.8.0_60 +1 javac 0m 53s the patch passed +1 compile 0m 50s the patch passed with JDK v1.7.0_79 +1 javac 0m 50s the patch passed -1 checkstyle 0m 27s Patch generated 1 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 265, now 265). +1 mvnsite 1m 22s the patch passed +1 mvneclipse 0m 39s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 xml 0m 0s The patch has no ill-formed XML file. +1 findbugs 4m 10s the patch passed +1 javadoc 1m 40s the patch passed with JDK v1.8.0_60 +1 javadoc 4m 11s the patch passed with JDK v1.7.0_79 +1 unit 0m 24s hadoop-yarn-api in the patch passed with JDK v1.8.0_60. +1 unit 2m 4s hadoop-yarn-common in the patch passed with JDK v1.8.0_60. +1 unit 9m 0s hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_60. +1 unit 0m 24s hadoop-yarn-api in the patch passed with JDK v1.7.0_79. +1 unit 2m 8s hadoop-yarn-common in the patch passed with JDK v1.7.0_79. +1 unit 8m 57s hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_79. +1 asflicense 0m 24s Patch does not generate ASF License warnings. 57m 12s Subsystem Report/Notes Docker Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-11-12 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12771915/YARN-4051.05.patch JIRA Issue YARN-4051 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml uname Linux 361522427d7d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/patchprocess/apache-yetus-fa12328/precommit/personality/hadoop.sh git revision trunk / 9ad708a findbugs v3.0.0 findbugs https://builds.apache.org/job/PreCommit-YARN-Build/9670/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common-warnings.html checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9670/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt JDK v1.7.0_79 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9670/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn Max memory used 227MB Powered by Apache Yetus http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-YARN-Build/9670/console This message was automatically generated.
          Hide
          sandflee sandflee added a comment -

          set default timeout to 2min, since default nm expire timeout is 10min

          Show
          sandflee sandflee added a comment - set default timeout to 2min, since default nm expire timeout is 10min
          Hide
          sandflee sandflee added a comment -

          thanks Jason Lowe

          Should the value be infinite by default? The concern is that if one container has issues recovering (due to log aggregation woes or whatever) then we risk expiring all of the containers on this node if we don't re-register with the RM within the node expiry interval. I think it makes sense if we have also fixed the recovery paths so there aren't potentially long-running procedures (like contacting HDFS) during the recovery process. If we haven't then we could create as many problems as we're solving by waiting forever.
          – aggree ! I also concern this.

          Why does the patch change the check interval? If it's to reduce the logging then we can better fix that by only logging when the status changes rather than every iteration.
          ---yes, it's to reduce the log, since recovery is almost very fast, change it back

          Nit: A value of zero should also be treated as a disabled max time.
          – zero is to register to register to rm at once whether nm complete recover or not,yes?

          Nit: "Max time to wait NM to complete container recover before register to RM " should be "Max time NM will wait to complete container recovery before registering with the RM".
          – corrected

          Show
          sandflee sandflee added a comment - thanks Jason Lowe Should the value be infinite by default? The concern is that if one container has issues recovering (due to log aggregation woes or whatever) then we risk expiring all of the containers on this node if we don't re-register with the RM within the node expiry interval. I think it makes sense if we have also fixed the recovery paths so there aren't potentially long-running procedures (like contacting HDFS) during the recovery process. If we haven't then we could create as many problems as we're solving by waiting forever. – aggree ! I also concern this. Why does the patch change the check interval? If it's to reduce the logging then we can better fix that by only logging when the status changes rather than every iteration. ---yes, it's to reduce the log, since recovery is almost very fast, change it back Nit: A value of zero should also be treated as a disabled max time. – zero is to register to register to rm at once whether nm complete recover or not,yes? Nit: "Max time to wait NM to complete container recover before register to RM " should be "Max time NM will wait to complete container recovery before registering with the RM". – corrected
          Hide
          jlowe Jason Lowe added a comment -

          Thanks for updating the patch!

          Should the value be infinite by default? The concern is that if one container has issues recovering (due to log aggregation woes or whatever) then we risk expiring all of the containers on this node if we don't re-register with the RM within the node expiry interval. I think it makes sense if we have also fixed the recovery paths so there aren't potentially long-running procedures (like contacting HDFS) during the recovery process. If we haven't then we could create as many problems as we're solving by waiting forever.

          Why does the patch change the check interval? If it's to reduce the logging then we can better fix that by only logging when the status changes rather than every iteration.

          Nit: A value of zero should also be treated as a disabled max time.

          Nit: "Max time to wait NM to complete container recover before register to RM " should be "Max time NM will wait to complete container recovery before registering with the RM".

          Show
          jlowe Jason Lowe added a comment - Thanks for updating the patch! Should the value be infinite by default? The concern is that if one container has issues recovering (due to log aggregation woes or whatever) then we risk expiring all of the containers on this node if we don't re-register with the RM within the node expiry interval. I think it makes sense if we have also fixed the recovery paths so there aren't potentially long-running procedures (like contacting HDFS) during the recovery process. If we haven't then we could create as many problems as we're solving by waiting forever. Why does the patch change the check interval? If it's to reduce the logging then we can better fix that by only logging when the status changes rather than every iteration. Nit: A value of zero should also be treated as a disabled max time. Nit: "Max time to wait NM to complete container recover before register to RM " should be "Max time NM will wait to complete container recovery before registering with the RM".
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 6s docker + precommit patch detected.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 mvninstall 3m 12s trunk passed
          +1 compile 0m 50s trunk passed with JDK v1.8.0_60
          +1 compile 0m 47s trunk passed with JDK v1.7.0_79
          +1 checkstyle 0m 27s trunk passed
          +1 mvneclipse 0m 37s trunk passed
          -1 findbugs 1m 17s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common in trunk has 3 extant Findbugs warnings.
          +1 javadoc 1m 23s trunk passed with JDK v1.8.0_60
          +1 javadoc 3m 45s trunk passed with JDK v1.7.0_79
          +1 mvninstall 1m 13s the patch passed
          +1 compile 0m 46s the patch passed with JDK v1.8.0_60
          +1 javac 0m 46s the patch passed
          +1 compile 0m 46s the patch passed with JDK v1.7.0_79
          +1 javac 0m 46s the patch passed
          -1 checkstyle 0m 25s Patch generated 1 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 265, now 265).
          +1 mvneclipse 0m 37s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 xml 0m 0s The patch has no ill-formed XML file.
          +1 findbugs 3m 54s the patch passed
          +1 javadoc 1m 18s the patch passed with JDK v1.8.0_60
          +1 javadoc 3m 48s the patch passed with JDK v1.7.0_79
          +1 unit 0m 19s hadoop-yarn-api in the patch passed with JDK v1.8.0_60.
          +1 unit 1m 46s hadoop-yarn-common in the patch passed with JDK v1.8.0_60.
          -1 unit 22m 50s hadoop-yarn-server-nodemanager in the patch failed with JDK v1.8.0_60.
          +1 unit 0m 21s hadoop-yarn-api in the patch passed with JDK v1.7.0_79.
          +1 unit 2m 2s hadoop-yarn-common in the patch passed with JDK v1.7.0_79.
          -1 unit 23m 23s hadoop-yarn-server-nodemanager in the patch failed with JDK v1.7.0_79.
          +1 asflicense 0m 23s Patch does not generate ASF License warnings.
          79m 57s



          Reason Tests
          JDK v1.8.0_60 Timed out junit tests org.apache.hadoop.yarn.server.nodemanager.containermanager.TestContainerManagerRecovery
          JDK v1.7.0_79 Timed out junit tests org.apache.hadoop.yarn.server.nodemanager.containermanager.TestContainerManagerRecovery



          Subsystem Report/Notes
          Docker Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-11-10
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12771487/YARN-4051.04.patch
          JIRA Issue YARN-4051
          Optional Tests asflicense javac javadoc mvninstall unit findbugs checkstyle compile xml
          uname Linux 26da36b52fe8 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/patchprocess/apache-yetus-ee5baeb/precommit/personality/hadoop.sh
          git revision trunk / 94a1833
          Default Java 1.7.0_79
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_60 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_79
          findbugs v3.0.0
          findbugs https://builds.apache.org/job/PreCommit-YARN-Build/9647/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common-warnings.html
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9647/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/9647/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdk1.8.0_60.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/9647/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdk1.7.0_79.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/9647/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdk1.8.0_60.txt https://builds.apache.org/job/PreCommit-YARN-Build/9647/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdk1.7.0_79.txt
          JDK v1.7.0_79 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9647/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn
          Max memory used 227MB
          Powered by Apache Yetus http://yetus.apache.org
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/9647/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 6s docker + precommit patch detected. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 3m 12s trunk passed +1 compile 0m 50s trunk passed with JDK v1.8.0_60 +1 compile 0m 47s trunk passed with JDK v1.7.0_79 +1 checkstyle 0m 27s trunk passed +1 mvneclipse 0m 37s trunk passed -1 findbugs 1m 17s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common in trunk has 3 extant Findbugs warnings. +1 javadoc 1m 23s trunk passed with JDK v1.8.0_60 +1 javadoc 3m 45s trunk passed with JDK v1.7.0_79 +1 mvninstall 1m 13s the patch passed +1 compile 0m 46s the patch passed with JDK v1.8.0_60 +1 javac 0m 46s the patch passed +1 compile 0m 46s the patch passed with JDK v1.7.0_79 +1 javac 0m 46s the patch passed -1 checkstyle 0m 25s Patch generated 1 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 265, now 265). +1 mvneclipse 0m 37s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 xml 0m 0s The patch has no ill-formed XML file. +1 findbugs 3m 54s the patch passed +1 javadoc 1m 18s the patch passed with JDK v1.8.0_60 +1 javadoc 3m 48s the patch passed with JDK v1.7.0_79 +1 unit 0m 19s hadoop-yarn-api in the patch passed with JDK v1.8.0_60. +1 unit 1m 46s hadoop-yarn-common in the patch passed with JDK v1.8.0_60. -1 unit 22m 50s hadoop-yarn-server-nodemanager in the patch failed with JDK v1.8.0_60. +1 unit 0m 21s hadoop-yarn-api in the patch passed with JDK v1.7.0_79. +1 unit 2m 2s hadoop-yarn-common in the patch passed with JDK v1.7.0_79. -1 unit 23m 23s hadoop-yarn-server-nodemanager in the patch failed with JDK v1.7.0_79. +1 asflicense 0m 23s Patch does not generate ASF License warnings. 79m 57s Reason Tests JDK v1.8.0_60 Timed out junit tests org.apache.hadoop.yarn.server.nodemanager.containermanager.TestContainerManagerRecovery JDK v1.7.0_79 Timed out junit tests org.apache.hadoop.yarn.server.nodemanager.containermanager.TestContainerManagerRecovery Subsystem Report/Notes Docker Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-11-10 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12771487/YARN-4051.04.patch JIRA Issue YARN-4051 Optional Tests asflicense javac javadoc mvninstall unit findbugs checkstyle compile xml uname Linux 26da36b52fe8 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/patchprocess/apache-yetus-ee5baeb/precommit/personality/hadoop.sh git revision trunk / 94a1833 Default Java 1.7.0_79 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_60 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_79 findbugs v3.0.0 findbugs https://builds.apache.org/job/PreCommit-YARN-Build/9647/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common-warnings.html checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9647/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/9647/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdk1.8.0_60.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/9647/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdk1.7.0_79.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/9647/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdk1.8.0_60.txt https://builds.apache.org/job/PreCommit-YARN-Build/9647/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdk1.7.0_79.txt JDK v1.7.0_79 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9647/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn Max memory used 227MB Powered by Apache Yetus http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-YARN-Build/9647/console This message was automatically generated.
          Hide
          sandflee sandflee added a comment -

          NM register to RM after all containers are recovered by default, and user could set a timeout vaule.

          Show
          sandflee sandflee added a comment - NM register to RM after all containers are recovered by default, and user could set a timeout vaule.
          Hide
          jlowe Jason Lowe added a comment -

          If I understand this correctly, we're saying that the problem described in YARN-4050 is holding up the main event dispatcher and the NM is semi-hung, yet we want to hurry and register with the ResourceManager before containers have recovered? Seems to me we need to address the problem described in YARN-4050 if possible (e.g.: skip HDFS operations if we recovered at least one container in the running or completed states since we know it must have done HDFS init in the previous NM instance). Otherwise we are hacking around the fact that we registered too soon and aren't able to properly handle the out-of-order events. I'd much rather deal with the root cause if possible than patch all the separate symptoms.

          Show
          jlowe Jason Lowe added a comment - If I understand this correctly, we're saying that the problem described in YARN-4050 is holding up the main event dispatcher and the NM is semi-hung, yet we want to hurry and register with the ResourceManager before containers have recovered? Seems to me we need to address the problem described in YARN-4050 if possible (e.g.: skip HDFS operations if we recovered at least one container in the running or completed states since we know it must have done HDFS init in the previous NM instance). Otherwise we are hacking around the fact that we registered too soon and aren't able to properly handle the out-of-order events. I'd much rather deal with the root cause if possible than patch all the separate symptoms.
          Hide
          sandflee sandflee added a comment -

          Is it possible for the finish application or complete container requests to arrive at this point?
          yes, we see this in YARN-4050. If we register to RM after complete container recover, we must face the risk that the container running on this node will be killed if container recovery takes much more time(in YARN-4050), for long-runing-services, maybe not so perfect.

          Show
          sandflee sandflee added a comment - Is it possible for the finish application or complete container requests to arrive at this point? yes, we see this in YARN-4050 . If we register to RM after complete container recover, we must face the risk that the container running on this node will be killed if container recovery takes much more time(in YARN-4050 ), for long-runing-services, maybe not so perfect.
          Hide
          jlowe Jason Lowe added a comment -

          For RM finish application or complete container request, let RM retry, seems a little complicated,should we do that?

          Is it possible for the finish application or complete container requests to arrive at this point? We should not be registering with the RM until we've completed the container recovery process. As such, it should be impossible to be told by the RM these things as we should not even be talking to it at that point. Similarly, I believe the cleanest fix for the stop container request race is to avoid opening the client port until all the containers have recovered. I know there's some issue there where we need to know the bind address of the client port during recovery but don't want to start listening on the port yet. If the RPC layer supported that, it'd be a lot cleaner to simply not "open the front doors" while we're still coming up and recovering – then all these races simply aren't possible.

          Show
          jlowe Jason Lowe added a comment - For RM finish application or complete container request, let RM retry, seems a little complicated,should we do that? Is it possible for the finish application or complete container requests to arrive at this point? We should not be registering with the RM until we've completed the container recovery process. As such, it should be impossible to be told by the RM these things as we should not even be talking to it at that point. Similarly, I believe the cleanest fix for the stop container request race is to avoid opening the client port until all the containers have recovered. I know there's some issue there where we need to know the bind address of the client port during recovery but don't want to start listening on the port yet. If the RPC layer supported that, it'd be a lot cleaner to simply not "open the front doors" while we're still coming up and recovering – then all these races simply aren't possible.
          Hide
          sandflee sandflee added a comment -

          Thanks Jason, sorry for just noticed your reply.

          It's more reasonable to let others retry before nm recovered containers.
          1, For AM stopContainer request , we could it simply like startContainers
          2, For RM finish application or complete container request, let RM retry, seems a little complicated,should we do that?

          Show
          sandflee sandflee added a comment - Thanks Jason, sorry for just noticed your reply. It's more reasonable to let others retry before nm recovered containers. 1, For AM stopContainer request , we could it simply like startContainers 2, For RM finish application or complete container request, let RM retry, seems a little complicated,should we do that?
          Hide
          jlowe Jason Lowe added a comment -

          Thanks for the patch! Sorry for the delay, as I missed this when it was originally filed.

          I'm lukewarm on an event buffering approach since we have to track it and remember to propagate it at all the appropriate times which is a maintenance burden. Would it be simpler if we simply prevented the kill request from coming in too soon? Seems like another way to fix this would be to prevent kill requests from arriving before we're done recovering containers. We could do a similar "try again" response as we do for container start requests while still recovering, and we can postpone finish application processing until after containers are recovered.

          However we decide to fix this, there should be a unit test to cover the scenario.

          Show
          jlowe Jason Lowe added a comment - Thanks for the patch! Sorry for the delay, as I missed this when it was originally filed. I'm lukewarm on an event buffering approach since we have to track it and remember to propagate it at all the appropriate times which is a maintenance burden. Would it be simpler if we simply prevented the kill request from coming in too soon? Seems like another way to fix this would be to prevent kill requests from arriving before we're done recovering containers. We could do a similar "try again" response as we do for container start requests while still recovering, and we can postpone finish application processing until after containers are recovered. However we decide to fix this, there should be a unit test to cover the scenario.
          Hide
          sandflee sandflee added a comment -

          could anyone help to review it?

          Show
          sandflee sandflee added a comment - could anyone help to review it?
          Hide
          hadoopqa Hadoop QA added a comment -



          +1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 16m 26s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 57s There were no new javac warning messages.
          +1 javadoc 9m 59s There were no new javadoc warning messages.
          +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 37s There were no new checkstyle issues.
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 21s mvn install still works.
          +1 eclipse:eclipse 0m 32s The patch built with eclipse:eclipse.
          +1 findbugs 1m 13s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 6m 20s Tests passed in hadoop-yarn-server-nodemanager.
              44m 54s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12750841/YARN-4051.03.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 13604bd
          hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8865/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8865/testReport/
          Java 1.7.0_55
          uname Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8865/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 16m 26s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 57s There were no new javac warning messages. +1 javadoc 9m 59s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 37s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 21s mvn install still works. +1 eclipse:eclipse 0m 32s The patch built with eclipse:eclipse. +1 findbugs 1m 13s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 6m 20s Tests passed in hadoop-yarn-server-nodemanager.     44m 54s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12750841/YARN-4051.03.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 13604bd hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8865/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8865/testReport/ Java 1.7.0_55 uname Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8865/console This message was automatically generated.
          Hide
          sandflee sandflee added a comment -

          if recovered as REQUESTED, try to cleanup container resource, and goto Done state.

          Show
          sandflee sandflee added a comment - if recovered as REQUESTED, try to cleanup container resource, and goto Done state.
          Hide
          sandflee sandflee added a comment -

          pending kill event while container is recovered. and just act like recoveredAsKilled.
          if container is recovered as COMPLETE, goto DONE state.
          if recovered as LAUNCHED, try to require container and kill container.
          if recovered as REQUESTED, try to cleanup container state, and goto Done state.

          Show
          sandflee sandflee added a comment - pending kill event while container is recovered. and just act like recoveredAsKilled. if container is recovered as COMPLETE, goto DONE state. if recovered as LAUNCHED, try to require container and kill container. if recovered as REQUESTED, try to cleanup container state, and goto Done state.
          Hide
          hadoopqa Hadoop QA added a comment -



          +1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 15m 56s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 40s There were no new javac warning messages.
          +1 javadoc 9m 36s There were no new javadoc warning messages.
          +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 35s There were no new checkstyle issues.
          +1 whitespace 0m 1s The patch has no lines that end in whitespace.
          +1 install 1m 22s mvn install still works.
          +1 eclipse:eclipse 0m 32s The patch built with eclipse:eclipse.
          +1 findbugs 1m 14s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 6m 16s Tests passed in hadoop-yarn-server-nodemanager.
              43m 38s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12750647/YARN-4051.02.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 8dfec7a
          hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8849/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8849/testReport/
          Java 1.7.0_55
          uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8849/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 15m 56s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 40s There were no new javac warning messages. +1 javadoc 9m 36s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 35s There were no new checkstyle issues. +1 whitespace 0m 1s The patch has no lines that end in whitespace. +1 install 1m 22s mvn install still works. +1 eclipse:eclipse 0m 32s The patch built with eclipse:eclipse. +1 findbugs 1m 14s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 6m 16s Tests passed in hadoop-yarn-server-nodemanager.     43m 38s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12750647/YARN-4051.02.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 8dfec7a hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8849/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8849/testReport/ Java 1.7.0_55 uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8849/console This message was automatically generated.
          Hide
          sandflee sandflee added a comment -

          fix check style errors

          Show
          sandflee sandflee added a comment - fix check style errors
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 16m 11s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 44s There were no new javac warning messages.
          +1 javadoc 9m 48s There were no new javadoc warning messages.
          +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 0m 38s The applied patch generated 5 new checkstyle issues (total was 96, now 101).
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 18s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 1m 14s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 6m 7s Tests passed in hadoop-yarn-server-nodemanager.
              43m 58s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12750297/YARN-4051.01.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 53bef9c
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8842/artifact/patchprocess/diffcheckstylehadoop-yarn-server-nodemanager.txt
          hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8842/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8842/testReport/
          Java 1.7.0_55
          uname Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8842/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 16m 11s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 44s There were no new javac warning messages. +1 javadoc 9m 48s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 0m 38s The applied patch generated 5 new checkstyle issues (total was 96, now 101). +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 18s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 1m 14s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 6m 7s Tests passed in hadoop-yarn-server-nodemanager.     43m 58s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12750297/YARN-4051.01.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 53bef9c checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8842/artifact/patchprocess/diffcheckstylehadoop-yarn-server-nodemanager.txt hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8842/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8842/testReport/ Java 1.7.0_55 uname Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8842/console This message was automatically generated.
          Hide
          sandflee sandflee added a comment -

          pending kill event if container is in NEW state, and is recovered as Launched, send kill event when container becomes RUNNING.

          Show
          sandflee sandflee added a comment - pending kill event if container is in NEW state, and is recovered as Launched, send kill event when container becomes RUNNING.

            People

            • Assignee:
              sandflee sandflee
              Reporter:
              sandflee sandflee
            • Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development