Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4726 [Umbrella] Allocation reuse for application upgrades
  3. YARN-5620

Core changes in NodeManager to support re-initialization of Containers with new launchContext

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9.0, 3.0.0-alpha2
    • Component/s: None
    • Labels:
      None

      Description

      JIRA proposes to modify the ContainerManager (and other core classes) to support upgrade of a running container with a new ContainerLaunchContext as well as the ability to rollback the upgrade if the container is not able to restart using the new launch Context.

      1. YARN-5620.001.patch
        35 kB
        Arun Suresh
      2. YARN-5620.002.patch
        41 kB
        Arun Suresh
      3. YARN-5620.003.patch
        46 kB
        Arun Suresh
      4. YARN-5620.004.patch
        37 kB
        Arun Suresh
      5. YARN-5620.005.patch
        37 kB
        Arun Suresh
      6. YARN-5620.006.patch
        42 kB
        Arun Suresh
      7. YARN-5620.007.patch
        42 kB
        Arun Suresh
      8. YARN-5620.008.patch
        56 kB
        Arun Suresh
      9. YARN-5620.009.patch
        55 kB
        Arun Suresh
      10. YARN-5620.010.patch
        54 kB
        Arun Suresh
      11. YARN-5620.011.patch
        50 kB
        Arun Suresh
      12. YARN-5620.012.patch
        49 kB
        Arun Suresh
      13. YARN-5620.013.patch
        58 kB
        Arun Suresh
      14. YARN-5620.014.patch
        57 kB
        Arun Suresh
      15. YARN-5620.015.patch
        59 kB
        Arun Suresh
      16. YARN-5620.016.patch
        58 kB
        Arun Suresh

        Issue Links

          Activity

          Hide
          asuresh Arun Suresh added a comment -

          Aah.. got it, since LocalizedResource#ref is a Queue and not a Set, an entry for the container will still exist..
          Agreed, I will raise a jira.

          Show
          asuresh Arun Suresh added a comment - Aah.. got it, since LocalizedResource#ref is a Queue and not a Set, an entry for the container will still exist.. Agreed, I will raise a jira.
          Hide
          jianhe Jian He added a comment -

          In that case, IIUC, if a LocalizedResource is referenced by both v1 and v3. when we release v1 resources, the refCount (LocalizedResource#ref) is decremented, but the LocalizedResource is not actually released because v3 is still referencing it.

          Show
          jianhe Jian He added a comment - In that case, IIUC, if a LocalizedResource is referenced by both v1 and v3. when we release v1 resources, the refCount (LocalizedResource#ref) is decremented, but the LocalizedResource is not actually released because v3 is still referencing it.
          Hide
          asuresh Arun Suresh added a comment -

          I agree.. but what if the resourcesets for v2 and v3 are only incremental resourceSets (we merge the newResourceSet with the current) ? In which case, v3 might still reference v1 resources.

          Show
          asuresh Arun Suresh added a comment - I agree.. but what if the resourcesets for v2 and v3 are only incremental resourceSets (we merge the newResourceSet with the current) ? In which case, v3 might still reference v1 resources.
          Hide
          jianhe Jian He added a comment -

          Arun Suresh,
          I think we need one more jira to release the older than previous resources, if the container is upgraded from v1 -> v2 -> v3. The v1 resources are not referenced by container any more, it needs to be released, I briefly checked the code, maybe below code in ResourceLocalizationService need to be invoked for v1 resources.

              for (Map.Entry<LocalResourceVisibility, Collection<LocalResourceRequest>> e :
                   rsrcs.entrySet()) {
                LocalResourcesTracker tracker = getLocalResourcesTracker(e.getKey(), c.getUser(), 
                    c.getContainerId().getApplicationAttemptId()
                    .getApplicationId());
                for (LocalResourceRequest req : e.getValue()) {
                  tracker.handle(new ResourceReleaseEvent(req,
                      c.getContainerId()));
                }
              }
          
          Show
          jianhe Jian He added a comment - Arun Suresh , I think we need one more jira to release the older than previous resources, if the container is upgraded from v1 -> v2 -> v3. The v1 resources are not referenced by container any more, it needs to be released, I briefly checked the code, maybe below code in ResourceLocalizationService need to be invoked for v1 resources. for (Map.Entry<LocalResourceVisibility, Collection<LocalResourceRequest>> e : rsrcs.entrySet()) { LocalResourcesTracker tracker = getLocalResourcesTracker(e.getKey(), c.getUser(), c.getContainerId().getApplicationAttemptId() .getApplicationId()); for (LocalResourceRequest req : e.getValue()) { tracker.handle( new ResourceReleaseEvent(req, c.getContainerId())); } }
          Hide
          asuresh Arun Suresh added a comment -

          YARN-5657 has been committed.. should be taken care of now...

          Show
          asuresh Arun Suresh added a comment - YARN-5657 has been committed.. should be taken care of now...
          Hide
          gtCarrera9 Li Lu added a comment -

          which are generally ignored, since they conflict with existing indentation.

          Thanks for the note!

          Show
          gtCarrera9 Li Lu added a comment - which are generally ignored, since they conflict with existing indentation. Thanks for the note!
          Hide
          asuresh Arun Suresh added a comment -

          The test failed locally for me when i reverted the patch, so I assumed it was unrelated... ill take a look.

          There's no note about this either...

          I mentioned i will be cleaning up the imports when I check in and the remaining as switch case indentation issues.. which are generally ignored, since they conflict with existing indentation.

          Show
          asuresh Arun Suresh added a comment - The test failed locally for me when i reverted the patch, so I assumed it was unrelated... ill take a look. There's no note about this either... I mentioned i will be cleaning up the imports when I check in and the remaining as switch case indentation issues.. which are generally ignored, since they conflict with existing indentation.
          Hide
          gtCarrera9 Li Lu added a comment -

          YARN-5657 blocks ongoing Jenkins testing to some NM/container related new code (noticed this when testing YARN-5638). Also, I noticed there were several trivial complaints from checkstyle in the last Jenkins run. Were those warnings addressed when the code got committed? There's no note about this either...

          Show
          gtCarrera9 Li Lu added a comment - YARN-5657 blocks ongoing Jenkins testing to some NM/container related new code (noticed this when testing YARN-5638 ). Also, I noticed there were several trivial complaints from checkstyle in the last Jenkins run. Were those warnings addressed when the code got committed? There's no note about this either...
          Hide
          ajisakaa Akira Ajisaka added a comment -

          Hi Arun Suresh, Jian He, and Varun Vasudev,
          This commit broke YARN-5657. Would you look into this issue?

          Show
          ajisakaa Akira Ajisaka added a comment - Hi Arun Suresh , Jian He , and Varun Vasudev , This commit broke YARN-5657 . Would you look into this issue?
          Hide
          asuresh Arun Suresh added a comment - - edited

          Committed this to trunk and branch-2
          Thanks again for the review Jian He and Varun Vasudev

          Show
          asuresh Arun Suresh added a comment - - edited Committed this to trunk and branch-2 Thanks again for the review Jian He and Varun Vasudev
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10443 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10443/)
          YARN-5620. Core changes in NodeManager to support re-initialization of (arun suresh: rev 40b5a59b726733df456330a26f03d5174cc0bc1c)

          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/Container.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerEventType.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerState.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainersLauncher.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceSet.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestContainerManagerWithLCE.java
          • (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerReInitEvent.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/event/ContainerLocalizationRequestEvent.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainersLauncherEventType.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java
          • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/MockContainer.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10443 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10443/ ) YARN-5620 . Core changes in NodeManager to support re-initialization of (arun suresh: rev 40b5a59b726733df456330a26f03d5174cc0bc1c) (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/Container.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerEventType.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerState.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainersLauncher.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceSet.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestContainerManagerWithLCE.java (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerReInitEvent.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/event/ContainerLocalizationRequestEvent.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainersLauncherEventType.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/MockContainer.java
          Hide
          asuresh Arun Suresh added a comment -

          Committing this shortly based on Varun Vasudev's and Jian He's +1. Will take care of the unused imports when I check in.

          Show
          asuresh Arun Suresh added a comment - Committing this shortly based on Varun Vasudev 's and Jian He 's +1. Will take care of the unused imports when I check in.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 21s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          +1 mvninstall 7m 7s trunk passed
          +1 compile 0m 27s trunk passed
          +1 checkstyle 0m 22s trunk passed
          +1 mvnsite 0m 27s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 0m 41s trunk passed
          +1 javadoc 0m 17s trunk passed
          +1 mvninstall 0m 23s the patch passed
          +1 compile 0m 25s the patch passed
          +1 javac 0m 25s the patch passed
          -1 checkstyle 0m 21s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 13 new + 529 unchanged - 5 fixed = 542 total (was 534)
          +1 mvnsite 0m 26s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 46s the patch passed
          +1 javadoc 0m 14s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242)
          -1 unit 14m 11s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 16s The patch does not generate ASF License warnings.
          27m 55s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828647/YARN-5620.016.patch
          JIRA Issue YARN-5620
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 45370375ce2a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 2a8f55a
          Default Java 1.8.0_101
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13112/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13112/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13112/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13112/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13112/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 21s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. +1 mvninstall 7m 7s trunk passed +1 compile 0m 27s trunk passed +1 checkstyle 0m 22s trunk passed +1 mvnsite 0m 27s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 0m 41s trunk passed +1 javadoc 0m 17s trunk passed +1 mvninstall 0m 23s the patch passed +1 compile 0m 25s the patch passed +1 javac 0m 25s the patch passed -1 checkstyle 0m 21s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 13 new + 529 unchanged - 5 fixed = 542 total (was 534) +1 mvnsite 0m 26s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 46s the patch passed +1 javadoc 0m 14s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242) -1 unit 14m 11s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 16s The patch does not generate ASF License warnings. 27m 55s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828647/YARN-5620.016.patch JIRA Issue YARN-5620 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 45370375ce2a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 2a8f55a Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13112/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13112/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13112/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13112/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13112/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          asuresh Arun Suresh added a comment -

          Thanks for the review Varun Vasudev..
          Uploading final patch with the changes in comments and class rename you suggested.

          Will commit after a good Jenkins run

          Show
          asuresh Arun Suresh added a comment - Thanks for the review Varun Vasudev .. Uploading final patch with the changes in comments and class rename you suggested. Will commit after a good Jenkins run
          Hide
          vvasudev Varun Vasudev added a comment -

          Thanks for the patch Arun Suresh. +1 except for some minor comment fixes.

          1)

          +   * Resource is localized while the container is running - create symlinks.
          

          Comment is the same for two transition handlers - maybe change slightly to provide more context?

          2)

          +      // If Container died during an upgrade, dont bother retrying.
          

          What is this comment for? There’s no change in the code and it looks we're just going through the regular retry mechanism.

          3)

          +  static class KilledExternallyForReInitTransition extends ContainerTransition {
          

          Maybe this should be renamed? My understanding is that this really is “Killed by YARN framework to restart container"

          Show
          vvasudev Varun Vasudev added a comment - Thanks for the patch Arun Suresh . +1 except for some minor comment fixes. 1) + * Resource is localized while the container is running - create symlinks. Comment is the same for two transition handlers - maybe change slightly to provide more context? 2) + // If Container died during an upgrade, dont bother retrying. What is this comment for? There’s no change in the code and it looks we're just going through the regular retry mechanism. 3) + static class KilledExternallyForReInitTransition extends ContainerTransition { Maybe this should be renamed? My understanding is that this really is “Killed by YARN framework to restart container"
          Hide
          asuresh Arun Suresh added a comment -

          The previous Jenkins run looks bad because I had manually cancelled it.
          The run before that ran fine with the same patch.

          The testcase failure is not related.
          Jian He / Varun Vasudev .. if you are fine with the latest patch, can we push this into trunk so we can start jenkins against YARN-5637 ?

          Show
          asuresh Arun Suresh added a comment - The previous Jenkins run looks bad because I had manually cancelled it. The run before that ran fine with the same patch. The testcase failure is not related. Jian He / Varun Vasudev .. if you are fine with the latest patch, can we push this into trunk so we can start jenkins against YARN-5637 ?
          Hide
          jianhe Jian He added a comment -

          KilledExternallyForReInitTransition we use this function to merge the existing ResourceSet and upgraded ResourceSe

          I see, I overlook it's passing both parameters. thanks

          Show
          jianhe Jian He added a comment - KilledExternallyForReInitTransition we use this function to merge the existing ResourceSet and upgraded ResourceSe I see, I overlook it's passing both parameters. thanks
          Hide
          asuresh Arun Suresh added a comment - - edited

          Jian He, I intentionally wanted to keep the merge function as a static utility function. I would prefer not adding a new constructor... Also, if you notice in the KilledExternallyForReInitTransition we use this function to merge the existing ResourceSet and upgraded ResourceSet and set the value of container.resourceSet to the merged RS.
          I can change it if your are particular... but would prefer to keep it as such.

          Show
          asuresh Arun Suresh added a comment - - edited Jian He , I intentionally wanted to keep the merge function as a static utility function. I would prefer not adding a new constructor... Also, if you notice in the KilledExternallyForReInitTransition we use this function to merge the existing ResourceSet and upgraded ResourceSet and set the value of container.resourceSet to the merged RS. I can change it if your are particular... but would prefer to keep it as such.
          Hide
          jianhe Jian He added a comment -

          Thanks Arun for updating,
          The merged variable is not initialized with the existing resources, right? We can probably create a new constructor in ResourceSet to take in ResourceSet as a parameter.

              ResourceSet merged = new ResourceSet();
              for (ResourceSet rs : resourceSets) {
                // This should overwrite existing symlinks
                merged.localizedResources.putAll(rs.localizedResources);
          
          Show
          jianhe Jian He added a comment - Thanks Arun for updating, The merged variable is not initialized with the existing resources, right? We can probably create a new constructor in ResourceSet to take in ResourceSet as a parameter. ResourceSet merged = new ResourceSet(); for (ResourceSet rs : resourceSets) { // This should overwrite existing symlinks merged.localizedResources.putAll(rs.localizedResources);
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 24s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          -1 mvninstall 11m 17s root in trunk failed.
          +1 compile 0m 43s trunk passed
          +1 checkstyle 0m 35s trunk passed
          +1 mvnsite 0m 35s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 0m 49s trunk passed
          +1 javadoc 0m 19s trunk passed
          +1 mvninstall 0m 32s the patch passed
          +1 compile 0m 32s the patch passed
          -1 javac 0m 32s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 113 new + 29 unchanged - 5 fixed = 142 total (was 34)
          -1 checkstyle 0m 32s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 433 new + 1202 unchanged - 413 fixed = 1635 total (was 1615)
          +1 mvnsite 0m 35s the patch passed
          +1 mvneclipse 0m 13s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 1s the patch passed
          +1 javadoc 0m 19s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 480 unchanged - 2 fixed = 480 total (was 482)
          -1 unit 15m 12s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 20s The patch does not generate ASF License warnings.
          35m 3s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager
            hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828347/YARN-5620.015.patch
          JIRA Issue YARN-5620
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 483df05816d2 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 86c9862
          Default Java 1.8.0_101
          mvninstall https://builds.apache.org/job/PreCommit-YARN-Build/13100/artifact/patchprocess/branch-mvninstall-root.txt
          findbugs v3.0.0
          javac https://builds.apache.org/job/PreCommit-YARN-Build/13100/artifact/patchprocess/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13100/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13100/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13100/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13100/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13100/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 24s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. -1 mvninstall 11m 17s root in trunk failed. +1 compile 0m 43s trunk passed +1 checkstyle 0m 35s trunk passed +1 mvnsite 0m 35s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 0m 49s trunk passed +1 javadoc 0m 19s trunk passed +1 mvninstall 0m 32s the patch passed +1 compile 0m 32s the patch passed -1 javac 0m 32s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 113 new + 29 unchanged - 5 fixed = 142 total (was 34) -1 checkstyle 0m 32s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 433 new + 1202 unchanged - 413 fixed = 1635 total (was 1615) +1 mvnsite 0m 35s the patch passed +1 mvneclipse 0m 13s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 1s the patch passed +1 javadoc 0m 19s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 480 unchanged - 2 fixed = 480 total (was 482) -1 unit 15m 12s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 20s The patch does not generate ASF License warnings. 35m 3s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager   hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828347/YARN-5620.015.patch JIRA Issue YARN-5620 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 483df05816d2 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 86c9862 Default Java 1.8.0_101 mvninstall https://builds.apache.org/job/PreCommit-YARN-Build/13100/artifact/patchprocess/branch-mvninstall-root.txt findbugs v3.0.0 javac https://builds.apache.org/job/PreCommit-YARN-Build/13100/artifact/patchprocess/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13100/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13100/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13100/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13100/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13100/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 21s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          +1 mvninstall 6m 46s trunk passed
          +1 compile 0m 28s trunk passed
          +1 checkstyle 0m 23s trunk passed
          +1 mvnsite 0m 26s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          +1 findbugs 0m 40s trunk passed
          +1 javadoc 0m 17s trunk passed
          +1 mvninstall 0m 22s the patch passed
          +1 compile 0m 23s the patch passed
          +1 javac 0m 23s the patch passed
          -1 checkstyle 0m 20s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 15 new + 529 unchanged - 5 fixed = 544 total (was 534)
          +1 mvnsite 0m 24s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 48s the patch passed
          +1 javadoc 0m 15s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242)
          -1 unit 14m 9s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 17s The patch does not generate ASF License warnings.
          27m 25s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828347/YARN-5620.015.patch
          JIRA Issue YARN-5620
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux dc6b6d89e65e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 86c9862
          Default Java 1.8.0_101
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13101/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13101/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13101/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13101/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13101/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 21s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. +1 mvninstall 6m 46s trunk passed +1 compile 0m 28s trunk passed +1 checkstyle 0m 23s trunk passed +1 mvnsite 0m 26s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 0m 40s trunk passed +1 javadoc 0m 17s trunk passed +1 mvninstall 0m 22s the patch passed +1 compile 0m 23s the patch passed +1 javac 0m 23s the patch passed -1 checkstyle 0m 20s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 15 new + 529 unchanged - 5 fixed = 544 total (was 534) +1 mvnsite 0m 24s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 48s the patch passed +1 javadoc 0m 15s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242) -1 unit 14m 9s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 17s The patch does not generate ASF License warnings. 27m 25s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828347/YARN-5620.015.patch JIRA Issue YARN-5620 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux dc6b6d89e65e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 86c9862 Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13101/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13101/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13101/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13101/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13101/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 24s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          -1 mvninstall 2m 40s root in trunk failed.
          +1 compile 0m 38s trunk passed
          +1 checkstyle 0m 36s trunk passed
          +1 mvnsite 0m 48s trunk passed
          +1 mvneclipse 0m 17s trunk passed
          +1 findbugs 1m 8s trunk passed
          +1 javadoc 0m 26s trunk passed
          +1 mvninstall 0m 38s the patch passed
          -1 compile 0m 36s hadoop-yarn-server-nodemanager in the patch failed.
          -1 javac 0m 36s hadoop-yarn-server-nodemanager in the patch failed.
          -1 checkstyle 0m 29s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 15 new + 530 unchanged - 5 fixed = 545 total (was 535)
          +1 mvnsite 0m 40s the patch passed
          +1 mvneclipse 0m 16s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 15s the patch passed
          +1 javadoc 0m 24s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242)
          -1 unit 2m 22s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 31s The patch does not generate ASF License warnings.
          15m 1s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.webapp.TestNMWebServices



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828288/YARN-5620.014.patch
          JIRA Issue YARN-5620
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux d5879fe6cbea 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 86c9862
          Default Java 1.8.0_101
          mvninstall https://builds.apache.org/job/PreCommit-YARN-Build/13099/artifact/patchprocess/branch-mvninstall-root.txt
          findbugs v3.0.0
          compile https://builds.apache.org/job/PreCommit-YARN-Build/13099/artifact/patchprocess/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          javac https://builds.apache.org/job/PreCommit-YARN-Build/13099/artifact/patchprocess/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13099/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13099/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13099/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13099/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13099/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 24s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. -1 mvninstall 2m 40s root in trunk failed. +1 compile 0m 38s trunk passed +1 checkstyle 0m 36s trunk passed +1 mvnsite 0m 48s trunk passed +1 mvneclipse 0m 17s trunk passed +1 findbugs 1m 8s trunk passed +1 javadoc 0m 26s trunk passed +1 mvninstall 0m 38s the patch passed -1 compile 0m 36s hadoop-yarn-server-nodemanager in the patch failed. -1 javac 0m 36s hadoop-yarn-server-nodemanager in the patch failed. -1 checkstyle 0m 29s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 15 new + 530 unchanged - 5 fixed = 545 total (was 535) +1 mvnsite 0m 40s the patch passed +1 mvneclipse 0m 16s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 15s the patch passed +1 javadoc 0m 24s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242) -1 unit 2m 22s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 31s The patch does not generate ASF License warnings. 15m 1s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.webapp.TestNMWebServices Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828288/YARN-5620.014.patch JIRA Issue YARN-5620 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux d5879fe6cbea 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 86c9862 Default Java 1.8.0_101 mvninstall https://builds.apache.org/job/PreCommit-YARN-Build/13099/artifact/patchprocess/branch-mvninstall-root.txt findbugs v3.0.0 compile https://builds.apache.org/job/PreCommit-YARN-Build/13099/artifact/patchprocess/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt javac https://builds.apache.org/job/PreCommit-YARN-Build/13099/artifact/patchprocess/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13099/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13099/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13099/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13099/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13099/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          asuresh Arun Suresh added a comment -

          Deleted, renamed and re-uploaded the attachment

          Show
          asuresh Arun Suresh added a comment - Deleted, renamed and re-uploaded the attachment
          Hide
          asuresh Arun Suresh added a comment -

          Uploading patch with minor refactoring:

          • ResourceSet::localizedResources is now a Map<String, Path>
          • Make sure that the RetryContext of the Container is updated after re-initialization
          Show
          asuresh Arun Suresh added a comment - Uploading patch with minor refactoring: ResourceSet::localizedResources is now a Map<String, Path> Make sure that the RetryContext of the Container is updated after re-initialization
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 19s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          +1 mvninstall 7m 1s trunk passed
          +1 compile 0m 27s trunk passed
          +1 checkstyle 0m 24s trunk passed
          +1 mvnsite 0m 27s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          +1 findbugs 0m 41s trunk passed
          +1 javadoc 0m 17s trunk passed
          +1 mvninstall 0m 22s the patch passed
          +1 compile 0m 24s the patch passed
          +1 javac 0m 24s the patch passed
          -1 checkstyle 0m 21s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 14 new + 530 unchanged - 5 fixed = 544 total (was 535)
          +1 mvnsite 0m 24s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 45s the patch passed
          +1 javadoc 0m 14s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242)
          -1 unit 14m 21s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 15s The patch does not generate ASF License warnings.
          27m 45s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828288/YARN-5620.014.patch
          JIRA Issue YARN-5620
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux b266e3d80f76 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / e793309
          Default Java 1.8.0_101
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13098/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13098/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13098/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13098/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13098/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 19s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. +1 mvninstall 7m 1s trunk passed +1 compile 0m 27s trunk passed +1 checkstyle 0m 24s trunk passed +1 mvnsite 0m 27s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 0m 41s trunk passed +1 javadoc 0m 17s trunk passed +1 mvninstall 0m 22s the patch passed +1 compile 0m 24s the patch passed +1 javac 0m 24s the patch passed -1 checkstyle 0m 21s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 14 new + 530 unchanged - 5 fixed = 544 total (was 535) +1 mvnsite 0m 24s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 45s the patch passed +1 javadoc 0m 14s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242) -1 unit 14m 21s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 27m 45s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828288/YARN-5620.014.patch JIRA Issue YARN-5620 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux b266e3d80f76 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / e793309 Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13098/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13098/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13098/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13098/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13098/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          asuresh Arun Suresh added a comment -

          Reverting this:

          Also.. I pulled in a change from YARN-5637 here. Essentially, if the LocalResourceTracker gets a REQUEST for a resource and if it already exists, then it just ignores the request..

          Since this is already taken care of..

          Show
          asuresh Arun Suresh added a comment - Reverting this: Also.. I pulled in a change from YARN-5637 here. Essentially, if the LocalResourceTracker gets a REQUEST for a resource and if it already exists, then it just ignores the request.. Since this is already taken care of..
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 20s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          +1 mvninstall 8m 19s trunk passed
          +1 compile 0m 30s trunk passed
          +1 checkstyle 0m 26s trunk passed
          +1 mvnsite 0m 31s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 0m 50s trunk passed
          +1 javadoc 0m 17s trunk passed
          +1 mvninstall 0m 23s the patch passed
          +1 compile 0m 26s the patch passed
          -1 javac 0m 26s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 1 new + 16 unchanged - 1 fixed = 17 total (was 17)
          -1 checkstyle 0m 21s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 11 new + 530 unchanged - 5 fixed = 541 total (was 535)
          +1 mvnsite 0m 25s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 54s the patch passed
          +1 javadoc 0m 14s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242)
          -1 unit 14m 24s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 15s The patch does not generate ASF License warnings.
          29m 44s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.containermanager.localizer.TestLocalResourcesTrackerImpl
            hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor
            hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828221/YARN-5620.013.patch
          JIRA Issue YARN-5620
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux c4f942af9660 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 729de3e
          Default Java 1.8.0_101
          findbugs v3.0.0
          javac https://builds.apache.org/job/PreCommit-YARN-Build/13094/artifact/patchprocess/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13094/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13094/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13094/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13094/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13094/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 20s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. +1 mvninstall 8m 19s trunk passed +1 compile 0m 30s trunk passed +1 checkstyle 0m 26s trunk passed +1 mvnsite 0m 31s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 0m 50s trunk passed +1 javadoc 0m 17s trunk passed +1 mvninstall 0m 23s the patch passed +1 compile 0m 26s the patch passed -1 javac 0m 26s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 1 new + 16 unchanged - 1 fixed = 17 total (was 17) -1 checkstyle 0m 21s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 11 new + 530 unchanged - 5 fixed = 541 total (was 535) +1 mvnsite 0m 25s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 54s the patch passed +1 javadoc 0m 14s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242) -1 unit 14m 24s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 29m 44s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.containermanager.localizer.TestLocalResourcesTrackerImpl   hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor   hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828221/YARN-5620.013.patch JIRA Issue YARN-5620 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux c4f942af9660 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 729de3e Default Java 1.8.0_101 findbugs v3.0.0 javac https://builds.apache.org/job/PreCommit-YARN-Build/13094/artifact/patchprocess/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13094/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13094/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13094/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13094/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13094/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          asuresh Arun Suresh added a comment -

          Jian He, thanks for checking.. Please find updated patch (v013)...

          ReInitializationContext#resourceSet is never initialized with the pending resources ?

          fixed this.. looks like the test case was passing because the localization request was fired from the resourceSet in the event.. so it did get localized.

          Also, the symlinks for the upgraded resources are not overridden correctly. we probably need to change this to be a map of (symlink -> localizedPath)

          Can you take a look at the changes to ResourceSet in the latest patch? The mergeWith function merges with upgraded resource set with the existing one and creates a new ResourceSet. It takes care of overwriting the symlinks.
          I changed the localizedResources map from <Path, List<Links>> to <Path, Set<Links>> since it is easier to de-dup.. like you said, I kept the call chain intact (we can handle in another JIRA) but converting it back to <Path, List<Links>> for the callers. Also, this happens only once when container is started and every upgrade.. so am guess it should be fine.

          Also.. I pulled in a change from YARN-5637 here. Essentially, if the LocalResourceTracker gets a REQUEST for a resource and if it already exists, then it just ignores the request.. I modified it slightly to ensure that the container receives Resource Localized message even if the resource exists since re-initialization can be kicked off only after the container receives a Resource Localized message for ALL its required resources,

          Show
          asuresh Arun Suresh added a comment - Jian He , thanks for checking.. Please find updated patch (v013)... ReInitializationContext#resourceSet is never initialized with the pending resources ? fixed this.. looks like the test case was passing because the localization request was fired from the resourceSet in the event.. so it did get localized. Also, the symlinks for the upgraded resources are not overridden correctly. we probably need to change this to be a map of (symlink -> localizedPath) Can you take a look at the changes to ResourceSet in the latest patch? The mergeWith function merges with upgraded resource set with the existing one and creates a new ResourceSet. It takes care of overwriting the symlinks. I changed the localizedResources map from <Path, List<Links>> to <Path, Set<Links>> since it is easier to de-dup.. like you said, I kept the call chain intact (we can handle in another JIRA) but converting it back to <Path, List<Links>> for the callers. Also, this happens only once when container is started and every upgrade.. so am guess it should be fine. Also.. I pulled in a change from YARN-5637 here. Essentially, if the LocalResourceTracker gets a REQUEST for a resource and if it already exists, then it just ignores the request.. I modified it slightly to ensure that the container receives Resource Localized message even if the resource exists since re-initialization can be kicked off only after the container receives a Resource Localized message for ALL its required resources,
          Hide
          jianhe Jian He added a comment -

          Arun Suresh, found a couple more issues while looking at the other patch:

          • not sure if I missed something, Looks like the ReInitializationContext#resourceSet is never initialized with the pending resources ?
          • Also, the symlinks for the upgraded resources are not overridden correctly.
            we probably need to change this to be a map of (symlink -> localizedPath) so that the same symlink can be overwritten to point to the new resource. If this involves a chain of caller changes, it can be done separately.
              private Map<Path, List<String>> localizedResources =
                  new ConcurrentHashMap<>();
            
          Show
          jianhe Jian He added a comment - Arun Suresh , found a couple more issues while looking at the other patch: not sure if I missed something, Looks like the ReInitializationContext#resourceSet is never initialized with the pending resources ? Also, the symlinks for the upgraded resources are not overridden correctly. we probably need to change this to be a map of (symlink -> localizedPath) so that the same symlink can be overwritten to point to the new resource. If this involves a chain of caller changes, it can be done separately. private Map<Path, List< String >> localizedResources = new ConcurrentHashMap<>();
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 21s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          +1 mvninstall 8m 9s trunk passed
          +1 compile 0m 32s trunk passed
          +1 checkstyle 0m 24s trunk passed
          +1 mvnsite 0m 31s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 0m 49s trunk passed
          +1 javadoc 0m 18s trunk passed
          +1 mvninstall 0m 23s the patch passed
          +1 compile 0m 27s the patch passed
          +1 javac 0m 27s the patch passed
          -1 checkstyle 0m 22s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 10 new + 515 unchanged - 4 fixed = 525 total (was 519)
          +1 mvnsite 0m 29s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 54s the patch passed
          +1 javadoc 0m 18s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242)
          -1 unit 14m 34s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 15s The patch does not generate ASF License warnings.
          29m 53s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828059/YARN-5620.012.patch
          JIRA Issue YARN-5620
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux f4886f9afe87 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 9faccd1
          Default Java 1.8.0_101
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13087/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13087/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13087/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13087/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13087/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 21s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. +1 mvninstall 8m 9s trunk passed +1 compile 0m 32s trunk passed +1 checkstyle 0m 24s trunk passed +1 mvnsite 0m 31s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 0m 49s trunk passed +1 javadoc 0m 18s trunk passed +1 mvninstall 0m 23s the patch passed +1 compile 0m 27s the patch passed +1 javac 0m 27s the patch passed -1 checkstyle 0m 22s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 10 new + 515 unchanged - 4 fixed = 525 total (was 519) +1 mvnsite 0m 29s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 54s the patch passed +1 javadoc 0m 18s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242) -1 unit 14m 34s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 29m 53s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828059/YARN-5620.012.patch JIRA Issue YARN-5620 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux f4886f9afe87 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 9faccd1 Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13087/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13087/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13087/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13087/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13087/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          asuresh Arun Suresh added a comment -

          Done.. Thanks Jian He..

          Show
          asuresh Arun Suresh added a comment - Done.. Thanks Jian He ..
          Hide
          jianhe Jian He added a comment -

          Arun, thank you very much for the prompt response..
          I think you forgot to remove the unused checkAndUpdatePending method, given that we anyway need one more patch.. few nits:

          • reInitContext no need to be volatile
          • remove unused imports in ContainerLocalizationRequestEvent

          latest patch looks good to me.
          Varun Vasudev, want to take a look ?

          Show
          jianhe Jian He added a comment - Arun, thank you very much for the prompt response.. I think you forgot to remove the unused checkAndUpdatePending method, given that we anyway need one more patch.. few nits: reInitContext no need to be volatile remove unused imports in ContainerLocalizationRequestEvent latest patch looks good to me. Varun Vasudev , want to take a look ?
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 20s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          +1 mvninstall 8m 15s trunk passed
          +1 compile 0m 31s trunk passed
          +1 checkstyle 0m 22s trunk passed
          +1 mvnsite 0m 28s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 0m 50s trunk passed
          +1 javadoc 0m 17s trunk passed
          +1 mvninstall 0m 25s the patch passed
          +1 compile 0m 28s the patch passed
          +1 javac 0m 28s the patch passed
          -1 checkstyle 0m 21s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 12 new + 514 unchanged - 4 fixed = 526 total (was 518)
          +1 mvnsite 0m 27s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 54s the patch passed
          +1 javadoc 0m 17s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242)
          -1 unit 14m 5s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 15s The patch does not generate ASF License warnings.
          29m 23s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828048/YARN-5620.011.patch
          JIRA Issue YARN-5620
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux dc2101f05586 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / cc01ed70
          Default Java 1.8.0_101
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13085/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13085/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13085/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13085/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13085/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 20s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. +1 mvninstall 8m 15s trunk passed +1 compile 0m 31s trunk passed +1 checkstyle 0m 22s trunk passed +1 mvnsite 0m 28s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 0m 50s trunk passed +1 javadoc 0m 17s trunk passed +1 mvninstall 0m 25s the patch passed +1 compile 0m 28s the patch passed +1 javac 0m 28s the patch passed -1 checkstyle 0m 21s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 12 new + 514 unchanged - 4 fixed = 526 total (was 518) +1 mvnsite 0m 27s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 54s the patch passed +1 javadoc 0m 17s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242) -1 unit 14m 5s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 29m 23s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828048/YARN-5620.011.patch JIRA Issue YARN-5620 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux dc2101f05586 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / cc01ed70 Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13085/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13085/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13085/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13085/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13085/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          asuresh Arun Suresh added a comment -

          Thanks Jian He.. Uploading patch (v011) with the changes.

          I left the CLEANUP_CONTAINER_FOR_REINIT there, even though it does the same thing as CLEANUP_CONTAINER. It is sent by a different source, it can be used for debugging etc.

          Show
          asuresh Arun Suresh added a comment - Thanks Jian He .. Uploading patch (v011) with the changes. I left the CLEANUP_CONTAINER_FOR_REINIT there, even though it does the same thing as CLEANUP_CONTAINER. It is sent by a different source, it can be used for debugging etc.
          Hide
          jianhe Jian He added a comment - - edited

          It is also possible that the an admin logs into the NM and does a 'kill -9' which will also cause the ContainerLaunch to send CONTAINER_KILLED_ON_REQUEST but it wont be in KILLING state.. right ?

          I guess in this case, it’s also fine to do the upgrade… because the upgrade API does accept it, it’s hard to distinguish which one should go first.. It's also likely the reverse can also happen because it's transient, if the setKillForReInitialization is called first, then the container process is killed, It will be considered as re-init, even though it is killed by external signal. so keep it consistent ?

          Actually if you look at the prepareContainerUpgrade() function,

          ah, yes, mislooked . thank you !

          The problem with getAllResourcesByVisibility, is it gets all resources. I just need the pending resources

          In this case, the pendingResources in the same as the getAllResourcesByVisibility, right? basically, I meant like below.. and the newly added methods could be not needed.

          Map<LocalResourceVisibility, Collection<LocalResourceRequest>>
              pendingResources = ((ContainerReInitEvent) event).getResourceSet()
              .getAllResourcesByVisibility();
          if (!pendingResources.isEmpty()) {
            container.dispatcher.getEventHandler().handle(
                new ContainerLocalizationRequestEvent(container, pendingResources));
          } else {
          
          • Forgot to say, similarly, is the change in ResourceLocalizedWhileRunningTransition required. as the symlinks are also already distinct.
          Show
          jianhe Jian He added a comment - - edited It is also possible that the an admin logs into the NM and does a 'kill -9' which will also cause the ContainerLaunch to send CONTAINER_KILLED_ON_REQUEST but it wont be in KILLING state.. right ? I guess in this case, it’s also fine to do the upgrade… because the upgrade API does accept it, it’s hard to distinguish which one should go first.. It's also likely the reverse can also happen because it's transient, if the setKillForReInitialization is called first, then the container process is killed, It will be considered as re-init, even though it is killed by external signal. so keep it consistent ? Actually if you look at the prepareContainerUpgrade() function, ah, yes, mislooked . thank you ! The problem with getAllResourcesByVisibility, is it gets all resources. I just need the pending resources In this case, the pendingResources in the same as the getAllResourcesByVisibility, right? basically, I meant like below.. and the newly added methods could be not needed. Map<LocalResourceVisibility, Collection<LocalResourceRequest>>     pendingResources = ((ContainerReInitEvent) event).getResourceSet()     .getAllResourcesByVisibility(); if  (!pendingResources.isEmpty()) {   container.dispatcher.getEventHandler().handle(       new  ContainerLocalizationRequestEvent(container, pendingResources)); }  else  { Forgot to say, similarly, is the change in ResourceLocalizedWhileRunningTransition required. as the symlinks are also already distinct.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 11s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          +1 mvninstall 7m 14s trunk passed
          +1 compile 0m 26s trunk passed
          +1 checkstyle 0m 24s trunk passed
          +1 mvnsite 0m 27s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          +1 findbugs 0m 41s trunk passed
          +1 javadoc 0m 16s trunk passed
          +1 mvninstall 0m 23s the patch passed
          +1 compile 0m 23s the patch passed
          +1 javac 0m 23s the patch passed
          -1 checkstyle 0m 20s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 22 new + 575 unchanged - 4 fixed = 597 total (was 579)
          +1 mvnsite 0m 24s the patch passed
          +1 mvneclipse 0m 9s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 44s the patch passed
          +1 javadoc 0m 13s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242)
          -1 unit 14m 7s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 15s The patch does not generate ASF License warnings.
          27m 30s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827991/YARN-5620.010.patch
          JIRA Issue YARN-5620
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux d596c40c7af7 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / cc01ed70
          Default Java 1.8.0_101
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13081/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13081/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13081/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13081/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13081/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 11s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. +1 mvninstall 7m 14s trunk passed +1 compile 0m 26s trunk passed +1 checkstyle 0m 24s trunk passed +1 mvnsite 0m 27s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 0m 41s trunk passed +1 javadoc 0m 16s trunk passed +1 mvninstall 0m 23s the patch passed +1 compile 0m 23s the patch passed +1 javac 0m 23s the patch passed -1 checkstyle 0m 20s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 22 new + 575 unchanged - 4 fixed = 597 total (was 579) +1 mvnsite 0m 24s the patch passed +1 mvneclipse 0m 9s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 44s the patch passed +1 javadoc 0m 13s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242) -1 unit 14m 7s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 27m 30s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827991/YARN-5620.010.patch JIRA Issue YARN-5620 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux d596c40c7af7 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / cc01ed70 Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13081/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13081/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13081/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13081/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13081/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          asuresh Arun Suresh added a comment -

          Fixing failed tests (The TestDefaultContainerExecutor error seems to be unrelated) and some more checkstyles.

          Show
          asuresh Arun Suresh added a comment - Fixing failed tests (The TestDefaultContainerExecutor error seems to be unrelated) and some more checkstyles.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 23s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          +1 mvninstall 8m 31s trunk passed
          +1 compile 0m 31s trunk passed
          +1 checkstyle 0m 25s trunk passed
          +1 mvnsite 0m 32s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 0m 50s trunk passed
          +1 javadoc 0m 20s trunk passed
          +1 mvninstall 0m 28s the patch passed
          +1 compile 0m 28s the patch passed
          +1 javac 0m 28s the patch passed
          -1 checkstyle 0m 23s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 25 new + 604 unchanged - 4 fixed = 629 total (was 608)
          +1 mvnsite 0m 28s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 58s the patch passed
          +1 javadoc 0m 18s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242)
          -1 unit 14m 32s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 18s The patch does not generate ASF License warnings.
          30m 35s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.containermanager.TestContainerManager
            hadoop.yarn.server.nodemanager.containermanager.TestContainerManagerRegression
            hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827978/YARN-5620.009.patch
          JIRA Issue YARN-5620
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 761d12e98010 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / cc01ed70
          Default Java 1.8.0_101
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13080/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13080/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13080/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13080/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13080/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 23s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. +1 mvninstall 8m 31s trunk passed +1 compile 0m 31s trunk passed +1 checkstyle 0m 25s trunk passed +1 mvnsite 0m 32s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 0m 50s trunk passed +1 javadoc 0m 20s trunk passed +1 mvninstall 0m 28s the patch passed +1 compile 0m 28s the patch passed +1 javac 0m 28s the patch passed -1 checkstyle 0m 23s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 25 new + 604 unchanged - 4 fixed = 629 total (was 608) +1 mvnsite 0m 28s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 58s the patch passed +1 javadoc 0m 18s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242) -1 unit 14m 32s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 18s The patch does not generate ASF License warnings. 30m 35s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.containermanager.TestContainerManager   hadoop.yarn.server.nodemanager.containermanager.TestContainerManagerRegression   hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827978/YARN-5620.009.patch JIRA Issue YARN-5620 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 761d12e98010 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / cc01ed70 Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13080/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13080/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13080/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13080/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13080/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          asuresh Arun Suresh added a comment - - edited

          Updating patch.

          • Addressing Jian He's latest comments
          • some javadoc, checkstyle and javac fixes

          IIUC, in this case, the ContainerImpl will receive the KILL event first and move to the KILLING state, and the CONTAINER_KILLED_ON_REQUEST will be sent to the container at KILLING state..

          It goes to KILLING stage only if the AM explicitly sends a kill signal or the RM asks NM to kill. It is also possible that the an admin logs into the NM and does a 'kill -9' which will also cause the ContainerLaunch to send CONTAINER_KILLED_ON_REQUEST but it wont be in KILLING state.. right ?

          ..In testContainerUpgradeSuccess, could you make newStartFile a new upgrade resource, and verify the output is written into it, this verifies the part about the localization part as well.

          Actually if you look at the prepareContainerUpgrade() function, we create a new script file scriptFile_new which is passed into the prepareContainerLaunchContext() function which associates the new file to a new dest_file_new location.. this should verify that the upgrade needed a new localized resource. The output of the script is also written to a new start_file_n.txt which we read and verify to check if the new process has actually started.

          Also by the way:

          We can use the ResourceSet#getAllResourcesByVisibility method instead, and so the getLocalPendingRequests method and the new constructor in ContainerLocalizationRequestEvent is not needed

          The problem with getAllResourcesByVisibility, is it gets all resources. I just need the pending resources... So if you are ok with it, Id like to keep it as is..

          Show
          asuresh Arun Suresh added a comment - - edited Updating patch. Addressing Jian He 's latest comments some javadoc, checkstyle and javac fixes IIUC, in this case, the ContainerImpl will receive the KILL event first and move to the KILLING state, and the CONTAINER_KILLED_ON_REQUEST will be sent to the container at KILLING state.. It goes to KILLING stage only if the AM explicitly sends a kill signal or the RM asks NM to kill. It is also possible that the an admin logs into the NM and does a 'kill -9' which will also cause the ContainerLaunch to send CONTAINER_KILLED_ON_REQUEST but it wont be in KILLING state.. right ? ..In testContainerUpgradeSuccess, could you make newStartFile a new upgrade resource, and verify the output is written into it, this verifies the part about the localization part as well. Actually if you look at the prepareContainerUpgrade() function, we create a new script file scriptFile_new which is passed into the prepareContainerLaunchContext() function which associates the new file to a new dest_file_new location.. this should verify that the upgrade needed a new localized resource. The output of the script is also written to a new start_file_n.txt which we read and verify to check if the new process has actually started. Also by the way: We can use the ResourceSet#getAllResourcesByVisibility method instead, and so the getLocalPendingRequests method and the new constructor in ContainerLocalizationRequestEvent is not needed The problem with getAllResourcesByVisibility, is it gets all resources. I just need the pending resources... So if you are ok with it, Id like to keep it as is..
          Hide
          jianhe Jian He added a comment -

          One more thing about the test.. In testContainerUpgradeSuccess, could you make newStartFile a new upgrade resource, and verify the output is written into it, this verifies the part about the localization part as well.

          Show
          jianhe Jian He added a comment - One more thing about the test.. In testContainerUpgradeSuccess, could you make newStartFile a new upgrade resource, and verify the output is written into it, this verifies the part about the localization part as well.
          Hide
          jianhe Jian He added a comment -

          the container should be killable explicitly via an external signal.

          IIUC, in this case, the ContainerImpl will receive the KILL event first and move to the KILLING state, and the CONTAINER_KILLED_ON_REQUEST will be sent to the container at KILLING state.

          Show
          jianhe Jian He added a comment - the container should be killable explicitly via an external signal. IIUC, in this case, the ContainerImpl will receive the KILL event first and move to the KILLING state, and the CONTAINER_KILLED_ON_REQUEST will be sent to the container at KILLING state.
          Hide
          asuresh Arun Suresh added a comment -

          Thanks Jian He..

          Will update patch shortly with your suggestions. But, with regard to this :

          Looks like we don’t need the killedForReInitialization flag in ContainerLaunch, because container_killed event can already be distinguished based on whether container is at Reinit or running state.

          This we actually do need the flag.. Since even while the container is re-initing (while resource is localizing but before the container_cleanup signal is sent), the container should be killable explicitly via an external signal. The ContainerLaunch should be able to know the difference and after killing, it should either ContainerEventType.CONTAINER_KILLED_ON_REQUEST or ContainerEventType.CONTAINER_KILLED_FOR_REINIT. That reminds me... I have to add the following transition as well:

          .addTransition(ContainerState.REINITIALIZING,
                  ContainerState.EXITED_WITH_FAILURE,
                  ContainerEventType.CONTAINER_KILLED_ON_REQUEST,
                  new KilledExternallyTransition())
          
          Show
          asuresh Arun Suresh added a comment - Thanks Jian He .. Will update patch shortly with your suggestions. But, with regard to this : Looks like we don’t need the killedForReInitialization flag in ContainerLaunch, because container_killed event can already be distinguished based on whether container is at Reinit or running state. This we actually do need the flag.. Since even while the container is re-initing (while resource is localizing but before the container_cleanup signal is sent), the container should be killable explicitly via an external signal. The ContainerLaunch should be able to know the difference and after killing, it should either ContainerEventType.CONTAINER_KILLED_ON_REQUEST or ContainerEventType.CONTAINER_KILLED_FOR_REINIT . That reminds me... I have to add the following transition as well: .addTransition(ContainerState.REINITIALIZING, ContainerState.EXITED_WITH_FAILURE, ContainerEventType.CONTAINER_KILLED_ON_REQUEST, new KilledExternallyTransition())
          Hide
          jianhe Jian He added a comment -

          Arun, thanks for updating ! looks good to me overall, few more comments:

          • Unused parameter container in createReInitContext
          • We can use the ResourceSet#getAllResourcesByVisibility method instead, and so the getLocalPendingRequests method and the new constructor in ContainerLocalizationRequestEvent is not needed
            Collection<LocalResourceRequest> reqs = getLocalPendingRequests(event);
            if (!reqs.isEmpty()) {
              container.dispatcher.getEventHandler().handle( 
                  new ContainerLocalizationRequestEvent(container, reqs));
            
          • In ResourceLocalizedWhileReInitTransition, why is checkAndUpdatePending needed ? why do we need to pass in a hashset to store the links ? I think the symlinks should be always distinct. I wonder we can can just call “resourceSet.resourceLocalized”
          • The newly added interfaces (startReInitialization/completeReInitialization/isReInitializing) in NMContext. Thinking if we can have a flag in ContainerImpl for this.. the advantage is that it won’t cause object leak for any reason we forgot to remove the container from the set. Else, we can expose a single getReinitializingContainers() in the NMContext instead of three, as a getter API looks conforming more with the other APIs in the NMContext.
          • we lost the tracking of the failed resource if we set the reInitContext to null? probably we should add the failed resource to ContainerImpl.resourceSet
            LOG.error("Container [" + container.getContainerId() + "] Re-init" +
                " failed !! Resource [" + failedEvent.getResource() + "] could" +
                " not be localized !!"); 
            container.reInitContext = null;
            
          • Looks like we don’t need the killedForReInitialization flag in ContainerLaunch, because container_killed event can already be distinguished based on whether container is at Reinit or running state.
          • nit: fix the format of the new imports in ContainerManagerImpl
          • the preUpgradeCheck method can also be reused by the localize API.

          Tests looks good, only very minor things:

          • could you add comments to prepareInitialContainer/prepareContainerUpgrade method about what’s it doing, that helps people to understand without reading through the code
          • Wait for new processStartfile to be creases typo : to be created

            Do you think we should cleanup the old script file ? If the upgrade uses the same script file name, it will be overwritten right ? the token file is anyway overwritten right ?

            I agree we don't need to clean up old files, if it can be overwritten.

          Show
          jianhe Jian He added a comment - Arun, thanks for updating ! looks good to me overall, few more comments: Unused parameter container in createReInitContext We can use the ResourceSet#getAllResourcesByVisibility method instead, and so the getLocalPendingRequests method and the new constructor in ContainerLocalizationRequestEvent is not needed Collection<LocalResourceRequest> reqs = getLocalPendingRequests(event); if (!reqs.isEmpty()) {   container.dispatcher.getEventHandler().handle(       new ContainerLocalizationRequestEvent(container, reqs)); In ResourceLocalizedWhileReInitTransition, why is checkAndUpdatePending needed ? why do we need to pass in a hashset to store the links ? I think the symlinks should be always distinct. I wonder we can can just call “resourceSet.resourceLocalized” The newly added interfaces (startReInitialization/completeReInitialization/isReInitializing) in NMContext. Thinking if we can have a flag in ContainerImpl for this.. the advantage is that it won’t cause object leak for any reason we forgot to remove the container from the set. Else, we can expose a single getReinitializingContainers() in the NMContext instead of three, as a getter API looks conforming more with the other APIs in the NMContext. we lost the tracking of the failed resource if we set the reInitContext to null? probably we should add the failed resource to ContainerImpl.resourceSet LOG.error( "Container [" + container.getContainerId() + "] Re-init" +     " failed !! Resource [" + failedEvent.getResource() + "] could" +     " not be localized !!" ); container.reInitContext = null ; Looks like we don’t need the killedForReInitialization flag in ContainerLaunch, because container_killed event can already be distinguished based on whether container is at Reinit or running state. nit: fix the format of the new imports in ContainerManagerImpl the preUpgradeCheck method can also be reused by the localize API. Tests looks good, only very minor things: could you add comments to prepareInitialContainer/prepareContainerUpgrade method about what’s it doing, that helps people to understand without reading through the code Wait for new processStartfile to be creases typo : to be created Do you think we should cleanup the old script file ? If the upgrade uses the same script file name, it will be overwritten right ? the token file is anyway overwritten right ? I agree we don't need to clean up old files, if it can be overwritten.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 20s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 5 new or modified test files.
          +1 mvninstall 6m 49s trunk passed
          +1 compile 0m 27s trunk passed
          +1 checkstyle 0m 24s trunk passed
          +1 mvnsite 0m 27s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          +1 findbugs 0m 40s trunk passed
          +1 javadoc 0m 17s trunk passed
          +1 mvninstall 0m 23s the patch passed
          +1 compile 0m 28s the patch passed
          -1 javac 0m 28s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 1 new + 16 unchanged - 1 fixed = 17 total (was 17)
          -1 checkstyle 0m 25s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 23 new + 624 unchanged - 2 fixed = 647 total (was 626)
          +1 mvnsite 0m 28s the patch passed
          +1 mvneclipse 0m 13s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 56s the patch passed
          -1 javadoc 0m 16s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 3 new + 240 unchanged - 2 fixed = 243 total (was 242)
          -1 unit 14m 50s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 16s The patch does not generate ASF License warnings.
          28m 38s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827911/YARN-5620.008.patch
          JIRA Issue YARN-5620
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 5619fb540838 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / bee9f57
          Default Java 1.8.0_101
          findbugs v3.0.0
          javac https://builds.apache.org/job/PreCommit-YARN-Build/13074/artifact/patchprocess/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13074/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          javadoc https://builds.apache.org/job/PreCommit-YARN-Build/13074/artifact/patchprocess/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13074/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13074/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13074/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13074/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 20s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 5 new or modified test files. +1 mvninstall 6m 49s trunk passed +1 compile 0m 27s trunk passed +1 checkstyle 0m 24s trunk passed +1 mvnsite 0m 27s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 0m 40s trunk passed +1 javadoc 0m 17s trunk passed +1 mvninstall 0m 23s the patch passed +1 compile 0m 28s the patch passed -1 javac 0m 28s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 1 new + 16 unchanged - 1 fixed = 17 total (was 17) -1 checkstyle 0m 25s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 23 new + 624 unchanged - 2 fixed = 647 total (was 626) +1 mvnsite 0m 28s the patch passed +1 mvneclipse 0m 13s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 56s the patch passed -1 javadoc 0m 16s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 3 new + 240 unchanged - 2 fixed = 243 total (was 242) -1 unit 14m 50s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 16s The patch does not generate ASF License warnings. 28m 38s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827911/YARN-5620.008.patch JIRA Issue YARN-5620 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 5619fb540838 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / bee9f57 Default Java 1.8.0_101 findbugs v3.0.0 javac https://builds.apache.org/job/PreCommit-YARN-Build/13074/artifact/patchprocess/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13074/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt javadoc https://builds.apache.org/job/PreCommit-YARN-Build/13074/artifact/patchprocess/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13074/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13074/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13074/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13074/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          asuresh Arun Suresh added a comment -

          Uploading patch addressing Jian He's suggestions.

          • Refactored to use a new REINITIALIZING state
          • Handle race conditions to properly disallow relocalization and reintialization while a container is undergoing reinitialization.
          Show
          asuresh Arun Suresh added a comment - Uploading patch addressing Jian He 's suggestions. Refactored to use a new REINITIALIZING state Handle race conditions to properly disallow relocalization and reintialization while a container is undergoing reinitialization.
          Hide
          asuresh Arun Suresh added a comment -

          Missed this:

          why do we set the reInitContext to be null if once resource localization failed ?

          So once the localization required for the reinit is complete, a CONTAINER_CLEANUP event is sent to the ContainerLaunch. The ContainerLaunch can only interpret this as a FORCE_KILLED or TERMINATED.. in which case, the only event it can fire is the CONTAINER_KILLED_ON_REQUEST event.. in which case.. the only transition to handle this while container is in RUNNING state is the KilledExternallyTransition. The KilledExternallyTransition needs to know if the the kill needs result in a reinit or not... which it does by checking if reInitContext is null.

          I understand we can maybe avoid this with a new RE_INIT state (and thereby do away with the if check). But I have a feeling the state machine might get more complicated... I will give the refactor a shot.

          Show
          asuresh Arun Suresh added a comment - Missed this: why do we set the reInitContext to be null if once resource localization failed ? So once the localization required for the reinit is complete, a CONTAINER_CLEANUP event is sent to the ContainerLaunch . The ContainerLaunch can only interpret this as a FORCE_KILLED or TERMINATED.. in which case, the only event it can fire is the CONTAINER_KILLED_ON_REQUEST event.. in which case.. the only transition to handle this while container is in RUNNING state is the KilledExternallyTransition . The KilledExternallyTransition needs to know if the the kill needs result in a reinit or not... which it does by checking if reInitContext is null. I understand we can maybe avoid this with a new RE_INIT state (and thereby do away with the if check). But I have a feeling the state machine might get more complicated... I will give the refactor a shot.
          Hide
          asuresh Arun Suresh added a comment - - edited

          Jian He, thanks for thoughtful comments.

          Also, if it’s ignored here, then it appears to AM that the upgrade call is somehow ignored

          So in the ReInitializeContainerTransition, I do the following

          if (container.reInitContext != null) {
             container.addDiagnostics("Container [" + container.getContainerId()
                + "] ReInitialization already in progress !!");
            return;
          }
          

          So a new upgrade is not ignored..

          I think we can move the logic of ReInitializeContainerTransition to the upgrade API ? All it does is to resend the events to containerLauncher or ResourceLocalizationService , which can be done in the API. Also, this has the benefit of rejecting the upgrade call while the previous upgrade is in_progress.

          Actually, in one of my initial versions, I was setting the reInitContext on the container object in the upgrade API itself. But then I realized that all state change on the Container is affected via a state machine transition and I did not want to break that. I think we should keep the state change of a Container limited to state machine transitions as the code is cleaner and we can guard against other synchronization issues.

          Also, I think we should have a consistent way to letting the AM know about errors. I prefer we use the diagnostic messages (with maybe an error code) to let the AM know if failures. I guess you are already doing it for Failed Localization. This way, the checks and enforcements are done only in transitions. I agree the AM will not know immediately of an error, but subsequent 'getContainerStatus' calls should notify the AM.

          One other potential race is that the relocalize API has a chance to go through instead of rejected, while upgrading,

          I understand.. I was actually thinking we have a set of reInitilizingContainerIds in the NM context. we can check if containerId is present in the set in the upgrade and relocalize API and add to the set if not. We then remove from this set once the Container moves from LOCALIZED to RUNNING (this signifies that reinitialization is complete). Thoughts ?

          Given so many if(reinitializing) conditions in containerImpl, should we consider adding a new state?

          I tried that at first... but I preferred having more if thens than extra Container states. I can provide a version with different container states though...

          when launching the container, we need to cleanupPreviousContainerFiles as done in ContainerRelaunch, right?

          This is actually something I wanted to discuss. Do you think we should cleanup the old script file ? If the upgrade uses the same script file name, it will be overwritten right ? the token file is anyway overwritten right ?

          Show
          asuresh Arun Suresh added a comment - - edited Jian He , thanks for thoughtful comments. Also, if it’s ignored here, then it appears to AM that the upgrade call is somehow ignored So in the ReInitializeContainerTransition , I do the following if (container.reInitContext != null) { container.addDiagnostics("Container [" + container.getContainerId() + "] ReInitialization already in progress !!"); return; } So a new upgrade is not ignored.. I think we can move the logic of ReInitializeContainerTransition to the upgrade API ? All it does is to resend the events to containerLauncher or ResourceLocalizationService , which can be done in the API. Also, this has the benefit of rejecting the upgrade call while the previous upgrade is in_progress. Actually, in one of my initial versions, I was setting the reInitContext on the container object in the upgrade API itself. But then I realized that all state change on the Container is affected via a state machine transition and I did not want to break that. I think we should keep the state change of a Container limited to state machine transitions as the code is cleaner and we can guard against other synchronization issues. Also, I think we should have a consistent way to letting the AM know about errors. I prefer we use the diagnostic messages (with maybe an error code) to let the AM know if failures. I guess you are already doing it for Failed Localization. This way, the checks and enforcements are done only in transitions. I agree the AM will not know immediately of an error, but subsequent 'getContainerStatus' calls should notify the AM. One other potential race is that the relocalize API has a chance to go through instead of rejected, while upgrading, I understand.. I was actually thinking we have a set of reInitilizingContainerIds in the NM context. we can check if containerId is present in the set in the upgrade and relocalize API and add to the set if not. We then remove from this set once the Container moves from LOCALIZED to RUNNING (this signifies that reinitialization is complete). Thoughts ? Given so many if(reinitializing) conditions in containerImpl, should we consider adding a new state? I tried that at first... but I preferred having more if thens than extra Container states. I can provide a version with different container states though... when launching the container, we need to cleanupPreviousContainerFiles as done in ContainerRelaunch, right? This is actually something I wanted to discuss. Do you think we should cleanup the old script file ? If the upgrade uses the same script file name, it will be overwritten right ? the token file is anyway overwritten right ?
          Hide
          jianhe Jian He added a comment -

          Thanks Arun, some more comments and questions:

          • The reInitContext is updated asynchronously via event, but here it’s being checked synchronously in the upgrade API.
            if (!container.isRunning() || container.isReInitializing()) {
            
          • Also, if it’s ignored here, then it appears to AM that the upgrade call is somehow ignored. And user who issues the upgrade command will get confused as the call is ignored.
            if (container.reInitContext != null) {
              container.addDiagnostics("Container [" + container.getContainerId()
                  + "] ReInitialization already in progress !!");
              return; 
            }
            
            

            Overall, I think we can move the logic of ReInitializeContainerTransition to the upgrade API ? All it does is to resend the events to containerLauncher or ResourceLocalizationService , which can be done in the API. Also, this has the benefit of rejecting the upgrade call while the previous upgrade is in_progress.
            Current solution has a race condition that if previous upgrade is in_progress, the second one may be ignored instead of rejected, and user will not get notification that the previous upgrade is in_progress.
            One other potential race is that the relocalize API has a chance to go through instead of rejected, while upgrading, as the reInitContext is updated asynchronously, those requested resources will then be considered as upgrade resources.

          • why is checkAndUpdatePending method needed ?
            checkAndUpdatePending(rsrcEvent, container.resourceSet, links);
            if (container.isReInitializing()) {
              checkAndUpdatePending(
                  rsrcEvent, container.reInitContext.resourceSet, links); 
            }
            
            
          • why do we set the reInitContext to be null if once resource localization failed ?
            if (container.isReInitializing() &&
                container.reInitContext.resourceSet.getPendingResources()
                    .containsKey(failedEvent.getResource())) {
              LOG.error("Container [" + container.getContainerId() + "] Re-init" +
                  " failed !! Resource [" + failedEvent.getResource() + "] could" +
                  " not be localized !!");
              container.reInitContext = null; 
            }
            
            
          • In ResourceLocalizedWhileRunningTransition, the symlink creation part is not needed for reinit, because it will be done as part of the containerLaunch.
          • Given so many if(reinitializing) conditions in containerImpl, should we consider adding a new state?
          • when launching the container, we need to cleanupPreviousContainerFiles as done in ContainerRelaunch, right?
          Show
          jianhe Jian He added a comment - Thanks Arun, some more comments and questions: The reInitContext is updated asynchronously via event, but here it’s being checked synchronously in the upgrade API. if (!container.isRunning() || container.isReInitializing()) { Also, if it’s ignored here, then it appears to AM that the upgrade call is somehow ignored. And user who issues the upgrade command will get confused as the call is ignored. if (container.reInitContext != null ) {   container.addDiagnostics( "Container [" + container.getContainerId()       + "] ReInitialization already in progress !!" );   return ; } Overall, I think we can move the logic of ReInitializeContainerTransition to the upgrade API ? All it does is to resend the events to containerLauncher or ResourceLocalizationService , which can be done in the API. Also, this has the benefit of rejecting the upgrade call while the previous upgrade is in_progress. Current solution has a race condition that if previous upgrade is in_progress, the second one may be ignored instead of rejected, and user will not get notification that the previous upgrade is in_progress. One other potential race is that the relocalize API has a chance to go through instead of rejected, while upgrading, as the reInitContext is updated asynchronously, those requested resources will then be considered as upgrade resources. why is checkAndUpdatePending method needed ? checkAndUpdatePending(rsrcEvent, container.resourceSet, links); if (container.isReInitializing()) {   checkAndUpdatePending(       rsrcEvent, container.reInitContext.resourceSet, links); } why do we set the reInitContext to be null if once resource localization failed ? if (container.isReInitializing() &&     container.reInitContext.resourceSet.getPendingResources()         .containsKey(failedEvent.getResource())) {   LOG.error( "Container [" + container.getContainerId() + "] Re-init" +       " failed !! Resource [" + failedEvent.getResource() + "] could" +       " not be localized !!" );   container.reInitContext = null ; } In ResourceLocalizedWhileRunningTransition, the symlink creation part is not needed for reinit, because it will be done as part of the containerLaunch. Given so many if(reinitializing) conditions in containerImpl, should we consider adding a new state? when launching the container, we need to cleanupPreviousContainerFiles as done in ContainerRelaunch, right?
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 14s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          +1 mvninstall 7m 23s trunk passed
          +1 compile 0m 32s trunk passed
          +1 checkstyle 0m 21s trunk passed
          +1 mvnsite 0m 29s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          +1 findbugs 0m 44s trunk passed
          +1 javadoc 0m 18s trunk passed
          +1 mvninstall 0m 24s the patch passed
          +1 compile 0m 25s the patch passed
          +1 javac 0m 25s the patch passed
          -1 checkstyle 0m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 3 new + 315 unchanged - 2 fixed = 318 total (was 317)
          +1 mvnsite 0m 26s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 50s the patch passed
          +1 javadoc 0m 16s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242)
          -1 unit 14m 20s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 16s The patch does not generate ASF License warnings.
          28m 22s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager
            hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827677/YARN-5620.007.patch
          JIRA Issue YARN-5620
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux b2842b7681d3 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / b6d839a
          Default Java 1.8.0_101
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13055/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13055/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13055/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13055/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13055/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. +1 mvninstall 7m 23s trunk passed +1 compile 0m 32s trunk passed +1 checkstyle 0m 21s trunk passed +1 mvnsite 0m 29s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 0m 44s trunk passed +1 javadoc 0m 18s trunk passed +1 mvninstall 0m 24s the patch passed +1 compile 0m 25s the patch passed +1 javac 0m 25s the patch passed -1 checkstyle 0m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 3 new + 315 unchanged - 2 fixed = 318 total (was 317) +1 mvnsite 0m 26s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 50s the patch passed +1 javadoc 0m 16s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242) -1 unit 14m 20s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 16s The patch does not generate ASF License warnings. 28m 22s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager   hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827677/YARN-5620.007.patch JIRA Issue YARN-5620 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux b2842b7681d3 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / b6d839a Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13055/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13055/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13055/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13055/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13055/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          asuresh Arun Suresh added a comment -

          Fixing checkstyles, javadocs and javac

          Show
          asuresh Arun Suresh added a comment - Fixing checkstyles, javadocs and javac
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 18s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          +1 mvninstall 6m 44s trunk passed
          +1 compile 0m 26s trunk passed
          +1 checkstyle 0m 20s trunk passed
          +1 mvnsite 0m 28s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          +1 findbugs 0m 40s trunk passed
          +1 javadoc 0m 17s trunk passed
          +1 mvninstall 0m 22s the patch passed
          +1 compile 0m 25s the patch passed
          -1 javac 0m 25s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 1 new + 16 unchanged - 1 fixed = 17 total (was 17)
          -1 checkstyle 0m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 3 new + 315 unchanged - 2 fixed = 318 total (was 317)
          +1 mvnsite 0m 25s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 45s the patch passed
          -1 javadoc 0m 14s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 3 new + 242 unchanged - 0 fixed = 245 total (was 242)
          -1 unit 14m 21s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 15s The patch does not generate ASF License warnings.
          27m 23s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827613/YARN-5620.006.patch
          JIRA Issue YARN-5620
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 73b1a68f56a7 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 20a20c2
          Default Java 1.8.0_101
          findbugs v3.0.0
          javac https://builds.apache.org/job/PreCommit-YARN-Build/13049/artifact/patchprocess/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13049/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          javadoc https://builds.apache.org/job/PreCommit-YARN-Build/13049/artifact/patchprocess/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13049/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13049/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13049/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13049/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 18s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. +1 mvninstall 6m 44s trunk passed +1 compile 0m 26s trunk passed +1 checkstyle 0m 20s trunk passed +1 mvnsite 0m 28s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 0m 40s trunk passed +1 javadoc 0m 17s trunk passed +1 mvninstall 0m 22s the patch passed +1 compile 0m 25s the patch passed -1 javac 0m 25s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 1 new + 16 unchanged - 1 fixed = 17 total (was 17) -1 checkstyle 0m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 3 new + 315 unchanged - 2 fixed = 318 total (was 317) +1 mvnsite 0m 25s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 45s the patch passed -1 javadoc 0m 14s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 3 new + 242 unchanged - 0 fixed = 245 total (was 242) -1 unit 14m 21s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 27m 23s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827613/YARN-5620.006.patch JIRA Issue YARN-5620 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 73b1a68f56a7 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 20a20c2 Default Java 1.8.0_101 findbugs v3.0.0 javac https://builds.apache.org/job/PreCommit-YARN-Build/13049/artifact/patchprocess/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13049/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt javadoc https://builds.apache.org/job/PreCommit-YARN-Build/13049/artifact/patchprocess/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13049/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13049/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13049/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13049/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          asuresh Arun Suresh added a comment -

          Uploading patch addressing most of Varun Vasudev and Jian He suggestions. Thanks for the comments !!

          Varun Vasudev,

          Should there be a guard against calling reint if a reinit is already in progress? Could we end up with the ReInitContext in odd state?

          So there is already a guard in the ContainerManager api... but I have included an additional check in the transition in the new patch as per your suggestion.

          Instead of a launch event we should send a relaunch event - the relaunch takes care of trying to run in same work dir as the earlier attempt, etc

          I actually tried using relaunch initially... but it looks like the pid has to be running for the re launch to work correctly. Also, looks like we would need an intermediate state there too and would result in same (or more) amount of code change. I would actually prefer to use launch itself, since I am more confident of how it works. I have also updated the testcase to verify that the upgraded container has access to and is able to read files created by the previous process in the working directory.

          think an explicit commit API(with auto-commit option being the default option) should satisfy both use cases.

          Thanks.. will update the patch with it once we agree that the reinit flow is fine.

          Jian He,

          While AM issues the upgrade command, the container could exit with success or failure. in this case, should we still continue the upgrade process ?

          I am nullifying the reInitContext in the event of an explicit kill or if process completed successfully during the reInit.. the upgrade should thus be cancelled. Do take a look at the latest patch and let me know if you think i've cover all cases.

          Show
          asuresh Arun Suresh added a comment - Uploading patch addressing most of Varun Vasudev and Jian He suggestions. Thanks for the comments !! Varun Vasudev , Should there be a guard against calling reint if a reinit is already in progress? Could we end up with the ReInitContext in odd state? So there is already a guard in the ContainerManager api... but I have included an additional check in the transition in the new patch as per your suggestion. Instead of a launch event we should send a relaunch event - the relaunch takes care of trying to run in same work dir as the earlier attempt, etc I actually tried using relaunch initially... but it looks like the pid has to be running for the re launch to work correctly. Also, looks like we would need an intermediate state there too and would result in same (or more) amount of code change. I would actually prefer to use launch itself, since I am more confident of how it works. I have also updated the testcase to verify that the upgraded container has access to and is able to read files created by the previous process in the working directory. think an explicit commit API(with auto-commit option being the default option) should satisfy both use cases. Thanks.. will update the patch with it once we agree that the reinit flow is fine. Jian He , While AM issues the upgrade command, the container could exit with success or failure. in this case, should we still continue the upgrade process ? I am nullifying the reInitContext in the event of an explicit kill or if process completed successfully during the reInit.. the upgrade should thus be cancelled. Do take a look at the latest patch and let me know if you think i've cover all cases.
          Hide
          jianhe Jian He added a comment -

          Thanks Arun, few comments on the upgrade flow:

          • The addResource better be done in the upgrade API itself so that if the passed local resource is not valid, the API can throw exception directly and AM gets notified. Right now, the failure is ignored to the AM.
            Map<LocalResourceVisibility, Collection<LocalResourceRequest>>

                req = container.reInitContext.resourceSet.addResources(
    getResourcesToLocalize(event));
            
          • While AM issues the upgrade command, the container could exit with success or failure. in this case, should we still continue the upgrade process ? i.e. restart the container with the new launch context, because anyways we wanted to old process to exit.
          Show
          jianhe Jian He added a comment - Thanks Arun, few comments on the upgrade flow: The addResource better be done in the upgrade API itself so that if the passed local resource is not valid, the API can throw exception directly and AM gets notified. Right now, the failure is ignored to the AM. Map<LocalResourceVisibility, Collection<LocalResourceRequest>>
 req = container.reInitContext.resourceSet.addResources(
 getResourcesToLocalize(event)); While AM issues the upgrade command, the container could exit with success or failure. in this case, should we still continue the upgrade process ? i.e. restart the container with the new launch context, because anyways we wanted to old process to exit.
          Hide
          vvasudev Varun Vasudev added a comment -

          Thanks for the patch Arun Suresh!

          1)

          -            containermanager.container.ContainerState.RUNNING)) {
          +            containermanager.container.ContainerState.RUNNING)
          +        || container.isReInitializing()) {
          

          Minor nit - can we add a function called container.isRunning and move the running state check into that function? Then the function becomes container.isRunning() || container.isReInitializing()

          2)

          +  private void preUpgradeCheck(ContainerId containerId, String op)
          

          Maybe switch to enum instead of String for the op?

          3)

          +        container.launchContext = container.reInitContext.newLaunchContext;
          +        container.resourceSet.merge(container.reInitContext.resourceSet);
          +
          +        container.sendLaunchEvent();
          

          Instead of a launch event we should send a relaunch event - the relaunch takes care of trying to run in same work dir as the earlier attempt, etc

          4)

          +    public void transition(ContainerImpl container, ContainerEvent event) {
          +      container.reInitContext = createReInitContext(container, event);
          

          Should there be a guard against calling reint if a reinit is already in progress? Could we end up with the ReInitContext in odd state?

          5)

          +        List<String> l = resourceSet.resourceLocalized(
          +            rsrcEvent.getResource(), rsrcEvent.getLocation());
          +        if (l != null) {
          +          links.addAll(l);
          +        }
          

          Do we need to de-dup here? It’s possible that the same link gets added twice?

          6)

          How does AM determine whether the upgrade is successful (like what kind signal should AM depend on)? I feel once the container starts running, even for AM, it's hard to distinguish whether the failure is caused by upgrade or runtime. IMO, if container fails to launch on upgrade, it should be considered as upgrade failure. Once the container starts running, if the container fails, it can be considered as runtime failure. If user does want to rollback, user call the upgardeContainer/rollback command again to roll back.

          I think both Jian He and Arun Suresh raise valid points. I think an explicit commit API(with auto-commit option being the default option) should satisfy both use cases.

          Show
          vvasudev Varun Vasudev added a comment - Thanks for the patch Arun Suresh ! 1) - containermanager.container.ContainerState.RUNNING)) { + containermanager.container.ContainerState.RUNNING) + || container.isReInitializing()) { Minor nit - can we add a function called container.isRunning and move the running state check into that function? Then the function becomes container.isRunning() || container.isReInitializing() 2) + private void preUpgradeCheck(ContainerId containerId, String op) Maybe switch to enum instead of String for the op? 3) + container.launchContext = container.reInitContext.newLaunchContext; + container.resourceSet.merge(container.reInitContext.resourceSet); + + container.sendLaunchEvent(); Instead of a launch event we should send a relaunch event - the relaunch takes care of trying to run in same work dir as the earlier attempt, etc 4) + public void transition(ContainerImpl container, ContainerEvent event) { + container.reInitContext = createReInitContext(container, event); Should there be a guard against calling reint if a reinit is already in progress? Could we end up with the ReInitContext in odd state? 5) + List< String > l = resourceSet.resourceLocalized( + rsrcEvent.getResource(), rsrcEvent.getLocation()); + if (l != null ) { + links.addAll(l); + } Do we need to de-dup here? It’s possible that the same link gets added twice? 6) How does AM determine whether the upgrade is successful (like what kind signal should AM depend on)? I feel once the container starts running, even for AM, it's hard to distinguish whether the failure is caused by upgrade or runtime. IMO, if container fails to launch on upgrade, it should be considered as upgrade failure. Once the container starts running, if the container fails, it can be considered as runtime failure. If user does want to rollback, user call the upgardeContainer/rollback command again to roll back. I think both Jian He and Arun Suresh raise valid points. I think an explicit commit API(with auto-commit option being the default option) should satisfy both use cases.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 13s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          +1 mvninstall 6m 57s trunk passed
          +1 compile 0m 27s trunk passed
          +1 checkstyle 0m 19s trunk passed
          +1 mvnsite 0m 27s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          +1 findbugs 0m 41s trunk passed
          +1 javadoc 0m 17s trunk passed
          +1 mvninstall 0m 22s the patch passed
          +1 compile 0m 24s the patch passed
          +1 javac 0m 24s the patch passed
          -1 checkstyle 0m 17s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 2 new + 311 unchanged - 2 fixed = 313 total (was 313)
          +1 mvnsite 0m 24s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 46s the patch passed
          +1 javadoc 0m 14s the patch passed
          -1 unit 14m 14s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 15s The patch does not generate ASF License warnings.
          27m 19s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827504/YARN-5620.005.patch
          JIRA Issue YARN-5620
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux f1a1d3b82815 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 63f5948
          Default Java 1.8.0_101
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13042/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13042/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13042/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13042/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13042/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 13s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. +1 mvninstall 6m 57s trunk passed +1 compile 0m 27s trunk passed +1 checkstyle 0m 19s trunk passed +1 mvnsite 0m 27s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 0m 41s trunk passed +1 javadoc 0m 17s trunk passed +1 mvninstall 0m 22s the patch passed +1 compile 0m 24s the patch passed +1 javac 0m 24s the patch passed -1 checkstyle 0m 17s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 2 new + 311 unchanged - 2 fixed = 313 total (was 313) +1 mvnsite 0m 24s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 46s the patch passed +1 javadoc 0m 14s the patch passed -1 unit 14m 14s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 27m 19s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827504/YARN-5620.005.patch JIRA Issue YARN-5620 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux f1a1d3b82815 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 63f5948 Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13042/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13042/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13042/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13042/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13042/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 15s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          +1 mvninstall 6m 51s trunk passed
          +1 compile 0m 28s trunk passed
          +1 checkstyle 0m 20s trunk passed
          +1 mvnsite 0m 28s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          +1 findbugs 0m 42s trunk passed
          +1 javadoc 0m 18s trunk passed
          +1 mvninstall 0m 23s the patch passed
          +1 compile 0m 25s the patch passed
          +1 javac 0m 25s the patch passed
          -1 checkstyle 0m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 2 new + 311 unchanged - 2 fixed = 313 total (was 313)
          +1 mvnsite 0m 24s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 45s the patch passed
          +1 javadoc 0m 15s the patch passed
          -1 unit 14m 13s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 16s The patch does not generate ASF License warnings.
          27m 24s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827504/YARN-5620.005.patch
          JIRA Issue YARN-5620
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux c556c6baeb0f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 63f5948
          Default Java 1.8.0_101
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13041/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13041/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13041/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13041/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13041/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 15s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. +1 mvninstall 6m 51s trunk passed +1 compile 0m 28s trunk passed +1 checkstyle 0m 20s trunk passed +1 mvnsite 0m 28s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 0m 42s trunk passed +1 javadoc 0m 18s trunk passed +1 mvninstall 0m 23s the patch passed +1 compile 0m 25s the patch passed +1 javac 0m 25s the patch passed -1 checkstyle 0m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 2 new + 311 unchanged - 2 fixed = 313 total (was 313) +1 mvnsite 0m 24s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 45s the patch passed +1 javadoc 0m 15s the patch passed -1 unit 14m 13s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 16s The patch does not generate ASF License warnings. 27m 24s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827504/YARN-5620.005.patch JIRA Issue YARN-5620 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux c556c6baeb0f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 63f5948 Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13041/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13041/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13041/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13041/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13041/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 18s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          +1 mvninstall 6m 58s trunk passed
          +1 compile 0m 28s trunk passed
          +1 checkstyle 0m 20s trunk passed
          +1 mvnsite 0m 28s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 0m 41s trunk passed
          +1 javadoc 0m 17s trunk passed
          +1 mvninstall 0m 22s the patch passed
          +1 compile 0m 25s the patch passed
          +1 javac 0m 25s the patch passed
          -1 checkstyle 0m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 4 new + 312 unchanged - 2 fixed = 316 total (was 314)
          +1 mvnsite 0m 24s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 45s the patch passed
          +1 javadoc 0m 15s the patch passed
          -1 unit 1m 56s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 16s The patch does not generate ASF License warnings.
          15m 17s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.nodelabels.TestConfigurationNodeLabelsProvider



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827501/YARN-5620.004.patch
          JIRA Issue YARN-5620
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 3663b840c8c5 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / d355573
          Default Java 1.8.0_101
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13040/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13040/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13040/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13040/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13040/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 18s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. +1 mvninstall 6m 58s trunk passed +1 compile 0m 28s trunk passed +1 checkstyle 0m 20s trunk passed +1 mvnsite 0m 28s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 0m 41s trunk passed +1 javadoc 0m 17s trunk passed +1 mvninstall 0m 22s the patch passed +1 compile 0m 25s the patch passed +1 javac 0m 25s the patch passed -1 checkstyle 0m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 4 new + 312 unchanged - 2 fixed = 316 total (was 314) +1 mvnsite 0m 24s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 45s the patch passed +1 javadoc 0m 15s the patch passed -1 unit 1m 56s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 16s The patch does not generate ASF License warnings. 15m 17s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.nodelabels.TestConfigurationNodeLabelsProvider Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827501/YARN-5620.004.patch JIRA Issue YARN-5620 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 3663b840c8c5 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / d355573 Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13040/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13040/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13040/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13040/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13040/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          asuresh Arun Suresh added a comment -

          Uploading an updated patch with minor test case fixes

          Show
          asuresh Arun Suresh added a comment - Uploading an updated patch with minor test case fixes
          Hide
          asuresh Arun Suresh added a comment - - edited

          Jian He, As per your suggestion, I am uploading a patch with just the restart container for your review convenience. I renamed it reInitialize to signify that the restart is dependent on the container being re-initialized with new bits.

          But, as per my previous comments, I do believe that we should not expose an upgrade without a rollback to just-previous launch context (both implicit based on failure policy and well as an explicit rollback API).

          I would thus prefer to update the same JIRA with the rollback and commit calls (once you are satisfied with the restart flow) rather than open separate JIRAs.

          the slider AM (also Yarn code) will have the prior context and call the upgardeContainer with the corresponding context, and so NM does not need to remember prior context.

          Hmmmm... I still believe rollback to just prior version should be supported by the NM.. and for rolling upgrades, atleast for production environments I have had experience with, it is an absolute requirement. The AM (Slider in our case) can subsequently reinitialize to any version it chooses later on if it wants.

          Show
          asuresh Arun Suresh added a comment - - edited Jian He , As per your suggestion, I am uploading a patch with just the restart container for your review convenience. I renamed it reInitialize to signify that the restart is dependent on the container being re-initialized with new bits. But, as per my previous comments, I do believe that we should not expose an upgrade without a rollback to just-previous launch context (both implicit based on failure policy and well as an explicit rollback API). I would thus prefer to update the same JIRA with the rollback and commit calls (once you are satisfied with the restart flow) rather than open separate JIRAs. the slider AM (also Yarn code) will have the prior context and call the upgardeContainer with the corresponding context, and so NM does not need to remember prior context. Hmmmm... I still believe rollback to just prior version should be supported by the NM.. and for rolling upgrades, atleast for production environments I have had experience with, it is an absolute requirement. The AM (Slider in our case) can subsequently reinitialize to any version it chooses later on if it wants.
          Hide
          jianhe Jian He added a comment - - edited

          Arun Suresh, thanks for the explanation

          Only the AM knows if the upgrade is actually successful.

          How does AM determine whether the upgrade is successful (like what kind signal should AM depend on)? I feel once the container starts running, even for AM, it's hard to distinguish whether the failure is caused by upgrade or runtime. IMO, if container fails to launch on upgrade, it should be considered as upgrade failure. Once the container starts running, if the container fails, it can be considered as runtime failure. If user does want to rollback, user call the upgardeContainer/rollback command again to roll back.

          But, in my opinion rollback should not be provided with an explicit launchContext, it should always be the just previous context.

          I also agree AM can take care of tying the context with version. In our case, the slider AM (also Yarn code) will have the prior context and call the upgardeContainer with the corresponding context, and so NM does not need to remember prior context.

          I think for upgrade itself, it is enough work for a single jira with enough corner cases and we have consensus on that. Could you separate the patch to include only the upgrade piece in this jira ? that also makes review easier..

          Show
          jianhe Jian He added a comment - - edited Arun Suresh , thanks for the explanation Only the AM knows if the upgrade is actually successful. How does AM determine whether the upgrade is successful (like what kind signal should AM depend on)? I feel once the container starts running, even for AM, it's hard to distinguish whether the failure is caused by upgrade or runtime. IMO, if container fails to launch on upgrade, it should be considered as upgrade failure. Once the container starts running, if the container fails, it can be considered as runtime failure. If user does want to rollback, user call the upgardeContainer/rollback command again to roll back. But, in my opinion rollback should not be provided with an explicit launchContext, it should always be the just previous context. I also agree AM can take care of tying the context with version. In our case, the slider AM (also Yarn code) will have the prior context and call the upgardeContainer with the corresponding context, and so NM does not need to remember prior context. I think for upgrade itself, it is enough work for a single jira with enough corner cases and we have consensus on that. Could you separate the patch to include only the upgrade piece in this jira ? that also makes review easier..
          Hide
          leftnoteasy Wangda Tan added a comment -

          Just have an offline chat with Arun Suresh, since this is a implementation change only, and also, container change (like update binary) is orthogonal to allocation change (like update resources). I'm OK with the overall approach.

          Thanks,

          Show
          leftnoteasy Wangda Tan added a comment - Just have an offline chat with Arun Suresh , since this is a implementation change only, and also, container change (like update binary) is orthogonal to allocation change (like update resources). I'm OK with the overall approach. Thanks,
          Hide
          leftnoteasy Wangda Tan added a comment -

          Arun Suresh,

          I still think two APIs should be merged.

          In YARN-5221, it update properties of container in RM, like resource / execution-type
          So I think we should have a symmetric API in NM to update properties of container in NM.

          To me, localized resource, resource, execution type, any fields in ContainerLaunchContext are all properties of a container in NM, we should be able to use a unified API to update them.

          And should we expose a rollback API to application?

          Show
          leftnoteasy Wangda Tan added a comment - Arun Suresh , I still think two APIs should be merged. In YARN-5221 , it update properties of container in RM, like resource / execution-type So I think we should have a symmetric API in NM to update properties of container in NM. To me, localized resource, resource, execution type, any fields in ContainerLaunchContext are all properties of a container in NM, we should be able to use a unified API to update them. And should we expose a rollback API to application?
          Hide
          asuresh Arun Suresh added a comment - - edited

          Wangda Tan, I feel that YARN-5221 and this are orthogonal.

          YARN-5221 deals with the update of a container allocation, properties of the container that are of interest to the Scheduler for allocation of the resource request like Resource size, and ExecutionType.

          This JIRA is pertaining to upgrade/rollback of the of the container Process, which are of interest only to the NM for the running of the container, which are encapsulated in the ContainerLaunchContext and Resources that need to be localized. The ContainerLaunchContext and LocalResources are not even known to the RM/Scheduler.

          With regard to the AM-NM protocol, I feel we should keep both different.

          1. One for updating the container allocation which is handled by YARN-5221. Maybe we should rename those to updateContainerAllocation rather than just updateContainer ?
          2. One for updating the container process, which this JIRA proposes to call upgrade and rollback
          Show
          asuresh Arun Suresh added a comment - - edited Wangda Tan , I feel that YARN-5221 and this are orthogonal. YARN-5221 deals with the update of a container allocation , properties of the container that are of interest to the Scheduler for allocation of the resource request like Resource size, and ExecutionType. This JIRA is pertaining to upgrade/rollback of the of the container Process , which are of interest only to the NM for the running of the container, which are encapsulated in the ContainerLaunchContext and Resources that need to be localized. The ContainerLaunchContext and LocalResources are not even known to the RM/Scheduler. With regard to the AM-NM protocol, I feel we should keep both different. One for updating the container allocation which is handled by YARN-5221 . Maybe we should rename those to updateContainerAllocation rather than just updateContainer ? One for updating the container process, which this JIRA proposes to call upgrade and rollback
          Hide
          leftnoteasy Wangda Tan added a comment -

          Arun Suresh,

          In YARN-5221 we have merged update container resource and execution type. I think it's better to merge the API in NM side as well. We have updated NM for container resizing in YARN-1643/YARN-1449, and there's on pending ticket for updating cgroups: YARN-4166.

          I would suggest to have:

          • A unified API in AM-NM protocol to update container
          • A unified implementation inside to update container manager / CGroups.

          To avoid making API/Implementation fragmented.

          Thoughts?

          Show
          leftnoteasy Wangda Tan added a comment - Arun Suresh , In YARN-5221 we have merged update container resource and execution type. I think it's better to merge the API in NM side as well. We have updated NM for container resizing in YARN-1643 / YARN-1449 , and there's on pending ticket for updating cgroups: YARN-4166 . I would suggest to have: A unified API in AM-NM protocol to update container A unified implementation inside to update container manager / CGroups. To avoid making API/Implementation fragmented. Thoughts?
          Hide
          asuresh Arun Suresh added a comment - - edited

          Thanks for the review Jian He

          The COMMIT_UPGRADE API: I don’t quite get the necessity of this API. Could you explain under what scenario should the user call this API ?

          Consider an AM that upgrades a container with a new binary and the process is subsequently restarted. Now after say around 10 mins the process dies. There is no way form the NM to know if the process died because of the upgrade (memory leak ?) or due to some transient failure.. and therefore it cannot make the decision to Retry the process or Rollback the upgrade. Only the AM knows if the upgrade is actually successful. Essentially, the commit API should be used by the AM to notify the NM that upgrade is fine and any subsequent failure can be handled by the existing Retry Policy AFTER it has performed some upgrade diagnostics on the container. We can provide an autoCommit convenience method that clubs upgrade + commit. But I feel it is important we keep the explicit commit API.

          The ROLLBACK_UPGRADE API: I think it should be able to rollback to any previous version, rather than only the immediate previous one. In some sense, it’s the same as upgrade.

          I agree AM should be able to move to any previous version, but,

          1. I feel the versioning should NOT be managed by the NM, since a) the launch context is provided and managed by the AM, the AM should take care of tying the context with the version b) There are (possibly huge) storage implications the NM would have to deal with to keep track of all the earlier versions.
          2. It should not be called rollback. The AM should call upgradeContainer(launchContext) with some previous context.

          IMHO, we probably can use one API restartContainer(context) for both upgrade and downgrade

          I agree that both rollback (explicit rollback via API) and upgrade can be implemented as wrappers over restartContainer(launchContext). But, in my opinion rollback should not be provided with an explicit launchContext, it should always be the just previous context.

          Also, Forcing containers to be restarted with previous version if upgrade fails may not be suitable in all cases, User wants to troubleshoot the failure first before triggering a new wave of restarts.

          Agreed... I can include an UpgradePolicy which allows users to terminate or rollBack (implicit rollback) on failure. Also COMMIT is useful here if the user wants to verify if one wave has successfully upgraded, commit upgrade in those instances and then move on to the next wave.

          IMO, as first cut implementation, we can fail the container if upgrade fails. we can add retry, rollback, or release the container as RetryPolicy on failure later. your opinion ?

          Yup.. will include a policy, as I mentioned above. Don't think retry makes sense though.

          Show
          asuresh Arun Suresh added a comment - - edited Thanks for the review Jian He The COMMIT_UPGRADE API: I don’t quite get the necessity of this API. Could you explain under what scenario should the user call this API ? Consider an AM that upgrades a container with a new binary and the process is subsequently restarted. Now after say around 10 mins the process dies. There is no way form the NM to know if the process died because of the upgrade (memory leak ?) or due to some transient failure.. and therefore it cannot make the decision to Retry the process or Rollback the upgrade. Only the AM knows if the upgrade is actually successful. Essentially, the commit API should be used by the AM to notify the NM that upgrade is fine and any subsequent failure can be handled by the existing Retry Policy AFTER it has performed some upgrade diagnostics on the container. We can provide an autoCommit convenience method that clubs upgrade + commit. But I feel it is important we keep the explicit commit API. The ROLLBACK_UPGRADE API: I think it should be able to rollback to any previous version, rather than only the immediate previous one. In some sense, it’s the same as upgrade. I agree AM should be able to move to any previous version, but, I feel the versioning should NOT be managed by the NM, since a) the launch context is provided and managed by the AM, the AM should take care of tying the context with the version b) There are (possibly huge) storage implications the NM would have to deal with to keep track of all the earlier versions. It should not be called rollback . The AM should call upgradeContainer(launchContext) with some previous context. IMHO, we probably can use one API restartContainer(context) for both upgrade and downgrade I agree that both rollback (explicit rollback via API) and upgrade can be implemented as wrappers over restartContainer(launchContext) . But, in my opinion rollback should not be provided with an explicit launchContext, it should always be the just previous context. Also, Forcing containers to be restarted with previous version if upgrade fails may not be suitable in all cases, User wants to troubleshoot the failure first before triggering a new wave of restarts. Agreed... I can include an UpgradePolicy which allows users to terminate or rollBack (implicit rollback) on failure. Also COMMIT is useful here if the user wants to verify if one wave has successfully upgraded, commit upgrade in those instances and then move on to the next wave. IMO, as first cut implementation, we can fail the container if upgrade fails. we can add retry, rollback, or release the container as RetryPolicy on failure later. your opinion ? Yup.. will include a policy, as I mentioned above. Don't think retry makes sense though.
          Hide
          jianhe Jian He added a comment -

          Thanks Arun, some questions on the API:

          • The COMMIT_UPGRADE API: I don’t quite get the necessity of this API. Could you explain under what scenario should the user call this API ?
          • The ROLLBACK_UPGRADE API: I think it should be able to rollback to any previous version, rather than only the immediate previous one. In some sense, it’s the same as upgrade.
            IMHO, we probably can use one API restartContainer(context) for both upgrade and downgrade, if the upgrade fails, user should be notified. And the AM(in our case slider), on user’s instruction, can choose to restartContainer with different version.
          • Also, Forcing containers to be restarted with previous version if upgrade fails may not be suitable in all cases, as often times, containers are upgraded in waves. User wants to troubleshoot the failure first before triggering a new wave of restarts.
            IMO, as first cut implementation, we can fail the container if upgrade fails. we can add retry, rollback, or release the container as RetryPolicy on failure later. your opinion ?
          Show
          jianhe Jian He added a comment - Thanks Arun, some questions on the API: The COMMIT_UPGRADE API: I don’t quite get the necessity of this API. Could you explain under what scenario should the user call this API ? The ROLLBACK_UPGRADE API: I think it should be able to rollback to any previous version, rather than only the immediate previous one. In some sense, it’s the same as upgrade. IMHO, we probably can use one API restartContainer(context) for both upgrade and downgrade, if the upgrade fails, user should be notified. And the AM(in our case slider), on user’s instruction, can choose to restartContainer with different version. Also, Forcing containers to be restarted with previous version if upgrade fails may not be suitable in all cases, as often times, containers are upgraded in waves. User wants to troubleshoot the failure first before triggering a new wave of restarts. IMO, as first cut implementation, we can fail the container if upgrade fails. we can add retry, rollback, or release the container as RetryPolicy on failure later. your opinion ?
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 13s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          +1 mvninstall 7m 39s trunk passed
          +1 compile 0m 28s trunk passed
          +1 checkstyle 0m 20s trunk passed
          +1 mvnsite 0m 31s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 0m 47s trunk passed
          +1 javadoc 0m 17s trunk passed
          +1 mvninstall 0m 24s the patch passed
          +1 compile 0m 26s the patch passed
          -1 javac 0m 26s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 1 new + 16 unchanged - 1 fixed = 17 total (was 17)
          -1 checkstyle 0m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 7 new + 312 unchanged - 2 fixed = 319 total (was 314)
          +1 mvnsite 0m 26s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 49s the patch passed
          +1 javadoc 0m 14s the patch passed
          -1 unit 14m 16s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 15s The patch does not generate ASF License warnings.
          28m 29s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827319/YARN-5620.003.patch
          JIRA Issue YARN-5620
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux ddd7fd5e9ae9 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / c0e492e
          Default Java 1.8.0_101
          findbugs v3.0.0
          javac https://builds.apache.org/job/PreCommit-YARN-Build/13030/artifact/patchprocess/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13030/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13030/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13030/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13030/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13030/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 13s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. +1 mvninstall 7m 39s trunk passed +1 compile 0m 28s trunk passed +1 checkstyle 0m 20s trunk passed +1 mvnsite 0m 31s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 0m 47s trunk passed +1 javadoc 0m 17s trunk passed +1 mvninstall 0m 24s the patch passed +1 compile 0m 26s the patch passed -1 javac 0m 26s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 1 new + 16 unchanged - 1 fixed = 17 total (was 17) -1 checkstyle 0m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 7 new + 312 unchanged - 2 fixed = 319 total (was 314) +1 mvnsite 0m 26s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 49s the patch passed +1 javadoc 0m 14s the patch passed -1 unit 14m 16s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 28m 29s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827319/YARN-5620.003.patch JIRA Issue YARN-5620 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux ddd7fd5e9ae9 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / c0e492e Default Java 1.8.0_101 findbugs v3.0.0 javac https://builds.apache.org/job/PreCommit-YARN-Build/13030/artifact/patchprocess/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13030/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13030/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13030/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13030/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13030/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          asuresh Arun Suresh added a comment -

          Updating patch

          • Adding more test coverage
          • fixing some javadocs and checkstyles
          • fixing the failed test cases (the TestDefaultContainerExecutor failures don't seem to be related to this patch though)
          Show
          asuresh Arun Suresh added a comment - Updating patch Adding more test coverage fixing some javadocs and checkstyles fixing the failed test cases (the TestDefaultContainerExecutor failures don't seem to be related to this patch though)
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 17s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
          +1 mvninstall 6m 58s trunk passed
          +1 compile 0m 31s trunk passed
          +1 checkstyle 0m 23s trunk passed
          +1 mvnsite 0m 33s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          +1 findbugs 0m 48s trunk passed
          +1 javadoc 0m 20s trunk passed
          +1 mvninstall 0m 28s the patch passed
          +1 compile 0m 28s the patch passed
          -1 javac 0m 28s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 1 new + 16 unchanged - 1 fixed = 17 total (was 17)
          -1 checkstyle 0m 19s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 13 new + 277 unchanged - 1 fixed = 290 total (was 278)
          +1 mvnsite 0m 29s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 58s the patch passed
          +1 javadoc 0m 16s the patch passed
          -1 unit 15m 4s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 16s The patch does not generate ASF License warnings.
          29m 16s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.containermanager.TestContainerManager
            hadoop.yarn.server.nodemanager.containermanager.TestContainerManagerRegression
            hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor
            hadoop.yarn.server.nodemanager.TestContainerManagerWithLCE



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827297/YARN-5620.002.patch
          JIRA Issue YARN-5620
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 127ebb70bd89 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 5f23abf
          Default Java 1.8.0_101
          findbugs v3.0.0
          javac https://builds.apache.org/job/PreCommit-YARN-Build/13026/artifact/patchprocess/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13026/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13026/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13026/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13026/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13026/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 17s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 6m 58s trunk passed +1 compile 0m 31s trunk passed +1 checkstyle 0m 23s trunk passed +1 mvnsite 0m 33s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 0m 48s trunk passed +1 javadoc 0m 20s trunk passed +1 mvninstall 0m 28s the patch passed +1 compile 0m 28s the patch passed -1 javac 0m 28s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 1 new + 16 unchanged - 1 fixed = 17 total (was 17) -1 checkstyle 0m 19s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 13 new + 277 unchanged - 1 fixed = 290 total (was 278) +1 mvnsite 0m 29s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 58s the patch passed +1 javadoc 0m 16s the patch passed -1 unit 15m 4s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 16s The patch does not generate ASF License warnings. 29m 16s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.containermanager.TestContainerManager   hadoop.yarn.server.nodemanager.containermanager.TestContainerManagerRegression   hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor   hadoop.yarn.server.nodemanager.TestContainerManagerWithLCE Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827297/YARN-5620.002.patch JIRA Issue YARN-5620 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 127ebb70bd89 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 5f23abf Default Java 1.8.0_101 findbugs v3.0.0 javac https://builds.apache.org/job/PreCommit-YARN-Build/13026/artifact/patchprocess/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13026/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13026/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13026/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13026/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13026/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          asuresh Arun Suresh added a comment -

          Uploading updated patch:

          • Added support for explicit Rollback. If upgrade has not been committed.
          • Some minor code cleanup
          Show
          asuresh Arun Suresh added a comment - Uploading updated patch: Added support for explicit Rollback. If upgrade has not been committed. Some minor code cleanup
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 19s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
          +1 mvninstall 8m 31s trunk passed
          +1 compile 0m 30s trunk passed
          +1 checkstyle 0m 22s trunk passed
          +1 mvnsite 0m 31s trunk passed
          +1 mvneclipse 0m 15s trunk passed
          +1 findbugs 0m 46s trunk passed
          +1 javadoc 0m 20s trunk passed
          +1 mvninstall 0m 26s the patch passed
          +1 compile 0m 29s the patch passed
          -1 javac 0m 29s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 1 new + 16 unchanged - 1 fixed = 17 total (was 17)
          -1 checkstyle 0m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 10 new + 277 unchanged - 1 fixed = 287 total (was 278)
          +1 mvnsite 0m 30s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 58s the patch passed
          +1 javadoc 0m 16s the patch passed
          -1 unit 14m 5s hadoop-yarn-server-nodemanager in the patch failed.
          +1 asflicense 0m 15s The patch does not generate ASF License warnings.
          29m 43s



          Reason Tests
          Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor
            hadoop.yarn.server.nodemanager.TestContainerManagerWithLCE



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827269/YARN-5620.001.patch
          JIRA Issue YARN-5620
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux e11d22996f46 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 5f23abf
          Default Java 1.8.0_101
          findbugs v3.0.0
          javac https://builds.apache.org/job/PreCommit-YARN-Build/13024/artifact/patchprocess/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13024/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit https://builds.apache.org/job/PreCommit-YARN-Build/13024/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13024/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13024/testReport/
          modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/13024/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 19s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 8m 31s trunk passed +1 compile 0m 30s trunk passed +1 checkstyle 0m 22s trunk passed +1 mvnsite 0m 31s trunk passed +1 mvneclipse 0m 15s trunk passed +1 findbugs 0m 46s trunk passed +1 javadoc 0m 20s trunk passed +1 mvninstall 0m 26s the patch passed +1 compile 0m 29s the patch passed -1 javac 0m 29s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 1 new + 16 unchanged - 1 fixed = 17 total (was 17) -1 checkstyle 0m 18s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 10 new + 277 unchanged - 1 fixed = 287 total (was 278) +1 mvnsite 0m 30s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 58s the patch passed +1 javadoc 0m 16s the patch passed -1 unit 14m 5s hadoop-yarn-server-nodemanager in the patch failed. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 29m 43s Reason Tests Failed junit tests hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor   hadoop.yarn.server.nodemanager.TestContainerManagerWithLCE Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12827269/YARN-5620.001.patch JIRA Issue YARN-5620 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux e11d22996f46 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 5f23abf Default Java 1.8.0_101 findbugs v3.0.0 javac https://builds.apache.org/job/PreCommit-YARN-Build/13024/artifact/patchprocess/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13024/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13024/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13024/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13024/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13024/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          asuresh Arun Suresh added a comment -

          Attaching initial patch based on some offline ideas from Jian He, Vinod Kumar Vavilapalli etc.

          I havn't included the API changes with this patch. I have just added upgradeContainer and commitUpgrade methods to the ContainerManagerImpl to test the end to end flow via test cases.

          The patch assumes the following:

          • The container is restarted only after ALL the required resources are localized.
          • If the relaunch of the container with the new bits fails, the Container will be rollback
          • Rollback involves reverting to the old launch Context and restarting.
          • It is upto the AM to call the commitUpgrade once the container has completed to ensure that if the Container fails after the upgrade, it is not rolled back. This is required, since if the container fails for some reason after the upgrade, there is no way to distinguish if it is because of the upgrade or for some other reason.
          Show
          asuresh Arun Suresh added a comment - Attaching initial patch based on some offline ideas from Jian He , Vinod Kumar Vavilapalli etc. I havn't included the API changes with this patch. I have just added upgradeContainer and commitUpgrade methods to the ContainerManagerImpl to test the end to end flow via test cases. The patch assumes the following: The container is restarted only after ALL the required resources are localized. If the relaunch of the container with the new bits fails, the Container will be rollback Rollback involves reverting to the old launch Context and restarting. It is upto the AM to call the commitUpgrade once the container has completed to ensure that if the Container fails after the upgrade, it is not rolled back. This is required, since if the container fails for some reason after the upgrade, there is no way to distinguish if it is because of the upgrade or for some other reason.

            People

            • Assignee:
              asuresh Arun Suresh
              Reporter:
              asuresh Arun Suresh
            • Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development