Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3476

Nodemanager can fail to delete local logs if log aggregation fails

    Details

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      If log aggregation encounters an error trying to upload the file then the underlying TFile can throw an illegalstateexception which will bubble up through the top of the thread and prevent the application logs from being deleted.

      1. 0001-YARN-3476.patch
        2 kB
        Rohith Sharma K S
      2. 0001-YARN-3476.patch
        2 kB
        Rohith Sharma K S
      3. 0002-YARN-3476.patch
        2 kB
        Rohith Sharma K S

        Issue Links

          Activity

          Hide
          jlowe Jason Lowe added a comment -

          Snippet from an NM log:

          2015-03-15 11:34:34,671 [LogAggregationService #25] ERROR logaggregation.AppLogAggregatorImpl: Couldn't upload logs for container_e03_1424994657328_776201_02_016386. Skipping this container.
          2015-03-15 11:34:34,672 [DeletionService #3] INFO nodemanager.LinuxContainerExecutor: Deleting absolute path : null
          2015-03-15 11:34:34,751 [LogAggregationService #25] WARN logaggregation.AppLogAggregatorImpl: Aggregation did not complete for application application_1424994657328_776201
          2015-03-15 11:34:34,751 [LogAggregationService #25] ERROR yarn.YarnUncaughtExceptionHandler: Thread Thread[LogAggregationService #25,5,main] threw an Exception.
          java.lang.IllegalStateException: Cannot close TFile in the middle of key-value insertion.
                  at org.apache.hadoop.io.file.tfile.TFile$Writer.close(TFile.java:310)
                  at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter.close(AggregatedLogFormat.java:454)
                  at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainers(AppLogAggregatorImpl.java:285)
                  at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:415)
                  at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:380)
                  at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$2.run(LogAggregationService.java:387)
                  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
                  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
                  at java.lang.Thread.run(Thread.java:722)
          

          Because of the TFile error we fail to do post-aggregation cleanup such as deleting the application logs. At that point we leak the logs on the local disk.

          Note the "Deleting absolute path : null" log above is probably caused by this logic in AppLogAggregatorImpl:

                  if (uploadedFilePathsInThisCycle.size() > 0) {
                    uploadedLogsInThisCycle = true;
                  }
                  this.delService.delete(this.userUgi.getShortUserName(), null,
                    uploadedFilePathsInThisCycle
                      .toArray(new Path[uploadedFilePathsInThisCycle.size()]));
          

          We check if there are no uploaded file paths, but then go ahead and always try to delete them even if there are none.

          Show
          jlowe Jason Lowe added a comment - Snippet from an NM log: 2015-03-15 11:34:34,671 [LogAggregationService #25] ERROR logaggregation.AppLogAggregatorImpl: Couldn't upload logs for container_e03_1424994657328_776201_02_016386. Skipping this container. 2015-03-15 11:34:34,672 [DeletionService #3] INFO nodemanager.LinuxContainerExecutor: Deleting absolute path : null 2015-03-15 11:34:34,751 [LogAggregationService #25] WARN logaggregation.AppLogAggregatorImpl: Aggregation did not complete for application application_1424994657328_776201 2015-03-15 11:34:34,751 [LogAggregationService #25] ERROR yarn.YarnUncaughtExceptionHandler: Thread Thread[LogAggregationService #25,5,main] threw an Exception. java.lang.IllegalStateException: Cannot close TFile in the middle of key-value insertion. at org.apache.hadoop.io.file.tfile.TFile$Writer.close(TFile.java:310) at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter.close(AggregatedLogFormat.java:454) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainers(AppLogAggregatorImpl.java:285) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:415) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:380) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$2.run(LogAggregationService.java:387) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) Because of the TFile error we fail to do post-aggregation cleanup such as deleting the application logs. At that point we leak the logs on the local disk. Note the "Deleting absolute path : null" log above is probably caused by this logic in AppLogAggregatorImpl: if (uploadedFilePathsInThisCycle.size() > 0) { uploadedLogsInThisCycle = true ; } this .delService.delete( this .userUgi.getShortUserName(), null , uploadedFilePathsInThisCycle .toArray( new Path[uploadedFilePathsInThisCycle.size()])); We check if there are no uploaded file paths, but then go ahead and always try to delete them even if there are none.
          Hide
          rohithsharma Rohith Sharma K S added a comment -

          Sometimes in test I have seen NM recovery throw NPE because of 'Deleting absolute path : null'.

          Show
          rohithsharma Rohith Sharma K S added a comment - Sometimes in test I have seen NM recovery throw NPE because of 'Deleting absolute path : null'.
          Hide
          rohithsharma Rohith Sharma K S added a comment -

          As a user, I think if logs did not aggregated then user expect it should be available in local disk for reference. But in above scenario, does logs aggregation completed and available in HDFS?

          Show
          rohithsharma Rohith Sharma K S added a comment - As a user, I think if logs did not aggregated then user expect it should be available in local disk for reference. But in above scenario, does logs aggregation completed and available in HDFS?
          Hide
          jlowe Jason Lowe added a comment -

          As a user, I think if logs did not aggregated then user expect it should be available in local disk for reference.

          We could leave the logs on the local disk, but then we need some kind of retention logic to handle that case. If we don't have such logic then we risk eventually filling up the disks (which is what happened in this case on a number of nodes).

          But in above scenario, does logs aggregation completed and available in HDFS?

          Not all of the application's logs were available in HDFS because it encountered an error (token-related) trying to upload the logs.

          Show
          jlowe Jason Lowe added a comment - As a user, I think if logs did not aggregated then user expect it should be available in local disk for reference. We could leave the logs on the local disk, but then we need some kind of retention logic to handle that case. If we don't have such logic then we risk eventually filling up the disks (which is what happened in this case on a number of nodes). But in above scenario, does logs aggregation completed and available in HDFS? Not all of the application's logs were available in HDFS because it encountered an error (token-related) trying to upload the logs.
          Hide
          rohithsharma Rohith Sharma K S added a comment -

          but then we need some kind of retention logic to handle that case

          Yes I agree. Such cases we can use the aggregated log-retention configuration i.e 'yarn.log-aggregation.retain-seconds' for deleting from disk. But by default, this logic is disabled. The same behavior can be keep for the failed to aggregate container logs also.Does it fine?

          Show
          rohithsharma Rohith Sharma K S added a comment - but then we need some kind of retention logic to handle that case Yes I agree. Such cases we can use the aggregated log-retention configuration i.e 'yarn.log-aggregation.retain-seconds' for deleting from disk. But by default, this logic is disabled. The same behavior can be keep for the failed to aggregate container logs also.Does it fine?
          Hide
          jlowe Jason Lowe added a comment -

          Such cases we can use the aggregated log-retention configuration i.e 'yarn.log-aggregation.retain-seconds' for deleting from disk.

          No, that's not going to be OK. One is the lifetime on the aggregation filesystem and what we're considering is the lifetime on the local disk. I can hold a lot more logs in HDFS than I can in the local disk.

          Show
          jlowe Jason Lowe added a comment - Such cases we can use the aggregated log-retention configuration i.e 'yarn.log-aggregation.retain-seconds' for deleting from disk. No, that's not going to be OK. One is the lifetime on the aggregation filesystem and what we're considering is the lifetime on the local disk. I can hold a lot more logs in HDFS than I can in the local disk.
          Hide
          rohithsharma Rohith Sharma K S added a comment -

          Not all of the application's logs were available in HDFS because it encountered an error (token-related) trying to upload the logs.

          Is this because of IllegalStateException caused failure?

          There are 2 options

          1. do post-aggregation cleanup by handling IllegalStateException OR
          2. scheduler timer for those log directories which are not uploaded and which are not deleted.
            Thinking,does IllegalStateException is causing log not to be found in hdfs? If it is not then I think simple way to handle is 1st approach.
          Show
          rohithsharma Rohith Sharma K S added a comment - Not all of the application's logs were available in HDFS because it encountered an error (token-related) trying to upload the logs. Is this because of IllegalStateException caused failure? There are 2 options do post-aggregation cleanup by handling IllegalStateException OR scheduler timer for those log directories which are not uploaded and which are not deleted. Thinking,does IllegalStateException is causing log not to be found in hdfs? If it is not then I think simple way to handle is 1st approach.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12724974/0001-YARN-3476.patch
          against trunk revision fddd552.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7346//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7346//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12724974/0001-YARN-3476.patch against trunk revision fddd552. +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7346//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7346//console This message is automatically generated.
          Hide
          sunilg Sunil G added a comment -

          HI Jason Lowe and Rohith Sharma K S

          A retention logic to handle this error may become more complex when multiple failures seen during aggretion across application. If this happens rarely, a strong retention logic with a timer s helpful.

          On a generic level, by considering more failures, a clean up after aggression can save the disk. Which s acceptable as we encountered error and there may not be real pressure to give 100% good logs with an error while aggretion.

          Show
          sunilg Sunil G added a comment - HI Jason Lowe and Rohith Sharma K S A retention logic to handle this error may become more complex when multiple failures seen during aggretion across application. If this happens rarely, a strong retention logic with a timer s helpful. On a generic level, by considering more failures, a clean up after aggression can save the disk. Which s acceptable as we encountered error and there may not be real pressure to give 100% good logs with an error while aggretion.
          Hide
          rohithsharma Rohith Sharma K S added a comment -

          Thanks Sunil G for sharing your thoughts.
          Going for retention logic or time, thinking about NM recovery that retention logic should be stored in state store. Then NM should support for state store update in AggregatddLogService similar to NonAggregatedLogHandler

          Jason Lowe I attached patch with straightforward fix that handling exception and do post aggregation clean up. Kindly share your opinion on 2 approaches i.e 1. handling exception and 2. retention logic

          Show
          rohithsharma Rohith Sharma K S added a comment - Thanks Sunil G for sharing your thoughts. Going for retention logic or time, thinking about NM recovery that retention logic should be stored in state store. Then NM should support for state store update in AggregatddLogService similar to NonAggregatedLogHandler Jason Lowe I attached patch with straightforward fix that handling exception and do post aggregation clean up. Kindly share your opinion on 2 approaches i.e 1. handling exception and 2. retention logic
          Hide
          jlowe Jason Lowe added a comment -

          I'm OK with deleting the logs upon error uploading. It should be a rare occurrence, and log availability is already a best-effort rather than guaranteed service. Even if we try to retain the logs it has questionable benefit in practice, as the history of a job always points to the aggregated logs, not the node's copy of the logs, and thus the logs will still be "lost" from the end-user's point of view. Savvy users may realize the logs could still be on the original node, but most won't know to check there or how to form the URL to find them. If we always point to the node then that defeats one of the features of log aggregation, since loss of the node will mean the node's URL is bad and we fail to show the logs even if they are aggregated.

          So for now I say we keep it simple and just cleanup the files on errors to prevent leaks. Speaking of which I took a look at the patch. It will fix the particular error we saw with TFiles, but there could easily be other non-IOExceptions that creep out of the code, especially as it is maintained over time. Would it be better to wrap the cleanup in a finally block or something a little more broadly applicable to errors that occur?

          Show
          jlowe Jason Lowe added a comment - I'm OK with deleting the logs upon error uploading. It should be a rare occurrence, and log availability is already a best-effort rather than guaranteed service. Even if we try to retain the logs it has questionable benefit in practice, as the history of a job always points to the aggregated logs, not the node's copy of the logs, and thus the logs will still be "lost" from the end-user's point of view. Savvy users may realize the logs could still be on the original node, but most won't know to check there or how to form the URL to find them. If we always point to the node then that defeats one of the features of log aggregation, since loss of the node will mean the node's URL is bad and we fail to show the logs even if they are aggregated. So for now I say we keep it simple and just cleanup the files on errors to prevent leaks. Speaking of which I took a look at the patch. It will fix the particular error we saw with TFiles, but there could easily be other non-IOExceptions that creep out of the code, especially as it is maintained over time. Would it be better to wrap the cleanup in a finally block or something a little more broadly applicable to errors that occur?
          Hide
          rohithsharma Rohith Sharma K S added a comment -

          Apologies for looking back to issue delayed.

          Would it be better to wrap the cleanup in a finally block or something a little more broadly applicable to errors that occur?

          Make sense to me.

          Uploading the patch with handling exception and do the post clean up.

          Show
          rohithsharma Rohith Sharma K S added a comment - Apologies for looking back to issue delayed. Would it be better to wrap the cleanup in a finally block or something a little more broadly applicable to errors that occur? Make sense to me. Uploading the patch with handling exception and do the post clean up.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 14m 35s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          -1 whitespace 0m 0s The patch has 2 line(s) that end in whitespace.
          +1 javac 7m 31s There were no new javac warning messages.
          +1 javadoc 9m 33s There were no new javadoc warning messages.
          +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 7m 40s The applied patch generated 1 additional checkstyle issues.
          +1 install 1m 34s mvn install still works.
          +1 eclipse:eclipse 0m 32s The patch built with eclipse:eclipse.
          +1 findbugs 1m 1s The patch does not introduce any new Findbugs (version 2.0.3) warnings.
          +1 yarn tests 5m 58s Tests passed in hadoop-yarn-server-nodemanager.
              48m 51s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12729459/0001-YARN-3476.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / de9404f
          whitespace https://builds.apache.org/job/PreCommit-YARN-Build/7554/artifact/patchprocess/whitespace.txt
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/7554/artifact/patchprocess/checkstyle-result-diff.txt
          hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/7554/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/7554/testReport/
          Java 1.7.0_55
          uname Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/7554/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 14m 35s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 whitespace 0m 0s The patch has 2 line(s) that end in whitespace. +1 javac 7m 31s There were no new javac warning messages. +1 javadoc 9m 33s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 7m 40s The applied patch generated 1 additional checkstyle issues. +1 install 1m 34s mvn install still works. +1 eclipse:eclipse 0m 32s The patch built with eclipse:eclipse. +1 findbugs 1m 1s The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 yarn tests 5m 58s Tests passed in hadoop-yarn-server-nodemanager.     48m 51s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12729459/0001-YARN-3476.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / de9404f whitespace https://builds.apache.org/job/PreCommit-YARN-Build/7554/artifact/patchprocess/whitespace.txt checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/7554/artifact/patchprocess/checkstyle-result-diff.txt hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/7554/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/7554/testReport/ Java 1.7.0_55 uname Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/7554/console This message was automatically generated.
          Hide
          jlowe Jason Lowe added a comment -

          This will silently eat any exception that does occur, but we're going to need at least an entry in the NM log to understand why a log wasn't aggregated as expected.

          --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
          +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
          @@ -417,6 +417,9 @@ public Object run() throws Exception {
             public void run() {
               try {
                 doAppLogAggregation();
          +    } catch (Exception e) {
          +      // do post clean up of log directories on any exception
          +      doAppLogAggregationPostCleanUp();
               } finally {
                 if (!this.appAggregationFinished.get()) {
                   LOG.warn("Aggregation did not complete for application " + appId);
          
          Show
          jlowe Jason Lowe added a comment - This will silently eat any exception that does occur, but we're going to need at least an entry in the NM log to understand why a log wasn't aggregated as expected. --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java @@ -417,6 +417,9 @@ public Object run() throws Exception { public void run() { try { doAppLogAggregation(); + } catch (Exception e) { + // do post clean up of log directories on any exception + doAppLogAggregationPostCleanUp(); } finally { if (!this.appAggregationFinished.get()) { LOG.warn("Aggregation did not complete for application " + appId);
          Hide
          rohithsharma Rohith Sharma K S added a comment -

          Updated the patch adding LOG on exception. Kindly review the updated patch

          Show
          rohithsharma Rohith Sharma K S added a comment - Updated the patch adding LOG on exception. Kindly review the updated patch
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 14m 40s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 javac 7m 36s There were no new javac warning messages.
          +1 javadoc 9m 39s There were no new javadoc warning messages.
          +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 38s There were no new checkstyle issues.
          -1 whitespace 0m 0s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix.
          +1 install 1m 34s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 1m 2s The patch does not introduce any new Findbugs (version 2.0.3) warnings.
          -1 yarn tests 5m 47s Tests failed in hadoop-yarn-server-nodemanager.
              41m 55s  



          Reason Tests
          Failed unit tests hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12731375/0002-YARN-3476.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / f4ebbc6
          whitespace https://builds.apache.org/job/PreCommit-YARN-Build/7810/artifact/patchprocess/whitespace.txt
          hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/7810/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/7810/testReport/
          Java 1.7.0_55
          uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/7810/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 14m 40s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac 7m 36s There were no new javac warning messages. +1 javadoc 9m 39s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 38s There were no new checkstyle issues. -1 whitespace 0m 0s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 1m 34s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 1m 2s The patch does not introduce any new Findbugs (version 2.0.3) warnings. -1 yarn tests 5m 47s Tests failed in hadoop-yarn-server-nodemanager.     41m 55s   Reason Tests Failed unit tests hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12731375/0002-YARN-3476.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / f4ebbc6 whitespace https://builds.apache.org/job/PreCommit-YARN-Build/7810/artifact/patchprocess/whitespace.txt hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/7810/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/7810/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/7810/console This message was automatically generated.
          Hide
          jlowe Jason Lowe added a comment -

          +1 lgtm. Test failure is unrelated, and I'll fix whitespace nit on commit.

          Show
          jlowe Jason Lowe added a comment - +1 lgtm. Test failure is unrelated, and I'll fix whitespace nit on commit.
          Hide
          jlowe Jason Lowe added a comment -

          Thanks, Rohith! I committed this to trunk, branch-2, and branch-2.7.

          Show
          jlowe Jason Lowe added a comment - Thanks, Rohith! I committed this to trunk, branch-2, and branch-2.7.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #7780 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7780/)
          YARN-3476. Nodemanager can fail to delete local logs if log aggregation fails. Contributed by Rohith (jlowe: rev 25e2b02122c4ed760227ab33c49d3445c23b9276)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #7780 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7780/ ) YARN-3476 . Nodemanager can fail to delete local logs if log aggregation fails. Contributed by Rohith (jlowe: rev 25e2b02122c4ed760227ab33c49d3445c23b9276) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java hadoop-yarn-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #191 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/191/)
          YARN-3476. Nodemanager can fail to delete local logs if log aggregation fails. Contributed by Rohith (jlowe: rev 25e2b02122c4ed760227ab33c49d3445c23b9276)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #191 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/191/ ) YARN-3476 . Nodemanager can fail to delete local logs if log aggregation fails. Contributed by Rohith (jlowe: rev 25e2b02122c4ed760227ab33c49d3445c23b9276) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java hadoop-yarn-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Yarn-trunk #922 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/922/)
          YARN-3476. Nodemanager can fail to delete local logs if log aggregation fails. Contributed by Rohith (jlowe: rev 25e2b02122c4ed760227ab33c49d3445c23b9276)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk #922 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/922/ ) YARN-3476 . Nodemanager can fail to delete local logs if log aggregation fails. Contributed by Rohith (jlowe: rev 25e2b02122c4ed760227ab33c49d3445c23b9276) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2120 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2120/)
          YARN-3476. Nodemanager can fail to delete local logs if log aggregation fails. Contributed by Rohith (jlowe: rev 25e2b02122c4ed760227ab33c49d3445c23b9276)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2120 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2120/ ) YARN-3476 . Nodemanager can fail to delete local logs if log aggregation fails. Contributed by Rohith (jlowe: rev 25e2b02122c4ed760227ab33c49d3445c23b9276) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #180 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/180/)
          YARN-3476. Nodemanager can fail to delete local logs if log aggregation fails. Contributed by Rohith (jlowe: rev 25e2b02122c4ed760227ab33c49d3445c23b9276)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #180 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/180/ ) YARN-3476 . Nodemanager can fail to delete local logs if log aggregation fails. Contributed by Rohith (jlowe: rev 25e2b02122c4ed760227ab33c49d3445c23b9276) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #190 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/190/)
          YARN-3476. Nodemanager can fail to delete local logs if log aggregation fails. Contributed by Rohith (jlowe: rev 25e2b02122c4ed760227ab33c49d3445c23b9276)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #190 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/190/ ) YARN-3476 . Nodemanager can fail to delete local logs if log aggregation fails. Contributed by Rohith (jlowe: rev 25e2b02122c4ed760227ab33c49d3445c23b9276) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2138 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2138/)
          YARN-3476. Nodemanager can fail to delete local logs if log aggregation fails. Contributed by Rohith (jlowe: rev 25e2b02122c4ed760227ab33c49d3445c23b9276)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2138 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2138/ ) YARN-3476 . Nodemanager can fail to delete local logs if log aggregation fails. Contributed by Rohith (jlowe: rev 25e2b02122c4ed760227ab33c49d3445c23b9276) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java hadoop-yarn-project/CHANGES.txt

            People

            • Assignee:
              rohithsharma Rohith Sharma K S
              Reporter:
              jlowe Jason Lowe
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development