Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.2.0
    • Fix Version/s: 3.0.0-alpha1
    • Component/s: None
    • Labels:
      None

      Description

      Expose a client API to allow clients to figure if log aggregation is complete

      1. YARN-1279.1.patch
        87 kB
        Xuan Gong
      2. YARN-1279.11.patch
        89 kB
        Xuan Gong
      3. YARN-1279.2.patch
        86 kB
        Xuan Gong
      4. YARN-1279.2.patch
        86 kB
        Xuan Gong
      5. YARN-1279.3.patch
        87 kB
        Xuan Gong
      6. YARN-1279.3.patch
        87 kB
        Xuan Gong
      7. YARN-1279.4.patch
        88 kB
        Xuan Gong
      8. YARN-1279.4.patch
        88 kB
        Xuan Gong
      9. YARN-1279.5.patch
        88 kB
        Xuan Gong
      10. YARN-1279.6.patch
        93 kB
        Xuan Gong
      11. YARN-1279.7.patch
        93 kB
        Xuan Gong
      12. YARN-1279.8.patch
        98 kB
        Xuan Gong
      13. YARN-1279.8.patch
        98 kB
        Xuan Gong
      14. YARN-1279.9.patch
        98 kB
        Xuan Gong

        Issue Links

          Activity

          Hide
          xgong Xuan Gong added a comment -

          The basic idea is NM notifies the RM about its log aggregation status of all the containers through the node heartBeat. And when RMNode get the log aggregation status from the npdeUpdateEvent, it will forward the log status to related RMApp. After that, the client can get the log aggregation status by calling related API.

          Show
          xgong Xuan Gong added a comment - The basic idea is NM notifies the RM about its log aggregation status of all the containers through the node heartBeat. And when RMNode get the log aggregation status from the npdeUpdateEvent, it will forward the log status to related RMApp. After that, the client can get the log aggregation status by calling related API.
          Hide
          xgong Xuan Gong added a comment -

          Will split the work into two parts. This ticket is used to track the work on RM side. It will include all the changes after RMNode receives the STATUS_UPDATE event, changes on NodeStatus and related PB changes.

          Show
          xgong Xuan Gong added a comment - Will split the work into two parts. This ticket is used to track the work on RM side. It will include all the changes after RMNode receives the STATUS_UPDATE event, changes on NodeStatus and related PB changes.
          Hide
          xgong Xuan Gong added a comment -

          create YARN-1376 to track the changes on NM side

          Show
          xgong Xuan Gong added a comment - create YARN-1376 to track the changes on NM side
          Hide
          xgong Xuan Gong added a comment -

          The patch includes all the changes in RM side and Client side. All the pb changes are included. Add a unit test to test the functionality.

          Show
          xgong Xuan Gong added a comment - The patch includes all the changes in RM side and Client side. All the pb changes are included. Add a unit test to test the functionality.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12611502/YARN-1279.1.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2338//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12611502/YARN-1279.1.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2338//console This message is automatically generated.
          Hide
          xgong Xuan Gong added a comment -

          update the patch based on the latest trunk

          Show
          xgong Xuan Gong added a comment - update the patch based on the latest trunk
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12611529/YARN-1279.2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          -1 javac. The patch appears to cause the build to fail.

          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2341//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12611529/YARN-1279.2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 9 new or modified test files. -1 javac . The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2341//console This message is automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12611542/YARN-1279.2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

          -1 release audit. The applied patch generated 1 release audit warnings.

          -1 core tests. The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

          org.apache.hadoop.mapreduce.v2.TestUberAM

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2343//testReport/
          Release audit warnings: https://builds.apache.org/job/PreCommit-YARN-Build/2343//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
          Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/2343//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-common.html
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2343//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12611542/YARN-1279.2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 9 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. -1 release audit . The applied patch generated 1 release audit warnings. -1 core tests . The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.mapreduce.v2.TestUberAM +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2343//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-YARN-Build/2343//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/2343//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-common.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2343//console This message is automatically generated.
          Hide
          xgong Xuan Gong added a comment -

          fix -1 findbugs and -1 release audit

          Show
          xgong Xuan Gong added a comment - fix -1 findbugs and -1 release audit
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12611631/YARN-1279.3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          -1 javac. The patch appears to cause the build to fail.

          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2348//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12611631/YARN-1279.3.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 9 new or modified test files. -1 javac . The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2348//console This message is automatically generated.
          Hide
          xgong Xuan Gong added a comment -

          Kick off the Jenkins again

          Show
          xgong Xuan Gong added a comment - Kick off the Jenkins again
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12611642/YARN-1279.3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          -1 release audit. The applied patch generated 1 release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

          org.apache.hadoop.mapred.TestJobCleanup

          The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

          org.apache.hadoop.mapreduce.v2.TestUberAM

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2349//testReport/
          Release audit warnings: https://builds.apache.org/job/PreCommit-YARN-Build/2349//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2349//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12611642/YARN-1279.3.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 9 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit . The applied patch generated 1 release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.mapred.TestJobCleanup The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.mapreduce.v2.TestUberAM +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2349//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-YARN-Build/2349//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2349//console This message is automatically generated.
          Hide
          xgong Xuan Gong added a comment -

          fix -1 release audit

          Show
          xgong Xuan Gong added a comment - fix -1 release audit
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12611676/YARN-1279.4.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          -1 javac. The patch appears to cause the build to fail.

          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2355//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12611676/YARN-1279.4.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 9 new or modified test files. -1 javac . The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2355//console This message is automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12611865/YARN-1279.4.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

          org.apache.hadoop.mapred.TestJobCleanup

          The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

          org.apache.hadoop.mapreduce.v2.TestUberAM

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2358//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2358//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12611865/YARN-1279.4.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 9 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.mapred.TestJobCleanup The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.mapreduce.v2.TestUberAM +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2358//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2358//console This message is automatically generated.
          Hide
          jianhe Jian He added a comment -
          • LogAggregationState: DISABLE -> DISABLED, NOT_START -> NOT_STARTED
          • Log Aggregation is NM side config, this is getting from RM itself.
                  if (!conf.getBoolean(YarnConfiguration.LOG_AGGREGATION_ENABLED,
                      YarnConfiguration.DEFAULT_LOG_AGGREGATION_ENABLED)) {
                    return LogAggregationState.DISABLE;
                  }
            
          • LogAggregationStatus may come via heartbeat before FinalTransition is called, inside which containerLogAggregationStatus is initialized with the containers. In this case, the log status is lost.
            public void updateLogAggregationStatus(ContainerLogAggregationStatus status) {
                this.writeLock.lock();
                try {
                  if (containerLogAggregationStatus.containsKey(status.getContainerId())) {
                    LogAggregationState currentState =
                        containerLogAggregationStatus.get(status.getContainerId());
                    if (currentState != LogAggregationState.COMPLETED
                        && currentState != LogAggregationState.FAILED) {
                      if (status.getLogAggregationState() == LogAggregationState.COMPLETED) {
                        LogAggregationCompleted.getAndAdd(1);
                      } else if (status.getLogAggregationState() == LogAggregationState.FAILED) {
                        LogAggregationFailed.getAndAdd(1);
                      }
                      containerLogAggregationStatus.put(status.getContainerId(),
                          status.getLogAggregationState());
                    }
                  }
                } finally {
                  this.writeLock.unlock();
                }
              }
            
          Show
          jianhe Jian He added a comment - LogAggregationState: DISABLE -> DISABLED, NOT_START -> NOT_STARTED Log Aggregation is NM side config, this is getting from RM itself. if (!conf.getBoolean(YarnConfiguration.LOG_AGGREGATION_ENABLED, YarnConfiguration.DEFAULT_LOG_AGGREGATION_ENABLED)) { return LogAggregationState.DISABLE; } LogAggregationStatus may come via heartbeat before FinalTransition is called, inside which containerLogAggregationStatus is initialized with the containers. In this case, the log status is lost. public void updateLogAggregationStatus(ContainerLogAggregationStatus status) { this .writeLock.lock(); try { if (containerLogAggregationStatus.containsKey(status.getContainerId())) { LogAggregationState currentState = containerLogAggregationStatus.get(status.getContainerId()); if (currentState != LogAggregationState.COMPLETED && currentState != LogAggregationState.FAILED) { if (status.getLogAggregationState() == LogAggregationState.COMPLETED) { LogAggregationCompleted.getAndAdd(1); } else if (status.getLogAggregationState() == LogAggregationState.FAILED) { LogAggregationFailed.getAndAdd(1); } containerLogAggregationStatus.put(status.getContainerId(), status.getLogAggregationState()); } } } finally { this .writeLock.unlock(); } }
          Hide
          xgong Xuan Gong added a comment -

          LogAggregationState: DISABLE -> DISABLED, NOT_START -> NOT_STARTED

          Changed

          Log Aggregation is NM side config, this is getting from RM itself.

          Yes, you are right. Removed. Will rely on the containerLogAggregationState.

          LogAggregationStatus may come via heartbeat before FinalTransition is called, inside which containerLogAggregationStatus is initialized with the containers. In this case, the log status is lost.

          Removed the initialization in FinalTransition. Only get the number of finished Containers at FinalTransition state

          Show
          xgong Xuan Gong added a comment - LogAggregationState: DISABLE -> DISABLED, NOT_START -> NOT_STARTED Changed Log Aggregation is NM side config, this is getting from RM itself. Yes, you are right. Removed. Will rely on the containerLogAggregationState. LogAggregationStatus may come via heartbeat before FinalTransition is called, inside which containerLogAggregationStatus is initialized with the containers. In this case, the log status is lost. Removed the initialization in FinalTransition. Only get the number of finished Containers at FinalTransition state
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12612118/YARN-1279.5.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          -1 javac. The patch appears to cause the build to fail.

          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2370//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612118/YARN-1279.5.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 9 new or modified test files. -1 javac . The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2370//console This message is automatically generated.
          Hide
          xgong Xuan Gong added a comment -

          The RMNode will receive one applicationLogStatus per application (which includes applicationId, applicationLogAggregationState, Map<ContainerId, LogAggregationState>) through the NM heartBeat.

          Also add time_out state in LogAggregationState. If some NMs is shut down, it is very possible that we will miss the applicationLogStatus from those NMs. In stead of keep showing the IN_PROGRESS state, we can use TIME_OUT instead based on how long we wait.

          Show
          xgong Xuan Gong added a comment - The RMNode will receive one applicationLogStatus per application (which includes applicationId, applicationLogAggregationState, Map<ContainerId, LogAggregationState>) through the NM heartBeat. Also add time_out state in LogAggregationState. If some NMs is shut down, it is very possible that we will miss the applicationLogStatus from those NMs. In stead of keep showing the IN_PROGRESS state, we can use TIME_OUT instead based on how long we wait.
          Hide
          xgong Xuan Gong added a comment -

          Upload an initial patch on YARN-1376 for the NM side changes. You may want to take a look to get the whole picture

          Show
          xgong Xuan Gong added a comment - Upload an initial patch on YARN-1376 for the NM side changes. You may want to take a look to get the whole picture
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12612422/YARN-1279.6.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

          org.apache.hadoop.yarn.client.cli.TestYarnCLI

          The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

          org.apache.hadoop.mapreduce.v2.TestUberAM

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2384//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2384//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612422/YARN-1279.6.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 9 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 javadoc . The javadoc tool appears to have generated 1 warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.client.cli.TestYarnCLI The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.mapreduce.v2.TestUberAM +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2384//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2384//console This message is automatically generated.
          Hide
          xgong Xuan Gong added a comment -

          Fix -1 on javadoc warning
          Fix TestYarnCLI test case failure

          Show
          xgong Xuan Gong added a comment - Fix -1 on javadoc warning Fix TestYarnCLI test case failure
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12612450/YARN-1279.7.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

          org.apache.hadoop.mapred.TestJobCleanup

          The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

          org.apache.hadoop.mapreduce.v2.TestUberAM
          org.apache.hadoop.mapred.TestMultiFileInputFormat

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2386//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2386//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612450/YARN-1279.7.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 9 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.mapred.TestJobCleanup The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.mapreduce.v2.TestUberAM org.apache.hadoop.mapred.TestMultiFileInputFormat +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2386//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2386//console This message is automatically generated.
          Hide
          jianhe Jian He added a comment -
          • Does it make sense to send an event to RMApp to process the app log status, instead of explicitly creating an update api of RMApp?
          • why is it possible for a single node to first get log aggregation succeeded then Failed ?
            currentState == LogAggregationState.COMPLETED
                      && status.getLogAggregationState() == LogAggregationState.FAILED
            
          • I think we can have separate maps, one for completed succeeded node aggregation, the other for failed node aggregation. Then we don't need two more counters for counting succeeded or failed nodes and those increment/decrement logic.
          • It's good to append failed log aggregation node info and also the diagnostics coming with ApplicationLogStatus to the diagnostics of the app.
          • Do we need a separate Timeout state? is it good to append the timeout diagnostics and return the state as Failed ?
          • this code logic can be simplified to say, if exceeds timeout period return FAILED or Timeout, otherwise return In_Progress. And so we can remove the logAggregationTimeOutDisabled boolean.
             if (this.logAggregationTimeOutDisabled) {
                      return LogAggregationState.IN_PROGRESS;
                    } else {
                      if (System.currentTimeMillis() - this.finishTime <=
                          this.logAggregationTimeOut) {
                        return LogAggregationState.IN_PROGRESS;
                      }
                      return LogAggregationState.TIME_OUT;
                    }
            
          • containerLogAggregationFail doesn't need to be atmoicBoolean
          • ApplicationLogStatus is better to be named as ApplicationLogAggregationStatus
          • Vinod Kumar Vavilapalli For the time being, do we want to keep both application level log status as well as per container log status in the ApplicationLogStatus.java which is sent from NM to RM?
          Show
          jianhe Jian He added a comment - Does it make sense to send an event to RMApp to process the app log status, instead of explicitly creating an update api of RMApp? why is it possible for a single node to first get log aggregation succeeded then Failed ? currentState == LogAggregationState.COMPLETED && status.getLogAggregationState() == LogAggregationState.FAILED I think we can have separate maps, one for completed succeeded node aggregation, the other for failed node aggregation. Then we don't need two more counters for counting succeeded or failed nodes and those increment/decrement logic. It's good to append failed log aggregation node info and also the diagnostics coming with ApplicationLogStatus to the diagnostics of the app. Do we need a separate Timeout state? is it good to append the timeout diagnostics and return the state as Failed ? this code logic can be simplified to say, if exceeds timeout period return FAILED or Timeout, otherwise return In_Progress. And so we can remove the logAggregationTimeOutDisabled boolean. if ( this .logAggregationTimeOutDisabled) { return LogAggregationState.IN_PROGRESS; } else { if ( System .currentTimeMillis() - this .finishTime <= this .logAggregationTimeOut) { return LogAggregationState.IN_PROGRESS; } return LogAggregationState.TIME_OUT; } containerLogAggregationFail doesn't need to be atmoicBoolean ApplicationLogStatus is better to be named as ApplicationLogAggregationStatus Vinod Kumar Vavilapalli For the time being, do we want to keep both application level log status as well as per container log status in the ApplicationLogStatus.java which is sent from NM to RM?
          Hide
          xgong Xuan Gong added a comment -

          Does it make sense to send an event to RMApp to process the app log status, instead of explicitly creating an update api of RMApp?

          We can do that. Add a new RMAppEvent (RMAppLogAggregationStatusUpdateEvent) in FINISHED state and KILLED state

          why is it possible for a single node to first get log aggregation succeeded then Failed ?

          Actually, I spend more time to think about this question. I made some changes in NM side. The ApplicationLogAggregationStatus will be set into NMContext when the applicationImpl receives the APPLICATION_LOG_HANDLING_FINISHED event. In that case, we can make sure that NM only send the logAggregationStatus out if the RMApp is finished/killed. This also makes sure that we will not receive two different logAggregationStatus from the same NM.

          I think we can have separate maps, one for completed succeeded node aggregation, the other for failed node aggregation. Then we don't need two more counters for counting succeeded or failed nodes and those increment/decrement logic.

          Make sense. Changed

          It's good to append failed log aggregation node info and also the diagnostics coming with ApplicationLogStatus to the diagnostics of the app.

          Added

          Do we need a separate Timeout state? is it good to append the timeout diagnostics and return the state as Failed ? this code logic can be simplified to say, if exceeds timeout period return FAILED or Timeout, otherwise return In_Progress. And so we can remove the logAggregationTimeOutDisabled boolean.

          About this, I still prefer to keep the status something like TIME_OUT. Because it is different when we say logAggregation is time_out between it is failed. For my understanding, the process of Log aggregation should be quick(At nm side). But it will definitely add lots of delay to notify the RMApp. So, the time_out means "we are waitting for a long time. I do not the current state right now". It may be better to simply say the log aggregation is failed.

          ApplicationLogStatus is better to be named as ApplicationLogAggregationStatus

          renamed

          containerLogAggregationFail doesn't need to be atmoicBoolean

          changed

          Show
          xgong Xuan Gong added a comment - Does it make sense to send an event to RMApp to process the app log status, instead of explicitly creating an update api of RMApp? We can do that. Add a new RMAppEvent (RMAppLogAggregationStatusUpdateEvent) in FINISHED state and KILLED state why is it possible for a single node to first get log aggregation succeeded then Failed ? Actually, I spend more time to think about this question. I made some changes in NM side. The ApplicationLogAggregationStatus will be set into NMContext when the applicationImpl receives the APPLICATION_LOG_HANDLING_FINISHED event. In that case, we can make sure that NM only send the logAggregationStatus out if the RMApp is finished/killed. This also makes sure that we will not receive two different logAggregationStatus from the same NM. I think we can have separate maps, one for completed succeeded node aggregation, the other for failed node aggregation. Then we don't need two more counters for counting succeeded or failed nodes and those increment/decrement logic. Make sense. Changed It's good to append failed log aggregation node info and also the diagnostics coming with ApplicationLogStatus to the diagnostics of the app. Added Do we need a separate Timeout state? is it good to append the timeout diagnostics and return the state as Failed ? this code logic can be simplified to say, if exceeds timeout period return FAILED or Timeout, otherwise return In_Progress. And so we can remove the logAggregationTimeOutDisabled boolean. About this, I still prefer to keep the status something like TIME_OUT. Because it is different when we say logAggregation is time_out between it is failed. For my understanding, the process of Log aggregation should be quick(At nm side). But it will definitely add lots of delay to notify the RMApp. So, the time_out means "we are waitting for a long time. I do not the current state right now". It may be better to simply say the log aggregation is failed. ApplicationLogStatus is better to be named as ApplicationLogAggregationStatus renamed containerLogAggregationFail doesn't need to be atmoicBoolean changed
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12612740/YARN-1279.8.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          -1 javac. The patch appears to cause the build to fail.

          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2393//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612740/YARN-1279.8.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 9 new or modified test files. -1 javac . The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2393//console This message is automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12612745/YARN-1279.8.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          -1 javac. The patch appears to cause the build to fail.

          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2394//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612745/YARN-1279.8.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 9 new or modified test files. -1 javac . The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2394//console This message is automatically generated.
          Hide
          xgong Xuan Gong added a comment -

          this code logic can be simplified to say, if exceeds timeout period return FAILED or Timeout, otherwise return In_Progress. And so we can remove the logAggregationTimeOutDisabled boolean.

          Make senses. Delete logAggregationTimeOutDisabled boolean. If the clients set logAggregationTimeOut value as negative number, will use default value instead

          Show
          xgong Xuan Gong added a comment - this code logic can be simplified to say, if exceeds timeout period return FAILED or Timeout, otherwise return In_Progress. And so we can remove the logAggregationTimeOutDisabled boolean. Make senses. Delete logAggregationTimeOutDisabled boolean. If the clients set logAggregationTimeOut value as negative number, will use default value instead
          Hide
          jianhe Jian He added a comment -
          • updateLogAggregationStatus, doesn't need write-lock protection, state-machine has write-lock protection already.
          • LOG_AGGREGATION_WATTING_MS, unit can be seconds instead of millisecond, like LOG_AGGREGATION_WAIT_SECONDS
          • LogAggregationState.COMPLETED rename to FINISHED ?
          • why RMApp Failed state doesn't receive LOG_AGGREGATION_STATUS_UPDATE event ?
          • Use the wrong configure value.
                  this.logAggregationTimeOut =
                      YarnConfiguration.DEFAULT_LOG_AGGREGATION_RETAIN_CHECK_INTERVAL_SECONDS;
            
          • I think better unit tests would be using MockRM to submit a job and finish that job. Use MockNM.nodeHeartBeat() method, inside which customize NodeStatus with ApplicationLogAggregationStatus, and call that method to interact with RM. And also use ClientRMService.getApplicationReport to assert the expected logAggregationState.
            In this way, we cover the whole picture including NM side changes and client side changes. You can see example from TestRMRestart.testRMRestartSucceededApp.
            Given YARN-1376 is not that big, we can incorporate that into this patch also.
          Show
          jianhe Jian He added a comment - updateLogAggregationStatus, doesn't need write-lock protection, state-machine has write-lock protection already. LOG_AGGREGATION_WATTING_MS, unit can be seconds instead of millisecond, like LOG_AGGREGATION_WAIT_SECONDS LogAggregationState.COMPLETED rename to FINISHED ? why RMApp Failed state doesn't receive LOG_AGGREGATION_STATUS_UPDATE event ? Use the wrong configure value. this .logAggregationTimeOut = YarnConfiguration.DEFAULT_LOG_AGGREGATION_RETAIN_CHECK_INTERVAL_SECONDS; I think better unit tests would be using MockRM to submit a job and finish that job. Use MockNM.nodeHeartBeat() method, inside which customize NodeStatus with ApplicationLogAggregationStatus, and call that method to interact with RM. And also use ClientRMService.getApplicationReport to assert the expected logAggregationState. In this way, we cover the whole picture including NM side changes and client side changes. You can see example from TestRMRestart.testRMRestartSucceededApp. Given YARN-1376 is not that big, we can incorporate that into this patch also.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12612890/YARN-1279.9.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

          org.apache.hadoop.mapreduce.v2.TestUberAM

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2397//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2397//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612890/YARN-1279.9.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 9 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.mapreduce.v2.TestUberAM +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2397//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2397//console This message is automatically generated.
          Hide
          xgong Xuan Gong added a comment -

          updateLogAggregationStatus, doesn't need write-lock protection, state-machine has write-lock protection already.

          Removed

          LOG_AGGREGATION_WATTING_MS, unit can be seconds instead of millisecond, like LOG_AGGREGATION_WAIT_SECONDS

          Changed

          LogAggregationState.COMPLETED rename to FINISHED ?

          Changed

          why RMApp Failed state doesn't receive LOG_AGGREGATION_STATUS_UPDATE event ?

          Added

          Use the wrong configure value.

          Fixed

          I think better unit tests would be using MockRM to submit a job and finish that job. Use MockNM.nodeHeartBeat() method, inside which customize NodeStatus with ApplicationLogAggregationStatus, and call that method to interact with RM. And also use ClientRMService.getApplicationReport to assert the expected logAggregationState.

          In this way, we cover the whole picture including NM side changes and client side changes. You can see example from TestRMRestart.testRMRestartSucceededApp.

          changed

          Show
          xgong Xuan Gong added a comment - updateLogAggregationStatus, doesn't need write-lock protection, state-machine has write-lock protection already. Removed LOG_AGGREGATION_WATTING_MS, unit can be seconds instead of millisecond, like LOG_AGGREGATION_WAIT_SECONDS Changed LogAggregationState.COMPLETED rename to FINISHED ? Changed why RMApp Failed state doesn't receive LOG_AGGREGATION_STATUS_UPDATE event ? Added Use the wrong configure value. Fixed I think better unit tests would be using MockRM to submit a job and finish that job. Use MockNM.nodeHeartBeat() method, inside which customize NodeStatus with ApplicationLogAggregationStatus, and call that method to interact with RM. And also use ClientRMService.getApplicationReport to assert the expected logAggregationState. In this way, we cover the whole picture including NM side changes and client side changes. You can see example from TestRMRestart.testRMRestartSucceededApp. changed
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12613076/YARN-1279.11.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 10 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

          org.apache.hadoop.mapreduce.v2.TestUberAM

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2407//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2407//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12613076/YARN-1279.11.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 10 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.mapreduce.v2.TestUberAM +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2407//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2407//console This message is automatically generated.
          Hide
          acmurthy Arun C Murthy added a comment -

          Cleaning up stale PA patches.

          Show
          acmurthy Arun C Murthy added a comment - Cleaning up stale PA patches.
          Hide
          xgong Xuan Gong added a comment -

          Close this ticket since YARN-1376 and YARN-1402 have been fixed.

          Show
          xgong Xuan Gong added a comment - Close this ticket since YARN-1376 and YARN-1402 have been fixed.

            People

            • Assignee:
              xgong Xuan Gong
              Reporter:
              acmurthy Arun C Murthy
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development