Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-6872

Ensure apps could run given NodeLabels are disabled post RM switchover/restart

    Details

      Description

      Post YARN-6031, few apps could be failed during recovery provided they had some label requirements for AM and labels were disable post RM restart/switchover. As discussed in YARN-6031, its better to run such apps as it may be long running apps as well.

      1. YARN-6872.001.patch
        5 kB
        Sunil G
      2. YARN-6872.002.patch
        9 kB
        Sunil G
      3. YARN-6872.003.patch
        11 kB
        Sunil G
      4. YARN-6872-addendum.001.patch
        5 kB
        Sunil G

        Activity

        Hide
        sunilg Sunil G added a comment -

        Attached an initial version of patch.

        cc/Jian He Wangda Tan Rohith Sharma K S Ying Zhang

        Show
        sunilg Sunil G added a comment - Attached an initial version of patch. cc/ Jian He Wangda Tan Rohith Sharma K S Ying Zhang
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Continue on label validation failure is good if AM is already launched. But what will happens if app is in accepted state and recovered?
        If application is not killed/failed during recovery, then need to handle in RMAppAttemptImpl schedule transition where AM RR is sent to scheduler.

        Show
        rohithsharma Rohith Sharma K S added a comment - Continue on label validation failure is good if AM is already launched. But what will happens if app is in accepted state and recovered? If application is not killed/failed during recovery, then need to handle in RMAppAttemptImpl schedule transition where AM RR is sent to scheduler.
        Hide
        sunilg Sunil G added a comment -

        Thanks Rohith Sharma K S

        When new container requests comes, its been validated in ApplicationMasterService#allocate and InvalidResourceRequestException is thrown back to AM. Hence scheduler will not have problem. At most what could happen is, AM container may never get scheduled and it may remain in scheduler queue until admin corrects the pblm with labels. (enabled/disable)

        Show
        sunilg Sunil G added a comment - Thanks Rohith Sharma K S When new container requests comes, its been validated in ApplicationMasterService#allocate and InvalidResourceRequestException is thrown back to AM. Hence scheduler will not have problem. At most what could happen is, AM container may never get scheduled and it may remain in scheduler queue until admin corrects the pblm with labels. (enabled/disable)
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Ahh I see. I submitted the patch for triggering Jenkins

        Show
        rohithsharma Rohith Sharma K S added a comment - Ahh I see. I submitted the patch for triggering Jenkins
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 18s Docker mode activated.
              Prechecks
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
              trunk Compile Tests
        +1 mvninstall 13m 40s trunk passed
        +1 compile 0m 34s trunk passed
        +1 checkstyle 0m 27s trunk passed
        +1 mvnsite 0m 36s trunk passed
        +1 findbugs 0m 59s trunk passed
        +1 javadoc 0m 23s trunk passed
              Patch Compile Tests
        +1 mvninstall 0m 33s the patch passed
        +1 compile 0m 31s the patch passed
        +1 javac 0m 31s the patch passed
        -0 checkstyle 0m 24s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 122 unchanged - 0 fixed = 123 total (was 122)
        +1 mvnsite 0m 34s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 1m 13s the patch passed
        +1 javadoc 0m 21s the patch passed
              Other Tests
        -1 unit 43m 53s hadoop-yarn-server-resourcemanager in the patch failed.
        +1 asflicense 0m 18s The patch does not generate ASF License warnings.
        65m 58s



        Reason Tests
        Failed junit tests hadoop.yarn.server.resourcemanager.TestRMRestart



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:14b5c93
        JIRA Issue YARN-6872
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12878988/YARN-6872.001.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 4ada1cf81f0d 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 27a1a5f
        Default Java 1.8.0_131
        findbugs v3.1.0-RC1
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/16574/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
        unit https://builds.apache.org/job/PreCommit-YARN-Build/16574/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/16574/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/16574/console
        Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 18s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.       trunk Compile Tests +1 mvninstall 13m 40s trunk passed +1 compile 0m 34s trunk passed +1 checkstyle 0m 27s trunk passed +1 mvnsite 0m 36s trunk passed +1 findbugs 0m 59s trunk passed +1 javadoc 0m 23s trunk passed       Patch Compile Tests +1 mvninstall 0m 33s the patch passed +1 compile 0m 31s the patch passed +1 javac 0m 31s the patch passed -0 checkstyle 0m 24s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 122 unchanged - 0 fixed = 123 total (was 122) +1 mvnsite 0m 34s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 13s the patch passed +1 javadoc 0m 21s the patch passed       Other Tests -1 unit 43m 53s hadoop-yarn-server-resourcemanager in the patch failed. +1 asflicense 0m 18s The patch does not generate ASF License warnings. 65m 58s Reason Tests Failed junit tests hadoop.yarn.server.resourcemanager.TestRMRestart Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue YARN-6872 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12878988/YARN-6872.001.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 4ada1cf81f0d 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 27a1a5f Default Java 1.8.0_131 findbugs v3.1.0-RC1 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/16574/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/16574/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/16574/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/16574/console Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        jianhe Jian He added a comment -

        isRecovery flag is already passed into SchedulerUtils#normalizeAndValidateRequest, I think we can use that flag directly ?

        And this block of code can be removed now ?

            // If null amReq has been returned, check if it is the case that
            // application has specified node label expression while node label
            // has been disabled. Reject the recovery of this application if it
            // is true and give clear message so that user can react properly.
            if (!appContext.getUnmanagedAM() &&
                (application.getAMResourceRequests() == null ||
                    application.getAMResourceRequests().isEmpty()) &&
                !YarnConfiguration.areNodeLabelsEnabled(this.conf)) {
              // check application submission context and see if am resource request
              // or application itself contains any node label expression.
              List<ResourceRequest> amReqsFromAppContext =
                  appContext.getAMContainerResourceRequests();
              String labelExp =
                  (amReqsFromAppContext != null && !amReqsFromAppContext.isEmpty()) ?
                  amReqsFromAppContext.get(0).getNodeLabelExpression() : null;
              if (labelExp == null) {
                labelExp = appContext.getNodeLabelExpression();
              }
              if (labelExp != null &&
                  !labelExp.equals(RMNodeLabelsManager.NO_LABEL)) {
                String message = "Application recovered " + appId
                    + ". NodeLabel is not enabled in cluster, but AM resource request "
                    + "contains a label expression. Consider for NO_LABEL.";
                LOG.warn(message);
              }
            }
        

        Did you verify that the labeled resource will be counted as non-labeled resource after RM restart with node label disabled?

        Show
        jianhe Jian He added a comment - isRecovery flag is already passed into SchedulerUtils#normalizeAndValidateRequest, I think we can use that flag directly ? And this block of code can be removed now ? // If null amReq has been returned, check if it is the case that // application has specified node label expression while node label // has been disabled. Reject the recovery of this application if it // is true and give clear message so that user can react properly. if (!appContext.getUnmanagedAM() && (application.getAMResourceRequests() == null || application.getAMResourceRequests().isEmpty()) && !YarnConfiguration.areNodeLabelsEnabled( this .conf)) { // check application submission context and see if am resource request // or application itself contains any node label expression. List<ResourceRequest> amReqsFromAppContext = appContext.getAMContainerResourceRequests(); String labelExp = (amReqsFromAppContext != null && !amReqsFromAppContext.isEmpty()) ? amReqsFromAppContext.get(0).getNodeLabelExpression() : null ; if (labelExp == null ) { labelExp = appContext.getNodeLabelExpression(); } if (labelExp != null && !labelExp.equals(RMNodeLabelsManager.NO_LABEL)) { String message = "Application recovered " + appId + ". NodeLabel is not enabled in cluster, but AM resource request " + "contains a label expression. Consider for NO_LABEL." ; LOG.warn(message); } } Did you verify that the labeled resource will be counted as non-labeled resource after RM restart with node label disabled?
        Hide
        sunilg Sunil G added a comment -

        Thanks Jian He
        I did some manual tests and I am seeing the app is in ACCEPTED state after RM is restart with node-labels as false. If i reset it back to true, app is completing.

        I will check whether we can improve and can show app as RUNNING. Please share your thoughts meanwhile.

        Show
        sunilg Sunil G added a comment - Thanks Jian He I did some manual tests and I am seeing the app is in ACCEPTED state after RM is restart with node-labels as false. If i reset it back to true, app is completing. I will check whether we can improve and can show app as RUNNING. Please share your thoughts meanwhile.
        Hide
        sunilg Sunil G added a comment -

        NM work preserving was off. Now I can see that resources are coming correctly.

        However I am seeing an issue with Cluster Metrics. Its coming -ve or wrong after RM restart. Even without node label disabled scenario, metrics are wrong. I think it should be handled in another ticket as metrics calculation is wrong after running app recovery and RM work preserving restart (when labels are used).

        Please suggest whether we need to include metrics issue also here.
        cc/Wangda Tan and Jian He

        Show
        sunilg Sunil G added a comment - NM work preserving was off. Now I can see that resources are coming correctly. However I am seeing an issue with Cluster Metrics. Its coming -ve or wrong after RM restart. Even without node label disabled scenario, metrics are wrong. I think it should be handled in another ticket as metrics calculation is wrong after running app recovery and RM work preserving restart (when labels are used). Please suggest whether we need to include metrics issue also here. cc/ Wangda Tan and Jian He
        Hide
        sunilg Sunil G added a comment -

        isRecovery flag is already passed into SchedulerUtils#normalizeAndValidateRequest, I think we can use that flag directly ?

        In normalizeAndValidateRequest, we check for the node label disabled check before looking for isRecovery flag. We could do some optimization here, but its a public api. I will check all references.

        And this block of code can be removed now ?

        This could be removed. Just a thought ,that log is helpful rt. If we need that log, i guess some checks are needed there.

        Show
        sunilg Sunil G added a comment - isRecovery flag is already passed into SchedulerUtils#normalizeAndValidateRequest, I think we can use that flag directly ? In normalizeAndValidateRequest , we check for the node label disabled check before looking for isRecovery flag. We could do some optimization here, but its a public api. I will check all references. And this block of code can be removed now ? This could be removed. Just a thought ,that log is helpful rt. If we need that log, i guess some checks are needed there.
        Hide
        jianhe Jian He added a comment -

        In normalizeAndValidateRequest, we check for the node label disabled check before looking for isRecovery flag. We could do some optimization here, but its a public api. I will check all references.

        Basically, if we check isRecovery outside, then the isRecovery flag parameter is redundant. I was checking if other methods in normalizeAndValidateRequest was needed for recovery such as SchedulerUtils.normalizeNodeLabelExpressionInRequest.
        I think this is still required if node label is enabled and this is recovery ?

        This could be removed. Just a thought ,that log is helpful rt. If we need that log, i guess some checks are needed there.

        Just for logging, we could do it inside normalizeAndValidateRequest itself which will be simpler

        Show
        jianhe Jian He added a comment - In normalizeAndValidateRequest, we check for the node label disabled check before looking for isRecovery flag. We could do some optimization here, but its a public api. I will check all references. Basically, if we check isRecovery outside, then the isRecovery flag parameter is redundant. I was checking if other methods in normalizeAndValidateRequest was needed for recovery such as SchedulerUtils.normalizeNodeLabelExpressionInRequest. I think this is still required if node label is enabled and this is recovery ? This could be removed. Just a thought ,that log is helpful rt. If we need that log, i guess some checks are needed there. Just for logging, we could do it inside normalizeAndValidateRequest itself which will be simpler
        Hide
        sunilg Sunil G added a comment -

        Attaching new patch addressing Jian's comments

        Show
        sunilg Sunil G added a comment - Attaching new patch addressing Jian's comments
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 14s Docker mode activated.
              Prechecks
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
              trunk Compile Tests
        +1 mvninstall 13m 12s trunk passed
        +1 compile 0m 33s trunk passed
        +1 checkstyle 0m 24s trunk passed
        +1 mvnsite 0m 36s trunk passed
        +1 findbugs 1m 0s trunk passed
        +1 javadoc 0m 20s trunk passed
              Patch Compile Tests
        +1 mvninstall 0m 32s the patch passed
        +1 compile 0m 32s the patch passed
        +1 javac 0m 32s the patch passed
        +1 checkstyle 0m 22s the patch passed
        +1 mvnsite 0m 32s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 1m 7s the patch passed
        +1 javadoc 0m 18s the patch passed
              Other Tests
        -1 unit 44m 15s hadoop-yarn-server-resourcemanager in the patch failed.
        +1 asflicense 0m 13s The patch does not generate ASF License warnings.
        65m 25s



        Reason Tests
        Failed junit tests hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:14b5c93
        JIRA Issue YARN-6872
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12879794/YARN-6872.002.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux cc007828ceaf 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / b38a1ee
        Default Java 1.8.0_131
        findbugs v3.1.0-RC1
        unit https://builds.apache.org/job/PreCommit-YARN-Build/16639/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/16639/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/16639/console
        Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.       trunk Compile Tests +1 mvninstall 13m 12s trunk passed +1 compile 0m 33s trunk passed +1 checkstyle 0m 24s trunk passed +1 mvnsite 0m 36s trunk passed +1 findbugs 1m 0s trunk passed +1 javadoc 0m 20s trunk passed       Patch Compile Tests +1 mvninstall 0m 32s the patch passed +1 compile 0m 32s the patch passed +1 javac 0m 32s the patch passed +1 checkstyle 0m 22s the patch passed +1 mvnsite 0m 32s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 7s the patch passed +1 javadoc 0m 18s the patch passed       Other Tests -1 unit 44m 15s hadoop-yarn-server-resourcemanager in the patch failed. +1 asflicense 0m 13s The patch does not generate ASF License warnings. 65m 25s Reason Tests Failed junit tests hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue YARN-6872 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12879794/YARN-6872.002.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux cc007828ceaf 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / b38a1ee Default Java 1.8.0_131 findbugs v3.1.0-RC1 unit https://builds.apache.org/job/PreCommit-YARN-Build/16639/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/16639/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/16639/console Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        sunilg Sunil G added a comment -

        During recovery of containers from node manager, if the recovered container has label and node label is disabled in cluster, we can consider that container to default label. This help to handle metrics issue correctly.

        cc/Jian He Wangda Tan Rohith Sharma K S

        Show
        sunilg Sunil G added a comment - During recovery of containers from node manager, if the recovered container has label and node label is disabled in cluster, we can consider that container to default label. This help to handle metrics issue correctly. cc/ Jian He Wangda Tan Rohith Sharma K S
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 16s Docker mode activated.
              Prechecks
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
              trunk Compile Tests
        +1 mvninstall 13m 29s trunk passed
        +1 compile 0m 33s trunk passed
        +1 checkstyle 0m 27s trunk passed
        +1 mvnsite 0m 35s trunk passed
        +1 findbugs 0m 59s trunk passed
        +1 javadoc 0m 21s trunk passed
              Patch Compile Tests
        +1 mvninstall 0m 32s the patch passed
        +1 compile 0m 31s the patch passed
        +1 javac 0m 31s the patch passed
        +1 checkstyle 0m 24s the patch passed
        +1 mvnsite 0m 33s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 1m 6s the patch passed
        +1 javadoc 0m 18s the patch passed
              Other Tests
        -1 unit 46m 7s hadoop-yarn-server-resourcemanager in the patch failed.
        +1 asflicense 0m 14s The patch does not generate ASF License warnings.
        67m 40s



        Reason Tests
        Failed junit tests hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation
          hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer
        Timed out junit tests org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:14b5c93
        JIRA Issue YARN-6872
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12879808/YARN-6872.003.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 1163462ee16e 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / b38a1ee
        Default Java 1.8.0_131
        findbugs v3.1.0-RC1
        unit https://builds.apache.org/job/PreCommit-YARN-Build/16640/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/16640/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/16640/console
        Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 16s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.       trunk Compile Tests +1 mvninstall 13m 29s trunk passed +1 compile 0m 33s trunk passed +1 checkstyle 0m 27s trunk passed +1 mvnsite 0m 35s trunk passed +1 findbugs 0m 59s trunk passed +1 javadoc 0m 21s trunk passed       Patch Compile Tests +1 mvninstall 0m 32s the patch passed +1 compile 0m 31s the patch passed +1 javac 0m 31s the patch passed +1 checkstyle 0m 24s the patch passed +1 mvnsite 0m 33s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 6s the patch passed +1 javadoc 0m 18s the patch passed       Other Tests -1 unit 46m 7s hadoop-yarn-server-resourcemanager in the patch failed. +1 asflicense 0m 14s The patch does not generate ASF License warnings. 67m 40s Reason Tests Failed junit tests hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation   hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer Timed out junit tests org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue YARN-6872 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12879808/YARN-6872.003.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 1163462ee16e 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / b38a1ee Default Java 1.8.0_131 findbugs v3.1.0-RC1 unit https://builds.apache.org/job/PreCommit-YARN-Build/16640/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/16640/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/16640/console Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        bibinchundatt Bibin A Chundatt added a comment -

        During recovery of containers from node manager, if the recovered container has label and node label is disabled in cluster, we can consider that container to default label. This help to handle metrics issue correctly.

        I dont see any issue with this change.

        Show
        bibinchundatt Bibin A Chundatt added a comment - During recovery of containers from node manager, if the recovered container has label and node label is disabled in cluster, we can consider that container to default label. This help to handle metrics issue correctly. I dont see any issue with this change.
        Hide
        sunilg Sunil G added a comment -

        Test case failures are known.

        Show
        sunilg Sunil G added a comment - Test case failures are known.
        Hide
        hudson Hudson added a comment -

        ABORTED: Integrated in Jenkins build Hadoop-trunk-Commit #12090 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12090/)
        YARN-6872. Ensure apps could run given NodeLabels are disabled post RM (jianhe: rev 91f120f743662c6e037e8f21b1792e81d58ac664)

        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
        Show
        hudson Hudson added a comment - ABORTED: Integrated in Jenkins build Hadoop-trunk-Commit #12090 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12090/ ) YARN-6872 . Ensure apps could run given NodeLabels are disabled post RM (jianhe: rev 91f120f743662c6e037e8f21b1792e81d58ac664) (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
        Hide
        sunilg Sunil G added a comment -

        when we use non-exclusive node labels also we could get same issue. Updating an addendum patch to cover that scenario as well.
        Thanks Wangda Tan and Jian He

        Show
        sunilg Sunil G added a comment - when we use non-exclusive node labels also we could get same issue. Updating an addendum patch to cover that scenario as well. Thanks Wangda Tan and Jian He
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 20s Docker mode activated.
              Prechecks
        +1 @author 0m 0s The patch does not contain any @author tags.
        -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
              trunk Compile Tests
        +1 mvninstall 14m 42s trunk passed
        +1 compile 0m 33s trunk passed
        +1 checkstyle 0m 26s trunk passed
        +1 mvnsite 0m 35s trunk passed
        +1 findbugs 1m 0s trunk passed
        +1 javadoc 0m 21s trunk passed
              Patch Compile Tests
        +1 mvninstall 0m 33s the patch passed
        +1 compile 0m 31s the patch passed
        +1 javac 0m 31s the patch passed
        -0 checkstyle 0m 23s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 69 unchanged - 0 fixed = 70 total (was 69)
        +1 mvnsite 0m 34s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 1m 11s the patch passed
        +1 javadoc 0m 22s the patch passed
              Other Tests
        -1 unit 43m 57s hadoop-yarn-server-resourcemanager in the patch failed.
        +1 asflicense 0m 14s The patch does not generate ASF License warnings.
        67m 1s



        Reason Tests
        Failed junit tests hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer
          hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:14b5c93
        JIRA Issue YARN-6872
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12879882/YARN-6872-addendum.001.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 7f79a5e72546 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 91f120f
        Default Java 1.8.0_131
        findbugs v3.1.0-RC1
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/16648/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
        unit https://builds.apache.org/job/PreCommit-YARN-Build/16648/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/16648/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/16648/console
        Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 20s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.       trunk Compile Tests +1 mvninstall 14m 42s trunk passed +1 compile 0m 33s trunk passed +1 checkstyle 0m 26s trunk passed +1 mvnsite 0m 35s trunk passed +1 findbugs 1m 0s trunk passed +1 javadoc 0m 21s trunk passed       Patch Compile Tests +1 mvninstall 0m 33s the patch passed +1 compile 0m 31s the patch passed +1 javac 0m 31s the patch passed -0 checkstyle 0m 23s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 69 unchanged - 0 fixed = 70 total (was 69) +1 mvnsite 0m 34s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 11s the patch passed +1 javadoc 0m 22s the patch passed       Other Tests -1 unit 43m 57s hadoop-yarn-server-resourcemanager in the patch failed. +1 asflicense 0m 14s The patch does not generate ASF License warnings. 67m 1s Reason Tests Failed junit tests hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer   hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue YARN-6872 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12879882/YARN-6872-addendum.001.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 7f79a5e72546 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 91f120f Default Java 1.8.0_131 findbugs v3.1.0-RC1 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/16648/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/16648/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/16648/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/16648/console Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        jianhe Jian He added a comment -

        Committed to trunk, branch-2, branch-2.8 thanks Sunil !

        Show
        jianhe Jian He added a comment - Committed to trunk, branch-2, branch-2.8 thanks Sunil !
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12101 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12101/)
        YARN-6872. [Addendum patch] Ensure apps could run given NodeLabels are (jianhe: rev f9139ac8f60184a82a8bb315237bea04bdb98ec8)

        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12101 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12101/ ) YARN-6872 . [Addendum patch] Ensure apps could run given NodeLabels are (jianhe: rev f9139ac8f60184a82a8bb315237bea04bdb98ec8) (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java
        Hide
        sunilg Sunil G added a comment -

        Thank you very much Jian He and Wangda Tan for review and commit!

        Show
        sunilg Sunil G added a comment - Thank you very much Jian He and Wangda Tan for review and commit!
        Hide
        djp Junping Du added a comment -

        I have backport the commit to branch-2.8.2.

        Show
        djp Junping Du added a comment - I have backport the commit to branch-2.8.2.

          People

          • Assignee:
            sunilg Sunil G
            Reporter:
            sunilg Sunil G
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development