Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3860

rmadmin -transitionToActive should check the state of non-target node

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.7.0
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: resourcemanager
    • Labels:
      None
    • Target Version/s:

      Description

      Users can make both ResouceManagers active by rmadmin -transitionToActive even if --forceactive option is not given. haadmin -transitionToActive of HDFS checks whether non-target nodes are already active but rmadmin -transitionToActive does not do.

      1. YARN-3860.001.patch
        2 kB
        Masatake Iwasaki
      2. YARN-3860.002.patch
        2 kB
        Masatake Iwasaki
      3. YARN-3860.003.patch
        2 kB
        Masatake Iwasaki

        Activity

        Hide
        iwasakims Masatake Iwasaki added a comment -

        HAAdmin#isOtherTargetNodeActive does not checks the other nodes are active without overriding HAAdmin#getTargetIds which returns list including only given target id. RMAdminCLI should have getTargetIds method which returns list of all node ids as DFSHAAdmin do.

        Show
        iwasakims Masatake Iwasaki added a comment - HAAdmin#isOtherTargetNodeActive does not checks the other nodes are active without overriding HAAdmin#getTargetIds which returns list including only given target id. RMAdminCLI should have getTargetIds method which returns list of all node ids as DFSHAAdmin do.
        Hide
        iwasakims Masatake Iwasaki added a comment -

        I fixed inappropriate name of the argument variable of getTargetIds in 002.

        Show
        iwasakims Masatake Iwasaki added a comment - I fixed inappropriate name of the argument variable of getTargetIds in 002.
        Hide
        hadoopqa Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        0 pre-patch 15m 41s Pre-patch trunk compilation is healthy.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
        +1 javac 7m 33s There were no new javac warning messages.
        +1 javadoc 9m 35s There were no new javadoc warning messages.
        +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
        -1 checkstyle 0m 29s The applied patch generated 1 new checkstyle issues (total was 38, now 39).
        -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix.
        +1 install 1m 34s mvn install still works.
        +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse.
        +1 findbugs 0m 51s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
        -1 yarn tests 19m 52s Tests failed in hadoop-yarn-client.
            56m 34s  



        Reason Tests
        Timed out tests org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12742365/YARN-3860.001.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / 79ed0f9
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8365/artifact/patchprocess/diffcheckstylehadoop-yarn-client.txt
        whitespace https://builds.apache.org/job/PreCommit-YARN-Build/8365/artifact/patchprocess/whitespace.txt
        hadoop-yarn-client test log https://builds.apache.org/job/PreCommit-YARN-Build/8365/artifact/patchprocess/testrun_hadoop-yarn-client.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8365/testReport/
        Java 1.7.0_55
        uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/8365/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 15m 41s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 33s There were no new javac warning messages. +1 javadoc 9m 35s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 0m 29s The applied patch generated 1 new checkstyle issues (total was 38, now 39). -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 1m 34s mvn install still works. +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse. +1 findbugs 0m 51s The patch does not introduce any new Findbugs (version 3.0.0) warnings. -1 yarn tests 19m 52s Tests failed in hadoop-yarn-client.     56m 34s   Reason Tests Timed out tests org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12742365/YARN-3860.001.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 79ed0f9 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8365/artifact/patchprocess/diffcheckstylehadoop-yarn-client.txt whitespace https://builds.apache.org/job/PreCommit-YARN-Build/8365/artifact/patchprocess/whitespace.txt hadoop-yarn-client test log https://builds.apache.org/job/PreCommit-YARN-Build/8365/artifact/patchprocess/testrun_hadoop-yarn-client.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8365/testReport/ Java 1.7.0_55 uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8365/console This message was automatically generated.
        Hide
        hadoopqa Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        0 pre-patch 15m 37s Pre-patch trunk compilation is healthy.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
        +1 javac 7m 36s There were no new javac warning messages.
        +1 javadoc 9m 39s There were no new javadoc warning messages.
        +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
        -1 checkstyle 0m 30s The applied patch generated 1 new checkstyle issues (total was 38, now 39).
        -1 whitespace 0m 1s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix.
        +1 install 1m 33s mvn install still works.
        +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
        +1 findbugs 0m 52s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
        +1 yarn tests 6m 55s Tests passed in hadoop-yarn-client.
            43m 42s  



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12742367/YARN-3860.002.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / 79ed0f9
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8366/artifact/patchprocess/diffcheckstylehadoop-yarn-client.txt
        whitespace https://builds.apache.org/job/PreCommit-YARN-Build/8366/artifact/patchprocess/whitespace.txt
        hadoop-yarn-client test log https://builds.apache.org/job/PreCommit-YARN-Build/8366/artifact/patchprocess/testrun_hadoop-yarn-client.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8366/testReport/
        Java 1.7.0_55
        uname Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/8366/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 15m 37s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 36s There were no new javac warning messages. +1 javadoc 9m 39s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 0m 30s The applied patch generated 1 new checkstyle issues (total was 38, now 39). -1 whitespace 0m 1s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 1m 33s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 0m 52s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 6m 55s Tests passed in hadoop-yarn-client.     43m 42s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12742367/YARN-3860.002.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 79ed0f9 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8366/artifact/patchprocess/diffcheckstylehadoop-yarn-client.txt whitespace https://builds.apache.org/job/PreCommit-YARN-Build/8366/artifact/patchprocess/whitespace.txt hadoop-yarn-client test log https://builds.apache.org/job/PreCommit-YARN-Build/8366/artifact/patchprocess/testrun_hadoop-yarn-client.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8366/testReport/ Java 1.7.0_55 uname Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8366/console This message was automatically generated.
        Hide
        iwasakims Masatake Iwasaki added a comment -

        addressing checkstyle and whitespace warnings.

        Show
        iwasakims Masatake Iwasaki added a comment - addressing checkstyle and whitespace warnings.
        Hide
        hadoopqa Hadoop QA added a comment -



        +1 overall



        Vote Subsystem Runtime Comment
        0 pre-patch 15m 15s Pre-patch trunk compilation is healthy.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
        +1 javac 7m 31s There were no new javac warning messages.
        +1 javadoc 9m 35s There were no new javadoc warning messages.
        +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
        +1 checkstyle 0m 28s There were no new checkstyle issues.
        +1 whitespace 0m 0s The patch has no lines that end in whitespace.
        +1 install 1m 34s mvn install still works.
        +1 eclipse:eclipse 0m 32s The patch built with eclipse:eclipse.
        +1 findbugs 0m 52s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
        +1 yarn tests 6m 55s Tests passed in hadoop-yarn-client.
            43m 9s  



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12742371/YARN-3860.003.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / 79ed0f9
        hadoop-yarn-client test log https://builds.apache.org/job/PreCommit-YARN-Build/8367/artifact/patchprocess/testrun_hadoop-yarn-client.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8367/testReport/
        Java 1.7.0_55
        uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/8367/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 15m 15s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 31s There were no new javac warning messages. +1 javadoc 9m 35s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 28s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 34s mvn install still works. +1 eclipse:eclipse 0m 32s The patch built with eclipse:eclipse. +1 findbugs 0m 52s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 6m 55s Tests passed in hadoop-yarn-client.     43m 9s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12742371/YARN-3860.003.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 79ed0f9 hadoop-yarn-client test log https://builds.apache.org/job/PreCommit-YARN-Build/8367/artifact/patchprocess/testrun_hadoop-yarn-client.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8367/testReport/ Java 1.7.0_55 uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8367/console This message was automatically generated.
        Hide
        zxu zhihai xu added a comment -

        Masatake Iwasaki, thanks for working on this issue. This looks like a good catch.
        One nit: I think times(1) is used by default, Can we just use verify(haadmin).getServiceStatus();? because all other tests didn't have times(1) in verify.

        Show
        zxu zhihai xu added a comment - Masatake Iwasaki , thanks for working on this issue. This looks like a good catch. One nit: I think times(1) is used by default, Can we just use verify(haadmin).getServiceStatus(); ? because all other tests didn't have times(1) in verify.
        Hide
        djp Junping Du added a comment -

        Thanks Masatake Iwasaki for nice catching and the patch! Latest patch LGTM.
        zhihai xu, thanks for review here. I think times(1) is quite useful here as the calling times indicate the other nodes loops into consideration. If we have "rm1, rm2, rm3" in config settings before, here we should get times(2). If you agree with it, I will go ahead to commit this patch soon.

        Show
        djp Junping Du added a comment - Thanks Masatake Iwasaki for nice catching and the patch! Latest patch LGTM. zhihai xu , thanks for review here. I think times(1) is quite useful here as the calling times indicate the other nodes loops into consideration. If we have "rm1, rm2, rm3" in config settings before, here we should get times(2). If you agree with it, I will go ahead to commit this patch soon.
        Hide
        zxu zhihai xu added a comment -

        yes, it makes sense. Although they are equivalent, it is easier to change from times(1) to times(2) if we have "rm1, rm2, rm3" in config settings. +1(non-binding) for the latest patch.

        Show
        zxu zhihai xu added a comment - yes, it makes sense. Although they are equivalent, it is easier to change from times(1) to times(2) if we have "rm1, rm2, rm3" in config settings. +1(non-binding) for the latest patch.
        Hide
        djp Junping Du added a comment -

        I have commit 003 patch to trunk and branch-2. Thanks Masatake Iwasaki for contributing the patch and zhihai xu for reviewing effort!

        Show
        djp Junping Du added a comment - I have commit 003 patch to trunk and branch-2. Thanks Masatake Iwasaki for contributing the patch and zhihai xu for reviewing effort!
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-trunk-Commit #8082 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8082/)
        YARN-3860. rmadmin -transitionToActive should check the state of non-target node. (Contributed by Masatake Iwasaki) (junping_du: rev a95d39f9d08b3b215a1b33e77e9ab8a2ee59b3a9)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8082 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8082/ ) YARN-3860 . rmadmin -transitionToActive should check the state of non-target node. (Contributed by Masatake Iwasaki) (junping_du: rev a95d39f9d08b3b215a1b33e77e9ab8a2ee59b3a9) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java hadoop-yarn-project/CHANGES.txt
        Hide
        iwasakims Masatake Iwasaki added a comment -

        Thanks, zhihai xu and Junping Du!

        Show
        iwasakims Masatake Iwasaki added a comment - Thanks, zhihai xu and Junping Du !
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #243 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/243/)
        YARN-3860. rmadmin -transitionToActive should check the state of non-target node. (Contributed by Masatake Iwasaki) (junping_du: rev a95d39f9d08b3b215a1b33e77e9ab8a2ee59b3a9)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #243 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/243/ ) YARN-3860 . rmadmin -transitionToActive should check the state of non-target node. (Contributed by Masatake Iwasaki) (junping_du: rev a95d39f9d08b3b215a1b33e77e9ab8a2ee59b3a9) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java hadoop-yarn-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Yarn-trunk #973 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/973/)
        YARN-3860. rmadmin -transitionToActive should check the state of non-target node. (Contributed by Masatake Iwasaki) (junping_du: rev a95d39f9d08b3b215a1b33e77e9ab8a2ee59b3a9)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk #973 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/973/ ) YARN-3860 . rmadmin -transitionToActive should check the state of non-target node. (Contributed by Masatake Iwasaki) (junping_du: rev a95d39f9d08b3b215a1b33e77e9ab8a2ee59b3a9) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk #2171 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2171/)
        YARN-3860. rmadmin -transitionToActive should check the state of non-target node. (Contributed by Masatake Iwasaki) (junping_du: rev a95d39f9d08b3b215a1b33e77e9ab8a2ee59b3a9)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2171 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2171/ ) YARN-3860 . rmadmin -transitionToActive should check the state of non-target node. (Contributed by Masatake Iwasaki) (junping_du: rev a95d39f9d08b3b215a1b33e77e9ab8a2ee59b3a9) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #232 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/232/)
        YARN-3860. rmadmin -transitionToActive should check the state of non-target node. (Contributed by Masatake Iwasaki) (junping_du: rev a95d39f9d08b3b215a1b33e77e9ab8a2ee59b3a9)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #232 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/232/ ) YARN-3860 . rmadmin -transitionToActive should check the state of non-target node. (Contributed by Masatake Iwasaki) (junping_du: rev a95d39f9d08b3b215a1b33e77e9ab8a2ee59b3a9) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2189 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2189/)
        YARN-3860. rmadmin -transitionToActive should check the state of non-target node. (Contributed by Masatake Iwasaki) (junping_du: rev a95d39f9d08b3b215a1b33e77e9ab8a2ee59b3a9)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2189 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2189/ ) YARN-3860 . rmadmin -transitionToActive should check the state of non-target node. (Contributed by Masatake Iwasaki) (junping_du: rev a95d39f9d08b3b215a1b33e77e9ab8a2ee59b3a9) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #241 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/241/)
        YARN-3860. rmadmin -transitionToActive should check the state of non-target node. (Contributed by Masatake Iwasaki) (junping_du: rev a95d39f9d08b3b215a1b33e77e9ab8a2ee59b3a9)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #241 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/241/ ) YARN-3860 . rmadmin -transitionToActive should check the state of non-target node. (Contributed by Masatake Iwasaki) (junping_du: rev a95d39f9d08b3b215a1b33e77e9ab8a2ee59b3a9) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java hadoop-yarn-project/CHANGES.txt
        Hide
        kasha Karthik Kambatla added a comment -

        I am not convinced we need this.

        The only reason we don't want both RMs to be active is to avoid the split-brain situation. Today, if both RMs become active, one of them fails to create the fencing-node and should automatically transition to standby.

        Show
        kasha Karthik Kambatla added a comment - I am not convinced we need this. The only reason we don't want both RMs to be active is to avoid the split-brain situation. Today, if both RMs become active, one of them fails to create the fencing-node and should automatically transition to standby.
        Hide
        kasha Karthik Kambatla added a comment -

        Also, I wonder if this affects the latency of transitioning active by any significant amount? Masatake Iwasaki - any observations here?

        Show
        kasha Karthik Kambatla added a comment - Also, I wonder if this affects the latency of transitioning active by any significant amount? Masatake Iwasaki - any observations here?
        Hide
        djp Junping Du added a comment -

        The only reason we don't want both RMs to be active is to avoid the split-brain situation. Today, if both RMs become active, one of them fails to create the fencing-node and should automatically transition to standby.

        Actually, this behavior could confuse user as it sounds like which RM is chosen by randomly. The expected behavior here is: if user doesn't specify "--forceactive" which means user may not know other RM get activated now - nothing should get updated by YARN and a warning message from CLI get thrown out if already another RM in activated state. No?

        I wonder if this affects the latency of transitioning active by any significant amount.

        The only latency to add here is reading a YARN configuration which doesn't sounds like a problem for user experience especially we support RM down with work preserving.

        Show
        djp Junping Du added a comment - The only reason we don't want both RMs to be active is to avoid the split-brain situation. Today, if both RMs become active, one of them fails to create the fencing-node and should automatically transition to standby. Actually, this behavior could confuse user as it sounds like which RM is chosen by randomly. The expected behavior here is: if user doesn't specify "--forceactive" which means user may not know other RM get activated now - nothing should get updated by YARN and a warning message from CLI get thrown out if already another RM in activated state. No? I wonder if this affects the latency of transitioning active by any significant amount. The only latency to add here is reading a YARN configuration which doesn't sounds like a problem for user experience especially we support RM down with work preserving.
        Hide
        iwasakims Masatake Iwasaki added a comment -

        Today, if both RMs become active, one of them fails to create the fencing-node and should automatically transition to standby.

        That is true only if RM restart is enabled and ZKRMStateStore is used but other settings should be cared. Also users can know failure reason from console output such as "rmX is already active" by the status checking.

        Show
        iwasakims Masatake Iwasaki added a comment - Today, if both RMs become active, one of them fails to create the fencing-node and should automatically transition to standby. That is true only if RM restart is enabled and ZKRMStateStore is used but other settings should be cared. Also users can know failure reason from console output such as "rmX is already active" by the status checking.
        Hide
        iwasakims Masatake Iwasaki added a comment -

        I wonder if this affects the latency of transitioning active by any significant amount?

        I think the time to check state of other nodes is small and not to be problem especially when user is doing manual transition. getServiceState is expected to return or fail immediately if the node is down. I'll check corner cases but it is a bug if getServiceState blocks long time.

        Show
        iwasakims Masatake Iwasaki added a comment - I wonder if this affects the latency of transitioning active by any significant amount? I think the time to check state of other nodes is small and not to be problem especially when user is doing manual transition. getServiceState is expected to return or fail immediately if the node is down. I'll check corner cases but it is a bug if getServiceState blocks long time.
        Hide
        kasha Karthik Kambatla added a comment -

        That is true only if RM restart is enabled and ZKRMStateStore is used but other settings should be cared.

        getServiceState is expected to return or fail immediately if the node is down.

        Do we support other configurations? My only concern is the case when getServiceState fails because we can't reach the other node, but the node and the RM are actually active. If someone uses rmadmin -transitionToActive, they might falsely assume that a split-brain scenario is not possible.

        Just to be clear, I agree the patch is benign. I want to make sure we are clear on why we are adding it.

        Show
        kasha Karthik Kambatla added a comment - That is true only if RM restart is enabled and ZKRMStateStore is used but other settings should be cared. getServiceState is expected to return or fail immediately if the node is down. Do we support other configurations? My only concern is the case when getServiceState fails because we can't reach the other node, but the node and the RM are actually active. If someone uses rmadmin -transitionToActive, they might falsely assume that a split-brain scenario is not possible. Just to be clear, I agree the patch is benign. I want to make sure we are clear on why we are adding it.

          People

          • Assignee:
            iwasakims Masatake Iwasaki
            Reporter:
            iwasakims Masatake Iwasaki
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development