Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.1.0-beta
    • 2.3.0
    • resourcemanager

    Description

      Support HA admin operations to facilitate transitioning the RM to Active and Standby states.

      Attachments

        1. yarn-1068-prelim.patch
          9 kB
          Karthik Kambatla
        2. yarn-1068-9.patch
          26 kB
          Karthik Kambatla
        3. yarn-1068-8.patch
          26 kB
          Karthik Kambatla
        4. yarn-1068-7.patch
          16 kB
          Karthik Kambatla
        5. yarn-1068-6.patch
          16 kB
          Karthik Kambatla
        6. yarn-1068-5.patch
          16 kB
          Karthik Kambatla
        7. yarn-1068-4.patch
          16 kB
          Karthik Kambatla
        8. yarn-1068-3.patch
          16 kB
          Karthik Kambatla
        9. yarn-1068-2.patch
          10 kB
          Karthik Kambatla
        10. yarn-1068-14.patch
          48 kB
          Karthik Kambatla
        11. yarn-1068-13.patch
          47 kB
          Karthik Kambatla
        12. yarn-1068-12.patch
          41 kB
          Karthik Kambatla
        13. yarn-1068-11.patch
          31 kB
          Karthik Kambatla
        14. yarn-1068-10.patch
          29 kB
          Karthik Kambatla
        15. yarn-1068-1.patch
          9 kB
          Karthik Kambatla
        16. YARN-1068.Karthik.patch
          48 kB
          Bikas Saha

        Issue Links

          Activity

            hudson Hudson added a comment -

            FAILURE: Integrated in Hadoop-Mapreduce-trunk #1594 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1594/)
            YARN-1068. Add admin support for HA operations (Karthik Kambatla via bikas) (bikas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1536888)

            • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java
            • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestCombineTextInputFormat.java
            • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/HAUtil.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMAdminCLI.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMHAServiceTarget.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMHAProtocolService.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/authorize/RMPolicyProvider.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java
            hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1594 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1594/ ) YARN-1068 . Add admin support for HA operations (Karthik Kambatla via bikas) (bikas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1536888 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestCombineTextInputFormat.java /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/HAUtil.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMAdminCLI.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMHAServiceTarget.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMHAProtocolService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/authorize/RMPolicyProvider.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java
            hudson Hudson added a comment -

            FAILURE: Integrated in Hadoop-Hdfs-trunk #1568 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1568/)
            YARN-1068. Add admin support for HA operations (Karthik Kambatla via bikas) (bikas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1536888)

            • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java
            • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestCombineTextInputFormat.java
            • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/HAUtil.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMAdminCLI.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMHAServiceTarget.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMHAProtocolService.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/authorize/RMPolicyProvider.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java
            hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #1568 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1568/ ) YARN-1068 . Add admin support for HA operations (Karthik Kambatla via bikas) (bikas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1536888 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestCombineTextInputFormat.java /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/HAUtil.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMAdminCLI.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMHAServiceTarget.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMHAProtocolService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/authorize/RMPolicyProvider.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java
            hudson Hudson added a comment -

            SUCCESS: Integrated in Hadoop-Yarn-trunk #378 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/378/)
            YARN-1068. Add admin support for HA operations (Karthik Kambatla via bikas) (bikas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1536888)

            • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java
            • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestCombineTextInputFormat.java
            • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/HAUtil.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMAdminCLI.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMHAServiceTarget.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMHAProtocolService.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/authorize/RMPolicyProvider.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java
            hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk #378 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/378/ ) YARN-1068 . Add admin support for HA operations (Karthik Kambatla via bikas) (bikas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1536888 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestCombineTextInputFormat.java /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/HAUtil.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMAdminCLI.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMHAServiceTarget.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMHAProtocolService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/authorize/RMPolicyProvider.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java

            Resolving this as bikassaha committed to trunk and branch-2.

            kkambatl Karthik Kambatla (Inactive) added a comment - Resolving this as bikassaha committed to trunk and branch-2.
            hadoopqa Hadoop QA added a comment -

            -1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12610924/YARN-1068.Karthik.patch
            against trunk revision .

            -1 patch. The patch command could not apply the patch.

            Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2311//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12610924/YARN-1068.Karthik.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2311//console This message is automatically generated.
            hudson Hudson added a comment -

            SUCCESS: Integrated in Hadoop-trunk-Commit #4668 (See https://builds.apache.org/job/Hadoop-trunk-Commit/4668/)
            YARN-1068. Add admin support for HA operations (Karthik Kambatla via bikas) (bikas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1536888)

            • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java
            • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestCombineTextInputFormat.java
            • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/HAUtil.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMAdminCLI.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMHAServiceTarget.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMHAProtocolService.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/authorize/RMPolicyProvider.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
            • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java
            hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #4668 (See https://builds.apache.org/job/Hadoop-trunk-Commit/4668/ ) YARN-1068 . Add admin support for HA operations (Karthik Kambatla via bikas) (bikas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1536888 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAAdmin.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestCombineTextInputFormat.java /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/HAUtil.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMAdminCLI.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMHAServiceTarget.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMHAProtocolService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/authorize/RMPolicyProvider.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java
            bikassaha Bikas Saha added a comment -

            Uploading patch with minor edits to add comments about dependency on YARN-1318

            bikassaha Bikas Saha added a comment - Uploading patch with minor edits to add comments about dependency on YARN-1318

            Haven't looked at the patch carefully, but okay with pushing out the service merge assuming YARN-1318 is a blocker for the next release that ships the fail-over work.

            vinodkv Vinod Kumar Vavilapalli added a comment - Haven't looked at the patch carefully, but okay with pushing out the service merge assuming YARN-1318 is a blocker for the next release that ships the fail-over work.
            hadoopqa Hadoop QA added a comment -

            +1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12610366/yarn-1068-14.patch
            against trunk revision .

            +1 @author. The patch does not contain any @author tags.

            +1 tests included. The patch appears to include 3 new or modified test files.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 eclipse:eclipse. The patch built with eclipse:eclipse.

            +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

            +1 contrib tests. The patch passed contrib unit tests.

            Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2287//testReport/
            Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2287//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12610366/yarn-1068-14.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2287//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2287//console This message is automatically generated.

            Updated patch to add a comment on why RMHAProtocolService is Private-Unstable and a pointer to YARN-1318.

            kkambatl Karthik Kambatla (Inactive) added a comment - Updated patch to add a comment on why RMHAProtocolService is Private-Unstable and a pointer to YARN-1318 .
            bikassaha Bikas Saha added a comment -

            vinodkv Does the new patch address your concerns?
            I have marked YARN-1318 as a blocker for YARN-149. We must fix that before failover is available. Karthik, in your final patch can you please include clear comments pointing to YARN-1318 near the @Private annotations for RMHAProtocolService. Thanks!

            bikassaha Bikas Saha added a comment - vinodkv Does the new patch address your concerns? I have marked YARN-1318 as a blocker for YARN-149 . We must fix that before failover is available. Karthik, in your final patch can you please include clear comments pointing to YARN-1318 near the @Private annotations for RMHAProtocolService. Thanks!

            Tested the latest patch again on a real cluster.

            bikassaha, vinodkv - I believe the patch is ready for review. Please take a look when you get a chance.

            kkambatl Karthik Kambatla (Inactive) added a comment - Tested the latest patch again on a real cluster. bikassaha , vinodkv - I believe the patch is ready for review. Please take a look when you get a chance.
            hadoopqa Hadoop QA added a comment -

            +1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12609970/yarn-1068-13.patch
            against trunk revision .

            +1 @author. The patch does not contain any @author tags.

            +1 tests included. The patch appears to include 3 new or modified test files.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 eclipse:eclipse. The patch built with eclipse:eclipse.

            +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

            +1 contrib tests. The patch passed contrib unit tests.

            Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2272//testReport/
            Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2272//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12609970/yarn-1068-13.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2272//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2272//console This message is automatically generated.

            Added HA-related tests to TestRMAdminCLI.

            kkambatl Karthik Kambatla (Inactive) added a comment - Added HA-related tests to TestRMAdminCLI.

            Updated patch extends RMAdminCLI to support admin operations.

            Todo: Add unit tests.

            kkambatl Karthik Kambatla (Inactive) added a comment - Updated patch extends RMAdminCLI to support admin operations. Todo: Add unit tests.

            Merging the CLI now makes any subsequent changes transparent to the user. Makes sense. Will work on the patch and report back.

            kkambatl Karthik Kambatla (Inactive) added a comment - Merging the CLI now makes any subsequent changes transparent to the user. Makes sense. Will work on the patch and report back.
            bikassaha Bikas Saha added a comment -

            Spoke to Vinod offline. His suggestion is to make sure that the admin cli is merged within this patch. We may defer the AdminService merge and make that a blocker for making the RMHAProtocolService public. Also not expose port configs in the yarn-defaults.

            bikassaha Bikas Saha added a comment - Spoke to Vinod offline. His suggestion is to make sure that the admin cli is merged within this patch. We may defer the AdminService merge and make that a blocker for making the RMHAProtocolService public. Also not expose port configs in the yarn-defaults.

            Thanks Bikas. I agree with your assessment that moving forward with the remaining work on failover, while working on the merging of RMHAProtocolService and AdminService in the background makes sense to me.

            I ll post a patch shortly to reflect the same.

            kkambatl Karthik Kambatla (Inactive) added a comment - Thanks Bikas. I agree with your assessment that moving forward with the remaining work on failover, while working on the merging of RMHAProtocolService and AdminService in the background makes sense to me. I ll post a patch shortly to reflect the same.
            bikassaha Bikas Saha added a comment -

            I havent looked deeply at what it would take to make AdminService always on. From your comment in YARN-1318 is looks like its significant surgery. In the interest of getting end to end RM failover move forward with crucial integration work with ZKFC and other testing can we mark the RMHAProtocolService private and unstable and move on with the integration work? The merger of RMHAProtocolService and AdminService can be made a blocker for making RMHAProtocolService public/stable. Does this work?

            bikassaha Bikas Saha added a comment - I havent looked deeply at what it would take to make AdminService always on. From your comment in YARN-1318 is looks like its significant surgery. In the interest of getting end to end RM failover move forward with crucial integration work with ZKFC and other testing can we mark the RMHAProtocolService private and unstable and move on with the integration work? The merger of RMHAProtocolService and AdminService can be made a blocker for making RMHAProtocolService public/stable. Does this work?

            Thanks vinodkv. I do agree that we should avoid creating new RPC servers and add more ports to the configuration.

            Created YARN-1318 to make AdminService an Always-On service. Unfortunately, that requires significant changes to ResourceManager and RMContext as well. https://issues.apache.org/jira/browse/YARN-1318?focusedCommentId=13798358&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13798358 provides a gist of the required changes.

            vinodkv, bikassaha can you please take a look at YARN-1318 as well and provide your thoughts.

            kkambatl Karthik Kambatla (Inactive) added a comment - Thanks vinodkv . I do agree that we should avoid creating new RPC servers and add more ports to the configuration. Created YARN-1318 to make AdminService an Always-On service. Unfortunately, that requires significant changes to ResourceManager and RMContext as well. https://issues.apache.org/jira/browse/YARN-1318?focusedCommentId=13798358&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13798358 provides a gist of the required changes. vinodkv , bikassaha can you please take a look at YARN-1318 as well and provide your thoughts.

            IIUC, the suggestion is to use the RPC server from AdminService. AdminService currently is an Active service and not an Always-On service, so doesn't start until the RM transitions to Active. Moving the AdminService to Always-On requires defining the semantics when the RM is Standby.

            I'd do this instead of adding a new service. For now as well as long term, we want to deny all the existing AdminService operations on Standby. Doing it via not stopping the server or explicitly rejecting the requests is an implementation detail and not a big change in semantics.

            Repeating what I said, we originally added AdminService separately from client-service only for prioritizing admin operations. No need for a new server for this.

            To do this, we need to have RMAdminCLI extend HAAdmin, and augment the run() method to call super.run() when applicable, and the usage needs to be augmented to include the HAAdmin usage.

            Yes. I guess there is no argument here other than stating the obvious.

            YARN expects the actual PB/PBImpl files to be at a particular location, and can't find the corresponding files when using HAServiceProtocol from common. Hence, had to use PB interfaces.

            HAServiceProtocolPB is the PB interface and there seems to be no PBImpl as Common/HDFS follow a different pattern from YARN's and the last I heard they liked YARN's PB impl stuff. In any case, +1 to skip using YARNRPC for that reason.

            The patch primarily adds command line support for HA transitions. Have tested this manually several times on a real cluster.

            We need junit tests for everything. We can skip down to manual tests for hard-to-test race conditions or security features that are not possible to address otherwise. Manual testing is not a substitute for junit tests.

            vinodkv Vinod Kumar Vavilapalli added a comment - IIUC, the suggestion is to use the RPC server from AdminService. AdminService currently is an Active service and not an Always-On service, so doesn't start until the RM transitions to Active. Moving the AdminService to Always-On requires defining the semantics when the RM is Standby. I'd do this instead of adding a new service. For now as well as long term, we want to deny all the existing AdminService operations on Standby. Doing it via not stopping the server or explicitly rejecting the requests is an implementation detail and not a big change in semantics. Repeating what I said, we originally added AdminService separately from client-service only for prioritizing admin operations. No need for a new server for this. To do this, we need to have RMAdminCLI extend HAAdmin, and augment the run() method to call super.run() when applicable, and the usage needs to be augmented to include the HAAdmin usage. Yes. I guess there is no argument here other than stating the obvious. YARN expects the actual PB/PBImpl files to be at a particular location, and can't find the corresponding files when using HAServiceProtocol from common. Hence, had to use PB interfaces. HAServiceProtocolPB is the PB interface and there seems to be no PBImpl as Common/HDFS follow a different pattern from YARN's and the last I heard they liked YARN's PB impl stuff. In any case, +1 to skip using YARNRPC for that reason. The patch primarily adds command line support for HA transitions. Have tested this manually several times on a real cluster. We need junit tests for everything. We can skip down to manual tests for hard-to-test race conditions or security features that are not possible to address otherwise. Manual testing is not a substitute for junit tests.

            Sorry, should have been clearer - was referring primarily to the PB interface use. Let me address each point separately.

            Let's try to avoid adding one more server. We added AdminService separately from client-service only for QOS sake for admin operations. HAAdminService should be listening on the same port as all other operations.

            IIUC, the suggestion is to use the RPC server from AdminService. AdminService currently is an Active service and not an Always-On service, so doesn't start until the RM transitions to Active. Moving the AdminService to Always-On requires defining the semantics when the RM is Standby.

            Following that, we should use only one rmadmin CLI for fail-over commands too.

            To do this, we need to have RMAdminCLI extend HAAdmin, and augment the run() method to call super.run() when applicable, and the usage needs to be augmented to include the HAAdmin usage.

            RMHAProtocolService: We don't directly use PB interfaces in YARN. Let's not change that here - use YarnRPC to create servers.

            YARN expects the actual PB/PBImpl files to be at a particular location, and can't find the corresponding files when using HAServiceProtocol from common. Hence, had to use PB interfaces.

            No tests specifically for the new code added here?

            The patch primarily adds command line support for HA transitions. Have tested this manually several times on a real cluster.

            kkambatl Karthik Kambatla (Inactive) added a comment - Sorry, should have been clearer - was referring primarily to the PB interface use. Let me address each point separately. Let's try to avoid adding one more server. We added AdminService separately from client-service only for QOS sake for admin operations. HAAdminService should be listening on the same port as all other operations. IIUC, the suggestion is to use the RPC server from AdminService. AdminService currently is an Active service and not an Always-On service, so doesn't start until the RM transitions to Active. Moving the AdminService to Always-On requires defining the semantics when the RM is Standby. Following that, we should use only one rmadmin CLI for fail-over commands too. To do this, we need to have RMAdminCLI extend HAAdmin, and augment the run() method to call super.run() when applicable, and the usage needs to be augmented to include the HAAdmin usage. RMHAProtocolService: We don't directly use PB interfaces in YARN. Let's not change that here - use YarnRPC to create servers. YARN expects the actual PB/PBImpl files to be at a particular location, and can't find the corresponding files when using HAServiceProtocol from common. Hence, had to use PB interfaces. No tests specifically for the new code added here? The patch primarily adds command line support for HA transitions. Have tested this manually several times on a real cluster.

            Trying to use common code is forcing us to start a new RPC-server? I don't quite follow it, more details may be?

            vinodkv Vinod Kumar Vavilapalli added a comment - Trying to use common code is forcing us to start a new RPC-server? I don't quite follow it, more details may be?

            vinodkv, the inconsistencies with the rest of the YARN code come primarily from our attempt to re-use Common code. Our attempt has been to use the Common code as much as possible. Do you think we should keep it independent?

            kkambatl Karthik Kambatla (Inactive) added a comment - vinodkv , the inconsistencies with the rest of the YARN code come primarily from our attempt to re-use Common code. Our attempt has been to use the Common code as much as possible. Do you think we should keep it independent?

            Catching up on all the HA stuff. Starting with this patch.

            Quick comments on the patch:

            • Let's try to avoid adding one more server. We added AdminService separately from client-service only for QOS sake for admin operations. HAAdminService should be listening on the same port as all other operations.
            • Following that, we should use only one rmadmin CLI for fail-over commands too.
            • RMHAProtocolService: We don't directly use PB interfaces in YARN. Let's not change that here - use YarnRPC to create servers.
            • No tests specifically for the new code added here?

            Will keep digging into the fail-over code more. Let me know if some of my assumptions are wrong. Tx.

            vinodkv Vinod Kumar Vavilapalli added a comment - Catching up on all the HA stuff. Starting with this patch. Quick comments on the patch: Let's try to avoid adding one more server. We added AdminService separately from client-service only for QOS sake for admin operations. HAAdminService should be listening on the same port as all other operations. Following that, we should use only one rmadmin CLI for fail-over commands too. RMHAProtocolService: We don't directly use PB interfaces in YARN. Let's not change that here - use YarnRPC to create servers. No tests specifically for the new code added here? Will keep digging into the fail-over code more. Let me know if some of my assumptions are wrong. Tx.

            Per discussion on YARN-1309 and YARN-142, looks like we should throw YarnException and not IOException. However, the actual exceptions to be thrown are defined in HAServiceProtocol which doesn't have YarnException listed.

            So, I guess we will have to leave the RMHAProtocolService as is.

            kkambatl Karthik Kambatla (Inactive) added a comment - Per discussion on YARN-1309 and YARN-142 , looks like we should throw YarnException and not IOException. However, the actual exceptions to be thrown are defined in HAServiceProtocol which doesn't have YarnException listed. So, I guess we will have to leave the RMHAProtocolService as is.

            Thanks Bikas.

            There isnt any need for this to wrap the IOException in another exception. The base AdminService protocol signature already supports throwing IOException.

            Agree. I did consider leaving it as IOE. However, there are several places in AdminService where an IOE is being wrapped into a YarnException. We should probably address all of them together in another JIRA. Created YARN-1309.

            kkambatl Karthik Kambatla (Inactive) added a comment - Thanks Bikas. There isnt any need for this to wrap the IOException in another exception. The base AdminService protocol signature already supports throwing IOException. Agree. I did consider leaving it as IOE. However, there are several places in AdminService where an IOE is being wrapped into a YarnException. We should probably address all of them together in another JIRA. Created YARN-1309 .
            bikassaha Bikas Saha added a comment -

            Looks good to me. Will give a day or so for some other committers to take a look.

            There isnt any need for this to wrap the IOException in another exception. The base AdminService protocol signature already supports throwing IOException. (ResourceManagerAdministrationProtocol). If its small enough, we could fix it this here or do it in a separate jira.

               private UserGroupInformation checkAcls(String method) throws YarnException {
            -    UserGroupInformation user;
                 try {
            -      user = UserGroupInformation.getCurrentUser();
            +      return RMServerUtils.verifyAccess(adminAcl, method, LOG);
                 } catch (IOException ioe) {
            -      LOG.warn("Couldn't get current user", ioe);
            -
            -      RMAuditLogger.logFailure("UNKNOWN", method,
            -          adminAcl.toString(), "AdminService",
            -          "Couldn't get current user");
                   throw RPCUtil.getRemoteException(ioe);
                 }
            
            bikassaha Bikas Saha added a comment - Looks good to me. Will give a day or so for some other committers to take a look. There isnt any need for this to wrap the IOException in another exception. The base AdminService protocol signature already supports throwing IOException. (ResourceManagerAdministrationProtocol). If its small enough, we could fix it this here or do it in a separate jira. private UserGroupInformation checkAcls( String method) throws YarnException { - UserGroupInformation user; try { - user = UserGroupInformation.getCurrentUser(); + return RMServerUtils.verifyAccess(adminAcl, method, LOG); } catch (IOException ioe) { - LOG.warn( "Couldn't get current user" , ioe); - - RMAuditLogger.logFailure( "UNKNOWN" , method, - adminAcl.toString(), "AdminService" , - "Couldn't get current user" ); throw RPCUtil.getRemoteException(ioe); }
            hadoopqa Hadoop QA added a comment -

            +1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12608393/yarn-1068-11.patch
            against trunk revision .

            +1 @author. The patch does not contain any @author tags.

            +1 tests included. The patch appears to include 2 new or modified test files.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 eclipse:eclipse. The patch built with eclipse:eclipse.

            +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

            +1 contrib tests. The patch passed contrib unit tests.

            Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2177//testReport/
            Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2177//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12608393/yarn-1068-11.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2177//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2177//console This message is automatically generated.

            Patch addressing Bikas' latest round of comments.

            kkambatl Karthik Kambatla (Inactive) added a comment - Patch addressing Bikas' latest round of comments.
            bikassaha Bikas Saha added a comment -

            Lets stick to what existing yarn code looks like IMO. If existing approach can be made better then we should do probably it in a different patch. vinodkv do you know why we are not following common naming practice for the above.

            bikassaha Bikas Saha added a comment - Lets stick to what existing yarn code looks like IMO. If existing approach can be made better then we should do probably it in a different patch. vinodkv do you know why we are not following common naming practice for the above.

            New name doesnt seem to follow convention based on other names in that file. YARN_SECURITY_SERVICE_AUTHORIZATION_FOO

            Right. The property comes from Common HA code, and I am using it as is. The alternative is to add YARN_SECURITY_SERVICE_AUTHORIZATION_HA_SERVICE_PROTOCOL to YarnConfiguration and point it to the other one. Is that the preferred approach?

            Will post a patch addressing the other comments.

            kkambatl Karthik Kambatla (Inactive) added a comment - New name doesnt seem to follow convention based on other names in that file. YARN_SECURITY_SERVICE_AUTHORIZATION_FOO Right. The property comes from Common HA code, and I am using it as is. The alternative is to add YARN_SECURITY_SERVICE_AUTHORIZATION_HA_SERVICE_PROTOCOL to YarnConfiguration and point it to the other one. Is that the preferred approach? Will post a patch addressing the other comments.
            bikassaha Bikas Saha added a comment -

            This should probably creating a new conf and override it instead of changing things in the original conf.

            +      YarnConfiguration conf = (YarnConfiguration) getConf();
            +      conf.set(YarnConfiguration.RM_HA_ID, rmId);
            +      return new RMHAServiceTarget(conf);
            

            Shoudlnt it be "transitionToActive"?

            +      RMAuditLogger.logFailure(user.getShortUserName(), "transitionToStandby",
            +          adminAcl.toString(), "RMHAProtocolService",
            +          "Exception transitioning to active");
            

            We shouldnt be wrapping some unknown exception into an AccessControlException

            +  private UserGroupInformation checkAccess(String method) throws AccessControlException {
            +    try {
            +      return RMServerUtils.verifyAccess(adminAcl, method, LOG);
            +    } catch (YarnException e) {
            +      throw new AccessControlException(e);
            +    }
            

            This method isnt even throwing an accesscontrolexception. Then why are transitionToStandby() etc changing signature to throw AccessControlException.

            +  public static UserGroupInformation verifyAccess(
            +      AccessControlList acl, String method, final Log LOG)
            +      throws YarnException {
            

            New name doesnt seem to follow convention based on other names in that file. YARN_SECURITY_SERVICE_AUTHORIZATION_FOO

                 new Service(
                     YarnConfiguration.YARN_SECURITY_SERVICE_AUTHORIZATION_CONTAINER_MANAGEMENT_PROTOCOL, 
                     ContainerManagementProtocolPB.class),
            +    new Service(
            +        CommonConfigurationKeys.SECURITY_HA_SERVICE_PROTOCOL_ACL,
            +        HAServiceProtocol.class),
            
            bikassaha Bikas Saha added a comment - This should probably creating a new conf and override it instead of changing things in the original conf. + YarnConfiguration conf = (YarnConfiguration) getConf(); + conf.set(YarnConfiguration.RM_HA_ID, rmId); + return new RMHAServiceTarget(conf); Shoudlnt it be "transitionToActive"? + RMAuditLogger.logFailure(user.getShortUserName(), "transitionToStandby" , + adminAcl.toString(), "RMHAProtocolService" , + "Exception transitioning to active" ); We shouldnt be wrapping some unknown exception into an AccessControlException + private UserGroupInformation checkAccess( String method) throws AccessControlException { + try { + return RMServerUtils.verifyAccess(adminAcl, method, LOG); + } catch (YarnException e) { + throw new AccessControlException(e); + } This method isnt even throwing an accesscontrolexception. Then why are transitionToStandby() etc changing signature to throw AccessControlException. + public static UserGroupInformation verifyAccess( + AccessControlList acl, String method, final Log LOG) + throws YarnException { New name doesnt seem to follow convention based on other names in that file. YARN_SECURITY_SERVICE_AUTHORIZATION_FOO new Service( YarnConfiguration.YARN_SECURITY_SERVICE_AUTHORIZATION_CONTAINER_MANAGEMENT_PROTOCOL, ContainerManagementProtocolPB.class), + new Service( + CommonConfigurationKeys.SECURITY_HA_SERVICE_PROTOCOL_ACL, + HAServiceProtocol.class),

            Forgot to mention the testing done.

            Ran a two node cluster - each node running an RM, and used the CLI to transition each to active/standby, getServiceState, checkHealth several (~100) times for various versions of the patch put together. Verified the latest patch on a pseudo-dist cluster with configs for both RMs pointing to the same RM.

            kkambatl Karthik Kambatla (Inactive) added a comment - Forgot to mention the testing done. Ran a two node cluster - each node running an RM, and used the CLI to transition each to active/standby, getServiceState, checkHealth several (~100) times for various versions of the patch put together. Verified the latest patch on a pseudo-dist cluster with configs for both RMs pointing to the same RM.
            hadoopqa Hadoop QA added a comment -

            +1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12607623/yarn-1068-10.patch
            against trunk revision .

            +1 @author. The patch does not contain any @author tags.

            +1 tests included. The patch appears to include 2 new or modified test files.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 eclipse:eclipse. The patch built with eclipse:eclipse.

            +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

            +1 contrib tests. The patch passed contrib unit tests.

            Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2155//testReport/
            Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2155//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12607623/yarn-1068-10.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2155//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2155//console This message is automatically generated.

            Thanks bikassaha for the detailed review. Sorry for the delay in responding, was caught up with some other issues. Uploaded a patch that addresses most of the comments:

            1. Audit logging in RMHAProtocolService
            2. Update AdminService also to use RMServerUtils#verifyAccess
            3. YarnConfiguration overrides updateConnectAddr as well to complement getSocketAddr
            4. Change argument names from nodeId to rmId

            cast not needed right?

            Currently, RMHAServiceTarget constructor takes YarnConfiguration as an argument and not Configuration. We can change this, but I think it is better to be explicit in the kind of instance needed. Leaving the constructor as is requires the cast.

            Should this be RMHAServiceProtocol address? Admins and ZKFC would be connecting on this protocol right?

            Updated the description to reflect that the RM listens at that address for both rmhaadmin CLI and the failover controller (ZKFC). I think we should leave the config name as ha.admin.address for the following reasons: (1) the user/admin understand admin better than HAProtocolService as the latter requires them to understand the protocol being used, (2) either CLI or ZKFC or actually administrating the HA state of the RM, (3) we don't use protocol names anywhere else in the configs.

            kkambatl Karthik Kambatla (Inactive) added a comment - Thanks bikassaha for the detailed review. Sorry for the delay in responding, was caught up with some other issues. Uploaded a patch that addresses most of the comments: Audit logging in RMHAProtocolService Update AdminService also to use RMServerUtils#verifyAccess YarnConfiguration overrides updateConnectAddr as well to complement getSocketAddr Change argument names from nodeId to rmId cast not needed right? Currently, RMHAServiceTarget constructor takes YarnConfiguration as an argument and not Configuration. We can change this, but I think it is better to be explicit in the kind of instance needed. Leaving the constructor as is requires the cast. Should this be RMHAServiceProtocol address? Admins and ZKFC would be connecting on this protocol right? Updated the description to reflect that the RM listens at that address for both rmhaadmin CLI and the failover controller (ZKFC). I think we should leave the config name as ha.admin.address for the following reasons: (1) the user/admin understand admin better than HAProtocolService as the latter requires them to understand the protocol being used, (2) either CLI or ZKFC or actually administrating the HA state of the RM, (3) we don't use protocol names anywhere else in the configs.
            bikassaha Bikas Saha added a comment -

            Should this be RMHAServiceProtocol address? Admins and ZKFC would be connecting on this protocol right?

            +  public static final String RM_HA_ADMIN_ADDRESS =
            +      RM_HA_PREFIX + "admin.address";
            ...
            +    <description>The address of the RM HA admin interface.</description>
            +    <name>yarn.resourcemanager.ha.admin.address</name>
            +    <value>${yarn.resourcemanager.hostname}:8034</value>
            ...
            +  private AccessControlList adminAcl;
            +  private Server haAdminServer;
            

            instanceof check before creating new conf?

            +  public void setConf(Configuration conf) {
            +    if (conf != null) {
            +      conf = new YarnConfiguration(conf);
            +    }
            +    super.setConf(conf);
            

            More clear if we name the parameter rmId instead of nodeId

            protected HAServiceTarget resolveTarget(String nodeId) {
            

            cast not needed right?

            +    try {
            +      return new RMHAServiceTarget((YarnConfiguration)getConf(), nodeId);
            

            Should this create its own copy of the conf instead of overriding the CLI's copy? If its ok to override then we should probably do it in the CLI itself instead of being the side effect of a constructor.

            public RMHAServiceTarget(YarnConfiguration conf, String nodeId)
            +      throws IOException {
            +    conf.set(YarnConfiguration.RM_HA_ID, nodeId);
            

            After this how is someone supposed to get the socker addr? conf.getSocketAddr() will do the HAUtil magic instead of directly picking what is set by this code?

            +    haAdminServer.start();
            +    conf.updateConnectAddr(YarnConfiguration.RM_HA_ADMIN_ADDRESS,
            +        haAdminServer.getListenerAddress());
            

            Good to refactor. Looks like we should change AdminService to use the same method. Also, this method returns a user value that is used by AdminService to do audit logging. Would be good to follow that pattern and do audit logging in HAService at least for the state transition operations. probably not for the health check operations.

            +    try {
            +      RMServerUtils.verifyAccess(adminAcl, method, LOG);
            

            Any notes on testing. There doesnt seem to be any new unit tests added.

            bikassaha Bikas Saha added a comment - Should this be RMHAServiceProtocol address? Admins and ZKFC would be connecting on this protocol right? + public static final String RM_HA_ADMIN_ADDRESS = + RM_HA_PREFIX + "admin.address" ; ... + <description>The address of the RM HA admin interface .</description> + <name>yarn.resourcemanager.ha.admin.address</name> + <value>${yarn.resourcemanager.hostname}:8034</value> ... + private AccessControlList adminAcl; + private Server haAdminServer; instanceof check before creating new conf? + public void setConf(Configuration conf) { + if (conf != null ) { + conf = new YarnConfiguration(conf); + } + super .setConf(conf); More clear if we name the parameter rmId instead of nodeId protected HAServiceTarget resolveTarget( String nodeId) { cast not needed right? + try { + return new RMHAServiceTarget((YarnConfiguration)getConf(), nodeId); Should this create its own copy of the conf instead of overriding the CLI's copy? If its ok to override then we should probably do it in the CLI itself instead of being the side effect of a constructor. public RMHAServiceTarget(YarnConfiguration conf, String nodeId) + throws IOException { + conf.set(YarnConfiguration.RM_HA_ID, nodeId); After this how is someone supposed to get the socker addr? conf.getSocketAddr() will do the HAUtil magic instead of directly picking what is set by this code? + haAdminServer.start(); + conf.updateConnectAddr(YarnConfiguration.RM_HA_ADMIN_ADDRESS, + haAdminServer.getListenerAddress()); Good to refactor. Looks like we should change AdminService to use the same method. Also, this method returns a user value that is used by AdminService to do audit logging. Would be good to follow that pattern and do audit logging in HAService at least for the state transition operations. probably not for the health check operations. + try { + RMServerUtils.verifyAccess(adminAcl, method, LOG); Any notes on testing. There doesnt seem to be any new unit tests added.
            hadoopqa Hadoop QA added a comment -

            +1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12607094/yarn-1068-9.patch
            against trunk revision .

            +1 @author. The patch does not contain any @author tags.

            +1 tests included. The patch appears to include 2 new or modified test files.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 eclipse:eclipse. The patch built with eclipse:eclipse.

            +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

            +1 contrib tests. The patch passed contrib unit tests.

            Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2135//testReport/
            Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2135//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12607094/yarn-1068-9.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2135//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2135//console This message is automatically generated.

            Patch to fix the javadoc and findbugs issues. The test failures seem unrelated.

            kkambatl Karthik Kambatla (Inactive) added a comment - Patch to fix the javadoc and findbugs issues. The test failures seem unrelated.
            hadoopqa Hadoop QA added a comment -

            -1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12607039/yarn-1068-8.patch
            against trunk revision .

            +1 @author. The patch does not contain any @author tags.

            +1 tests included. The patch appears to include 2 new or modified test files.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

            +1 eclipse:eclipse. The patch built with eclipse:eclipse.

            -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            -1 core tests. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

            org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStoreZKClientConnections

            The following test timeouts occurred in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

            org.apache.hadoop.yarn.client.api.impl.TestNMClient

            +1 contrib tests. The patch passed contrib unit tests.

            Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2127//testReport/
            Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/2127//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-client.html
            Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2127//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12607039/yarn-1068-8.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 javadoc . The javadoc tool appears to have generated 1 warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStoreZKClientConnections The following test timeouts occurred in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.client.api.impl.TestNMClient +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2127//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/2127//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-client.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2127//console This message is automatically generated.

            Patch on top of YARN-1232, that addresses Bikas' earlier comments about ACL checking as well.

            kkambatl Karthik Kambatla (Inactive) added a comment - Patch on top of YARN-1232 , that addresses Bikas' earlier comments about ACL checking as well.
            bikassaha Bikas Saha added a comment -

            Lets see how YARN-986 and YARN-1232 develop and make a call based on that. Would like to avoid committing stuff that will soon be changed a lot.

            bikassaha Bikas Saha added a comment - Lets see how YARN-986 and YARN-1232 develop and make a call based on that. Would like to avoid committing stuff that will soon be changed a lot.

            Thanks bikassaha, agree with most of your points.

            AdminService does not use the HAServiceProtocolServerSideTranslatorPB pattern

            The reason for this is our attempt to reuse most of the common code - protos and client implementations.

            Having thought about this, it seems to me that this jira is actually blocked by YARN-986.

            To fix the admin support in entirety, I agree that we need YARN-1232 and YARN-986. That said, for ease of development, I would propose splitting the admin support into two parts (JIRAs) - basic support (this JIRA) to go in first to help testing YARN-1232 and YARN-986, and complete admin support that adds the remaining parts. Otherwise, we need applying this over those other JIRAs to test. Thoughts?

            kkambatl Karthik Kambatla (Inactive) added a comment - Thanks bikassaha , agree with most of your points. AdminService does not use the HAServiceProtocolServerSideTranslatorPB pattern The reason for this is our attempt to reuse most of the common code - protos and client implementations. Having thought about this, it seems to me that this jira is actually blocked by YARN-986 . To fix the admin support in entirety, I agree that we need YARN-1232 and YARN-986 . That said, for ease of development, I would propose splitting the admin support into two parts (JIRAs) - basic support (this JIRA) to go in first to help testing YARN-1232 and YARN-986 , and complete admin support that adds the remaining parts. Otherwise, we need applying this over those other JIRAs to test. Thoughts?
            bikassaha Bikas Saha added a comment -

            It would be educative to compare the HAAdmin server start code with existing admin RM server like the AdminService. I notice 2 things.
            1) AdminService does not use the HAServiceProtocolServerSideTranslatorPB pattern
            2) AdminService does something with HADOOP_SECURITY_AUTHORIZATION which is missing in HAAdminService. This probably defines who has access to perform the admin operations. We will likely need that for the HAAdmin right?

            Having thought about this, it seems to me that this jira is actually blocked by YARN-986. Without a concept of a logical name how can we expect the CLI etc to find the correct RM address from configuration? The client conf files would be expected to have entries for all RM instances and we would need to be able to issue admin commands to any one of them. So we need to be able to address them via a logical name, right? So the current approach that picks the RM_HA_ADMIN_SERVICE address does not seem like a viable solution. Similarly, server conf files would need to tell the server what its logical name is so that it can try to pick and instance specific configurations. This is precisely why we have the HAAdmin.resolveTarget() method.
            Again, it would be educative to look at NNHAServiceTarget for client side and the constructor for NameNode where is uses the logical name to translate and re-write server side conf.

            bikassaha Bikas Saha added a comment - It would be educative to compare the HAAdmin server start code with existing admin RM server like the AdminService. I notice 2 things. 1) AdminService does not use the HAServiceProtocolServerSideTranslatorPB pattern 2) AdminService does something with HADOOP_SECURITY_AUTHORIZATION which is missing in HAAdminService. This probably defines who has access to perform the admin operations. We will likely need that for the HAAdmin right? Having thought about this, it seems to me that this jira is actually blocked by YARN-986 . Without a concept of a logical name how can we expect the CLI etc to find the correct RM address from configuration? The client conf files would be expected to have entries for all RM instances and we would need to be able to issue admin commands to any one of them. So we need to be able to address them via a logical name, right? So the current approach that picks the RM_HA_ADMIN_SERVICE address does not seem like a viable solution. Similarly, server conf files would need to tell the server what its logical name is so that it can try to pick and instance specific configurations. This is precisely why we have the HAAdmin.resolveTarget() method. Again, it would be educative to look at NNHAServiceTarget for client side and the constructor for NameNode where is uses the logical name to translate and re-write server side conf.
            hadoopqa Hadoop QA added a comment -

            +1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12604842/yarn-1068-7.patch
            against trunk revision .

            +1 @author. The patch does not contain any @author tags.

            +1 tests included. The patch appears to include 1 new or modified test files.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 eclipse:eclipse. The patch built with eclipse:eclipse.

            +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

            +1 contrib tests. The patch passed contrib unit tests.

            Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2000//testReport/
            Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2000//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12604842/yarn-1068-7.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2000//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2000//console This message is automatically generated.

            Thanks tucu00. Updated patch to address the comment.

            kkambatl Karthik Kambatla (Inactive) added a comment - Thanks tucu00 . Updated patch to address the comment.

            One nit, in the RMHAProtocolService, the serviceStop() should be symmetric with the start in the sense it should do the if (haEnabled) check to stop the HAAdmin server (instead of doing this check in the HAAdmin service itself).

            tucu00 Alejandro Abdelnur added a comment - One nit, in the RMHAProtocolService, the serviceStop() should be symmetric with the start in the sense it should do the if (haEnabled) check to stop the HAAdmin server (instead of doing this check in the HAAdmin service itself).

            bikassaha, when you get a chance, can you review the latest patch?

            kkambatl Karthik Kambatla (Inactive) added a comment - bikassaha , when you get a chance, can you review the latest patch?
            hadoopqa Hadoop QA added a comment -

            +1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12604708/yarn-1068-6.patch
            against trunk revision .

            +1 @author. The patch does not contain any @author tags.

            +1 tests included. The patch appears to include 1 new or modified test files.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 eclipse:eclipse. The patch built with eclipse:eclipse.

            +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

            +1 contrib tests. The patch passed contrib unit tests.

            Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1991//testReport/
            Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1991//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12604708/yarn-1068-6.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1991//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1991//console This message is automatically generated.

            Minor update to the patch - RMHAProtocolService should continue to be an AbstractService, and not a CompositeService.

            kkambatl Karthik Kambatla (Inactive) added a comment - Minor update to the patch - RMHAProtocolService should continue to be an AbstractService, and not a CompositeService.
            hadoopqa Hadoop QA added a comment -

            +1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12604406/yarn-1068-5.patch
            against trunk revision .

            +1 @author. The patch does not contain any @author tags.

            +1 tests included. The patch appears to include 1 new or modified test files.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 eclipse:eclipse. The patch built with eclipse:eclipse.

            +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

            +1 contrib tests. The patch passed contrib unit tests.

            Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1988//testReport/
            Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1988//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12604406/yarn-1068-5.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1988//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1988//console This message is automatically generated.

            The client side code should probably be invoked via RMProxy like the other protocols are accessed today. In a failover setup the following may not be valid.

            I have updated the patch to fetch the InetSocketAddress from ClientRMProxy#getRMAddress. Not using ClientRMProxy#createProxy because we want the HA related admin calls to go to the intended RM irrespective of whether it is active or not.

            I think YARN-1028 (ConfiguredFailoverProxy) should update this if need be, along with other changes it makes.

            kkambatl Karthik Kambatla (Inactive) added a comment - The client side code should probably be invoked via RMProxy like the other protocols are accessed today. In a failover setup the following may not be valid. I have updated the patch to fetch the InetSocketAddress from ClientRMProxy#getRMAddress. Not using ClientRMProxy#createProxy because we want the HA related admin calls to go to the intended RM irrespective of whether it is active or not. I think YARN-1028 (ConfiguredFailoverProxy) should update this if need be, along with other changes it makes.

            Updating the patch that moves the ha-admin rpc server to HAProtocolService.

            kkambatl Karthik Kambatla (Inactive) added a comment - Updating the patch that moves the ha-admin rpc server to HAProtocolService.
            bikassaha Bikas Saha added a comment -

            If I understand this correctly, the jira is adding RPC server support for HAServiceProtocol on the server side and client support for admins. The RPC server is meant for the HAServiceProtocol. Why do we need to create an RMHAAdminService? This RPC server should simply be embedded in the existing RMHAServiceProtocol that logically implements the HAServiceProtocol. Isnt that needed for ZKFC (and other clients like the admin cli) to connect to the RM and issue transitions commands? Other protocol servers are embedded similarly in the RM. e.g. ApplicationMasterService creates the ApplicationMasterProtocol RPC server and passes itself into the RPC server for callbacks.

            The client side code should probably be invoked via RMProxy like the other protocols are accessed today. In a failover setup the following may not be valid.

            +    RMHAServiceTarget() {
            +      YarnConfiguration targetConf = new YarnConfiguration(getConf());
            +      haAdminServiceAddress = targetConf.getSocketAddr(
            +          YarnConfiguration.RM_HA_ADMIN_ADDRESS,
            +          YarnConfiguration.DEFAULT_RM_HA_ADMIN_ADDRESS,
            +          YarnConfiguration.DEFAULT_RM_HA_ADMIN_PORT);
            
            bikassaha Bikas Saha added a comment - If I understand this correctly, the jira is adding RPC server support for HAServiceProtocol on the server side and client support for admins. The RPC server is meant for the HAServiceProtocol. Why do we need to create an RMHAAdminService? This RPC server should simply be embedded in the existing RMHAServiceProtocol that logically implements the HAServiceProtocol. Isnt that needed for ZKFC (and other clients like the admin cli) to connect to the RM and issue transitions commands? Other protocol servers are embedded similarly in the RM. e.g. ApplicationMasterService creates the ApplicationMasterProtocol RPC server and passes itself into the RPC server for callbacks. The client side code should probably be invoked via RMProxy like the other protocols are accessed today. In a failover setup the following may not be valid. + RMHAServiceTarget() { + YarnConfiguration targetConf = new YarnConfiguration(getConf()); + haAdminServiceAddress = targetConf.getSocketAddr( + YarnConfiguration.RM_HA_ADMIN_ADDRESS, + YarnConfiguration.DEFAULT_RM_HA_ADMIN_ADDRESS, + YarnConfiguration.DEFAULT_RM_HA_ADMIN_PORT);

            Another downside is having yet another port to configure on the RM. 1 for HAProtocol and 1 for HAAdmin. Do we envisage an RM where one of them are present without the other?

            Only the RMHAAdminService listens on a port, not the RMHAProtocolService. Am I missing something here?

            Let me be more elaborate. The MockRM overrides several RM#create*() methods to create services which don't start the listening servers. Similarly, I wanted to override createRMHAProtocolService in MockRM to create an instance of RMHAProtocolService which doesn't start the HA admin listener. Now, consider the two scenarios:

            • HAAdmin functionality embedded in RMHAProtocolService
              RMHAProtocolService#serviceStart() would look like the following:
              protected void serviceStart() {
                if (haEnabled) {
                  transitionToStandby(true);
              
                  // create admin server logic
                  // start admin server
              
              
                } else {
                  transitionToActive();
                }
                super.serviceStart();
              

            and the MockRM override would look like:

            @Override
            protected void createRMHAProtocolService() {
              return new RMHAProtocolService() {
                @Override
                protected void serviceStart() {
                  if (haEnabled) {
                    transitionToStandby(true);
                  } else {
                    transitionToActive();
                  }
                  super.serviceStart();
                }
              };
            }
            

            Any changes to RMHAProtocolService#serviceStart should be replicated to MockRM.

            • RMHAAdminService implements the admin functionality and is part of RMHAProtocolService (latest patch)
              RMHAProtocolService#createRMHAAdminService() creates the admin service to use. If not null, the service is added to RMHAProtocolService and is inited/started/stopped along with it. Now, all MockRM needs to do is return null in MockRM#createRMHAAdminService.
            kkambatl Karthik Kambatla (Inactive) added a comment - Another downside is having yet another port to configure on the RM. 1 for HAProtocol and 1 for HAAdmin. Do we envisage an RM where one of them are present without the other? Only the RMHAAdminService listens on a port, not the RMHAProtocolService. Am I missing something here? Let me be more elaborate. The MockRM overrides several RM#create*() methods to create services which don't start the listening servers. Similarly, I wanted to override createRMHAProtocolService in MockRM to create an instance of RMHAProtocolService which doesn't start the HA admin listener. Now, consider the two scenarios: HAAdmin functionality embedded in RMHAProtocolService RMHAProtocolService#serviceStart() would look like the following: protected void serviceStart() { if (haEnabled) { transitionToStandby( true ); // create admin server logic // start admin server } else { transitionToActive(); } super .serviceStart(); and the MockRM override would look like: @Override protected void createRMHAProtocolService() { return new RMHAProtocolService() { @Override protected void serviceStart() { if (haEnabled) { transitionToStandby( true ); } else { transitionToActive(); } super .serviceStart(); } }; } Any changes to RMHAProtocolService#serviceStart should be replicated to MockRM. RMHAAdminService implements the admin functionality and is part of RMHAProtocolService (latest patch) RMHAProtocolService#createRMHAAdminService() creates the admin service to use. If not null, the service is added to RMHAProtocolService and is inited/started/stopped along with it. Now, all MockRM needs to do is return null in MockRM#createRMHAAdminService.
            bikassaha Bikas Saha added a comment -

            Another downside is having yet another port to configure on the RM. 1 for HAProtocol and 1 for HAAdmin. Do we envisage an RM where one of them are present without the other?

            This makes it very simple to create a RMHAProtocolService without the admin parts without having to override all its service*() calls. In the current patch, MockRM does the following.

            Sorry, didnt quite follow this.

            bikassaha Bikas Saha added a comment - Another downside is having yet another port to configure on the RM. 1 for HAProtocol and 1 for HAAdmin. Do we envisage an RM where one of them are present without the other? This makes it very simple to create a RMHAProtocolService without the admin parts without having to override all its service*() calls. In the current patch, MockRM does the following. Sorry, didnt quite follow this.

            Having a separate RMHAAdminService allows handling it through the super.service*() calls in RMHAProtocolService. This makes it very simple to create a RMHAProtocolService without the admin parts without having to override all its service*() calls. In the current patch, MockRM does the following. If it was all in one RMHAService, this will be much longer and needs to be updated everytime the actual RMHAService service* methods get updated. Also, don't see particular downside to having it as a separate service except that it is class. The RM itself still sees a single service in RMHAProtocolService.

              protected RMHAProtocolService createRMHAProtocolService() {
                return new RMHAProtocolService(this) {
                  @Override
                  protected RMHAAdminService createRMHAAdminService() {
                    return null;
                  }
                };
              }
            
            kkambatl Karthik Kambatla (Inactive) added a comment - Having a separate RMHAAdminService allows handling it through the super.service*() calls in RMHAProtocolService. This makes it very simple to create a RMHAProtocolService without the admin parts without having to override all its service*() calls. In the current patch, MockRM does the following. If it was all in one RMHAService, this will be much longer and needs to be updated everytime the actual RMHAService service* methods get updated. Also, don't see particular downside to having it as a separate service except that it is class. The RM itself still sees a single service in RMHAProtocolService. protected RMHAProtocolService createRMHAProtocolService() { return new RMHAProtocolService( this ) { @Override protected RMHAAdminService createRMHAAdminService() { return null ; } }; }
            bikassaha Bikas Saha added a comment -

            I meant the first one. Was wondering whether we should have a single RMHAService that does both the protocol and the admin parts. Seems to me that they belong to one place. So a pro/con discussion would help.

            bikassaha Bikas Saha added a comment - I meant the first one. Was wondering whether we should have a single RMHAService that does both the protocol and the admin parts. Seems to me that they belong to one place. So a pro/con discussion would help.

            What are the pros and cons of making the new service part of the existing RMHAService instead of creating another RPC endpoint?

            Sorry Bikas, I am not sure I understand the question. Can you elaborate a bit. Little background on the thought-process that went in:

            1. Should RMHAAdminService be a part of RMHAProtocolService? It can be, I just moved it to a separate service to make the code easier to understand and maintain. Can definitely merge it back.
            2. Should RMHAAdminService be a part of AdminService and use the same port instead of listening on another? I initially thought of doing that, but refrained for two reasons: (1) Better to have two listeners to address two protocols - the AdminProtocol and HAAdminProtocol. (2) AdminService itself is not an Always-On service at the moment. To move it to Always-On, we need to make it HA-aware which could potentially take longer time.
            kkambatl Karthik Kambatla (Inactive) added a comment - What are the pros and cons of making the new service part of the existing RMHAService instead of creating another RPC endpoint? Sorry Bikas, I am not sure I understand the question. Can you elaborate a bit. Little background on the thought-process that went in: Should RMHAAdminService be a part of RMHAProtocolService? It can be, I just moved it to a separate service to make the code easier to understand and maintain. Can definitely merge it back. Should RMHAAdminService be a part of AdminService and use the same port instead of listening on another? I initially thought of doing that, but refrained for two reasons: (1) Better to have two listeners to address two protocols - the AdminProtocol and HAAdminProtocol. (2) AdminService itself is not an Always-On service at the moment. To move it to Always-On, we need to make it HA-aware which could potentially take longer time.
            bikassaha Bikas Saha added a comment -

            What are the pros and cons of making the new service part of the existing RMHAService instead of creating another RPC endpoint?

            bikassaha Bikas Saha added a comment - What are the pros and cons of making the new service part of the existing RMHAService instead of creating another RPC endpoint?
            hadoopqa Hadoop QA added a comment -

            +1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12603885/yarn-1068-4.patch
            against trunk revision .

            +1 @author. The patch does not contain any @author tags.

            +1 tests included. The patch appears to include 1 new or modified test files.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 eclipse:eclipse. The patch built with eclipse:eclipse.

            +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

            +1 contrib tests. The patch passed contrib unit tests.

            Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1960//testReport/
            Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1960//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12603885/yarn-1068-4.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1960//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1960//console This message is automatically generated.

            Thanks Alejandro for catching it, completely missed that.

            Uploading a patch that creates RMHAAdminService only when HA is enabled.

            kkambatl Karthik Kambatla (Inactive) added a comment - Thanks Alejandro for catching it, completely missed that. Uploading a patch that creates RMHAAdminService only when HA is enabled.

            Unless I'm missing something, the HAAdmin service is always on and listening, is this correct? The always on is OK, but listening to a port, do we need it?

            tucu00 Alejandro Abdelnur added a comment - Unless I'm missing something, the HAAdmin service is always on and listening, is this correct? The always on is OK, but listening to a port, do we need it?
            hadoopqa Hadoop QA added a comment -

            +1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12603703/yarn-1068-3.patch
            against trunk revision .

            +1 @author. The patch does not contain any @author tags.

            +1 tests included. The patch appears to include 1 new or modified test files.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 eclipse:eclipse. The patch built with eclipse:eclipse.

            +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

            +1 contrib tests. The patch passed contrib unit tests.

            Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1948//testReport/
            Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1948//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12603703/yarn-1068-3.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1948//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1948//console This message is automatically generated.

            Updated patch moves the ipc.Server creation and start to start() and MockRM to not start the server - verified these changes fix the test failures.

            kkambatl Karthik Kambatla (Inactive) added a comment - Updated patch moves the ipc.Server creation and start to start() and MockRM to not start the server - verified these changes fix the test failures.
            hadoopqa Hadoop QA added a comment -

            -1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12603434/yarn-1068-2.patch
            against trunk revision .

            +1 @author. The patch does not contain any @author tags.

            -1 tests included. The patch doesn't appear to include any new or modified tests.
            Please justify why no new tests are needed for this patch.
            Also please list what manual steps were performed to verify this patch.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 eclipse:eclipse. The patch built with eclipse:eclipse.

            +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            -1 core tests. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

            org.apache.hadoop.yarn.client.api.impl.TestYarnClient
            org.apache.hadoop.yarn.client.api.impl.TestNMClient
            org.apache.hadoop.yarn.server.resourcemanager.security.TestAMRMTokens
            org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterLauncher
            org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler
            org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler
            org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps
            org.apache.hadoop.yarn.server.resourcemanager.TestResourceManager
            org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService
            org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup
            org.apache.hadoop.yarn.server.resourcemanager.TestFifoScheduler
            org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched
            org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes
            org.apache.hadoop.yarn.server.resourcemanager.applicationmasterservice.TestApplicationMasterService
            org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.TestFifoScheduler
            org.apache.hadoop.yarn.server.resourcemanager.TestRM
            org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
            org.apache.hadoop.yarn.server.resourcemanager.TestAMAuthorization
            org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices
            org.apache.hadoop.yarn.server.resourcemanager.security.TestRMDelegationTokens
            org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart

            +1 contrib tests. The patch passed contrib unit tests.

            Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1940//testReport/
            Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1940//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12603434/yarn-1068-2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.client.api.impl.TestYarnClient org.apache.hadoop.yarn.client.api.impl.TestNMClient org.apache.hadoop.yarn.server.resourcemanager.security.TestAMRMTokens org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterLauncher org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps org.apache.hadoop.yarn.server.resourcemanager.TestResourceManager org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup org.apache.hadoop.yarn.server.resourcemanager.TestFifoScheduler org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes org.apache.hadoop.yarn.server.resourcemanager.applicationmasterservice.TestApplicationMasterService org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.TestFifoScheduler org.apache.hadoop.yarn.server.resourcemanager.TestRM org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler org.apache.hadoop.yarn.server.resourcemanager.TestAMAuthorization org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices org.apache.hadoop.yarn.server.resourcemanager.security.TestRMDelegationTokens org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1940//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1940//console This message is automatically generated.

            Patch rebased on trunk and documentation of configs.

            kkambatl Karthik Kambatla (Inactive) added a comment - Patch rebased on trunk and documentation of configs.

            Updated patch to capture the updates to YARN-1027.

            kkambatl Karthik Kambatla (Inactive) added a comment - Updated patch to capture the updates to YARN-1027 .

            That is correct - the patch depends on YARN-1027, which in turn depends on YARN-1098.

            kkambatl Karthik Kambatla (Inactive) added a comment - That is correct - the patch depends on YARN-1027 , which in turn depends on YARN-1098 .

            the patch seems to depend on another patch being applied first, which one?

            tucu00 Alejandro Abdelnur added a comment - the patch seems to depend on another patch being applied first, which one?

            I am uploading a preliminary patch adds admin support for HA operations for any feedback on the approach.

            The patch is very much along the lines of HDFS admin implementation and reuses the common code for the same. Outline:

            1. RMHAProtocolService starts an RPC server for HA commands.
            2. yarn rmhaadmin command invokes RMHAdminCLI which extends HAAdmin

            I haven't figured out how to use the ClientRMProxy while using HAAdmin yet. Would love to hear any thoughts/inputs on that.

            Pending tasks: (1) yarn-site, (2) RPC server instantiation through YARNRPC calls like in AdminService.

            kkambatl Karthik Kambatla (Inactive) added a comment - I am uploading a preliminary patch adds admin support for HA operations for any feedback on the approach. The patch is very much along the lines of HDFS admin implementation and reuses the common code for the same. Outline: RMHAProtocolService starts an RPC server for HA commands. yarn rmhaadmin command invokes RMHAdminCLI which extends HAAdmin I haven't figured out how to use the ClientRMProxy while using HAAdmin yet. Would love to hear any thoughts/inputs on that. Pending tasks: (1) yarn-site, (2) RPC server instantiation through YARNRPC calls like in AdminService.

            People

              kasha Karthik Kambatla
              kasha Karthik Kambatla
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: