Details

      Description

      The balancer currently interacts directly with namenode InetSocketAddresses and makes its own IPC proxies. We need to integrate it with HA so that it uses the same client failover infrastructure.

      1. HDFS-2592.patch
        17 kB
        Uma Maheswara Rao G
      2. HDFS-2592.patch
        18 kB
        Uma Maheswara Rao G
      3. HDFS-2592.patch
        20 kB
        Uma Maheswara Rao G
      4. HDFS-2592.patch
        23 kB
        Uma Maheswara Rao G

        Issue Links

          Activity

          Hide
          tlipcon Todd Lipcon added a comment -

          Hey Uma. Any progress on this? Would be nice to have the initial HA release support the balancer.

          Show
          tlipcon Todd Lipcon added a comment - Hey Uma. Any progress on this? Would be nice to have the initial HA release support the balancer.
          Hide
          umamaheswararao Uma Maheswara Rao G added a comment -

          Hi Todd, Due to some urgent issues and Travels, I could not concentrate much on my JIRAs from last few days.
          I will prepare the patch soon.
          I wanted to know, when are we planning the HA initial release?

          Show
          umamaheswararao Uma Maheswara Rao G added a comment - Hi Todd, Due to some urgent issues and Travels, I could not concentrate much on my JIRAs from last few days. I will prepare the patch soon. I wanted to know, when are we planning the HA initial release?
          Hide
          tlipcon Todd Lipcon added a comment -

          My hope is to propose a merge within the next week or so - most of the pieces for manual failover are done. The initial release would of course be considered alpha.

          Show
          tlipcon Todd Lipcon added a comment - My hope is to propose a merge within the next week or so - most of the pieces for manual failover are done. The initial release would of course be considered alpha.
          Hide
          umamaheswararao Uma Maheswara Rao G added a comment -

          Balancer HA support completed for ClinetProtocol, Pending support for NameNodeProtocol api used in Balancer. Currently ConfiguredFailoverProxyProvider supports ClinetProtocol. Filed JIRA for NameNodeProtocol support HDFS-2767.

          Show
          umamaheswararao Uma Maheswara Rao G added a comment - Balancer HA support completed for ClinetProtocol, Pending support for NameNodeProtocol api used in Balancer. Currently ConfiguredFailoverProxyProvider supports ClinetProtocol. Filed JIRA for NameNodeProtocol support HDFS-2767 .
          Hide
          atm Aaron T. Myers added a comment -

          Hey Uma, can you post the patch you have to make the Balancer work with ClientProtocol? I'd just like to see the approach.

          Show
          atm Aaron T. Myers added a comment - Hey Uma, can you post the patch you have to make the Balancer work with ClientProtocol? I'd just like to see the approach.
          Hide
          umamaheswararao Uma Maheswara Rao G added a comment -

          Ok Aaron, i will upload the initial patch today.

          Show
          umamaheswararao Uma Maheswara Rao G added a comment - Ok Aaron, i will upload the initial patch today.
          Hide
          umamaheswararao Uma Maheswara Rao G added a comment -

          Updated the patch!

          Show
          umamaheswararao Uma Maheswara Rao G added a comment - Updated the patch!
          Hide
          tlipcon Todd Lipcon added a comment -

          Uma, do you mind if I take this over to finish up your patch? I was planning on working on HDFS-2767 which is closely related.

          Show
          tlipcon Todd Lipcon added a comment - Uma, do you mind if I take this over to finish up your patch? I was planning on working on HDFS-2767 which is closely related.
          Hide
          umamaheswararao Uma Maheswara Rao G added a comment -

          Hi Todd, Thanks for the care on this issue. Actually I stated work on HDFS-2767 also as part of this issue. Just updated my work in HDFS-2767. With that change, Balancer can work with failover now.

          2012-01-11 06:48:43,791 INFO  balancer.Balancer (Balancer.java:run(1390)) - p         = Balancer.Parameters[BalancingPolicy.Node, threshold=10.0]
          Time Stamp               Iteration#  Bytes Already Moved  Bytes Left To Move  Bytes Being Moved
          2012-01-11 06:48:43,891 WARN  retry.RetryInvocationHandler (RetryInvocationHandler.java:invoke(105)) - Exception while invoking create of class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
          ............
          ...........
          2012-01-11 06:48:58,857 WARN  retry.RetryInvocationHandler (RetryInvocationHandler.java:invoke(105)) - Exception while invoking getBlocks of class org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
          
          Show
          umamaheswararao Uma Maheswara Rao G added a comment - Hi Todd, Thanks for the care on this issue. Actually I stated work on HDFS-2767 also as part of this issue. Just updated my work in HDFS-2767 . With that change, Balancer can work with failover now. 2012-01-11 06:48:43,791 INFO balancer.Balancer (Balancer.java:run(1390)) - p = Balancer.Parameters[BalancingPolicy.Node, threshold=10.0] Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved 2012-01-11 06:48:43,891 WARN retry.RetryInvocationHandler (RetryInvocationHandler.java:invoke(105)) - Exception while invoking create of class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately. ............ ........... 2012-01-11 06:48:58,857 WARN retry.RetryInvocationHandler (RetryInvocationHandler.java:invoke(105)) - Exception while invoking getBlocks of class org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB after 0 fail over attempts. Trying to fail over immediately.
          Hide
          umamaheswararao Uma Maheswara Rao G added a comment -

          This patch expects HDFS-2767 to apply first.

          Show
          umamaheswararao Uma Maheswara Rao G added a comment - This patch expects HDFS-2767 to apply first.
          Hide
          tlipcon Todd Lipcon added a comment -

          This looks fairly reasonable. A few items:

          • Is it possible to move that new code out of the NameNodeConnector constructor into a static method in DFSUtil or even DFSClient?
          • Rather than duplicating the code to parse the maxFailoverAttempts, failoverBaseSleepMillis, etc, can we reuse some of the code that's in DFSClient? If we move the connection code into a static method in DFSClient, then we can instantiate a DFSClient.Conf and pull out the variables from there, for example.
          • Some too-long lines in the new test code
          • The new test is mostly dup code from TestBalancer. Is it possible to reuse more of the code by refactoring into static methods, etc?
          • Similarly much of the setup code is duplicated from HAUtil.configureFailoverFs. Can you just call that function, then grab the conf from the resulting filesystem, or refactor that method so you can reuse the configuration generating code?
          Show
          tlipcon Todd Lipcon added a comment - This looks fairly reasonable. A few items: Is it possible to move that new code out of the NameNodeConnector constructor into a static method in DFSUtil or even DFSClient? Rather than duplicating the code to parse the maxFailoverAttempts, failoverBaseSleepMillis, etc, can we reuse some of the code that's in DFSClient? If we move the connection code into a static method in DFSClient, then we can instantiate a DFSClient.Conf and pull out the variables from there, for example. Some too-long lines in the new test code The new test is mostly dup code from TestBalancer. Is it possible to reuse more of the code by refactoring into static methods, etc? Similarly much of the setup code is duplicated from HAUtil.configureFailoverFs. Can you just call that function, then grab the conf from the resulting filesystem, or refactor that method so you can reuse the configuration generating code?
          Hide
          umamaheswararao Uma Maheswara Rao G added a comment -

          Thanks a lot again, Todd. I will address all your comments in next patch. Infact i already started the refactoring, mainly to avoid the duplicates. Was waiting for the initial feedback on approach.

          Thanks
          Uma

          Show
          umamaheswararao Uma Maheswara Rao G added a comment - Thanks a lot again, Todd. I will address all your comments in next patch. Infact i already started the refactoring, mainly to avoid the duplicates. Was waiting for the initial feedback on approach. Thanks Uma
          Hide
          umamaheswararao Uma Maheswara Rao G added a comment -

          Updated the patch reflecting to HDFS-2767

          Show
          umamaheswararao Uma Maheswara Rao G added a comment - Updated the patch reflecting to HDFS-2767
          Hide
          umamaheswararao Uma Maheswara Rao G added a comment -

          Todd, patch addressed your comments & ready for review, thanks

          Show
          umamaheswararao Uma Maheswara Rao G added a comment - Todd, patch addressed your comments & ready for review, thanks
          Hide
          tlipcon Todd Lipcon added a comment -

          Looks mostly good. One small nit - can you add some javadoc to HATestUtil.setFailoverConfigurations? Also, need to update the patch to apply on current branch. Thanks Uma!

          Show
          tlipcon Todd Lipcon added a comment - Looks mostly good. One small nit - can you add some javadoc to HATestUtil.setFailoverConfigurations? Also, need to update the patch to apply on current branch. Thanks Uma!
          Hide
          umamaheswararao Uma Maheswara Rao G added a comment -

          Thank you very much Todd, for the nice reviews!
          Updated the patch.

          Show
          umamaheswararao Uma Maheswara Rao G added a comment - Thank you very much Todd, for the nice reviews! Updated the patch.
          Hide
          tlipcon Todd Lipcon added a comment -

          +1, will commit momentarily.

          Show
          tlipcon Todd Lipcon added a comment - +1, will commit momentarily.
          Hide
          hudson Hudson added a comment -

          Integrated in Hadoop-Hdfs-HAbranch-build #51 (See https://builds.apache.org/job/Hadoop-Hdfs-HAbranch-build/51/)
          HDFS-2592. Balancer support for HA namenodes. Contributed by Uma Maheswara Rao G.

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1232531
          Files :

          • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-1623.txt
          • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java
          • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
          • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
          • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithHANameNodes.java
          • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/HATestUtil.java
          Show
          hudson Hudson added a comment - Integrated in Hadoop-Hdfs-HAbranch-build #51 (See https://builds.apache.org/job/Hadoop-Hdfs-HAbranch-build/51/ ) HDFS-2592 . Balancer support for HA namenodes. Contributed by Uma Maheswara Rao G. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1232531 Files : /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/CHANGES. HDFS-1623 .txt /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithHANameNodes.java /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/HATestUtil.java

            People

            • Assignee:
              umamaheswararao Uma Maheswara Rao G
              Reporter:
              tlipcon Todd Lipcon
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development