Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: HA branch (HDFS-1623)
    • Fix Version/s: HA branch (HDFS-1623)
    • Component/s: ha, hdfs-client
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Todd pointed out this issue with the current patch posted for HDFS-1623. It seems like the solution would be to replace "dfs.ha.namenode.addresses" with something like "dfs.ha.namenode.addresses.<logical-uri>", so as to be able to support configuring the addresses of multiple HA clusters.

      1. HDFS-2367-HDFS-1623.patch
        15 kB
        Aaron T. Myers
      2. HDFS-2367-HDFS-1623.patch
        15 kB
        Aaron T. Myers

        Issue Links

          Activity

          Hide
          hudson Hudson added a comment -

          Integrated in Hadoop-Hdfs-HAbranch-build #53 (See https://builds.apache.org/job/Hadoop-Hdfs-HAbranch-build/53/)
          HDFS-2367. Enable the configuration of multiple HA cluster addresses. Contributed by Aaron T. Myers.

          atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1233549
          Files :

          • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-1623.txt
          • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HAUtil.java
          • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/ConfiguredFailoverProxyProvider.java
          • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientFailover.java
          • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/HATestUtil.java
          Show
          hudson Hudson added a comment - Integrated in Hadoop-Hdfs-HAbranch-build #53 (See https://builds.apache.org/job/Hadoop-Hdfs-HAbranch-build/53/ ) HDFS-2367 . Enable the configuration of multiple HA cluster addresses. Contributed by Aaron T. Myers. atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1233549 Files : /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/CHANGES. HDFS-1623 .txt /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HAUtil.java /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/ConfiguredFailoverProxyProvider.java /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientFailover.java /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/HATestUtil.java
          Hide
          atm Aaron T. Myers added a comment -

          Thanks a lot for the reviews, Todd. I've just committed this.

          Show
          atm Aaron T. Myers added a comment - Thanks a lot for the reviews, Todd. I've just committed this.
          Hide
          tlipcon Todd Lipcon added a comment -

          +1, I tested on a cluster with HBase, seems to work.

          Show
          tlipcon Todd Lipcon added a comment - +1, I tested on a cluster with HBase, seems to work.
          Hide
          atm Aaron T. Myers added a comment -

          In theory this should also support a non-federated configuration by just specifying dfs.ha.namenodes and then dfs.namenode.rpc-address.nn1, etc.. have you tested that manually?

          I talked to Todd about this, and we concluded this isn't actually true. Some of the server side configuration is implemented such that it could deal with a null value for nameServiceId, but the configuration is necessary for client side failover to be configured correctly.

          Can you either leave this TODO in there, or file a JIRA for it?

          Filed: HDFS-2811

          Show
          atm Aaron T. Myers added a comment - In theory this should also support a non-federated configuration by just specifying dfs.ha.namenodes and then dfs.namenode.rpc-address.nn1, etc.. have you tested that manually? I talked to Todd about this, and we concluded this isn't actually true. Some of the server side configuration is implemented such that it could deal with a null value for nameServiceId, but the configuration is necessary for client side failover to be configured correctly. Can you either leave this TODO in there, or file a JIRA for it? Filed: HDFS-2811
          Hide
          tlipcon Todd Lipcon added a comment -

          One other small note:

          -  // TODO(HA): This test should probably be made to fail if a client fails over
          -  // to talk to an NN with a different block pool id. Once failover between
          -  // active/standy in a single block pool is implemented, this test should be
          -  // changed to exercise that.
          

          Can you either leave this TODO in there, or file a JIRA for it?

          Show
          tlipcon Todd Lipcon added a comment - One other small note: - // TODO(HA): This test should probably be made to fail if a client fails over - // to talk to an NN with a different block pool id. Once failover between - // active/standy in a single block pool is implemented, this test should be - // changed to exercise that. Can you either leave this TODO in there, or file a JIRA for it?
          Hide
          tlipcon Todd Lipcon added a comment -

          lgtm. In theory this should also support a non-federated configuration by just specifying dfs.ha.namenodes and then dfs.namenode.rpc-address.nn1, etc.. have you tested that manually?

          Show
          tlipcon Todd Lipcon added a comment - lgtm. In theory this should also support a non-federated configuration by just specifying dfs.ha.namenodes and then dfs.namenode.rpc-address.nn1 , etc.. have you tested that manually?
          Hide
          atm Aaron T. Myers added a comment -

          Agreed. Can we break that out into a separate JIRA?

          I went ahead and filed HDFS-2805.

          Show
          atm Aaron T. Myers added a comment - Agreed. Can we break that out into a separate JIRA? I went ahead and filed HDFS-2805 .
          Hide
          atm Aaron T. Myers added a comment -

          Rather than add the new initialize API, can we just add another argument to the "conventional constructor" for the interface? ie we already require that the implementations have a constructor which takes the interface - let's just require they also take the URI? It seems reasonable to me.

          Agreed. Done.

          Perhaps we should add a test case which sets up a federated cluster, each of the nameservices being HA? I think we have the pieces in place to do it, but would be good to verify.

          Agreed. Can we break that out into a separate JIRA?

          Show
          atm Aaron T. Myers added a comment - Rather than add the new initialize API, can we just add another argument to the "conventional constructor" for the interface? ie we already require that the implementations have a constructor which takes the interface - let's just require they also take the URI? It seems reasonable to me. Agreed. Done. Perhaps we should add a test case which sets up a federated cluster, each of the nameservices being HA? I think we have the pieces in place to do it, but would be good to verify. Agreed. Can we break that out into a separate JIRA?
          Hide
          tlipcon Todd Lipcon added a comment -

          Rather than add the new initialize API, can we just add another argument to the "conventional constructor" for the interface? ie we already require that the implementations have a constructor which takes the interface - let's just require they also take the URI? It seems reasonable to me.

          Perhaps we should add a test case which sets up a federated cluster, each of the nameservices being HA? I think we have the pieces in place to do it, but would be good to verify.

          Show
          tlipcon Todd Lipcon added a comment - Rather than add the new initialize API, can we just add another argument to the "conventional constructor" for the interface? ie we already require that the implementations have a constructor which takes the interface - let's just require they also take the URI? It seems reasonable to me. Perhaps we should add a test case which sets up a federated cluster, each of the nameservices being HA? I think we have the pieces in place to do it, but would be good to verify.
          Hide
          umamaheswararao Uma Maheswara Rao G added a comment -

          Ok Aaron, I will do that. Thanks for the confirmation.

          Show
          umamaheswararao Uma Maheswara Rao G added a comment - Ok Aaron, I will do that. Thanks for the confirmation.
          Hide
          atm Aaron T. Myers added a comment -

          @Uma - ah, so you did. Can we resolve HDFS-2796 as a duplicate, then, given that this patch contains a fix for that issue?

          Show
          atm Aaron T. Myers added a comment - @Uma - ah, so you did. Can we resolve HDFS-2796 as a duplicate, then, given that this patch contains a fix for that issue?
          Hide
          umamaheswararao Uma Maheswara Rao G added a comment -

          Hi Aaron, Yestreday i found TestDFSClientFailover TODO while executing some HA tests.
          filed HDFS-2796 for TestDFSClientFailover TODO.

          Show
          umamaheswararao Uma Maheswara Rao G added a comment - Hi Aaron, Yestreday i found TestDFSClientFailover TODO while executing some HA tests. filed HDFS-2796 for TestDFSClientFailover TODO.
          Hide
          atm Aaron T. Myers added a comment -

          The new test case doesn't really test the change

          Whoops! I should have mentioned that the new test case is to test some error handling code I put in for a case when the user misconfigures ConfiguredFailoverProxyProvider.

          Show
          atm Aaron T. Myers added a comment - The new test case doesn't really test the change Whoops! I should have mentioned that the new test case is to test some error handling code I put in for a case when the user misconfigures ConfiguredFailoverProxyProvider .
          Hide
          atm Aaron T. Myers added a comment -

          Here's a patch which addresses the issue.

          The new test case doesn't really test the change, but the change is tested by most or all of the new HA tests, all of which I ran. I also tested this patch manually on an HA cluster.

          I also took the liberty of addressing a TODO while I was in TestDFSClientFailover.

          Show
          atm Aaron T. Myers added a comment - Here's a patch which addresses the issue. The new test case doesn't really test the change, but the change is tested by most or all of the new HA tests, all of which I ran. I also tested this patch manually on an HA cluster. I also took the liberty of addressing a TODO while I was in TestDFSClientFailover.
          Hide
          tlipcon Todd Lipcon added a comment -

          We now have dfs.ha.namenodes.<nameserviceId> which should be all we need - the logical URI just becomes hdfs://nameserviceId. We just need to somehow plumb that URI into the proxy provider.

          Show
          tlipcon Todd Lipcon added a comment - We now have dfs.ha.namenodes.<nameserviceId> which should be all we need - the logical URI just becomes hdfs://nameserviceId . We just need to somehow plumb that URI into the proxy provider.

            People

            • Assignee:
              atm Aaron T. Myers
              Reporter:
              atm Aaron T. Myers
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development