Details

      Description

      HDFS-2231 started the process of adding configuration for HA, but one piece is missing. The current state of the configuration is, I believe:
      dfs.ha.namenodes - a list of identifiers for HA namenodes
      dfs.federation.nameservices - a list of federated nameservices
      dfs.namenode.rpc-address[.nameservice-id][.namenode-id] - some specific config for the given namenode. If HA or federation is disabled, the extra components can be elided for backwards compatibility.

      The issue here is that there is no easy way to discern which NN is paired with which other NN. Additionally, adding a new federated nameservice to a config will require changes to dfs.ha.namenodes which makes templating harder. It would be simpler to change dfs.ha.namenodes to be nameservice-scoped: dfs.ha.namenodes.<nameservice-id>.

      1. hdfs-2582.txt
        47 kB
        Todd Lipcon
      2. hdfs-2582-v2.txt
        49 kB
        Todd Lipcon
      3. hdfs-2582-v3.txt
        49 kB
        Todd Lipcon
      4. hdfs-2582-v4.txt
        49 kB
        Todd Lipcon

        Activity

        Hide
        tlipcon Todd Lipcon added a comment -

        Example config under the proposed scheme, for a DN which is part of two federated namespaces, each of which is HA-enabled:

        dfs.federation.nameservices=namespace1,namespace2
        dfs.ha.namenodes.namespace1=ns1-nn1,ns1-nn2
        dfs.ha.namenodes.namespace2=ns2-nn1,ns2-nn2
        dfs.namenode.http-address.namespace1.ns1-nn1=node1.example.com:50070
        

        Not entirely clear still: do namenode IDs need to be unique across federated nameservices? If they do, then why do we need the namespace1 component in the dfs.namenode.http-address... config?

        Show
        tlipcon Todd Lipcon added a comment - Example config under the proposed scheme, for a DN which is part of two federated namespaces, each of which is HA-enabled: dfs.federation.nameservices=namespace1,namespace2 dfs.ha.namenodes.namespace1=ns1-nn1,ns1-nn2 dfs.ha.namenodes.namespace2=ns2-nn1,ns2-nn2 dfs.namenode.http-address.namespace1.ns1-nn1=node1.example.com:50070 Not entirely clear still: do namenode IDs need to be unique across federated nameservices? If they do, then why do we need the namespace1 component in the dfs.namenode.http-address... config?
        Hide
        tlipcon Todd Lipcon added a comment -

        Updated description to clarify terminology ("nameservice" instead of "namespace")

        Show
        tlipcon Todd Lipcon added a comment - Updated description to clarify terminology ("nameservice" instead of "namespace")
        Hide
        eli Eli Collins added a comment -

        You're proposed scheme makes sense to me. Perhaps the namespace was left in the config so each namespace could have its own web UI at some point?

        Show
        eli Eli Collins added a comment - You're proposed scheme makes sense to me. Perhaps the namespace was left in the config so each namespace could have its own web UI at some point?
        Hide
        tlipcon Todd Lipcon added a comment -

        Here's a patch implementing the proposal.

        There's one TODO in there for fixing TestGetConf - I'm working on fixing that now.

        I haven't run all of the unit tests yet, but I did run some of the federation-related tests as well as some of the HA-related tests and they pass. I also ran a selection of general HDFS tests and they pass.

        The patch does introduce some more TODOs in places like the balancer, where we are currently doing some custom RPC connections to NNs – will file a followup JIRA for that when this is committed.

        Show
        tlipcon Todd Lipcon added a comment - Here's a patch implementing the proposal. There's one TODO in there for fixing TestGetConf - I'm working on fixing that now. I haven't run all of the unit tests yet, but I did run some of the federation-related tests as well as some of the HA-related tests and they pass. I also ran a selection of general HDFS tests and they pass. The patch does introduce some more TODOs in places like the balancer, where we are currently doing some custom RPC connections to NNs – will file a followup JIRA for that when this is committed.
        Hide
        tlipcon Todd Lipcon added a comment -

        v2 addresses GetConf

        Show
        tlipcon Todd Lipcon added a comment - v2 addresses GetConf
        Hide
        tlipcon Todd Lipcon added a comment -

        v3 addresses a small NPE I discovered building HDFS-2591 on top of this patch.

        Show
        tlipcon Todd Lipcon added a comment - v3 addresses a small NPE I discovered building HDFS-2591 on top of this patch.
        Hide
        eli Eli Collins added a comment -

        lgtm, +1 with the follow-on jiras.

        The code that uses the new map (Map<ns id, Map<nn id, addr>>) is a little unwieldy since we don't have types for the ns and nn ids. Given that this is a tiny map the readability might be worth the overhead of a class (eg in the same vain of ConfiguredNNAddress). Don't feel strongly so feel free to punt if you disagree.

        Show
        eli Eli Collins added a comment - lgtm, +1 with the follow-on jiras. The code that uses the new map (Map<ns id, Map<nn id, addr>>) is a little unwieldy since we don't have types for the ns and nn ids. Given that this is a tiny map the readability might be worth the overhead of a class (eg in the same vain of ConfiguredNNAddress). Don't feel strongly so feel free to punt if you disagree.
        Hide
        tlipcon Todd Lipcon added a comment -

        Here's v4, which I built a few more patches on top. One more slight fix to the nnId detection stuff, nothing major changed.

        I agree the map<map<..>> is kind of clunky, but I'd like to punt to a follow-on to clean this up, since I have some more patches built on top.

        I'll commit this soon based on Eli's prior +1 unless there are any other comments.

        Show
        tlipcon Todd Lipcon added a comment - Here's v4, which I built a few more patches on top. One more slight fix to the nnId detection stuff, nothing major changed. I agree the map<map<..>> is kind of clunky, but I'd like to punt to a follow-on to clean this up, since I have some more patches built on top. I'll commit this soon based on Eli's prior +1 unless there are any other comments.
        Hide
        eli Eli Collins added a comment -

        +1 to v4, the fix in the delta lgtm

        Show
        eli Eli Collins added a comment - +1 to v4, the fix in the delta lgtm
        Hide
        tlipcon Todd Lipcon added a comment -

        Committed to HDFS-1623 branch. Thanks for the reviews, Eli. If there are further review comments I'll be happy to address post-commit.

        Show
        tlipcon Todd Lipcon added a comment - Committed to HDFS-1623 branch. Thanks for the reviews, Eli. If there are further review comments I'll be happy to address post-commit.

          People

          • Assignee:
            tlipcon Todd Lipcon
            Reporter:
            tlipcon Todd Lipcon
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development