Details

    • Type: Sub-task Sub-task
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: datanode
    • Labels:
      None

      Description

      The Datanode should register multiple interfaces with the Namenode (who then forwards them to clients). We can do this by extending the DatanodeID, which currently just contains a single interface, to contain a list of interfaces. For compatibility, the DatanodeID method to get the DN address for data transfer should remain unchanged (multiple interfaces are only used where the client explicitly takes advantage of them).

      By default, if the Datanode binds on all interfaces (via using the wildcard in the dfs*address configuration) all interfaces are exposed, modulo ones like the loopback that should never be exposed. Alternatively, a new configuration parameter (dfs.datanode.available.interfaces) allows the set of interfaces can be specified explicitly in case the user only wants to expose a subset. If the new default behavior is too disruptive we could default dfs.datanode.available.interfaces to be the IP of the IPC interface which is the only interface exposed today (per HADOOP-6867, only the port from dfs.datanode.address is used today).

      The interfaces can be specified by name (eg "eth0"), subinterface name (eg "eth0:0"), or IP address. The IP address can be specified by range using CIDR notation so the configuration values are portable.

      1. hdfs-3146.txt
        15 kB
        Eli Collins

        Activity

        Hide
        Eli Collins added a comment -

        Patch attached.

        Show
        Eli Collins added a comment - Patch attached.
        Hide
        Todd Lipcon added a comment -
        +  public static InetSocketAddress[] getInterfaceAddrs(
        +      String interfaceNames[], int port) throws UnknownHostException {
        

        Sorry I missed this in the earlier review of this function, but I think it would be better to call the parameter something like interfaceSpecs – because each one may specify an interface name, an IP address, or a subnet.

        In the same function, for the subnet case, you're using port 0 instead of the specified port. Looks like a mistake?


        +      LOG.warn("Invalid address given " + addrString);
        

        Nit: add a ':' to the log message


        +    // If the datanode registered with an address we can't use
        +    // then use the address the IPC came in on instead
        +    if (NetUtils.isWildcardOrLoopback(nodeReg.getIpAddr())) {
        

        I found this comment a little unclear. Under what circumstance would the DN pass a loopback or wildcard IP? Aren't they filtered on the DN side? I think this should be at least a WARN, or maybe even throw an exception to disallow the registration.

        Edit: I got to the part later in the patch where the DN potentially sends a wildcard to the NN. I think it might be simplier to have the DN send an empty list to the NN if it's bound to wildcard – and adjust the comment here to explain why it would be registering with no addresses.


        +      // TODO: haven't determined the port yet, using default
        

        Are you planning another patch to fix this on the branch before merging? What's the backward-compatibility path with the existing configurations for bind address, etc, where the port's specified? We should be clear about which takes precedence, and throw errors on startup if both are configured, I think?

        Maybe it makes sense to change these to just be InetAddress instead of InetSocketAddress, and never fill in a port there?

        This patch should add the new config to hdfs-default, and edit the existing config's documentation to explain how the two interact.


        +      if (0 != interfaceStrs.length) {
        +        LOG.info("Using interfaces [" +
        +        Joiner.on(',').join(interfaceStrs)+ "] with addresses [" +
        +        Joiner.on(',').join(interfaceAddrs) + "]");
        +      }
        
        • need indentation for the joiner lines
        • add comment explaining how this eventually gets filled in, if it's empty?

        +   * @param addrs socket addresses to convert
        +   * @return an array of strings of IPs for the given addresses
        +   */
        +  public static String[] toIpAddrStrings(InetSocketAddress[] addrs) {
        

        javadoc should specify that ports aren't included in the stringification of addresses

        Show
        Todd Lipcon added a comment - + public static InetSocketAddress[] getInterfaceAddrs( + String interfaceNames[], int port) throws UnknownHostException { Sorry I missed this in the earlier review of this function, but I think it would be better to call the parameter something like interfaceSpecs – because each one may specify an interface name, an IP address, or a subnet. In the same function, for the subnet case, you're using port 0 instead of the specified port. Looks like a mistake? + LOG.warn( "Invalid address given " + addrString); Nit: add a ':' to the log message + // If the datanode registered with an address we can't use + // then use the address the IPC came in on instead + if (NetUtils.isWildcardOrLoopback(nodeReg.getIpAddr())) { I found this comment a little unclear. Under what circumstance would the DN pass a loopback or wildcard IP? Aren't they filtered on the DN side? I think this should be at least a WARN, or maybe even throw an exception to disallow the registration. Edit: I got to the part later in the patch where the DN potentially sends a wildcard to the NN. I think it might be simplier to have the DN send an empty list to the NN if it's bound to wildcard – and adjust the comment here to explain why it would be registering with no addresses. + // TODO: haven't determined the port yet, using default Are you planning another patch to fix this on the branch before merging? What's the backward-compatibility path with the existing configurations for bind address, etc, where the port's specified? We should be clear about which takes precedence, and throw errors on startup if both are configured, I think? Maybe it makes sense to change these to just be InetAddress instead of InetSocketAddress, and never fill in a port there? This patch should add the new config to hdfs-default, and edit the existing config's documentation to explain how the two interact. + if (0 != interfaceStrs.length) { + LOG.info( "Using interfaces [" + + Joiner.on(',').join(interfaceStrs)+ "] with addresses [" + + Joiner.on(',').join(interfaceAddrs) + "]" ); + } need indentation for the joiner lines add comment explaining how this eventually gets filled in, if it's empty? + * @param addrs socket addresses to convert + * @ return an array of strings of IPs for the given addresses + */ + public static String [] toIpAddrStrings(InetSocketAddress[] addrs) { javadoc should specify that ports aren't included in the stringification of addresses

          People

          • Assignee:
            Eli Collins
            Reporter:
            Eli Collins
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:

              Development