Description
This patch allows for network configuration parameters to be aded to the hadoop-site.xml file. These parameters specify a network interface name and an optional nameserver hostname which DataNodes and TaskTrackers consult to resolve their hostnames from the IP bound to the specified network interface.
This is useful when machines that are part of different physical or logical network need to participate in hadoop clusters as client nodes. The hostname and IP reported by InetAddress.getLocalHost() are not necessarily the ones that will allow the JobTracker and NameNode to reach the clients, as well as not necessarily the ones through which the DFS clients can reach the DataNodes.
The configuration parameters are
- cluster.report.nif
- cluster.report.ns
nif: takes the name of a network interface, like en0, en1 (on macs), eth0, etc...
ns: the host name of a DNS server to use when resolving the IP bound to the specified nif
These parameters are set by default to the value "default" which will replicate the current behavior of reporting InetAddress.getLocalHost().getHostName() and getHostAddress()
As part of the patch, a new library dnsjava was added along with its license information (BSD license). The list of affected files is:
src
org.apache.hadoop.dfs.DataNode
org.apache.hadoop.mapred.taskTracker
org.apache.hadoop.util.NetworkUtils
conf
hadoop-default.xml
lib
dnsjava-2.0.2.jar
dnsjava-2.0.2.LICENSE.txt