Description
In HDP 3, there will be support for multiple NameNode clusters (HDFS Federation), qualified by disitinct namespaces, in a single Hadoop cluster.
Knox needs to provide the ability to distinguish these NN clusters in its topologies for proxying WEBHDFS; the most practical means is a service parameter.
The behavior for the following needs to be defined:
- Given multiple NN clusters, and a descriptor which does not specify such a namespace; which NN cluster will be chosen at discovery-time?
https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/Federation.html#Configuration:
It seems the following config properties are available for federated namenode deployments:
- core-site.xml: fs.defaultFS
The value is the default HDFS nameservice URL (e.g., hdfs://ns1) or the default namenode endpoint (e.g., hdfs://hostname:8020)
- hdfs-site.xml: dfs.nameservices
Comma-delimited list of nameservices (e.g., "ns1,ns2")
- hdfs-site.xml: dfs.ha.namenodes.NAMESERVICE (e.g., dfs.ha.namenodes.ns1=nn1,nn3)
Identifies the namenodes associated with a given nameservice
hdfs-site.xml: dfs.namenode.http-address.NAMESERVICE.NODENAME
Property name pattern for the actual HTTP endpoint addresses
e.g, dfs.namenode.http-address.ns1.nn1, dfs.namenode.http-address.ns1.nn3, dfs.namenode.http-address.ns2.nn4, dfs.namenode.http-address.ns2.nn7