In DataNodeWebHdfsMethods, the code creates a DFSClient to connect to the NN, so that it can access the files in the cluster. DataNodeWebHdfsMethods relies on the address passed in the URL to locate the NN. This implementation has two problems:
- The DFSClient only knows about the current active NN, thus it does not support failover.
- The delegation token is based on the active NN, therefore the DFSClient will fail to authenticate of the standby NN in secure HA setup.
Currently the parameter namenoderpcaddress in the URL stores the host-ip pair that corresponds to the active NN. To fix this bug, this jira proposes to store the name service id in the parameter in HA setup (yet the parameter stays the same in non-HA setup).