Details
-
Improvement
-
Status: Patch Available
-
Minor
-
Resolution: Unresolved
-
1.0.0
-
None
-
None
Description
HBASE-14280 introduced fix for bulkload failures when referring a remote cluster name service id if "bulkloading" from a HA cluster.
HBASE-14280 solution on FSHDFSUtils.getNNAddresses was to invoke DFSUtil.getNNServiceRpcAddressesForCluster instead of DFSUtil.getNNServiceRpcAddresses. This works for hadoop 2.6 and above.
Proposed change here is to use "DFSUtil.getRpcAddressesForNameserviceId" instead, which already returns only addresses for specific nameservice informed. This is available since hadoop 2.4.
Sample proposal on FSHDFSUtils.getNNAddresses:
...
String nameServiceId = serviceName.split(":")[1]; if (dfsUtilClazz == null) { dfsUtilClazz = Class.forName("org.apache.hadoop.hdfs.DFSUtil"); } if (getNNAddressesMethod == null) { getNNAddressesMethod = dfsUtilClazz.getMethod("getRpcAddressesForNameserviceId", Configuration.class, String.class, String.class); } Map<String, InetSocketAddress> nnMap = (Map<String, InetSocketAddress>) getNNAddressesMethod .invoke(null, conf, nameServiceId, null); for (Map.Entry<String, InetSocketAddress> e2 : nnMap.entrySet()) { InetSocketAddress addr = e2.getValue(); addresses.add(addr); } ...
Will also add test conditions for FSHDFSUtils.isSameHdfs to verify scenario when multiple name service ids are defined.