When fs.default.name or fs.defaultFS in hadoop core-site.xml is configured as hdfs://ip, and hbase.rootdir is configured as hdfs://ip:port/hbaserootdir where port is the hdfs namenode's default port. the bulkload operation will not remove the file in bulk output dir. Store::bulkLoadHfile will think hdfs:://ip and hdfs:://ip:port as different filesystem and go with copy approaching instead of rename.
The root cause is that hbase master will rewrite fs.default.name/fs.defaultFS according to hbase.rootdir when regionserver started, thus, dest fs uri from the hregion will not matching src fs uri passed from client.
any suggestion what is the best approaching to fix this issue?
I kind of think that we could check for default port if src uri come without port info.