Thanks for the review. I took care of the comments, other than below:
> I thought this patch wanted to make decommissioning nodes sort lower for block locations also?
If you follow the discussion, this patch does not shutdown datanodes when they are decommissioned, which is the changed behavior. In order to prevent datanode from being used (which was the reason why the datanode is shutdown), they are listed as the last location. Decommissioning nodes will continue the same way, as today.
> # in MiniDFSCluster.setupDatanodeAddress, you can use conf.getTrimmed instead of manually calling trim()
I need getTrimmed(name, defaultValue). I cannot pass default value with the existing API. I do not think it calling a trim outside is such a big deal.
> Maybe a better name would be DECOMMISSIONED_AT_END_COMPARATOR or something?
I think DECOM_COMPARATOR name seems sufficient. Wouldn't adding javadoc that explains the intent instead of a long name to describes functionality, suffice?
> the getFreeSocketPort() trick seems like it's not likely to work repeatably - isn't there a high likelihood that two datanodes would pick the same free port, since you don't track "claimed" ports anywhere? Or that one of these ports might later get claimed by one of the many other daemons running on ephemeral ports in a mini cluster?
The code ensures the port is free and starts one datanode at a time. Each datanode is setup with it's own port and will use that. The port is tracked as a part of datanode address config setup for the datanode. Not sure how another datanode can take the free port?
> when the MiniDFS cluster is constructed, shouldn't you clear out the dfs.hosts file? Otherwise you're relying on the test case itself to clean itself up between runs (which differs from the rest of minidfs's storage handling)
Tests are cleaning this is the approach taken. MiniDFSCluster storage is retained or closed based on the format flag. This is a configuration file outside of that kind of state. Most tests do not care about this configuration. But the ones that do, work with this configuration and do appropriate deletions.
> is there a test case anywhere that covers what happens when a decom node connects to the namenode? eg after a NN restart when a node is in both include and decom?
It is a confusion today where people think there is a relationship between include and exlude. There are independent and unrelated. The include file defines datanodes that are part of the cluster. Exclude lists the datanodes that need to be decommissioned.
testDecommission() tests the case you are mentioning. It restarts the cluster to ensure decommissioned node can connect to NN.