BlockListAsLongs's constructor takes a list of Blocks and a list of ReplicaInfos. On the surface, the former is mildly irritating because it is a concrete class, while the latter is a greater concern due to being a File-based implementation of Replica.
On deeper inspection, BlockListAsLongs passes members of both to an internal method that accepts just Blocks, which conditionally casts them back to ReplicaInfos (this cast only happens to the latter, though this isn't immediately obvious to the reader).
Conveniently, all methods called on these objects are found in the Replica interface, and all functional (i.e. non-test) consumers of this interface pass in Replica subclasses. If this constructor took Lists of Replicas instead, it would be more generally useful and its implementation would be cleaner as well.
Fixing this indeed makes the business end of BlockListAsLongs cleaner while requiring no changes to FsDatasetImpl. As suggested by the above description, though, the HDFS tests use BlockListAsLongs differently from the production code – they pretty much universally provide a list of actual Blocks. To handle this:
- In the case of SimulatedFSDataset, providing a list of Replicas is actually less work.
- In the case of NNThroughputBenchmark, rewriting to use Replicas is fairly invasive. Instead, the patch creates a second constructor in BlockListOfLongs specifically for the use of NNThrougputBenchmark. It turns the stomach a little, but is clearer and requires less code than the alternatives (and isn't without precedent).