The DataNode's UUID (DataStorage.getDatanodeUuid() field) is NULL at the point where the FsDataset object is created (DataNode.initStorage().
As the DataStorage object is an input to the FsDataset factory method, it is desirable for it to be fully populated with a UUID at this point. In particular, our FsDatasetSpi implementation relies upon the DataNode UUID as a key to access our underlying block storage device.
This also appears to be a regression compared to Hadoop 1.x - our 1.x FSDatasetInterface plugin has a non-NULL UUID on startup. I haven't fully traced through the code, but I suspect this came from the BPOfferService/BPServiceActor refactoring to support federated namenodes.
HDFS-5448, the DataNode is now responsible for generating its own UUID. This greatly simplifies the fix. Move the UUID check/generation in from DataNode.createBPRegistration() to DataNode.initStorage(). This more naturally co-locates UUID generation immediately subsequent to the read of the UUID from the DataStorage properties file.