Description
I was looking into methods associated with storages and storageTypes. I found DatanodeDescriptor.hasStorageType could be a source of bottlenecks. To check whether a specific storage type exists among the storage locations associated with a DatanodeDescriptor, hasStorageType iterates over an array of DatanodeStorageInfos returned by getStorageInfos(). This retrieves the storage information from a storageMap and converts it to an array while under a lock. As the system scales and the size of storageMap grows with more datanodes, the duration spent in the synchronized block will increase. This issue could become more significant when hasStorageType is called in methods like DatanodeDescriptor.pruneStorageMap that could iterate (resulting in a form of nested iteration) over a large data structure. The combination of a repeated linear search (within hasStorageType) and the iteration within a lock can lead to a significant complexity (potentially quadratic) and significant synchronization bottlenecks
DFSNetworkTopology.chooseRandomWithStorageType and DFSNetworkTopology. chooseRandomWithStorageTypeTwoTrial are affected because they both invoke hasStorageType. Additionally, INodeFile.assertAllBlocksComplete and BlockManager.checkRedundancy() faces a similar issue (FSNamesystem.finalizeINodeFileUnderConstruction invokes both methods under a writeLock)
This appears to be a similar issue with https://issues.apache.org/jira/browse/HDFS-17638 . I’m curious to know if my analysis is wrong and if there is anything that can be done to reduce the impact of these issues