When the number of blocks on the DataNode grows large we start running into a few issues:
- Block reports take a long time to process on the NameNode. In testing we have seen that a block report with 6 Million blocks takes close to one second to process on the NameNode. The NameSystem write lock is held during this time.
- We start hitting the default protobuf message limit of 64MB somewhere around 10 Million blocks. While we can increase the message size limit it already takes over 7 seconds to serialize/unserialize a block report of this size.
HDFS-2832 has introduced the concept of a DataNode as a collection of storages i.e. the NameNode is aware of all the volumes (storage directories) attached to a given DataNode. This makes it easy to split block reports from the DN by sending one report per storage directory to mitigate the above problems.