Here is a first version of changing the locking so that getAdditionalBlock and addStoredBlock occur without any global locks. I have seen that randownwriter and dfsio that used to fail on a 1000 node cluster now runs successfully with this patch.
1. NetworkTopology has reader/writer locks. This map hardly changes but is used very frequently. Now, multiple open() calls can proceed in parallel.
2. The pending blocks and pending files are put into a new class called pendingCreates.java. This helps locking them together.
3. The BlocksMap is protected by a reader/writer lock.
4. In the common case (when the file is still in pendingCreates), addStordBlock() does not acquire the global fsnamesystem lock.
5. The datanodeMap was already using a lock object associated with it to protect modifications to it. Make sure that this check is done in all places where the datanodeMap is modified.
6. The Host2NodesMap has its own read/write lock. This will be merged in with the datanodeMap when we go to a much finer locking model in future.
This patch is for code review purposes only. Some additional locking is needed for processReport (still to be done) but I would like some comments on the changes I have made.
I would have liked a more-finer grain locking model that allows all filesystem-methods to be highly-concurrent. But that approach was deemed too-complex for the short term. I am putting out this patch to get feedback on whether this medium-term approach is acceptable.