Details
Description
Whenever a datanode is restarted, the registration call after the restart received by NameNode lands in NetworkTopology#add via DatanodeManager#registerDatanode requires write lock on NetworkTopology#netLock.
This registration thread is getting starved by flood of FSNamesystem.getAdditionalDatanode calls, which are triggered by clients those who were writing to the restarted datanode.
The registration call which is waiting for write lock on NetworkTopology#netLock is holding write lock on FSNamesystem#fsLock, causing all the other RPC calls which require the lock on FSNamesystem#fsLock wait.
We can make NetworkTopology#netLock lock fair so that the registration thread will not starve.
Attachments
Attachments
Issue Links
- is depended upon by
-
HADOOP-15509 Release Hadoop 2.7.7
- Resolved