Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
While working on HDFS-988 I noticed that the locking in FSNamesystem and FSDirectory could be improved. Some observations:
The namesystem lock (fsLock) is always taken before acquiring the directory lock (dirLock). Therefore the only time when the directory lock is needed is when the fsLock is taken for reading and the directory lock is taken for writing, but I don't think that ever happens. Therefore we can probably get rid of the directory lock.
In HDFS-988 I modified handleHeartbeat to take the read lock so it's synchronized with register datanode. I also added a missing synchronization of datanodeMap to wipeDatanode because handleHeartbeat calls getDatanode() while only holding locks on heartbeats and datanodeMap, but registerDatanode mutates datanodeMap without locking either. We should revisit which locks/synchronization protect which data structures, there may be other similar bugs and also opportunities to increase parallelism.