Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.2.0
-
None
-
None
-
Reviewed
Description
In FSNamesystem#commitBlockSynchronization of branch-1, logSync() may be called when the FSNamesystem lock is held. Similar to HDFS-4186, this may cause some performance issue.
The following issue was observed in a cluster that was running a Hive job and was writing to 100,000 temporary files (each task is writing to 1000s of files). When this job is killed, a large number of files are left open for write. Eventually when the lease for open files expires, lease recovery is started for all these files in a very short duration of time. This causes a large number of commitBlockSynchronization where logSync is performed with the FSNamesystem lock held. This overloads the namenode resulting in slowdown.
Since logSync is called right after the synchronization section, we can simply remove the logSync call.
Attachments
Attachments
Issue Links
- is related to
-
HDFS-4186 logSync() is called with the write lock held while releasing lease
- Closed