Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
2.3.0
-
None
-
Reviewed
-
Committed to branch-2 and trunk.
Description
We recently saw an issue where the NN restarted while tens of thousands of files were open. The NN then ended up spending multiple seconds for each commitBlockSynchronization() call, spending most of its time inside LeaseManager.findPath(). findPath currently works by looping over all files held for a given writer, and traversing the filesystem for each one. This takes way too long when tens of thousands of files are open by a single writer.
Attachments
Attachments
Issue Links
- duplicates
-
HDFS-4183 Throttle block recovery
- Resolved
- is duplicated by
-
HDFS-11326 FSNamesystem closeFileCommitBlocks block FSNamesystem Ops
- Open