Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Duplicate
-
2.1.0-beta, 0.23.9
-
None
-
None
-
None
Description
When ReplicationMonitor creates and adds a replication work for a block, its INodeFile (0.23) or BlockCollection (2.x) is recorded. This is done under the FSN write lock, but the actual chooseTarget() call is made outside the lock.
When chooseTarget() is called, FSDirectory#getFullPathName() ends up getting called. If the INode was unlinked from its parents after ReplicationMonitor releasing the lock (e.g. delete), this call genetates NPE and crashes the name node.
Path name is actually unused in the existing block placement policy modules. But private implementations might use it. It will be nice if we can avoid calling getFullPathName() at all here.
Attachments
Issue Links
- is duplicated by
-
HDFS-4482 ReplicationMonitor thread can exit with NPE due to the race between delete and replication of same file.
- Closed