Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
2.8.0
-
None
-
None
-
Reviewed
Description
When the namenode performs a lease recovery for a failed write, the commitBlockSynchronization() will fail, if none of the new target has sent a received-IBR. At this point, the data is inaccessible, as the namenode will throw a NullPointerException upon getBlockLocations().
The lease recovery will be retried in about an hour by the namenode. If the nodes are faulty (usually when there is only one new target), they may not block report until this point. If this happens, lease recovery throws an AlreadyBeingCreatedException, which causes LeaseManager to simply remove the lease without finalizing the inode.
This results in an inconsistent lease state. The inode stays under-construction, but no more lease recovery is attempted. A manual lease recovery is also not allowed.
Attachments
Attachments
Issue Links
- is duplicated by
-
HDFS-8406 Lease recovery continually failed
- Resolved
- is related to
-
HDFS-13486 Backport HDFS-11817 (A faulty node can cause a lease leak and NPE on accessing data) to branch-2.7
- Resolved