Thanks Zesheng for reporting the issue, Ravi for working on the solution and other folks for reviewing.
I was looking into an infinite loop case when doing checkLeases myself, and figured out that the logic in FSNamesystem#internalReleaseLease
assert false : "Already checked that the last block is incomplete";
doesn't take care of the case that penultimate block is COMMITTED and final block is COMPLETE, thus caused the infinite loop. Looking at the history of this jira, I found Jing Zhao suggested the same at https://issues.apache.org/jira/browse/HDFS-4882?focusedCommentId=14207202&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14207202
I did some analysis to share here (sorry for a long post).
When the final block is COMMITTED, the current implementation does the following:
+ " internalReleaseLease: Committed blocks are minimally replicated,"
+ " lease removed, file closed.");
return true; }
String message = "DIR* NameSystem.internalReleaseLease: " +
"Failed to release lease for file " + src +
". Committed blocks are waiting to be minimally replicated." +
" Try again later.";
throw new AlreadyBeingCreatedException(message);
What it does:
- For scenario#1, check minReplication for both penultimate and last block, if satisifed, finalize the block (recover lease, close file)
- For scenario#2, throw AlreadyBeingCreatedException derived from IOException (the name of this exception appears to be a misnomer, maybe we should fix later).
To solve the case that penultimate block is COMMITTED and final block is COMPLETE, I'd suggest to make some changes on top of the submitted patch (for further discussion):
For scenario#1, we can do the same as when the last block is COMMITTED, as described above.
For scenario#2, I think we have two options:
- option A, drop the code in the existing code that handles scenario#2 (not to throw the exception), let checkLeases check back again (2 second is current internal), waiting for block report to finish to change the minimal replication situation then recover the lease. The infinite loop could still happen if the minimal replication never get satisfied. But this would be rare assuming the minimal replication can be satisfied eventually.
- option B, do the similar logic as in the existing code (throwing AlreadyBeingCreatedException). There is an issue for this option too that I can see, and described below.
With option B, look at the caller side (LeaseManager#checkLeases, whenever an IOException is caught, it just go ahead removing the lease. So the possible infinite loop described in the scenario#2 comment will not happen because of the lease removal (lease recovered).
But the problem with option B is, after the lease removal, the file may still have blocks not satisfying minimal replication (scenario#2), which would be a potential issue. This situation exists in current implementation when handling the case that the last block is COMMITTED. I think we should we wait for minimal replication to be satisfied before recovering the lease.
So looks like option A is more preferable. But the original code tries to recover the lease immediately, I'm not sure whether there is any catch here.