Looking at the code, I don't see a deadlock possibility. While a call to setTimerForTokenRenewal requires a lock on DelegationTokenRenewer.class, I don't see any method holding a lock on DelegationTokenRenewer.class requiring a lock on delegationTokens or cancelled flag. Am I missing something here?
You're right. I was somehow considering removeDelegationTokenRenewalForJob to be a synchronized method. Sorry about that.
This could be fixed via the original jira (
MAPREDUCE-4860 or a new jira). The deadlock being resolved was the main issues in this jira which is already fixed. An extra renewal just leads to an additional exception message in the logs, correct ? or is it more severe than that (other than the failed unit test).
Comments on the patch itself.
The previous patch is likely better. One concern with the current patch - 'cancelled' is associated with the current RenewalTimerTask. If removeDelegationTokenRenewalForJob tries to cancel() while a token renewal is in progress - it effectively has no affect, since a new RenewalTimerTask would be scheduled. This may not be an issue since the reference to the DelegationTokenToRenew object will be removed from the list of delegationTokens. Since renew has been moved into DelegationTokenToRenew - I'd prefer having the cancel / intent to cancel associated with that as well.