Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Won't Fix
-
2.8.2
-
None
-
None
Description
we have a secure hadoop cluster with namenode federation.
submit job fails after kerberos TGT maxLifeTime expired(default 24h), client log shows" failed to renew token: HDFS_DELEGATION_TOKEN...".
check rm log, found rm tgt is expired but not triggers relogin(),just retry and fail...
(rm log see screenshot)
digging in code:
when rm tries to renewToken(),
UserGroupInformation.getLoginUser()="rm",
but UserGroupInformation.getCurrentUser()="testUser".
this causes Client.shouldAuthenticateOverKrb() returns false, thus cant trigger reloginFromKeytab() or reloginFromTicketCache().