Hi Jason, thank you very much for the review !
It's confusing to see a MR_JOB_SEND_TOKEN_CONF_DEFAULT in MRJobConfig yet it clearly is not the default value.
Should this feature be tied to UserGroupInformation.isSecurityEnabled? I'm wondering if this can cause issues where the current cluster isn't secure but the RM needs to renew the job's tokens for a remote secure cluster or some other secure service. Seems like if this conf is set then that's all we need to know.
Currently, the RM DelegationTokenRewener will only add the tokens if security is enabled (code in RMAppManager#submitApplication), so I think with this existing implemtation, we can assume this feature is for security enabled only ?
Similarly the code explicitly fails in ClientRMService if the conf is there when security is disabled which seems like we're taking a case that isn't optimal but should work benignly and explicitly making sure it fails. Not sure that's user friendly behavior.
My intention was to prevent user from sending conf in non-secure mode(which anyways is not needed if my above reply is true), in case the conf size huge which may increase load on RM. On ther other hand, Varun chatted offline that we can add a limit config in RM to limit the size of configs, your opinion ?
Nit: For the ByteBuffer usage in parseCredentials and parseTokensConf, the rewind method calls seem unnecessary since we're throwing the buffers away immediately afterwards.
Actually, the bytebuffer is a direct reference from the containerLaunchContext, not a copy. I think this is also required because it was specifically to solve issues in
Should the Configuration constructor call in parseTokensConf be using the version that does not load defaults? If not then I recommend we at least allow a conf to be passed in to use as a copy constructor.Loading a new Configuration from scratch is really expensive and we should avoid it if possible. See the discussion on HADOOP-11223 for details.
Good point. I actually did the same in YarnRunner#setAppConf method, but missed this place.
In DelegationTokenRenewer, why aren't we using the appConf as-is when renewing the tokens?
I wasn't sure whether the mere appConf is enough for the connection - (Is there any kerberos related configs for RM itself are required for authentication?). Let me do some experiments, if this works, I'll just use appConf.
Also it looks like we're polluting subsequent app-conf renewals with prior app configurations, as well as simply leaking appConf objects as renewerConf resources infinitum. I don't see where renewerConf gets reset in-between.
My previous patch made a copy of each appConf and merge with RM's conf(for the reason I wasn't sure whether RM's own conf is required) and use that for renwer. But then I think this maybe bad because every app will have its own copy of configs, which may largely increase the memory size if the number of apps is very big. So, in the latest patch I changed it to let all apps share the same renewerConf - this is based on the assumption that "dfs.nameservices" must have distint keys for each distinct cluster, so we won't have situation where two apps use different configs for the same cluster - it is true that unnecessary configs used by 1st app will be shared by subsequent apps.
Arguably there should be a unit tests that verifies a first app with token conf key A and a second app with token conf key B doesn't leave a situation where the renewals of the second app are polluted with conf key A.
If the mere appConf works, we should be fine.
Speaking of unit tests, I see where we fixed up the YARN unit tests to pass the new conf but not a new test that verifies the specified conf is used appropriately when renewing for that app and not for other apps that didn't specify a conf.
Yep, I'll add the UT.