[YARN-5910] Support for multi-cluster delegation tokens - ASF JIRA

Details

Type: New Feature
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.9.0, 3.0.0-alpha4
Component/s: security
Labels:
None

Hadoop Flags:

Reviewed

Description

As an administrator running many secure (kerberized) clusters, some which have peer clusters managed by other teams, I am looking for a way to run jobs which may require services running on other clusters. Particular cases where this rears itself are running something as core as a distcp between two kerberized clusters (e.g. hadoop --config /home/user292/conf/ distcp hdfs://LOCALCLUSTER/user/user292/test.out hdfs://REMOTECLUSTER/user/user292/test.out.result).

Thanks to ~~YARN-3021~~, once can run for a while but if the delegation token for the remote cluster needs renewal the job will fail[1]. One can pre-configure their hdfs-site.xml loaded by the YARN RM to know of all possible HDFSes available but that requires coordination that is not always feasible, especially as a cluster's peers grow into the tens of clusters or across management teams. Ideally, one could have core systems configured this way but jobs could also specify their own handling of tokens and management when needed?

[1]: Example stack trace when the RM is unaware of a remote service:
----------------

2016-03-23 14:59:50,528 INFO org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: application_1458441356031_3317 found existing hdfs token Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token
 10927 for user292)
2016-03-23 14:59:50,557 WARN org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: Unable to add the application to the delegation token renewer.
java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for user292)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.IOException: Unable to map logical nameservice URI 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a failover proxy provider configured.
at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164)
at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128)
at org.apache.hadoop.security.token.Token.renew(Token.java:377)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425)
... 6 more

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

YARN-5910.01.patch
20/Dec/16 19:22
29 kB
Jian He
YARN-5910.2.patch
17/Jan/17 00:32
34 kB
Jian He
YARN-5910.3.patch
17/Jan/17 22:04
35 kB
Jian He
YARN-5910.4.patch
18/Jan/17 04:28
46 kB
Jian He
YARN-5910.5.patch
18/Jan/17 23:59
56 kB
Jian He
YARN-5910.6.patch
20/Jan/17 03:43
61 kB
Jian He
YARN-5910.7.patch
20/Jan/17 19:20
61 kB
Jian He

Issue Links

Blocked

YARN-9746 RM should merge local config for token renewal

Open

relates to

SPARK-37205 Support mapreduce.job.send-token-conf when starting containers in YARN

Resolved

Support for multi-cluster delegation tokens

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates