Details
Description
Problem
Spawning of SparkContext in Spark-Client mode changes the credentials of current user group information. This doesn't let the client (who spawned Spark-Context) talk to the Name Node using tgt anymore but, using delegation tokens. This is undesirable for any library to change the context of JVM here UserGroupInformation
Root Cause
Spark creates HDFS Delegation Tokens so that the App master so spawned can communicate with Name Node but, during creation of this token Spark adds the delegation token to current users credentials as well.
setupSecurityToken(amContainer) UserGroupInformation.getCurrentUser().addCredentials(credentials) amContainer
With this operation client now always uses delegation token for any further communication with Name Node. This scenario becomes dangerous when Resource Manager cancels the Delegation Token after 10 minutes of shutting down the spark context. This leads to issues on client side like:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token 444 for subroto) can't be found in cache at org.apache.hadoop.ipc.Client.call(Client.java:1472) at org.apache.hadoop.ipc.Client.call(Client.java:1403) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at com.sun.proxy.$Proxy13.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2095) at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1214) at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1210) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1210) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1409) at Sample.main(Sample.java:85)
There are other places in code also where we do similar operation like in:
org.apache.spark.deploy.yarn.ExecutorDelegationTokenUpdater.updateCredentialsIfRequired()