Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-13419

client can communicate to server even if hdfs delegation token expired

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      i was testing hdfs delegation token expired problem use spark streaming, if i set my batch interval small than 10 sec, my spark streaming program will not dead, but if batch interval was setted bigger than 10 sec, the spark streaming program will dead because of hdfs delegation token expire problem, and the exception as follows

      org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token 14042 for test) is expired
      	at org.apache.hadoop.ipc.Client.call(Client.java:1468)
      	at org.apache.hadoop.ipc.Client.call(Client.java:1399)
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
      	at com.sun.proxy.$Proxy11.getListing(Unknown Source)
      	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:554)
      	at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
      	at com.sun.proxy.$Proxy12.getListing(Unknown Source)
      	at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1969)
      	at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1952)
      	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:693)
      	at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:105)
      	at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:755)
      	at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:751)
      	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
      	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:751)
      	at com.envisioncn.arch.App$2$1.call(App.java:120)
      	at com.envisioncn.arch.App$2$1.call(App.java:91)
      	at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:218)
      	at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:218)
      	at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:902)
      	at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:902)
      	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1899)
      	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1899)
      	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
      	at org.apache.spark.scheduler.Task.run(Task.scala:86)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      

      the spark streaming program only call FileSystem.listStatus function in every batch

                              FileSystem fs = FileSystem.get(new Configuration());
                              FileStatus[] status =  fs.listStatus(new Path("/"));
      
                              for(FileStatus status1 : status){
                                  System.out.println(status1.getPath());
                              }
      

      and i found when hadoop client send rpc request to server, it will first get a connection object and set up the connection if the connection dose not exists.And it will get a SaslRpcClient to connect to the server side in the connection setup stage.Also server will authenticate the client at the connection setup stage. But if the connection exists, client will use the existed connection, so the authentication stage will not happen.

      The connection between client and server will be closed if it's idle time exceeds ipc.client.connection.maxidletime, and ipc.client.connection.maxidletime default value is 10sec. Therefore, if i continue send request to server at fixed interval as long as the interval small than 10sec, the connection will not be closed, so delegation token expire problem will not happen.

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            saiergon wangqiang.shen
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: