Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-15655

Enhance KMS client retry behavior

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 3.1.0
    • Fix Version/s: 3.2.0, 3.1.2
    • Component/s: kms
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      KMS doesn't retry upon SocketTimeoutException (the ssl connection was established, but the handshake timed out).

      6:08:55.315 PM	WARN	KMSClientProvider	
      Failed to connect to example.com:16000
      6:08:55.317 PM	WARN	LoadBalancingKMSClientProvider	
      KMS provider at [https://example.com:16000/kms/v1/] threw an IOException: 
      java.net.SocketTimeoutException: Read timed out
      	at java.net.SocketInputStream.socketRead0(Native Method)
      	at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
      	at java.net.SocketInputStream.read(SocketInputStream.java:171)
      	at java.net.SocketInputStream.read(SocketInputStream.java:141)
      	at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
      	at sun.security.ssl.InputRecord.read(InputRecord.java:503)
      	at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
      	at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
      	at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413)
      	at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1397)
      	at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
      	at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
      	at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:153)
      	at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:186)
      	at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:140)
      	at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:348)
      	at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.openConnection(DelegationTokenAuthenticatedURL.java:333)
      	at org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:478)
      	at org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:473)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:422)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
      	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:472)
      	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:788)
      	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:288)
      	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:284)
      	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:124)
      	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:284)
      	at org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:532)
      	at org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:927)
      	at org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:946)
      	at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:316)
      	at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:311)
      	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
      	at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:323)
      	at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:949)
      	at org.apache.hadoop.hbase.util.FSUtils.getVersion(FSUtils.java:338)
      	at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:423)
      	at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:260)
      	at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:151)
      	at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:122)
      	at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:795)
      	at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2036)
      	at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553)
      	at java.lang.Thread.run(Thread.java:748)
      6:08:55.346 PM	WARN	LoadBalancingKMSClientProvider	
      Aborting since the Request has failed with all KMS providers(depending on hadoop.security.kms.client.failover.max.retries=1 setting and numProviders=1) in the group OR the exception is not recoverable
      

        Attachments

        1. HADOOP-15655.001.patch
          1 kB
          Kitti Nanasi
        2. HADOOP-15655.002.patch
          5 kB
          Kitti Nanasi
        3. HADOOP-15655.003.patch
          18 kB
          Kitti Nanasi

          Activity

            People

            • Assignee:
              knanasi Kitti Nanasi
              Reporter:
              knanasi Kitti Nanasi
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: