Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-11722

Some Instances of Services using ZKDelegationTokenSecretManager go down when old token cannot be deleted

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.6.0
    • 2.7.0
    • None
    • None
    • Reviewed

    Description

      The delete node code in ZKDelegationTokenSecretManager is as follows :

             while(zkClient.checkExists().forPath(nodeRemovePath) != null){
                zkClient.delete().guaranteed().forPath(nodeRemovePath);
             }
      

      When instances of a Service using ZKDelegationTokenSecretManager try deleting a node simutaneously, It is possible that all of them enter into the while loop in which case, all peers will try to delete the node.. Only 1 will succeed and the rest will throw an exception.. which will bring down the node.

      The Exception is as follows :

      2015-03-15 10:24:54,000 ERROR org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: ExpiredTokenRemover thread received unexpected exception
      java.lang.RuntimeException: Could not remove Stored Token ZKDTSMDelegationToken_28
      	at org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.removeStoredToken(ZKDelegationTokenSecretManager.java:770)
      	at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.removeExpiredToken(AbstractDelegationTokenSecretManager.java:605)
      	at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.access$400(AbstractDelegationTokenSecretManager.java:54)
      	at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager$ExpiredTokenRemover.run(AbstractDelegationTokenSecretManager.java:656)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /zkdtsm/ZKDTSMRoot/ZKDTSMTokensRoot/DT_28
      	at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
      	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
      	at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873)
      	at org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:238)
      	at org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:233)
      	at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
      	at org.apache.curator.framework.imps.DeleteBuilderImpl.pathInForeground(DeleteBuilderImpl.java:230)
      	at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:214)
      	at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:41)
      	at org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.removeStoredToken(ZKDelegationTokenSecretManager.java:764)
      	... 4 more
      

      Attachments

        1. HADOOP-11722.1.patch
          3 kB
          Arun Suresh
        2. HADOOP-11722.2.patch
          3 kB
          Arun Suresh

        Activity

          People

            asuresh Arun Suresh
            asuresh Arun Suresh
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: