Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-12336

RegionServer failed to shutdown for NodeFailoverWorker thread

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 0.94.11
    • Fix Version/s: 0.98.8, 0.94.25, 0.99.2
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      After enabling hbase.zookeeper.useMulti in hbase cluster, we found that regionserver failed to shutdown. Other threads have exited except a NodeFailoverWorker thread.

      "ReplicationExecutor-0" prio=10 tid=0x00007f0d40195ad0 nid=0x73a in Object.wait() [0x00007f0dc8fe6000]
         java.lang.Thread.State: WAITING (on object monitor)
              at java.lang.Object.wait(Native Method)
              at java.lang.Object.wait(Object.java:485)
              at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
              - locked <0x00000005a16df080> (a org.apache.zookeeper.ClientCnxn$Packet)
              at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:930)
              at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:912)
              at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.multi(RecoverableZooKeeper.java:531)
              at org.apache.hadoop.hbase.zookeeper.ZKUtil.multiOrSequential(ZKUtil.java:1518)
              at org.apache.hadoop.hbase.replication.ReplicationZookeeper.copyQueuesFromRSUsingMulti(ReplicationZookeeper.java:804)
              at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$NodeFailoverWorker.run(ReplicationSourceManager.java:612)
              at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
              at java.lang.Thread.run(Thread.java:662)
      
      

      It's sure that the shutdown method of the executor is called in ReplicationSourceManager#join.

      I am looking for the root cause and suggestions are welcomed. Thanks

        Attachments

        1. stack
          23 kB
          Liu Shaohui
        2. HBASE-12336-trunk-v1.diff
          0.8 kB
          Liu Shaohui

          Activity

            People

            • Assignee:
              liushaohui Liu Shaohui
              Reporter:
              liushaohui Liu Shaohui
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: