Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-23261

Region stuck in transition while splitting

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.3.5
    • 1.6.0, 1.4.12, 1.3.7
    • None
    • None
    • Reviewed

    Description

      While splitting, some region gets stuck in transition. After RegionServer initiates split, ZK has the region marked in RIT ZNode. However, RegionServer encounters KeeperException.BadVersion for /hbase/region-in-transition/{region-name} while transitioning node to RS_ZK_REQUEST_REGION_SPLIT and hence, it runs rollback/cleanup of failed split of the region. Even after successful rollback, region stays in transition sometimes.

       

      2019-11-05 04:07:17,711 INFO [splits-1572926837064] regionserver.SplitRequest - Successful rollback of failed split of TABLE1,1572894157455.257ff8985e7a169af0514208b3b0b430.
      
      2019-11-05 04:07:17,688 INFO [splits-1572926837064] regionserver.SplitRequest - Running rollback/cleanup of failed split of TABLE1,1572894157455.257ff8985e7a169af0514208b3b0b430.; Failed getting SPLITTING znode on TABLE1,1572894157455.257ff8985e7a169af0514208b3b0b430.
      java.io.IOException: Failed getting SPLITTING znode on TABLE1,1572894157455.257ff8985e7a169af0514208b3b0b430. at org.apache.hadoop.hbase.coordination.ZKSplitTransactionCoordination.waitForSplitTransaction(ZKSplitTransactionCoordination.java:203) at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:383) at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:278) at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:561) at org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:82) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:153) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: Failed transition of splitting node TABLE1,1572894157455.257ff8985e7a169af0514208b3b0b430. at org.apache.hadoop.hbase.coordination.ZKSplitTransactionCoordination.transitionSplittingNode(ZKSplitTransactionCoordination.java:132) at org.apache.hadoop.hbase.coordination.ZKSplitTransactionCoordination.waitForSplitTransaction(ZKSplitTransactionCoordination.java:161) ... 8 more Caused by: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/region-in-transition/257ff8985e7a169af0514208b3b0b430 at org.apache.zookeeper.KeeperException.create(KeeperException.java:115) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1336) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:442) at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:818) at org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:871) at org.apache.hadoop.hbase.coordination.ZKSplitTransactionCoordination.transitionSplittingNode(ZKSplitTransactionCoordination.java:128) ... 9 more
      
      2019-11-05 04:07:17,688 INFO [.Worker-pool3-t26826] master.RegionStates - Transition {257ff8985e7a169af0514208b3b0b430 state=OPEN, ts=1572923178845, server=rsserver.net,60020,1572890688075} to {257ff8985e7a169af0514208b3b0b430 state=SPLITTING, ts=1572926837688, server=rsserver.net,60020,1572890688075}
      
      2019-11-05 04:07:17,680 INFO [myid:5] [ead(sid:5 cport:-1):] server.PrepRequestProcessor - Got user-level KeeperException when processing sessionid:0x36dd5dc94536a3e type:setData cxid:0x8f8a zxid:0x304fd98ef txntype:-1 reqpath:n/a Error Path:/hbase/region-in-transition/257ff8985e7a169af0514208b3b0b430 Error:KeeperErrorCode = BadVersion for /hbase/region-in-transition/257ff8985e7a169af0514208b3b0b430
      
      2019-11-05 04:07:17,668 DEBUG [.Worker-pool3-t26826] master.AssignmentManager - Handling RS_ZK_REQUEST_REGION_SPLIT, server=rsserver.net,60020,1572890688075, region=257ff8985e7a169af0514208b3b0b430, current_state={257ff8985e7a169af0514208b3b0b430 state=OPEN, ts=1572923178845, server=rsserver.net,60020,1572890688075}
      
      2019-11-05 04:07:17,661 DEBUG [splits-1572926837064] coordination.ZKSplitTransactionCoordination - Still waiting for master to process the pending_split for 257ff8985e7a169af0514208b3b0b430
      

       

       

      Attachments

        1. HBASE-23261.branch-1.3.003.patch
          4 kB
          Viraj Jasani
        2. HBASE-23261.branch-1.3.002.patch
          4 kB
          Viraj Jasani
        3. HBASE-23261.branch-1.3.001.patch
          3 kB
          Viraj Jasani
        4. HBASE-23261.branch-1.3.000.patch
          3 kB
          Viraj Jasani
        5. HBASE-23261.branch-1.3.000.patch
          3 kB
          Viraj Jasani

        Issue Links

          Activity

            People

              vjasani Viraj Jasani
              vjasani Viraj Jasani
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: