Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-21464

Splitting blocked with meta NSRE during split transaction

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 1.5.0, 1.4.3, 1.4.4, 1.4.5, 1.4.6, 1.4.8, 1.4.7
    • Fix Version/s: 1.4.9
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Splitting is blocked during split transaction. The split worker is trying to update meta but isn't able to relocate it after NSRE:

      2018-11-09 17:50:45,277 INFO  [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] client.RpcRetryingCaller: Call exception, tries=13, retries=350, started=88590 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832
           at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088)
              at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271)
              at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198)
              at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617)
              at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396)
              at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
              at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297)
              at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table 'hbase:meta' at region=hbase:meta,1.1588230740, hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, seqNum=0

      Clients, in this case YCSB, are hung with part of the keyspace missing:

      2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] client.ConnectionManager$HConnectionImplementation: locateRegionInMeta parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying after sleep of 20158 because: No server address listed in hbase:meta for region test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. containing row user3301635648728421323

      Balancing cannot run indefinitely because the split transaction is stuck

      2018-11-09 17:49:55,478 DEBUG [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: Not running balancer because 3 region(s) in transition: [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, server=ip-172-31-5-92.us-west-2.compute....

       

        Attachments

        1. HBASE-21464-branch-1.patch
          11 kB
          Andrew Kyle Purtell
        2. HBASE-21464-branch-1.patch
          10 kB
          Andrew Kyle Purtell
        3. HBASE-21464-branch-1.patch
          3 kB
          Andrew Kyle Purtell
        4. HBASE-21464-branch-1.patch
          3 kB
          Andrew Kyle Purtell

          Issue Links

            Activity

              People

              • Assignee:
                apurtell Andrew Kyle Purtell
                Reporter:
                apurtell Andrew Kyle Purtell
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: