Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-21648

[rsgroup] hbase shell "move_servers_rsgroup" or "balance_rsgroup" will be failed when meet a split region.

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.1.0
    • Fix Version/s: None
    • Component/s: Balancer, rsgroup
    • Labels:
      None

      Description

      A  example:

      We have a table "A" which is in RSGroup "group1".  "bd806f94a53be74e65bd76e1e6e16e5a" is a region of A and is opened on RS "rs1".

      Two steps will repeat this bug: 

      step1: Split region bd806f94a53be74e65bd76e1e6e16e5a

      step2: Before the region is cleared by CatalogJanitor, client runs shell : move_server_rsgroup 'group2', ['rs1:60020']  or balance_rsgroup 'group1'

      Finally, client will have exceptions below and rest regions moving will be interrupted. 

      ERROR: org.apache.hadoop.hbase.client.DoNotRetryRegionException: bd806f94a53be74e65bd76e1e6e16e5a is not OPEN
          at org.apache.hadoop.hbase.master.procedure.AbstractStateMachineTableProcedure.checkOnline(AbstractStateMachineTableProcedure.java:189)
          at org.apache.hadoop.hbase.master.assignment.MoveRegionProcedure.<init>(MoveRegionProcedure.java:71)
          at org.apache.hadoop.hbase.master.assignment.AssignmentManager.createMoveRegionProcedure(AssignmentManager.java:755)
          at org.apache.hadoop.hbase.master.assignment.AssignmentManager.move(AssignmentManager.java:560)
          at org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.moveServers(RSGroupAdminServer.java:349)
          at org.apache.hadoop.hbase.rsgroup.FGRSGroupAdminServer.moveServers(FGRSGroupAdminServer.java:119)
          at org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint$RSGroupAdminServiceImpl.moveServers(RSGroupAdminEndpoint.java:209)
          at org.apache.hadoop.hbase.protobuf.generated.RSGroupAdminProtos$RSGroupAdminService.callMethod(RSGroupAdminProtos.java:13870)
          at org.apache.hadoop.hbase.master.MasterRpcServices.execMasterService(MasterRpcServices.java:813)
          at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
          at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
          at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
          at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
          at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
      
      For usage try 'help "move_servers_rsgroup”'
      
      ERROR: org.apache.hadoop.hbase.client.DoNotRetryRegionException: bd806f94a53be74e65bd76e1e6e16e5a is not OPEN
          at org.apache.hadoop.hbase.master.procedure.AbstractStateMachineTableProcedure.checkOnline(AbstractStateMachineTableProcedure.java:189)
          at org.apache.hadoop.hbase.master.assignment.MoveRegionProcedure.<init>(MoveRegionProcedure.java:71)
          at org.apache.hadoop.hbase.master.assignment.AssignmentManager.createMoveRegionProcedure(AssignmentManager.java:755)
          at org.apache.hadoop.hbase.master.assignment.AssignmentManager.moveAsync(AssignmentManager.java:565)
          at org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.balanceRSGroup(RSGroupAdminServer.java:516)
          at org.apache.hadoop.hbase.rsgroup.FGRSGroupAdminServer.balanceRSGroup(FGRSGroupAdminServer.java:164)
          at org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint$RSGroupAdminServiceImpl.balanceRSGroup(RSGroupAdminEndpoint.java:296)
          at org.apache.hadoop.hbase.protobuf.generated.RSGroupAdminProtos$RSGroupAdminService.callMethod(RSGroupAdminProtos.java:13890)
          at org.apache.hadoop.hbase.master.MasterRpcServices.execMasterService(MasterRpcServices.java:813)
          at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
          at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
          at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
          at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
          at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
      
      For usage try 'help "balance_rsgroup"'

      Aflter splitting, this parent region will not be used anymore and will be cleared by CatalogJanitor in the future. So should we ignore moving it when doing move_server_rsgroup or balance_rsgroup?

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              xuming xuming
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated: