Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-17276

Reduce log spam from WrongRegionException in large multi()'s

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4.0, 0.98.24, 2.0.0
    • Component/s: regionserver
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The following spam drives me up a wall in the regionserver log:

      2016-12-05 05:53:05,085 WARN  [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=16020] regionserver.HRegion: Batch mutation had a row that does not belong to this region
      org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for doMiniBatchMutation on HRegion IntegrationTestReplicationSinkRestart,L\xCC\xCC\xCC\xCC\xCC\xCC\xC8,1480916713541.caab3310166699287b54b72b35b29431., startKey='L\xCC\xCC\xCC\xCC\xCC\xCC\xC8', getEndKey()='Y\x99\x99\x99\x99\x99\x99\x94', row='\x0C\xD2\xA5\xA3\x99\xC7\xE0Q!\x15^\xA6\x90\x1E\xA3\xAD'
      	at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:5211)
      	at org.apache.hadoop.hbase.regionserver.HRegion.checkAndPrepareMutation(HRegion.java:3879)
      	at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3040)
      	at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2933)
      	at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2875)
      	at org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:717)
      	at org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:679)
      	at org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2056)
      	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32303)
      	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2141)
      	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
      	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
      	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
      2016-12-05 05:53:05,086 WARN  [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=16020] regionserver.HRegion: Batch mutation had a row that does not belong to this region
      org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for doMiniBatchMutation on HRegion IntegrationTestReplicationSinkRestart,L\xCC\xCC\xCC\xCC\xCC\xCC\xC8,1480916713541.caab3310166699287b54b72b35b29431., startKey='L\xCC\xCC\xCC\xCC\xCC\xCC\xC8', getEndKey()='Y\x99\x99\x99\x99\x99\x99\x94', row='\x0E\xE7\xFA[\x8D\x93;\xF4\xC7F\xF9\x85\x84\x85\xF3\x0E'
      	at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:5211)
      	at org.apache.hadoop.hbase.regionserver.HRegion.checkAndPrepareMutation(HRegion.java:3879)
      	at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3040)
      	at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2933)
      	at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2875)
      	at org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:717)
      	at org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:679)
      	at org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2056)
      	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32303)
      	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2141)
      	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
      	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
      	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
      2016-12-05 05:53:05,087 WARN  [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=16020] regionserver.HRegion: Batch mutation had a row that does not belong to this region
      org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for doMiniBatchMutation on HRegion IntegrationTestReplicationSinkRestart,L\xCC\xCC\xCC\xCC\xCC\xCC\xC8,1480916713541.caab3310166699287b54b72b35b29431., startKey='L\xCC\xCC\xCC\xCC\xCC\xCC\xC8', getEndKey()='Y\x99\x99\x99\x99\x99\x99\x94', row='\x16-\xFC\x99\xF5c\x08\xFA\x1D\x84\x86\xD2\x18\xB1\x03q'
      	at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:5211)
      	at org.apache.hadoop.hbase.regionserver.HRegion.checkAndPrepareMutation(HRegion.java:3879)
      	at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3040)
      	at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2933)
      	at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2875)
      	at org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:717)
      	at org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:679)
      	at org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2056)
      	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32303)
      	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2141)
      	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
      	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
      	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
      

      With adequate replication traffic that is delayed or just slow, you can have a batch of 64MB of updates to a Region which are all on a different RegionServer by the time the RS processes it.

      In a run of IntegrationTestReplication that is particularly "slow"/oversaturated, I saw 1.591M log lines taken up with this message out of a total number of line of 1.597M lines (99.6% of the log). I propose that after the first WrongRegionException we see in doMiniBatchMutation, we stop printing out the rest of the stacktrace (save on 13 lines for every occurrence).

        Attachments

        1. HBASE-17276.001.patch
          8 kB
          Josh Elser
        2. HBASE-17276.002.patch
          8 kB
          Josh Elser

          Activity

            People

            • Assignee:
              elserj Josh Elser
              Reporter:
              elserj Josh Elser
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: