Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-7251

Avoid flood logs during client disconnect during batch get operation

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.94.2
    • Fix Version/s: 0.94.4
    • Component/s: None
    • Labels:

      Description

      Background:

      A smart guy in the company want to read data from the HBASE in batch, the code like the following:(just demonstrate, not runnable):

      List<Get> gets = new ArrayList<Get>();

      for(int i =0; i < n; ++i){
      gets.add("some row key here");

      if (i % 10000 == 0)

      { Results[] results = htable.get(gets); //process results here. so delete some code }

      }

      Yes, you know that, this guy forgot "gets.clear()" after each "htable.get()" operation in his code.

      One region server becomes very slow, and crashed after 30mins becauseof OOM, but we got 15GB log file.
      there are flood logs as following:

      ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
      org.apache.hadoop.hbase.ipc.CallerDisconnectedException: Aborting call multi(org.apache.hadoop.hbase.client.MultiAction@49540d8d), rpc version=1, client version=29, methodsFingerPrint=-56040613 from 10.1.1.1:57933 after 3980 ms, since caller disconnected
      at org.apache.hadoop.hbase.ipc.HBaseServer$Call.throwExceptionIfCallerDisconnected(HBaseServer.java:436)
      at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3468)
      at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3425)
      at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3449)
      at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4198)
      at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4171)
      at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1993)
      at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3410)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:601)
      at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
      at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1409)

      Fix:
      Server is stop "get" but cannot exit from the "for" loop, so write flood logs here.
      My patch just log one exception log instead of flood logs.
      Importantly, server stop processing immediately if client timeout or disconnect.

      Test:
      I used this guy's wrong code read data( NO "gets.clear()" ) from the HBASE, when it becomes very slow to get results, I pressed ctrl+C, then there is only ONE CallerDisconnectedException exception log and the server stop reading immediately, LOG generate the last log entry:

      WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 1 on 60020 caught a ClosedChannelException, this means that the server was processing a request but the client went away. The error message was: null

        Attachments

        1. HBASE-7251.patch
          3 kB
          Fengdong Yu

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              azuryy Fengdong Yu
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: