Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-21196

HTableMultiplexer clears the meta cache after every put operation

    XMLWordPrintableJSON

    Details

    • Hadoop Flags:
      Reviewed

      Description

      Problem: Operations which use AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, MultiResponse, int) API with tablename set to null reset the meta cache of the corresponding server after each call. One such operation is put operation of HTableMultiplexer (Might not be the only one). This may impact the performance of the system severely as all new ops directed to that server will have to go to zk first to get the meta table address and then get the location of the table region as it will become empty after every htablemultiplexer put.

      From the logs below, one can see after every other put the cached region locations are cleared. As a side effect of this, before every put the server needs to contact zk and get meta table location and read meta to get region locations of the table.

      2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): Removed all cached region locations that map to root1-thinkpad-t440p,35811,1536857446588
      2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5
      2018-09-13 22:21:15,515 TRACE [RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" request_param: true priority: 0 timeout: 60000 totalRequestSize: 137 bytes
      2018-09-13 22:21:15,515 TRACE [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] ipc.CallRunner(105): callId: 218 service: ClientService methodName: Get size: 137 connection: 127.0.0.1:42338 executing as root1
      2018-09-13 22:21:15,515 TRACE [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] ipc.RpcServer(2356): callId: 218 service: ClientService methodName: Get size: 137 connection: 127.0.0.1:42338 param: region= testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 totalTime: 0
      2018-09-13 22:21:15,516 TRACE [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, count=0, allocations=1
      2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, callTime: 2ms
      2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan table=hbase:meta, startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99999999999999
      2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): Advancing internal small scanner to startKey at 'testHTableMultiplexer_1,\x00\x00\x00\xC5,99999999999999'
      2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up meta region location in ZK, connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f
      

      From the minicluster logs HTableMultiplexer1000Puts.UT.txt one can see that the string "Removed all cached region locations that map" and "Looking up meta region location in ZK" are present for every put.

      Analysis:
      The problem occurs as we call the cleanServerCache method always clears the server cache in case tablename is null and exception is null. See AsyncRequestFutureImpl.java#L918

      private void cleanServerCache(ServerName server, Throwable regionException) {
          if (tableName == null && ClientExceptionsUtil.isMetaClearingException(regionException)) {
            // For multi-actions, we don't have a table name, but we want to make sure to clear the
            // cache in case there were location-related exceptions. We don't to clear the cache
            // for every possible exception that comes through, however.
            asyncProcess.connection.clearCaches(server);
          }
        }
      

      The problem isĀ  ClientExceptionsUtil.isMetaClearingException(regionException)) assumes that the caller should take care of null exception check before calling the method i.e. it will return true if the passed exception is null, which may not be a correct assumption.

        Attachments

        1. HTableMultiplexer1000Puts.UT.txt
          7.57 MB
          Nihal Jain
        2. HBASE-21196.master.001.patch
          8 kB
          Nihal Jain
        3. HBASE-21196.master.001.patch
          8 kB
          Nihal Jain
        4. HBASE-21196.master.002.patch
          8 kB
          Nihal Jain
        5. HBASE-21196-branch-1.patch
          9 kB
          Andrew Kyle Purtell

          Issue Links

            Activity

              People

              • Assignee:
                nihaljain.cs Nihal Jain
                Reporter:
                nihaljain.cs Nihal Jain
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: