Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-13850

Check for dead server on CallTimeoutException

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Duplicate
    • Affects Version/s: 1.2.0, 2.0.0
    • Fix Version/s: None
    • Component/s: Client, MTTR
    • Labels:
      None

      Description

      WARN this may be a misconf, so let me know if there is a timeout param to set.

      hbase-site.xml
      zookeeper.session.timeout 10000
      hbase.regionserver.storefile.refresh.period 10000
      hbase.client.operation.timeout 5000
      hbase.client.meta.operation.timeout 5000
      hbase.client.scanner.timeout.period 10000
      hbase.regionserver.lease.period 10000
      

      I have a test that does a kill STOP on a RS and tries to query it.
      From the conf the zk lease is 10sec, and the master is correctly doing the reassign after 10sec and meta is updated.

      the client keep trying to query the RS for a specific row until it get a response. The table.get(row) in the loop throws a CallTimeoutException every 5sec (which is the configured settings). but instead of succeed after 2/3 retries (> 10sec where the master reassign) it keeps retrying up to 60sec (I don't know what that 60sec is, maybe a conf param that I'm not able to find)

      one simple fix in the code is handling the CallTimeoutException in RegionServerCallable and clear the meta cache for that RS that is not responding. (but maybe there is already a conf to set to reduce that 60sec period)

        Attachments

        1. TestGetPerf.java
          19 kB
          Matteo Bertozzi
        2. HBASE-13850-v0.patch
          1 kB
          Matteo Bertozzi

          Issue Links

            Activity

              People

              • Assignee:
                huaxiang Hua Xiang
                Reporter:
                mbertozzi Matteo Bertozzi
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: