[HBASE-16388] Prevent client threads being blocked by only one slow region server - ASF JIRA

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.4.0, 2.0.0
Component/s: None
Labels:
None

Hadoop Flags:

Reviewed
Release Note:

Hide
Add a new configuration, hbase.client.perserver.requests.threshold, to limit the max number of concurrent request to one region server. If the user still create new request after reaching the limit, client will throw ServerTooBusyException and do not send the request to the server. This is a client side feature and can prevent client's threads being blocked by one slow region server resulting in the availability of client is much lower than the availability of region servers.

For completeness, here extract on new config from hbase-default.xml:

Property: hbase.client.perserver.requests.threshold
Default: 2147483647
Description: The max number of concurrent pending requests for one server in all client threads (process level). Exceeding requests will be thrown ServerTooBusyException immediately to prevent user's threads being occupied and blocked by only one slow region server. If you use a fix number of threads to access HBase in a synchronous way, set this to a suitable value which is related to the number of threads will help you. See https://issues.apache.org/jira/browse/HBASE-16388 for details.

Show
Add a new configuration, hbase.client.perserver.requests.threshold, to limit the max number of concurrent request to one region server. If the user still create new request after reaching the limit, client will throw ServerTooBusyException and do not send the request to the server. This is a client side feature and can prevent client's threads being blocked by one slow region server resulting in the availability of client is much lower than the availability of region servers. For completeness, here extract on new config from hbase-default.xml: Property: hbase.client.perserver.requests.threshold Default: 2147483647 Description: The max number of concurrent pending requests for one server in all client threads (process level). Exceeding requests will be thrown ServerTooBusyException immediately to prevent user's threads being occupied and blocked by only one slow region server. If you use a fix number of threads to access HBase in a synchronous way, set this to a suitable value which is related to the number of threads will help you. See https://issues.apache.org/jira/browse/HBASE-16388 for details.

Description

It is a general use case for HBase's users that they have several threads/handlers in their service, and each handler has its own Table/HTable instance. Generally users think each handler is independent and won't interact each other.

However, in an extreme case, if a region server is very slow, every requests to this RS will timeout, handlers of users' service may be occupied by the long-waiting requests even requests belong to other RS will also be timeout.

For example:
If we have 100 handlers in a client service(timeout is 1000ms) and HBase has 10 region servers whose average response time is 50ms. If no region server is slow, we can handle 2000 requests per second.
Now this service's QPS is 1000. If there is one region server very slow and all requests to it will be timeout. Users hope that only 10% requests failed, and 90% requests' response time is still 50ms, because only 10% requests are located to the slow RS. However, each second we have 100 long-waiting requests which exactly occupies all 100 handles. So all handlers is blocked, the availability of this service is almost zero.

To prevent this case, we can limit the max concurrent requests to one RS in process-level. Requests exceeding the limit will throws ServerBusyException(extends DoNotRetryIOE) immediately to users. In the above case, if we set this limit to 20, only 20 handlers will be occupied and other 80 handlers can still handle requests to other RS. The availability of this service is 90% as expected.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HBASE-16388-v3.patch
14/Sep/16 05:39
15 kB
Phil Yang
HBASE-16388-v2.patch
16/Aug/16 09:25
15 kB
Phil Yang
HBASE-16388-v2.patch
17/Aug/16 04:01
15 kB
Phil Yang
HBASE-16388-v2.patch
08/Sep/16 06:24
15 kB
Phil Yang
HBASE-16388-v2.patch
08/Sep/16 09:57
15 kB
Phil Yang
HBASE-16388-v1.patch
11/Aug/16 09:01
15 kB
Phil Yang
HBASE-16388-branch-1-v2.patch
14/Sep/16 06:05
15 kB
Phil Yang
HBASE-16388-branch-1-v1.patch
18/Aug/16 10:06
15 kB
Phil Yang

Issue Links

relates to

HBASE-22307 Deprecated Preemptive Fail Fast

Closed

Prevent client threads being blocked by only one slow region server

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates