Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.6.0
-
For backward compatibility purposes, default, throwing implementations are provided for the new methods for 2.7.0 and 2.6.1. Custom implementation that want to take advantage of the new functionality should override those methods as appropriate.
Description
Currently there is a gap in the coverage of HBase's quota-based workload throttling. Requests sent by [Async]AggregationClient reach AggregateImplementation. This then executes Scans in a way that bypasses the quota system. We see issues with this at Hubspot where clusters suffer under this load and we don't have a good way to protect them.
In this ticket I'm teaching AggregateImplementation to optionally stop scanning when a throttle is violated, and send back just the results it has accumulated so far. In addition, it will send back a row key to AsyncAggregationClient. When the client gets a response with a row key, it will sleep in order to satisfy the throttle, and then send a new request with a scan starting at that row key. This will have the effect of continuing the work where the last request stopped.
This feature will be unconditionally enabled by AsyncAggregationClient once this ticket is finished. AggregateImplementation will not assume that clients support partial results, however, so it can keep supporting older clients. For clients that do not support partial results, throttles will not be respecting, and results will always be complete.
This feature was first proposed on the mailing list. Builds on work in HBASE-28346.