Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
To call a coprocessor endpoint asynchronously, you start by calling AsyncTable#coprocessorService(), which gives you a CoprocessorServiceBuilder, and a few steps later you can talk to your coprocessor over the network. One argument to AsyncTable#coprocessorService() is a CoprocessorCallback object, which contains several methods that will be called during the lifecycle of a coprocessor endpoint call. AsyncTableImpl's implementation of AsyncTable#coprocessorService() wraps your CoprocessorCallback with its own that delegates the work to a thread pool. A snippet of this:
@Override public void onRegionComplete(RegionInfo region, R resp) { pool.execute(context.wrap(() -> callback.onRegionComplete(region, resp))); } ... @Override public void onComplete() { pool.execute(context.wrap(callback::onComplete)); }
The trouble with this is that your implementations of onRegionComplete() and onComplete() will end up getting called in a random order, and/or at the same time. The tasks of calling them are delegated to a thread pool, and the completion of those tasks is not waited on, so the thread pool can choose any ordering it wants to. Troublingly, onComplete() can be called before the final onRegionComplete(), which is an violation of the contract specified in the CoprocessorCallback#onComplete() javadoc.
I discovered this while working on HBASE-28770. I found that AsyncAggregationClient#rowCount() returns incorrect results 5-10% of the time, and this bug is the reason. Other AsyncAggregationClient methods I presume are similarly affected.