Currently, there is no way to force an operation on the HBase client (viz. HTable) to time out if a certain amount of time has elapsed. In other words, all invocations on the HTable class are veritable blocking calls, which will not return until a response (successful or otherwise) is received.
In general, there are two ways to handle timeouts: (a) call the operation in a separate thread, until it returns a response or the wait on the thread times out and (b) have the underlying socket unblock the operation if the read times out. The downside of the former approach is that it consumes more resources in terms of threads and callables.
Here, we describe a way to specify and handle timeouts on the HTable client, which relies on the latter approach (i.e., socket timeouts). Right now, the HBaseClient sets the socket timeout to the value of the "ipc.ping.interval" parameter, which is also how long it waits before pinging the server in case of a failure. The goal is to allow clients to set that timeout on the fly through HTable. Rather than adding an optional timeout argument to every HTable operation, we chose to make it a property of HTable which effectively applies to every method that involves a remote operation.
In order to propagate the timeout from HTable to HBaseClient, we replaced all occurrences of ServerCallable in HTable with an extension called ClientCallable, which sets the timeout on the region server interface, once it has been instantiated, through the HConnection object. The latter, in turn, asks HBaseRPC to pass that timeout to the corresponding Invoker, so that it may inject the timeout at the time the invocation is made on the region server proxy. Right before the request is sent to the server, we set the timeout specified by the client on the underlying socket.
In conclusion, this patch will afford clients the option of performing an HBase operation until it completes or a specified timeout elapses. Note that a timeout of zero is interpreted as an infinite timeout.