Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Won't Fix
-
0.94.9
-
None
-
None
Description
HBase currently determines which server to go to, then creates delayed callable with pre-determined server and goes there. For later 16-32-... second retries this approach is suboptimal, the cluster could have seen massive changes in the meantime, so retry might be completely useless.
We should re-locate regions after the delay, at least for longer retries. Given how grouping is currently done it would be a bit of a refactoring.
The effect of this is alleviated (to a degree) on trunk by server-based retries (if we fail going to the pre-delay server after delay and then determine the server has changed, we will go to the new server immediately, so we only lose the failed round-trip time); on 94, if the region is opened on some other server during the delay, we'd go to the old one, fail, then find out it's on different server, wait a bunch more time because it's a late-stage retry and THEN go to the new one, as far as I see.