Per the investigation in http://getkudu.io/2016/04/26/ycsb.html it seems like the 500ms backoff that the Java client does is way too aggressive. Using some back of the envelope math, we can see why:
- a typical insert batch is probably on the order of 10ms (assuming a large-ish batch and a remote server... localhost as seen in the blog post is much faster)
- imagine we have a soft threshold of 60GB and hard threshold of 100GB
- if we are 1GB over the soft threshold, this is 1/40 = 2.5%. So, 2.5% of write requests will be rejected with 'TOO_BUSY'. Put another way, on average one out of every 40 write requests will be rejected.
- With the 10ms per-request time above, this means that we'll experience a rejection approximately once every 400ms.
If the rejection causes us to block for 500ms, then that means we're spending more time sleeping than sending requests – operating at <50% of our peak throughput even though we are only 2.5% above the soft threshold. As we get to 10GB (25% above threshold) that means that we'll on average get 4 writes in (40ms) before we get rejected (500ms), and we're now spending 92% of our time sleeping.
The intent of the probalistic rejection was always to make the insert rate "smooth" as memory fills up, but it seems the current implementation with the Java client is nearly as bad as the original "brick wall" we were trying to avoid.