Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.95.2
-
None
-
Reviewed
-
Description
today batch algo is:
for Operation o: List<Op>{ add o to todolist if todolist > maxsize or o last in list split todolist per location send split lists to region servers clear todolist wait }
We could:
- create immediately the final object instead of an intermediate array
- split per location immediately
- instead of sending when the list as a whole is full, send it when there is enough data for a single location
It would be:
for Operation o: List<Op>{ get location add o to todo location.todolist if (location.todolist > maxLocationSize) send location.todolist to region server clear location.todolist // don't wait, continue the loop } send remaining wait
It's not trivial to write if you add error management: retried list must be shared with the operations added in the todolist. But it's doable.
It's interesting mainly for 'big' writes
Attachments
Attachments
Issue Links
- is blocked by
-
HBASE-8380 NPE in HBaseClient$Connection.readResponse
- Closed
- is depended upon by
-
HBASE-8338 Latency Resilience; umbrella list of issues that will help us ride over bad disk, bad region, ec2, etc.
- Closed
- relates to
-
HBASE-5843 Improve HBase MTTR - Mean Time To Recover
- Closed