Up on hbase-dev@ Ryan writes:
we need to make hlog flush faster, it currently does only 700 ops/sec
when we flush every entry.
it'd be nice if we could do something clever, such as:
- use multiple logs
- detect multiple waiting clients and better batch their commits
- group commits for bulk import
This issue addresses the first point.
While considering this, dynamically size the pool according to a concurrency measure. Spin up new writers on demand until some configurable upper bound. A simple strategy to try first might be 2 * ceil(log(load)), smoothed. Terminate excess writers at roll time to hold down unnecessary HDFS resource use.
In HLog.doWrite we write each HLogKey and KeyValue to the log, which is a SequenceFile. Use hfile instead? Can HFile do I/O batching? Otherwise I think to group commits we'd need to introduce a new writable which bundles edits together.
Moving into 0.21.