There're some reason why it's better to choose 65KB as the default buffer size:
1. Almost all of the data block have a block size: 64KB + delta, whose delta is very small, depends on the size of lastKeyValue. If we use the default hbase.ipc.server.allocator.buffer.size=64KB, then each block will be allocated as a MultiByteBuff: one 64KB DirectByteBuffer and delta bytes HeapByteBuffer, the HeapByteBuffer will increase the GC pressure. Ideally, we should let the data block to be allocated as a SingleByteBuff, it has simpler data structure, faster access speed, less heap usage...
2. In my benchmark, I found some checksum stack traces . (see checksum-stacktrace.png )
Since the block are MultiByteBuff, so we have to calculate the checksum by an temp heap copying ( see
HBASE-21917), while if we're a SingleByteBuff, we can speed the checksum by calling the hadoop' checksum in native lib, it's more faster.
3. Seems the BucketCacheWriters were always busy because of the higher cost of copying from MultiByteBuff to DirectByteBuffer. For SingleByteBuff, we can just use the unsafe array copying while for MultiByteBuff we have to copy byte one by one.
Anyway, I will give a benchmark for this.