In HFileBlock#readBlockDataInternal, we have the following:
In the read path, we still read the block from hfile to on-heap byte, then copy the on-heap byte to offheap bucket cache asynchronously, and in my 100% get performance test, I also observed some frequent young gc, The largest memory footprint in the young gen should be the on-heap block byte.
In fact, we can read HFile's block to ByteBuffer directly instead of to byte for reducing young gc purpose. we did not implement this before, because no ByteBuffer reading interface in the older HDFS client, but 2.7+ has supported this now, so we can fix this now. I think.
Will provide an patch and some perf-comparison for this.