Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
-
Description
In HFileBlock#readBlockDataInternal, we have the following:
@VisibleForTesting protected HFileBlock readBlockDataInternal(FSDataInputStream is, long offset, long onDiskSizeWithHeaderL, boolean pread, boolean verifyChecksum, boolean updateMetrics) throws IOException { // ..... // TODO: Make this ByteBuffer-based. Will make it easier to go to HDFS with BBPool (offheap). byte [] onDiskBlock = new byte[onDiskSizeWithHeader + hdrSize]; int nextBlockOnDiskSize = readAtOffset(is, onDiskBlock, preReadHeaderSize, onDiskSizeWithHeader - preReadHeaderSize, true, offset + preReadHeaderSize, pread); if (headerBuf != null) { // ... } // ... }
In the read path, we still read the block from hfile to on-heap byte[], then copy the on-heap byte[] to offheap bucket cache asynchronously, and in my 100% get performance test, I also observed some frequent young gc, The largest memory footprint in the young gen should be the on-heap block byte[].
In fact, we can read HFile's block to ByteBuffer directly instead of to byte[] for reducing young gc purpose. we did not implement this before, because no ByteBuffer reading interface in the older HDFS client, but 2.7+ has supported this now, so we can fix this now. I think.
Will provide an patch and some perf-comparison for this.
Attachments
Attachments
Issue Links
- is related to
-
HDFS-14535 The default 8KB buffer in requestFileDescriptors#BufferedOutputStream is causing lots of heap allocation in HBase when using short-circut read
- Resolved
-
HDFS-14541 When evictableMmapped or evictable size is zero, do not throw NoSuchElementException
- Resolved
-
HBASE-22582 The Compaction writer may access the lastCell whose memory has been released when appending fileInfo in the final
- Closed
-
HBASE-22309 Replace Shipper Interface with Netty's ReferenceCounted; add ExtendCell#retain/ExtendCell#release
- Open
-
HDFS-14483 Backport HDFS-14111,HDFS-3246 ByteBuffer pread interface to branch-2.9
- Resolved
-
HDFS-14585 Backport HDFS-8901 Use ByteBuffer in DFSInputStream#read to branch2.9
- Resolved
- relates to
-
HBASE-21946 Use ByteBuffer pread instead of byte[] pread in HFileBlock when applicable
- Resolved
-
HDFS-3246 pRead equivalent for direct read path
- Resolved
-
HDFS-2834 ByteBuffer-based read API for DFSInputStream
- Closed
-
HBASE-20188 Evaluate and address performance delta between branch-1 and branch-2
- Resolved
- links to