Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-21879

Read HFile's block to ByteBuffer directly instead of to byte for reducing young gc purpose

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.0.0, 2.3.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Before this issue, we've made the read path 100% offheap when block hit the BucketCache 100%, but if the cache missed then RS need to read the block by on-heap API, which would cause high young GC pressure.
      This issue will read the block by offheap even if reading the block from filesystem directly, it have some requirement for hadoop version(>=2.9.3) but can also works with older hadoop version(means still works fine but will read block onheap). We have written a careful doc about the implementation, performance and practice here: https://docs.google.com/document/d/1xSy9axGxafoH-Qc17zbD2Bd--rWjjI00xTWQZ8ZwI_E/edit#heading=h.nch5d72p27ex, for more details please read it.
      Show
      Before this issue, we've made the read path 100% offheap when block hit the BucketCache 100%, but if the cache missed then RS need to read the block by on-heap API, which would cause high young GC pressure. This issue will read the block by offheap even if reading the block from filesystem directly, it have some requirement for hadoop version(>=2.9.3) but can also works with older hadoop version(means still works fine but will read block onheap). We have written a careful doc about the implementation, performance and practice here: https://docs.google.com/document/d/1xSy9axGxafoH-Qc17zbD2Bd--rWjjI00xTWQZ8ZwI_E/edit#heading=h.nch5d72p27ex, for more details please read it.

      Description

      In HFileBlock#readBlockDataInternal, we have the following:

      @VisibleForTesting
      protected HFileBlock readBlockDataInternal(FSDataInputStream is, long offset,
          long onDiskSizeWithHeaderL, boolean pread, boolean verifyChecksum, boolean updateMetrics)
       throws IOException {
       // .....
        // TODO: Make this ByteBuffer-based. Will make it easier to go to HDFS with BBPool (offheap).
        byte [] onDiskBlock = new byte[onDiskSizeWithHeader + hdrSize];
        int nextBlockOnDiskSize = readAtOffset(is, onDiskBlock, preReadHeaderSize,
            onDiskSizeWithHeader - preReadHeaderSize, true, offset + preReadHeaderSize, pread);
        if (headerBuf != null) {
              // ...
        }
        // ...
       }
      

      In the read path, we still read the block from hfile to on-heap byte[], then copy the on-heap byte[] to offheap bucket cache asynchronously, and in my 100% get performance test, I also observed some frequent young gc, The largest memory footprint in the young gen should be the on-heap block byte[].

      In fact, we can read HFile's block to ByteBuffer directly instead of to byte[] for reducing young gc purpose. we did not implement this before, because no ByteBuffer reading interface in the older HDFS client, but 2.7+ has supported this now, so we can fix this now. I think.

      Will provide an patch and some perf-comparison for this.

        Attachments

        1. gc-data-before-HBASE-21879.png
          385 kB
          Zheng Hu
        2. HBASE-21879.v1.patch
          76 kB
          Zheng Hu
        3. HBASE-21879.v1.patch
          76 kB
          Zheng Hu
        4. QPS-latencies-before-HBASE-21879.png
          279 kB
          Zheng Hu

          Issue Links

            Activity

              People

              • Assignee:
                openinx Zheng Hu
                Reporter:
                openinx Zheng Hu
              • Votes:
                0 Vote for this issue
                Watchers:
                21 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: