• Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 3.0.0-alpha-1, 2.3.0
    • BucketCache
    • None
    • Reviewed


      There're some reason why it's better to choose 65KB as the default buffer size:
      1. Almost all of the data block have a block size: 64KB + delta, whose delta is very small, depends on the size of lastKeyValue. If we use the default hbase.ipc.server.allocator.buffer.size=64KB, then each block will be allocated as a MultiByteBuff: one 64KB DirectByteBuffer and delta bytes HeapByteBuffer, the HeapByteBuffer will increase the GC pressure. Ideally, we should let the data block to be allocated as a SingleByteBuff, it has simpler data structure, faster access speed, less heap usage...
      2. In my benchmark, I found some checksum stack traces . (see checksum-stacktrace.png )
      Since the block are MultiByteBuff, so we have to calculate the checksum by an temp heap copying ( see HBASE-21917), while if we're a SingleByteBuff, we can speed the checksum by calling the hadoop' checksum in native lib, it's more faster.
      3. Seems the BucketCacheWriters were always busy because of the higher cost of copying from MultiByteBuff to DirectByteBuffer. For SingleByteBuff, we can just use the unsafe array copying while for MultiByteBuff we have to copy byte one by one.

      Anyway, I will give a benchmark for this.


        1. with-buffer-size-65KB.png
          329 kB
          Zheng Hu
        2. with-buffer-size-64KB.png
          374 kB
          Zheng Hu
        3. checksum-stacktrace.png
          167 kB
          Zheng Hu
        4. BucketCacheWriter-is-busy.png
          148 kB
          Zheng Hu
        5. 121240.stack
          816 kB
          Zheng Hu

        Issue Links



              openinx Zheng Hu
              openinx Zheng Hu
              0 Vote for this issue
              4 Start watching this issue