Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-23066

Create a config that forces to cache blocks on compaction

    XMLWordPrintableJSON

Details

    • Hide
      The configuration 'hbase.rs.cacheblocksonwrite' was used to enable caching the blocks on write. But purposefully we were not caching the blocks when we do compaction (since it may be very aggressive) as the caching happens as and when the writer completes a block.
      In cloud environments since they have bigger sized caches - though they try to enable 'hbase.rs.prefetchblocksonopen' (non - aggressive way of caching the blocks proactively on reader creation) it does not help them because it takes time to cache the compacted blocks.
      This feature creates a new configuration 'hbase.rs.cachecompactedblocksonwrite' which when set to 'true' will enable the blocks created out of compaction.
      Remember that since it is aggressive caching the user should be having enough cache space - if not it may lead to other active blocks getting evicted.
      From the shell this can be enabled by using the option per Column Family also by using the below format
      {code}
      create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit', CONFIGURATION => {'hbase.rs.cachecompactedblocksonwrite' => 'true'}}
      {code}
      Show
      The configuration 'hbase.rs.cacheblocksonwrite' was used to enable caching the blocks on write. But purposefully we were not caching the blocks when we do compaction (since it may be very aggressive) as the caching happens as and when the writer completes a block. In cloud environments since they have bigger sized caches - though they try to enable 'hbase.rs.prefetchblocksonopen' (non - aggressive way of caching the blocks proactively on reader creation) it does not help them because it takes time to cache the compacted blocks. This feature creates a new configuration 'hbase.rs.cachecompactedblocksonwrite' which when set to 'true' will enable the blocks created out of compaction. Remember that since it is aggressive caching the user should be having enough cache space - if not it may lead to other active blocks getting evicted. From the shell this can be enabled by using the option per Column Family also by using the below format {code} create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit', CONFIGURATION => {'hbase.rs.cachecompactedblocksonwrite' => 'true'}} {code}

    Description

      In cases where users care a lot about read performance for tables that are small enough to fit into a cache (or the cache is large enough), prefetchOnOpen can be enabled to make the entire table available in cache after the initial region opening is completed. Any new data can also be guaranteed to be in cache with the cacheBlocksOnWrite setting.

      However, the missing piece is when all blocks are evicted after a compaction. We found very poor performance after compactions for tables under heavy read load and a slower backing filesystem (S3). After a compaction the prefetching threads need to compete with threads servicing read requests and get constantly blocked as a result. 

      This is a proposal to introduce a new cache configuration option that would cache blocks on write during compaction for any column family that has prefetch enabled. This would virtually guarantee all blocks are kept in cache after the initial prefetch on open is completed allowing for guaranteed steady read performance despite a slow backing file system.

      Attachments

        1. HBASE-23066.patch
          10 kB
          Jacob LeBlanc
        2. performance_results.png
          28 kB
          Jacob LeBlanc
        3. prefetchCompactedBlocksOnWrite.patch
          12 kB
          Jacob LeBlanc

        Issue Links

          Activity

            People

              jacob.leblanc Jacob LeBlanc
              jacob.leblanc Jacob LeBlanc
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: