Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-23887

New L1 cache : AdaptiveLRU

    XMLWordPrintableJSON

Details

    • Reviewed
    • Hide
      Introduced new L1 cache: AdaptiveLRU. This is supposed to provide better performance than default LRU cache.
      Set config key "hfile.block.cache.policy" to "AdaptiveLRU" in hbase-site in order to start using this new cache.
      Show
      Introduced new L1 cache: AdaptiveLRU. This is supposed to provide better performance than default LRU cache. Set config key "hfile.block.cache.policy" to "AdaptiveLRU" in hbase-site in order to start using this new cache.

    Description

      Hi!

      I first time here, correct me please if something wrong.

      All latest information is here:

      https://docs.google.com/document/d/1X8jVnK_3lp9ibpX6lnISf_He-6xrHZL0jQQ7hoTV0-g/edit?usp=sharing

      I want propose how to improve performance when data in HFiles much more than BlockChache (usual story in BigData). The idea - caching only part of DATA blocks. It is good becouse LruBlockCache starts to work and save huge amount of GC.

      Sometimes we have more data than can fit into BlockCache and it is cause a high rate of evictions. In this case we can skip cache a block N and insted cache the N+1th block. Anyway we would evict N block quite soon and that why that skipping good for performance.

      Some information below isn't  actual

       

       

      Example:

      Imagine we have little cache, just can fit only 1 block and we are trying to read 3 blocks with offsets:
      124
      198
      223

      Current way - we put the block 124, then put 198, evict 124, put 223, evict 198. A lot of work (5 actions).

      With the feature - last few digits evenly distributed from 0 to 99. When we divide by modulus we got:
      124 -> 24
      198 -> 98
      223 -> 23

      It helps to sort them. Some part, for example below 50 (if we set hbase.lru.cache.data.block.percent = 50) go into the cache. And skip others. It means we will not try to handle the block 198 and save CPU for other job. In the result - we put block 124, then put 223, evict 124 (3 actions).

      See the picture in attachment with test below. Requests per second is higher, GC is lower.

       
      The key point of the code:
      Added the parameter: hbase.lru.cache.data.block.percent which by default = 100
       
      But if we set it 1-99, then will work the next logic:
       
       

      public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory) {   
        if (cacheDataBlockPercent != 100 && buf.getBlockType().isData())      
          if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) 
            return;    
      ... 
      // the same code as usual
      }
      

       

      Other parameters help to control when this logic will be enabled. It means it will work only while heavy reading going on.

      hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run eviction process that start to avoid of putting data to BlockCache
      hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to evicted each time that start to avoid of putting data to BlockCache

      By default: if 10 times (100 secunds) evicted more than 10 MB (each time) then we start to skip 50% of data blocks.
      When heavy evitions process end then new logic off and will put into BlockCache all blocks again.
       

      Descriptions of the test:

      4 nodes E5-2698 v4 @ 2.20GHz, 700 Gb Mem.

      4 RegionServers

      4 tables by 64 regions by 1.88 Gb data in each = 600 Gb total (only FAST_DIFF)

      Total BlockCache Size = 48 Gb (8 % of data in HFiles)

      Random read in 20 threads

       

      I am going to make Pull Request, hope it is right way to make some contribution in this cool product.  

       

      Attachments

        1. 1582787018434_rs_metrics.jpg
          91 kB
          Danil Lipovoy
        2. 1582801838065_rs_metrics_new.png
          45 kB
          Danil Lipovoy
        3. BC_LongRun.png
          183 kB
          Danil Lipovoy
        4. BlockCacheEvictionProcess.gif
          3.58 MB
          Danil Lipovoy
        5. BlockCacheEvictionProcess.gif
          3.66 MB
          Danil Lipovoy
        6. cmp.png
          28 kB
          Danil Lipovoy
        7. evict_BC100_vs_BC23.png
          39 kB
          Danil Lipovoy
        8. eviction_100p.png
          169 kB
          Danil Lipovoy
        9. eviction_100p.png
          169 kB
          Danil Lipovoy
        10. eviction_100p.png
          158 kB
          Danil Lipovoy
        11. gc_100p.png
          162 kB
          Danil Lipovoy
        12. graph.png
          87 kB
          Danil Lipovoy
        13. image-2020-06-07-08-11-11-929.png
          89 kB
          Danil Lipovoy
        14. image-2020-06-07-08-19-00-922.png
          90 kB
          Danil Lipovoy
        15. image-2020-06-07-12-07-24-903.png
          0.2 kB
          Danil Lipovoy
        16. image-2020-06-07-12-07-30-307.png
          0.2 kB
          Danil Lipovoy
        17. image-2020-06-08-17-38-45-159.png
          48 kB
          Danil Lipovoy
        18. image-2020-06-08-17-38-52-579.png
          45 kB
          Danil Lipovoy
        19. image-2020-06-08-18-35-48-366.png
          40 kB
          Danil Lipovoy
        20. image-2020-06-14-20-51-11-905.png
          8 kB
          Danil Lipovoy
        21. image-2020-06-22-05-57-45-578.png
          3.58 MB
          Danil Lipovoy
        22. image-2020-09-23-09-48-59-714.png
          113 kB
          Danil Lipovoy
        23. image-2020-09-23-10-06-11-189.png
          36 kB
          Danil Lipovoy
        24. PR#1257.diff
          18 kB
          Viraj Jasani
        25. ratio.png
          38 kB
          Danil Lipovoy
        26. ratio2.png
          32 kB
          Danil Lipovoy
        27. read_requests_100pBC_vs_23pBC.png
          48 kB
          Danil Lipovoy
        28. requests_100p.png
          183 kB
          Danil Lipovoy
        29. requests_100p.png
          189 kB
          Danil Lipovoy
        30. requests_new_100p.png
          147 kB
          Danil Lipovoy
        31. requests_new2_100p.png
          152 kB
          Danil Lipovoy
        32. scan_and_gets.png
          107 kB
          Danil Lipovoy
        33. scan_and_gets2.png
          101 kB
          Danil Lipovoy
        34. scan.png
          55 kB
          Danil Lipovoy
        35. wave.png
          49 kB
          Danil Lipovoy
        36. ycsb_logs.zip
          35 kB
          Danil Lipovoy

        Issue Links

          Activity

            People

              pustota Danil Lipovoy
              pustota Danil Lipovoy
              Votes:
              2 Vote for this issue
              Watchers:
              25 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: