[HBASE-23887] New L1 cache : AdaptiveLRU - ASF JIRA

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 3.0.0-alpha-1, 2.5.0, 2.4.2
Component/s: BlockCache, Performance
Labels:
None

Hadoop Flags:

Reviewed
Release Note:

Hide
Introduced new L1 cache: AdaptiveLRU. This is supposed to provide better performance than default LRU cache.
Set config key "hfile.block.cache.policy" to "AdaptiveLRU" in hbase-site in order to start using this new cache.

Show
Introduced new L1 cache: AdaptiveLRU. This is supposed to provide better performance than default LRU cache. Set config key "hfile.block.cache.policy" to "AdaptiveLRU" in hbase-site in order to start using this new cache.

Description

Hi!

I first time here, correct me please if something wrong.

All latest information is here:

https://docs.google.com/document/d/1X8jVnK_3lp9ibpX6lnISf_He-6xrHZL0jQQ7hoTV0-g/edit?usp=sharing

I want propose how to improve performance when data in HFiles much more than BlockChache (usual story in BigData). The idea - caching only part of DATA blocks. It is good becouse LruBlockCache starts to work and save huge amount of GC.

Sometimes we have more data than can fit into BlockCache and it is cause a high rate of evictions. In this case we can skip cache a block N and insted cache the N+1th block. Anyway we would evict N block quite soon and that why that skipping good for performance.

—

Some information below isn't actual

—

Example:

Imagine we have little cache, just can fit only 1 block and we are trying to read 3 blocks with offsets:
124
198
223

Current way - we put the block 124, then put 198, evict 124, put 223, evict 198. A lot of work (5 actions).

With the feature - last few digits evenly distributed from 0 to 99. When we divide by modulus we got:
124 -> 24
198 -> 98
223 -> 23

It helps to sort them. Some part, for example below 50 (if we set hbase.lru.cache.data.block.percent = 50) go into the cache. And skip others. It means we will not try to handle the block 198 and save CPU for other job. In the result - we put block 124, then put 223, evict 124 (3 actions).

See the picture in attachment with test below. Requests per second is higher, GC is lower.

The key point of the code:
Added the parameter: hbase.lru.cache.data.block.percent which by default = 100

But if we set it 1-99, then will work the next logic:

public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory) {   
  if (cacheDataBlockPercent != 100 && buf.getBlockType().isData())      
    if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) 
      return;    
... 
// the same code as usual
}

Other parameters help to control when this logic will be enabled. It means it will work only while heavy reading going on.

hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run eviction process that start to avoid of putting data to BlockCache
hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to evicted each time that start to avoid of putting data to BlockCache

By default: if 10 times (100 secunds) evicted more than 10 MB (each time) then we start to skip 50% of data blocks.
When heavy evitions process end then new logic off and will put into BlockCache all blocks again.

Descriptions of the test:

4 nodes E5-2698 v4 @ 2.20GHz, 700 Gb Mem.

4 RegionServers

4 tables by 64 regions by 1.88 Gb data in each = 600 Gb total (only FAST_DIFF)

Total BlockCache Size = 48 Gb (8 % of data in HFiles)

Random read in 20 threads

I am going to make Pull Request, hope it is right way to make some contribution in this cool product.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

1582787018434_rs_metrics.jpg
27/Feb/20 07:05
91 kB
Danil Lipovoy
1582801838065_rs_metrics_new.png
27/Feb/20 11:12
45 kB
Danil Lipovoy
BC_LongRun.png
02/Mar/20 11:08
183 kB
Danil Lipovoy
BlockCacheEvictionProcess.gif
22/Jun/20 05:57
3.58 MB
Danil Lipovoy
BlockCacheEvictionProcess.gif
16/May/20 14:26
3.66 MB
Danil Lipovoy
cmp.png
24/Feb/20 08:44
28 kB
Danil Lipovoy
evict_BC100_vs_BC23.png
07/Mar/20 12:59
39 kB
Danil Lipovoy
eviction_100p.png
31/May/20 16:43
169 kB
Danil Lipovoy
eviction_100p.png
31/May/20 16:24
169 kB
Danil Lipovoy
eviction_100p.png
31/May/20 16:22
158 kB
Danil Lipovoy
gc_100p.png
31/May/20 16:29
162 kB
Danil Lipovoy
graph.png
06/Jun/20 18:33
87 kB
Danil Lipovoy
image-2020-06-07-08-11-11-929.png
07/Jun/20 05:11
89 kB
Danil Lipovoy
image-2020-06-07-08-19-00-922.png
07/Jun/20 05:19
90 kB
Danil Lipovoy
image-2020-06-07-12-07-24-903.png
07/Jun/20 09:07
0.2 kB
Danil Lipovoy
image-2020-06-07-12-07-30-307.png
07/Jun/20 09:07
0.2 kB
Danil Lipovoy
image-2020-06-08-17-38-45-159.png
08/Jun/20 14:38
48 kB
Danil Lipovoy
image-2020-06-08-17-38-52-579.png
08/Jun/20 14:38
45 kB
Danil Lipovoy
image-2020-06-08-18-35-48-366.png
08/Jun/20 15:35
40 kB
Danil Lipovoy
image-2020-06-14-20-51-11-905.png
14/Jun/20 17:51
8 kB
Danil Lipovoy
image-2020-06-22-05-57-45-578.png
22/Jun/20 05:57
3.58 MB
Danil Lipovoy
image-2020-09-23-09-48-59-714.png
23/Sep/20 06:49
113 kB
Danil Lipovoy
image-2020-09-23-10-06-11-189.png
23/Sep/20 07:06
36 kB
Danil Lipovoy
PR#1257.diff
08/Jan/21 14:02
18 kB
Viraj Jasani
ratio.png
14/Jun/20 17:42
38 kB
Danil Lipovoy
ratio2.png
14/Jun/20 18:03
32 kB
Danil Lipovoy
read_requests_100pBC_vs_23pBC.png
07/Mar/20 12:54
48 kB
Danil Lipovoy
requests_100p.png
31/May/20 16:50
183 kB
Danil Lipovoy
requests_100p.png
31/May/20 15:43
189 kB
Danil Lipovoy
requests_new_100p.png
02/Jun/20 05:47
147 kB
Danil Lipovoy
requests_new2_100p.png
06/Jun/20 18:38
152 kB
Danil Lipovoy
scan_and_gets.png
14/Jun/20 14:00
107 kB
Danil Lipovoy
scan_and_gets2.png
14/Jun/20 17:50
101 kB
Danil Lipovoy
scan.png
06/Jun/20 18:06
55 kB
Danil Lipovoy
wave.png
07/Jun/20 09:06
49 kB
Danil Lipovoy
ycsb_logs.zip
23/Sep/20 07:09
35 kB
Danil Lipovoy

Issue Links

is related to

HDFS-15202 HDFS-client: boost ShortCircuit Cache

Resolved

links to

GitHub Pull Request #1257

GitHub Pull Request #2934

GitHub Pull Request #2957

Pull Request HBASE 1257

New L1 cache : AdaptiveLRU

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates