[HBASE-18309] Support multi threads in CleanerChore - ASF JIRA

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Resolved
Affects Version/s: None
Fix Version/s: 2.0.0-beta-1, 2.0.0
Component/s: None
Labels:
None

Hadoop Flags:

Reviewed
Release Note:

Hide
After ~~HBASE-18309~~,
1. we could use multiple threads to scan archive directories (including data and oldWALs) through config hbase.cleaner.scan.dir.concurrent.size, which supports both integer (meaning the concrete size, but no more than available cpu cores) and double (between >0.0 and <=1.0, meaning ratio of available cpu cores) value and defaults to 0.25. Pay attention that 1.0 is different from 1, the former indicates it will use 100% of cores, while the latter will use only 1 thread for chore to scan dir.
2. We also support using multiple threads to clean wals under oldWALs directory through hbase.oldwals.cleaner.thread.size, 2 by default.
3. In addition, hbase.cleaner.scan.dir.concurrent.size and hbase.oldwals.cleaner.thread.size support online re-configuration as hbase.regionserver.hfilecleaner.large.thread.count and hbase.regionserver.hfilecleaner.small.thread.count does.
4. Please take hbase.cleaner.scan.dir.concurrent.size, hbase.regionserver.hfilecleaner.large.thread.count, hbase.regionserver.hfilecleaner.small.thread.count and hbase.oldwals.cleaner.thread.size into account when setting this config to avoid thread flooding.

Show
After HBASE-18309 , 1. we could use multiple threads to scan archive directories (including data and oldWALs) through config hbase.cleaner.scan.dir.concurrent.size, which supports both integer (meaning the concrete size, but no more than available cpu cores) and double (between >0.0 and <=1.0, meaning ratio of available cpu cores) value and defaults to 0.25. Pay attention that 1.0 is different from 1, the former indicates it will use 100% of cores, while the latter will use only 1 thread for chore to scan dir. 2. We also support using multiple threads to clean wals under oldWALs directory through hbase.oldwals.cleaner.thread.size, 2 by default. 3. In addition, hbase.cleaner.scan.dir.concurrent.size and hbase.oldwals.cleaner.thread.size support online re-configuration as hbase.regionserver.hfilecleaner.large.thread.count and hbase.regionserver.hfilecleaner.small.thread.count does. 4. Please take hbase.cleaner.scan.dir.concurrent.size, hbase.regionserver.hfilecleaner.large.thread.count, hbase.regionserver.hfilecleaner.small.thread.count and hbase.oldwals.cleaner.thread.size into account when setting this config to avoid thread flooding.

Description

There is only one thread in LogCleaner to clean oldWALs and in our big cluster we find this is not enough. The number of files under oldWALs reach the max-directory-items limit of HDFS and cause region server crash, so we use multi threads for LogCleaner and the crash not happened any more.

What's more, currently there's only one thread iterating the archive directory, and we could use multiple threads cleaning sub directories in parallel to speed it up.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HBASE-18309.addendum.patch
17/Dec/17 09:06
1 kB
Chia-Ping Tsai
HBASE-18309.branch-1.001.patch
30/Mar/18 04:52
42 kB
Reid Chan
HBASE-18309.branch-1.002.patch
30/Mar/18 06:12
42 kB
Reid Chan
HBASE-18309.branch-1.003.patch
30/Mar/18 07:14
40 kB
Reid Chan
HBASE-18309.branch-1.004.patch
30/Mar/18 08:31
40 kB
Reid Chan
HBASE-18309.branch-1.005.patch
30/Mar/18 12:19
40 kB
Reid Chan
HBASE-18309.branch-1.006.patch
03/Apr/18 18:50
40 kB
Zach York
HBASE-18309.master.001.patch
06/Nov/17 13:31
15 kB
Reid Chan
HBASE-18309.master.002.patch
07/Nov/17 14:23
16 kB
Reid Chan
HBASE-18309.master.004.patch
15/Nov/17 04:58
31 kB
Reid Chan
HBASE-18309.master.005.patch
15/Nov/17 15:51
32 kB
Reid Chan
HBASE-18309.master.006.patch
16/Nov/17 04:46
32 kB
Reid Chan
HBASE-18309.master.007.patch
16/Nov/17 14:27
32 kB
Reid Chan
HBASE-18309.master.008.patch
17/Nov/17 10:13
32 kB
Reid Chan
HBASE-18309.master.009.patch
18/Nov/17 03:29
32 kB
Reid Chan
HBASE-18309.master.010.patch
20/Nov/17 04:40
32 kB
Reid Chan
HBASE-18309.master.011.patch
20/Nov/17 09:41
33 kB
Reid Chan
HBASE-18309.master.012.patch
21/Nov/17 07:07
33 kB
Reid Chan
space_consumption_in_archive.png
10/Nov/17 07:03
202 kB
Yu Li

Issue Links

causes

HBASE-22867 The ForkJoinPool in CleanerChore will spawn thousands of threads in our cluster with thousands table

Resolved

is depended upon by

HBASE-19306 Avoid thread flood when both LogCleaner and HFileCleaner support multi threads execution

Open

HBASE-20352 [Chore] Backport HBASE-18309 to branch-1

Resolved

is related to

HBASE-20401 Make `MAX_WAIT` and `waitIfNotFinished` in CleanerContext configurable

Resolved

relates to

HBASE-19709 Guard against a ThreadPool size of 0 in CleanerChore

Resolved

HBASE-20095 Redesign single instance pool in CleanerChore

Resolved

HBASE-18083 Make large/small file clean thread number configurable in HFileCleaner

Resolved

HBASE-14247 Separate the old WALs into different regionserver directories

Closed

links to

Review Board

(3 relates to, 1 links to)

Support multi threads in CleanerChore

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates