[HDFS-15610] Reduce datanode upgrade/hardlink thread - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.0.0, 3.1.4
Fix Version/s: 3.3.1, 3.4.0
Component/s: datanode
Labels:
- pull-request-available

Description

There is a kernel overhead on datanode upgrade. If datanode with millions of blocks and 10+ disks then block-layout migration will be super expensive during its hardlink operation. Slowness is observed when running with large hardlink threads(dfs.datanode.block.id.layout.upgrade.threads, default is 12 thread for each disk) and its runs for 2+ hours.

I.e 10*12=120 threads (for 10 disks)

Small test:

RHEL7, 32 cores, 20 GB RAM, 8 GB DN heap

dfs.datanode.block.id.layout.upgrade.threads	Blocks	Disks	Time taken
12	3.3 Million	1	2 minutes and 59 seconds
6	3.3 Million	1	2 minutes and 35 seconds
3	3.3 Million	1	2 minutes and 51 seconds

Tried same test twice and 95% is accurate (only a few sec difference on each iteration). Using 6 thread is faster than 12 thread because of its overhead.

Attachments

Issue Links

is duplicated by

HDFS-9536 OOM errors during parallel upgrade to Block-ID based layout

Resolved

links to

GitHub Pull Request #2365

Activity

People

Assignee:: Karthik Palanisamy

Reporter:: Karthik Palanisamy

Votes:: 1 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 01/Oct/20 00:52

Updated:: 29/Oct/22 22:47

Resolved:: 08/Oct/20 07:23

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

1h 10m