[HDFS-14771] Backport HDFS-14617 to branch-2 (Improve fsimage load time by writing sub-sections to the fsimage index) - ASF JIRA

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.10.0
Fix Version/s: 2.10.0
Component/s: namenode
Labels:
- release-blocker

Release Note:

Hide
This change allows the inode and inode directory sections of the fsimage to be loaded in parallel. Tests on large images have shown this change to reduce the image load time to about 50% of the pre-change run time.

It works by writing sub-section entries to the image index, effectively splitting each image section into many sub-sections which can be processed in parallel. By default 12 sub-sections per image section are created when the image is saved, and 4 threads are used to load the image at startup.

This is disabled by default for any image with more than 1M inodes (dfs.image.parallel.inode.threshold) and can be enabled by setting dfs.image.parallel.load to true. When the feature is enabled, the next HDFS checkpoint will write the image sub-sections and subsequent namenode restarts can load the image in parallel.

A image with the parallel sections can be read even if the feature is disabled, but HDFS versions without this Jira cannot load an image with parallel sections. OIV can process a parallel enabled image without issues.

Key configuration parameters are:

dfs.image.parallel.load=false - enable or disable the feature

dfs.image.parallel.target.sections = 12 - The target number of subsections. Aim for 2 to 3 times the number of dfs.image.parallel.threads.

dfs.image.parallel.inode.threshold = 1000000 - Only save and load in parallel if the image has more than this number of inodes.

dfs.image.parallel.threads = 4 - The number of threads used to load the image. Testing has shown 4 to be optimal, but this may depends on the environment.

UPGRADE WARN:
1. It can upgrade smoothly from 2.10 to 3.* if not enable this feature ever.
2. Only path to do upgrade from 2.10 to 3.3 currently when enable fsimage parallel loading feature.
3. If someone want to upgrade 2.10 to 3.*(3.1.*/3.2.*) prior release, please make sure that save at least one fsimage file after disable this feature. It relies on change configuration parameter(dfs.image.parallel.load=false) first and restart namenode before upgrade operation.

Show
This change allows the inode and inode directory sections of the fsimage to be loaded in parallel. Tests on large images have shown this change to reduce the image load time to about 50% of the pre-change run time. It works by writing sub-section entries to the image index, effectively splitting each image section into many sub-sections which can be processed in parallel. By default 12 sub-sections per image section are created when the image is saved, and 4 threads are used to load the image at startup. This is disabled by default for any image with more than 1M inodes (dfs.image.parallel.inode.threshold) and can be enabled by setting dfs.image.parallel.load to true. When the feature is enabled, the next HDFS checkpoint will write the image sub-sections and subsequent namenode restarts can load the image in parallel. A image with the parallel sections can be read even if the feature is disabled, but HDFS versions without this Jira cannot load an image with parallel sections. OIV can process a parallel enabled image without issues. Key configuration parameters are: dfs.image.parallel.load=false - enable or disable the feature dfs.image.parallel.target.sections = 12 - The target number of subsections. Aim for 2 to 3 times the number of dfs.image.parallel.threads. dfs.image.parallel.inode.threshold = 1000000 - Only save and load in parallel if the image has more than this number of inodes. dfs.image.parallel.threads = 4 - The number of threads used to load the image. Testing has shown 4 to be optimal, but this may depends on the environment. UPGRADE WARN: 1. It can upgrade smoothly from 2.10 to 3.* if not enable this feature ever. 2. Only path to do upgrade from 2.10 to 3.3 currently when enable fsimage parallel loading feature. 3. If someone want to upgrade 2.10 to 3.*(3.1.*/3.2.*) prior release, please make sure that save at least one fsimage file after disable this feature. It relies on change configuration parameter(dfs.image.parallel.load=false) first and restart namenode before upgrade operation.

Description

This JIRA aims to backport ~~HDFS-14617~~ to branch-2: fsimage load time by writing sub-sections to the fsimage index.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDFS-14771.branch-2.003.patch
05/Sep/19 04:08
44 kB
Xiaoqiao He
HDFS-14771.branch-2.002.patch
03/Sep/19 17:21
44 kB
Xiaoqiao He
HDFS-14771.branch-2.001.patch
25/Aug/19 13:36
44 kB
Xiaoqiao He

Issue Links

relates to

HDFS-14821 Make HDFS-14617 (fsimage sub-sections) off by default

Resolved

Backport HDFS-14617 to branch-2 (Improve fsimage load time by writing sub-sections to the fsimage index)

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates