Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14617 Improve fsimage load time by writing sub-sections to the fsimage index
  3. HDFS-14771

Backport HDFS-14617 to branch-2 (Improve fsimage load time by writing sub-sections to the fsimage index)

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.10.0
    • Fix Version/s: 2.10.0
    • Component/s: namenode
    • Labels:
    • Release Note:
      Hide
      This change allows the inode and inode directory sections of the fsimage to be loaded in parallel. Tests on large images have shown this change to reduce the image load time to about 50% of the pre-change run time.

      It works by writing sub-section entries to the image index, effectively splitting each image section into many sub-sections which can be processed in parallel. By default 12 sub-sections per image section are created when the image is saved, and 4 threads are used to load the image at startup.

      This is disabled by default for any image with more than 1M inodes (dfs.image.parallel.inode.threshold) and can be enabled by setting dfs.image.parallel.load to true. When the feature is enabled, the next HDFS checkpoint will write the image sub-sections and subsequent namenode restarts can load the image in parallel.

      A image with the parallel sections can be read even if the feature is disabled, but HDFS versions without this Jira cannot load an image with parallel sections. OIV can process a parallel enabled image without issues.

      Key configuration parameters are:

      dfs.image.parallel.load=false - enable or disable the feature

      dfs.image.parallel.target.sections = 12 - The target number of subsections. Aim for 2 to 3 times the number of dfs.image.parallel.threads.

      dfs.image.parallel.inode.threshold = 1000000 - Only save and load in parallel if the image has more than this number of inodes.

      dfs.image.parallel.threads = 4 - The number of threads used to load the image. Testing has shown 4 to be optimal, but this may depends on the environment.

      UPGRADE WARN:
      1. It can upgrade smoothly from 2.10 to 3.* if not enable this feature ever.
      2. Only path to do upgrade from 2.10 to 3.3 currently when enable fsimage parallel loading feature.
      3. If someone want to upgrade 2.10 to 3.*(3.1.*/3.2.*) prior release, please make sure that save at least one fsimage file after disable this feature. It relies on change configuration parameter(dfs.image.parallel.load=false) first and restart namenode before upgrade operation.
      Show
      This change allows the inode and inode directory sections of the fsimage to be loaded in parallel. Tests on large images have shown this change to reduce the image load time to about 50% of the pre-change run time. It works by writing sub-section entries to the image index, effectively splitting each image section into many sub-sections which can be processed in parallel. By default 12 sub-sections per image section are created when the image is saved, and 4 threads are used to load the image at startup. This is disabled by default for any image with more than 1M inodes (dfs.image.parallel.inode.threshold) and can be enabled by setting dfs.image.parallel.load to true. When the feature is enabled, the next HDFS checkpoint will write the image sub-sections and subsequent namenode restarts can load the image in parallel. A image with the parallel sections can be read even if the feature is disabled, but HDFS versions without this Jira cannot load an image with parallel sections. OIV can process a parallel enabled image without issues. Key configuration parameters are: dfs.image.parallel.load=false - enable or disable the feature dfs.image.parallel.target.sections = 12 - The target number of subsections. Aim for 2 to 3 times the number of dfs.image.parallel.threads. dfs.image.parallel.inode.threshold = 1000000 - Only save and load in parallel if the image has more than this number of inodes. dfs.image.parallel.threads = 4 - The number of threads used to load the image. Testing has shown 4 to be optimal, but this may depends on the environment. UPGRADE WARN: 1. It can upgrade smoothly from 2.10 to 3.* if not enable this feature ever. 2. Only path to do upgrade from 2.10 to 3.3 currently when enable fsimage parallel loading feature. 3. If someone want to upgrade 2.10 to 3.*(3.1.*/3.2.*) prior release, please make sure that save at least one fsimage file after disable this feature. It relies on change configuration parameter(dfs.image.parallel.load=false) first and restart namenode before upgrade operation.

      Description

      This JIRA aims to backport HDFS-14617 to branch-2: fsimage load time by writing sub-sections to the fsimage index.

        Attachments

        1. HDFS-14771.branch-2.003.patch
          44 kB
          Xiaoqiao He
        2. HDFS-14771.branch-2.002.patch
          44 kB
          Xiaoqiao He
        3. HDFS-14771.branch-2.001.patch
          44 kB
          Xiaoqiao He

          Issue Links

            Activity

              People

              • Assignee:
                hexiaoqiao Xiaoqiao He
                Reporter:
                hexiaoqiao Xiaoqiao He
              • Votes:
                0 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: