Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-16000

HDFS : Rename performance optimization

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 3.1.4, 3.3.1
    • None
    • hdfs, namenode

    Description

      It takes a long time to move a large directory with rename. For example, it takes about 40 seconds to move a 1000W directory. When a large amount of data is deleted to the trash, the move large directory will occur when the recycle bin makes checkpoint. In addition, the user may also actively trigger the move large directory operation, which will cause the NameNode to lock too long and be killed by Zkfc. Through the flame graph, it is found that the main time consuming is to create the EnumCounters object.

       

      Rename logic optimization:

      • Regardless of whether the rename operation is the source directory and the target directory, the quota count must be calculated three times. The first time, check whether the moved directory exceeds the target directory quota, the second time, calculate the mobile directory quota to update the source directory quota, and the third time, calculate the mobile directory configuration update to the target directory.
      • I think some of the above three quota quota calculations are unnecessary. For example, if all parent directories of the source directory and target directory are not configured with quota, there is no need to calculate quotaCount. Even if both the source directory and the target directory use quota, there is no need to calculate the quota three times. The calculation logic for the first and third times is the same, and it only needs to be calculated once.

      Attachments

        1. HDFS-16000.patch
          37 kB
          Xiangyi Zhu
        2. 20210428-171635-lambda.svg
          102 kB
          Xiangyi Zhu
        3. 20210428-143238.svg
          145 kB
          Xiangyi Zhu

        Issue Links

          Activity

            People

              zhuxiangyi Xiangyi Zhu
              zhuxiangyi Xiangyi Zhu
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 50m
                  50m