Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-13883

Reduce memory consumption and GC of directory scan

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.2.0
    • None
    • None
    • None

    Description

          When DirectoryScan task have trigger in periodic,  the scan thread to scan all disk in this 

      DataNode for all blockpool, and construct a ScanInfo per block. So DataNode need huge memory to hold those ScanInfo's memory structure when tens of millions blocks store in this DataNode.

          Another problem is DataNode implements by java, so DataNode run as a JVM, so we need to set a big number for -Xmx to satisfy the memory needs of DirectoryScan. But we know the default period of DirectoryScan is 6 hours, and at other time DataNode actually need less memory, and JVM can't auto return free memory to OS, so many memory utilization rate is low.

          At last, we have test  close or open DirectoryScan in thirty-millions blocks store in the DataNode, and the  -Xmx set is 16G and 32G respectively.So i think we can improve the 

      DirectoryScan process and save memory such as scan one block pool per period, thanks.

         

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              liaoyuxiangqin liaoyuxiangqin
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated: