Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-13883

Reduce memory consumption and GC of directory scan

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 3.2.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

          When DirectoryScan task have trigger in periodic,  the scan thread to scan all disk in this 

      DataNode for all blockpool, and construct a ScanInfo per block. So DataNode need huge memory to hold those ScanInfo's memory structure when tens of millions blocks store in this DataNode.

          Another problem is DataNode implements by java, so DataNode run as a JVM, so we need to set a big number for -Xmx to satisfy the memory needs of DirectoryScan. But we know the default period of DirectoryScan is 6 hours, and at other time DataNode actually need less memory, and JVM can't auto return free memory to OS, so many memory utilization rate is low.

          At last, we have test  close or open DirectoryScan in thirty-millions blocks store in the DataNode, and the  -Xmx set is 16G and 32G respectively.So i think we can improve the 

      DirectoryScan process and save memory such as scan one block pool per period, thanks.

         

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                liaoyuxiangqin liaoyuxiangqin
              • Votes:
                0 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated: