Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-18694

DataNode JVM heap settings should include CMSInitiatingOccupancy

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.2.2
    • 2.5.0
    • None
    • None

    Description

      As HDFS-11047 reported, DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment with 500,000+ blocks, we've seen the DN heap usage being accumulated to high peaks very quickly. Deep copies of FinalizedReplica will make DN heap usage even worse if directory scans are scheduled more frequently.

      Another factor is that huge number of ScanInfo instances corresponding to HDFS blocks are lingering in garbage to eat many heap memories until a full GC takes place.

      This proposes adding JVM settings to force GC more frequently to release DataNode heap consumed as a result of two aforementioned reasons, i.e. add the options to HADOOP_DATANODE_OPTS

      -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -XX:ConcGCThreads=8 -XX:+UseConcMarkSweepGC
      

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            xiaobingo Xiaobing Zhou Assign to me
            xiaobingo Xiaobing Zhou
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment