Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-15745

Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES configurable

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      When i enable DataNodePeerMetrics to find slow slow peer in cluster, i found there is a lot of slow peer but ReportingNodes's averageDelay is very low, and these slow peer node are normal. I think the reason of why generating so many slow peer is that  the value of DataNodePeerMetrics#LOW_THRESHOLD_MS is too small (only 5ms) and it is not configurable. The default value of slow io warning log threshold is 300ms, i.e. DFSConfigKeys.DFS_DATANODE_SLOW_IO_WARNING_THRESHOLD_DEFAULT = 300, so DataNodePeerMetrics#LOW_THRESHOLD_MS should not be less than 300ms, otherwise namenode will get a lot of invalid slow peer information.

      Attachments

        1. image-2020-12-22-17-00-50-796.png
          429 kB
          Haibin Huang
        2. HDFS-15745-001.patch
          4 kB
          Haibin Huang
        3. HDFS-15745-002.patch
          5 kB
          Haibin Huang
        4. HDFS-15745-003.patch
          5 kB
          Haibin Huang
        5. HDFS-15745-branch-3.1.001.patch
          5 kB
          Haibin Huang
        6. HDFS-15745-branch-3.3.001.patch
          5 kB
          Haibin Huang
        7. HDFS-15745-branch-3.2.001.patch
          5 kB
          Haibin Huang

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            huanghaibin Haibin Huang Assign to me
            huanghaibin Haibin Huang
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - Not Specified
              Not Specified
              Remaining:
              Remaining Estimate - 0h
              0h
              Logged:
              Time Spent - 50m
              50m

              Slack

                Issue deployment