Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-16203

Discover datanodes with unbalanced block pool usage by the standard deviation

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      Discover datanodes with unbalanced volume usage by the standard deviation.

      In some scenarios, we may cause unbalanced datanode disk usage:
      1. Repair the damaged disk and make it online again.
      2. Add disks to some Datanodes.
      3. Some disks are damaged, resulting in slow data writing.
      4. Use some custom volume choosing policies.

      In the case of unbalanced disk usage, a sudden increase in datanode write traffic may result in busy disk I/O with low volume usage, resulting in decreased throughput across datanodes.

      We need to find these nodes in time to do diskBalance, or other processing. Based on the volume usage of each datanode, we can calculate the standard deviation of the volume usage. The more unbalanced the volume, the higher the standard deviation.

      We can display the result on the Web of namenode, and then sorting directly to find the nodes where the volumes usages are unbalanced.

      This interface is only used to obtain metrics and does not adversely affect namenode performance.

       

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            tomscut Tao Li Assign to me
            tomscut Tao Li
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - Not Specified
              Not Specified
              Remaining:
              Remaining Estimate - 0h
              0h
              Logged:
              Time Spent - 6h 20m
              6h 20m

              Slack

                Issue deployment