Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12077

Implement a remaining space based balancer policy

Add voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.6.0
    • None
    • balancer & mover
    • None

    Description

      Our cluster has DataNodes with 2T disk storage, as storage utilization of the cluster growing, we need to add new DataNodes to increse the capacity of our cluster. In order to make utilization of every DataNode be in relatively balanced state, usually we use HDFS balancer tool to balance our cluster every time we add new DataNodes.
      We have been facing an issue with heterogeneous disk capacity when using HDFS balancer tool. In production cluster, we often have to add new DataNodes with larger disk capacity than previous DNs. Since the original balancer is implemented to balance utilization of every DataNode, the balancer will make every DN's utilization and average utilization of the cluster be within a given threshold.
      For example, in a cluster with two DataNodes DN1 and DN2, DN1 has ten disks with 2T capacity, DN2 has ten disks with 10T capacity, the original balancer may make the cluster balanced in the following state:

      DataNode Total Capacity Used Remaining utilization
      DN1 20T 18T 2T 90%
      DN2 100T 90T 10T 90%

      each DN has reached a 90% utilization, in such a case, DN1's capacibility to store new blocks is far less than DN2's. When DN1 is full, all of the new blocks will be written to DN2 and more MR tasks will be scheduled to DN2. As a result, DN2 is overloaded and we can not
      make full use of each DN's I/O capacity. In such a case, We wish the balancer could run based on remaining space of every DN. After balancing, every DN's remaining space could be balanced like the following state:

      DataNode Total Capacity Used Remaining utilization
      DN1 20T 14T 6T 70%
      DN2 100T 94T 6T 94%

      In a cluster with balanced remaining space of DN's capacity, every DN will be utilized when writing new blocks to the cluster, on the other hand, every DN's I/O capacity can be utilized when running MR jobs.

      Please let me know what you guys think. I will attach a patch if necessary.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            yiyang liuyiyang

            Dates

              Created:
              Updated:

              Slack

                Issue deployment