Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-2202

Changes to balancer bandwidth should not require datanode restart.

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.205.0, 0.23.0
    • Fix Version/s: 0.20.205.0, 0.23.0
    • Component/s: balancer, datanode
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Hide
      New dfsadmin command added: [-setBalancerBandwidth <bandwidth>] where bandwidth is max network bandwidth in bytes per second that the balancer is allowed to use on each datanode during balacing.

      This is an incompatible change in 0.23. The versions of ClientProtocol and DatanodeProtocol are changed.
      Show
      New dfsadmin command added: [-setBalancerBandwidth <bandwidth>] where bandwidth is max network bandwidth in bytes per second that the balancer is allowed to use on each datanode during balacing. This is an incompatible change in 0.23. The versions of ClientProtocol and DatanodeProtocol are changed.

      Description

      Currently in order to change the value of the balancer bandwidth (dfs.datanode.balance.bandwidthPerSec), the datanode daemon must be restarted.

      The optimal value of the bandwidthPerSec parameter is not always (almost never) known at the time of cluster startup, but only once a new node is placed in the cluster and balancing is begun. If the balancing is taking too long (bandwidthPerSec is too low) or the balancing is taking up too much bandwidth (bandwidthPerSec is too high), the cluster must go into a "maintenance window" where it is unusable while all of the datanodes are bounced. In large clusters of thousands of nodes, this can be a real maintenance problem because these "mainenance windows" can take a long time and there may have to be several of them while the bandwidthPerSec is experimented with and tuned.

      A possible solution to this problem would be to add a -bandwidth parameter to the balancer tool. If bandwidth is supplied, pass the value to the datanodes via the OP_REPLACE_BLOCK and OP_COPY_BLOCK DataTransferProtocol requests. This would make it necessary, however, to change the DataTransferProtocol version.

      1. ant.test.0.23.out
        52 kB
        Eric Payne
      2. ASF.LICENSE.NOT.GRANTED--Balancer Bandwidth MSC.jpg
        52 kB
        Eric Payne
      3. HDFS-2171.patch
        21 kB
        Eric Payne
      4. HDFS-2202.0.20.205.0.v1.patch
        24 kB
        Eric Payne
      5. HDFS-2202.0.20.205.0.v2.patch
        24 kB
        Eric Payne
      6. HDFS-2202.0.23.0.v1.patch
        24 kB
        Eric Payne
      7. HDFS-2202.0.23.0.v2.patch
        24 kB
        Eric Payne
      8. HDFS-2202.patch
        21 kB
        Eric Payne

        Issue Links

          Activity

            People

            • Assignee:
              Eric Payne
              Reporter:
              Eric Payne
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development