Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-13032

Make AvailableSpaceBlockPlacementPolicy more adaptive



    • Improvement
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 2.8.2
    • None
    • None
    • None


      In a heterogeneous HDFS cluster, datanode capacity and usage are very different.

      Now we can use HDFS-8131, a usage-aware block placement policy to deal with the problem. However, this policy could be more flexible.

      1, The probability of a node with high usage being chosen is fixed once the parameter is set. That is the probability is always the same no matter its usage is 90% or 70%. When the usage of a node is close to full, its probability of being chosen should be lower.

      2, When the difference of usage is below 5%(hard code), the two nodes are considered the same usage. I think it's OK when usage is 30% and 35%, but when usage is 93% and 98%, they should not be treated equally. The correction of probability could be more smooth.

      In my opinion, when we choose one node from two candidates (A: usage 30%, B: usage 60%), we can calculate the probability according to the available storage. p(A) = 70%/(70% + 40%), p(B) = 40% (70% +40%). When a node is close to full, the probability would be very small.

      Also we could have another factor to weaken this correctness, and make the modification not so aggressive.

      Any thought? liushaohui


        1. HDFS-13032.002.patch
          13 kB
          Tao Jie
        2. HDFS-13032.001.patch
          13 kB
          Tao Jie

        Issue Links



              Tao Jie Tao Jie
              Tao Jie Tao Jie
              0 Vote for this issue
              4 Start watching this issue