Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-8204

Mover/Balancer should not schedule two replicas to the same DN



    • Reviewed


      Balancer moves blocks between Datanode(Ver. <2.6 ).
      Balancer moves blocks between StorageGroups ( introduced by HDFS-6584) , in the new version(Ver. >=2.6) .

      class DBlock extends Locations<StorageGroup>
      DBlock.isLocatedOn(StorageGroup loc)

      is flawed, may causes 2 replicas ends in same node after running balance.

      For example:
      We have 2 nodes. Each node has two storages.
      We have (DN0, SSD), (DN0, DISK), (DN1, SSD), (DN1, DISK).
      We have a block with ONE_SSD storage policy.
      The block has 2 replicas. They are in (DN0,SSD) and (DN1,DISK).
      Replica in (DN0,SSD) should not be moved to (DN1,SSD) after running Balancer.
      Otherwise DN1 has 2 replicas.
      UPDATE(Thanks szetszwo for pointing it out):

      This bug will NOT causes 2 replicas end in same node after running balance, thanks to Datanode rejecting it.

      We see a lot of ERROR when running test.

      2015-04-27 10:08:15,809 ERROR datanode.DataNode (DataXceiver.java:run(277)) - host1.foo.com:59537:DataXceiver error processing REPLACE_BLOCK operation  src: / dst: /
      org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block BP-264794661- already exists in state FINALIZED and thus cannot be created.
          at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1447)
          at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:186)
          at org.apache.hadoop.hdfs.server.datanode.DataXceiver.replaceBlock(DataXceiver.java:1158)
          at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReplaceBlock(Receiver.java:229)
          at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:77)
          at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:250)
          at java.lang.Thread.run(Thread.java:722)

      The Balancer runs 5~20 times iterations in the test, before it exits.
      It's ineffecient.
      Balancer should not schedule it in the first place, even though it'll failed anyway. In the test, it should exit after 5 times iteration.


        1. HDFS-8204.001.patch
          3 kB
          Walter Su
        2. HDFS-8204.002.patch
          5 kB
          Walter Su
        3. HDFS-8204.003.patch
          5 kB
          Walter Su

        Issue Links



              walter.k.su Walter Su
              walter.k.su Walter Su
              0 Vote for this issue
              7 Start watching this issue