Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-8204

Mover/Balancer should not schedule two replicas to the same DN

    Details

    • Hadoop Flags:
      Reviewed

      Description

      Balancer moves blocks between Datanode(Ver. <2.6 ).
      Balancer moves blocks between StorageGroups ( introduced by HDFS-6584) , in the new version(Ver. >=2.6) .
      function

      class DBlock extends Locations<StorageGroup>
      DBlock.isLocatedOn(StorageGroup loc)
      

      is flawed, may causes 2 replicas ends in same node after running balance.

      For example:
      We have 2 nodes. Each node has two storages.
      We have (DN0, SSD), (DN0, DISK), (DN1, SSD), (DN1, DISK).
      We have a block with ONE_SSD storage policy.
      The block has 2 replicas. They are in (DN0,SSD) and (DN1,DISK).
      Replica in (DN0,SSD) should not be moved to (DN1,SSD) after running Balancer.
      Otherwise DN1 has 2 replicas.
      --------------
      UPDATE(Thanks Tsz Wo Nicholas Sze for pointing it out):

      This bug will NOT causes 2 replicas end in same node after running balance, thanks to Datanode rejecting it.

      We see a lot of ERROR when running test.

      2015-04-27 10:08:15,809 ERROR datanode.DataNode (DataXceiver.java:run(277)) - host1.foo.com:59537:DataXceiver error processing REPLACE_BLOCK operation  src: /127.0.0.1:52532 dst: /127.0.0.1:59537
      org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block BP-264794661-9.96.1.34-1430100451121:blk_1073741825_1001 already exists in state FINALIZED and thus cannot be created.
          at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1447)
          at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:186)
          at org.apache.hadoop.hdfs.server.datanode.DataXceiver.replaceBlock(DataXceiver.java:1158)
          at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReplaceBlock(Receiver.java:229)
          at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:77)
          at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:250)
          at java.lang.Thread.run(Thread.java:722)
      

      The Balancer runs 5~20 times iterations in the test, before it exits.
      It's ineffecient.
      Balancer should not schedule it in the first place, even though it'll failed anyway. In the test, it should exit after 5 times iteration.

        Attachments

        1. HDFS-8204.001.patch
          3 kB
          Walter Su
        2. HDFS-8204.002.patch
          5 kB
          Walter Su
        3. HDFS-8204.003.patch
          5 kB
          Walter Su

          Issue Links

            Activity

              People

              • Assignee:
                walter.k.su Walter Su
                Reporter:
                walter.k.su Walter Su
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: