Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-8204

Mover/Balancer should not schedule two replicas to the same DN

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      Balancer moves blocks between Datanode(Ver. <2.6 ).
      Balancer moves blocks between StorageGroups ( introduced by HDFS-6584) , in the new version(Ver. >=2.6) .
      function

      class DBlock extends Locations<StorageGroup>
      DBlock.isLocatedOn(StorageGroup loc)
      

      is flawed, may causes 2 replicas ends in same node after running balance.

      For example:
      We have 2 nodes. Each node has two storages.
      We have (DN0, SSD), (DN0, DISK), (DN1, SSD), (DN1, DISK).
      We have a block with ONE_SSD storage policy.
      The block has 2 replicas. They are in (DN0,SSD) and (DN1,DISK).
      Replica in (DN0,SSD) should not be moved to (DN1,SSD) after running Balancer.
      Otherwise DN1 has 2 replicas.
      --------------
      UPDATE(Thanks szetszwo for pointing it out):

      This bug will NOT causes 2 replicas end in same node after running balance, thanks to Datanode rejecting it.

      We see a lot of ERROR when running test.

      2015-04-27 10:08:15,809 ERROR datanode.DataNode (DataXceiver.java:run(277)) - host1.foo.com:59537:DataXceiver error processing REPLACE_BLOCK operation  src: /127.0.0.1:52532 dst: /127.0.0.1:59537
      org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block BP-264794661-9.96.1.34-1430100451121:blk_1073741825_1001 already exists in state FINALIZED and thus cannot be created.
          at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1447)
          at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:186)
          at org.apache.hadoop.hdfs.server.datanode.DataXceiver.replaceBlock(DataXceiver.java:1158)
          at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReplaceBlock(Receiver.java:229)
          at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:77)
          at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:250)
          at java.lang.Thread.run(Thread.java:722)
      

      The Balancer runs 5~20 times iterations in the test, before it exits.
      It's ineffecient.
      Balancer should not schedule it in the first place, even though it'll failed anyway. In the test, it should exit after 5 times iteration.

      Attachments

        1. HDFS-8204.001.patch
          3 kB
          Walter Su
        2. HDFS-8204.002.patch
          5 kB
          Walter Su
        3. HDFS-8204.003.patch
          5 kB
          Walter Su

        Issue Links

          Activity

            People

              walter.k.su Walter Su
              walter.k.su Walter Su
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: