Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-16333

fix balancer bug when transfer an EC block

    XMLWordPrintableJSON

Details

    Description

      We set the EC policy to (6+3) and we also have nodes that were decommissioning when we executed balancer.

      With the balancer running, we find many error logs as follow.

      Node A wants to transfer an EC block to node B, but we found that the block is not on node A. The FSCK command to show the block status as follow

      In the dispatcher. getBlockList function

       

      Assume that the location of the an EC block in storageGroupMap look like this

      indices:[0, 1, 2, 3, 4, 5, 6, 7, 8]

      node:[a, b, c, d, e, f, g, h, i]

      after decommission operation, the internal block on indices[1] were decommission to another node.

      indices:[0, 1, 2, 3, 4, 5, 6, 7, 8]

      node:[a, j, c, d, e, f, g, h, i]

      the location of indices[1] change from node b to node j.

       

      When the balancer get the block location and check it with the location in storageGroupMap.

      If a node is not found in storageGroupMap, it will not be add to block locations.

      In this case, node j will not be added to the block locations, while the indices is not updated.

      Finally, the block location may look like this, 

      indices:[0, 1, 2, 3, 4, 5, 6, 7, 8]

      block.location:[a, c, d, e, f, g, h, i]

      the location of the nodes does not match their indices

       

      Solution:

      we should update the indices and match with the nodes

      indices:[0, 2, 3, 4, 5, 6, 7, 8]

      block.location:[a, c, d, e, f, g, h, i]

      Attachments

        1. image-2021-11-18-17-25-13-089.png
          183 kB
          qinyuren
        2. image-2021-11-18-17-25-50-556.png
          188 kB
          qinyuren
        3. image-2021-11-18-17-28-03-155.png
          25 kB
          qinyuren

        Activity

          People

            qinyuren qinyuren
            qinyuren qinyuren
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 6h 40m
                6h 40m