[HDFS-10716] In Balancer, the target task should be removed when its size < 0. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.8.0, 2.7.4, 3.0.0-alpha1
Component/s: balancer & mover
Labels:
None

Hadoop Flags:

Reviewed

Description

In ~~HDFS-10602~~, we found a failing case that the balancer moves data always between 2 DNs. And it made the balancer can't be finished. I debug the code for this, I found there seems a bug in choosing pending blocks in Dispatcher.Source.chooseNextMove.

The codes:

    private PendingMove chooseNextMove() {
      for (Iterator<Task> i = tasks.iterator(); i.hasNext();) {
        final Task task = i.next();
        final DDatanode target = task.target.getDDatanode();
        final PendingMove pendingBlock = new PendingMove(this, task.target);
        if (target.addPendingBlock(pendingBlock)) {
          // target is not busy, so do a tentative block allocation
          if (pendingBlock.chooseBlockAndProxy()) {
            long blockSize = pendingBlock.reportedBlock.getNumBytes(this);
            incScheduledSize(-blockSize);
            task.size -= blockSize;
            // If the size of bytes that need to be moved was first reduced to less than 0
            // it should also be removed.
            if (task.size == 0) {
              i.remove();
            }
            return pendingBlock;
            //...

The value of task.size was assigned in Balancer#matchSourceWithTargetToMove

    long size = Math.min(source.availableSizeToMove(), target.availableSizeToMove());
    final Task task = new Task(target, size);

This value was depended on the source and target node, and this value will not always can be reduced to 0 in choosing pending blocks. And then, it will still move the data to the target node even if the size of bytes that needed to move has been already reduced less than 0. And finally it will make the data imbalance again in cluster, then it leads the next balancer.

We can opitimize for this as this title mentioned, I think this can speed the balancer.

Can see the logs for failling case, or see the ~~HDFS-10602~~.(Concentrating on the change record for the scheduled size of target node. That's my added info for debug, like this).

2016-08-01 16:51:57,492 [pool-51-thread-1] INFO  balancer.Dispatcher (Dispatcher.java:chooseNextMove(799)) - TargetNode: 58794, bytes scheduled to move, after: -67, before: 33

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

failing.log
02/Aug/16 13:02
49 kB
Yiqun Lin
HDFS-10716.001.patch
02/Aug/16 13:09
0.8 kB
Yiqun Lin

Issue Links

is duplicated by

HDFS-10859 TestBalancer#testUnknownDatanodeSimple and testBalancerWithKeytabs are flaky in branch-2.7

Resolved

relates to

HDFS-10602 TestBalancer runs timeout intermittently

Resolved

Activity

People

Assignee:: Yiqun Lin

Reporter:: Yiqun Lin

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 02/Aug/16 12:59

Updated:: 05/Jan/17 01:19

Resolved:: 04/Aug/16 16:56