Issue Details (XML | Word | Printable)

Key: HADOOP-1912
Type: New Feature New Feature
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Hairong Kuang
Reporter: Hairong Kuang
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Datanode should support block replacement

Created: 17/Sep/07 10:18 PM   Updated: 08/Jul/09 04:42 PM
Return to search
Component/s: None
Affects Version/s: 0.14.1
Fix Version/s: 0.16.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works replace.patch 2007-09-17 11:16 PM Hairong Kuang 66 kB
Text File Licensed for inclusion in ASF works replace1.patch 2007-09-27 10:00 PM Hairong Kuang 28 kB
Text File Licensed for inclusion in ASF works replace2.patch 2007-10-16 10:30 PM Hairong Kuang 35 kB
Text File Licensed for inclusion in ASF works replace3.patch 2007-10-19 07:08 PM Hairong Kuang 37 kB
Text File Licensed for inclusion in ASF works replace4.patch 2007-10-23 04:36 AM Hairong Kuang 38 kB
Text File Licensed for inclusion in ASF works replace5.patch 2007-10-23 11:13 PM Hairong Kuang 38 kB
Text File Licensed for inclusion in ASF works replace6.patch 2007-10-25 06:01 PM Hairong Kuang 38 kB
Issue Links:
Dependants

Resolution Date: 01/Nov/07 06:09 PM


 Description  « Hide
This jira Data Node's support for rebalancing (HADOOP-1652). When a balancer decides to move a block B from Source S to Destination D. It also chooses a proxy source PS, which contains a replica of B, to speed up block copy. The block placement is carried in the following steps:
1. A block copy command is sent to datanode PS in the format of "OP_BLOCK_COPY <block_id_of_B> <source S> <destination D>". It requests PS to copy B to datanode D.
2. PS then transfers block B to datanode D with a block replacement command to D in the format of "OP_BLOCK_REPLACEMENT <block_id_of_B> <source S> <data_of_B>".
3. Datanode D writes the block B to its disk and then sends a name node a blockReceived RPC informing the namenode that a block B is received and please delete a replica of B from source S if there is any excessive replica.
4. The namenode then adds datanode D to block B's map and removes an exesive replicas of B in favor of datanode S.

In addition, each data node has a limited bandwidth for rebalancing. The default value for the bandwidth is 5MB/s. Throttling is done at both source & destination sides. Each data node limits maximum number of concurrent data transfers (including both sending and receiving) for the rebalancing purpose to be 5. In the worst case, each data transfer has a limited bandwidth of 1MB/s. Each sender & receiver has a Throttler. The primary method of the class is "throttle( int numOfBytes )". The parameter numOfBytes indicates the total number of bytes that the caller has sent or received since the last throttle is called. The method calculates the caller's I/O rate. If the rate is faster than the bandwidth limit, it sleeps to slow down the data transfer. After it wakes up, it adjusts its bandwidth limit if the number of concurrent data transfers is changed.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Hairong Kuang added a comment - 16/Oct/07 10:29 PM
The old patch was outdated. So here is a new one.

Hairong Kuang added a comment - 19/Oct/07 07:08 PM
This patch makes changes to the implementation of I/O throttling as suggested by Raghu. I also added a junit test for testing Throttling.

Raghu Angadi added a comment - 20/Oct/07 01:29 AM
Pretty much looks fine.
  1. I could not find throttling test.
  2. Regd throttler : each connection is individually throttled. I think ideally we should use one throttler that is used by all connections. This will make sure we use up allowed b/w when ever possible. In the current scheme, transfer rate betwen A & B can not use extra b/w if another connection between B & C cannot use its quota (because C has many connections). Also when throttler is shared, small blocks can not escape below the the radar.
    1. Please make ThrottlerBase package private so that it can be used by HADOOP-2012
  3. minor : in FSNamesystem.java :
    ///
            if( priSet.contains(delNodeHint)) {
              cur = delNodeHint;
            } else if(addedNode != null && !priSet.contains(addedNode)){
              cur = delNodeHint;
             }
    /// Can be replaced by
           if (   addedNode != null || priSet.contains(delNodeHint) ) {
              cur = delNodeHint;
          }
  4. minor : it increases allocation in addBlock() in FSNameSystem.java. Is the current implementation more correct?

Hairong Kuang added a comment - 22/Oct/07 07:18 PM
Thanks Raghu!

1. Throttling test is in TestBlockReplacement.

2. Regd throttler, yes I totally agree with you. I will make ThrottlerBase to be package private and it would be nice if a throttler can be shared by mutlple threads. Let's see how this could be done.

3. For comment 3, your change is not exactly the same as the logic in the current code. If you'd merge two checks into one, I will change the code to be if(priSet.contains(delNodeHint) || ( addedNode != null && !priSet.contains(addedNode)) { cur = delNodeHint;}

4. For addBlock, the current code may have null locations in machineSet which cause serialization error, so I use an ArrayList first and then convert it into an array when constructing the result.


Raghu Angadi added a comment - 22/Oct/07 10:15 PM
> 3. For comment 3, your change is not exactly the same as the logic in the current code. If you'd merge two checks into one, I will change the code to be [...]

I see now. I misread the code. No need to merge the conditions. thanks.


Hairong Kuang added a comment - 23/Oct/07 04:36 AM
Thanks Raghu for providing an implementation of a Throttler that can be shared by mutiple threads. This new patch incorporates his algorithm and a couple of minor changes.

Raghu Angadi added a comment - 23/Oct/07 05:35 AM
+1 code review.

Raghu Angadi added a comment - 23/Oct/07 09:57 PM
There is a warning in eclipse for bandwidthPerSec member of Throttler since it not read anywhere.

Hairong Kuang added a comment - 23/Oct/07 11:13 PM
This patch removes field bandwidthPerSec in Throttler.

Hadoop QA added a comment - 25/Oct/07 04:54 PM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12368262/replace5.patch
against trunk revision r588083.

@author +1. The patch does not contain any @author tags.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new compiler warnings.

findbugs -1. The patch appears to introduce 1 new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/981/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/981/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/981/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/981/console

This message is automatically generated.


Hadoop QA added a comment - 26/Oct/07 03:16 AM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12368396/replace6.patch
against trunk revision r588341.

@author +1. The patch does not contain any @author tags.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new compiler warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests -1. The patch failed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/998/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/998/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/998/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/998/console

This message is automatically generated.


Hadoop QA added a comment - 01/Nov/07 02:41 AM
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12368396/replace6.patch
against trunk revision r590273.

@author +1. The patch does not contain any @author tags.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new compiler warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1046/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1046/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1046/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1046/console

This message is automatically generated.


dhruba borthakur added a comment - 01/Nov/07 06:09 PM
I just committed this. Thanks Hairong!

Hudson added a comment - 02/Nov/07 07:08 PM