Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
Reviewed
Description
HDFS-7270 introduces the mechanism for DataNode to signal congestions. DFSClient should be able to recognize the signals and back off.
Attachments
Attachments
- HDFS-8008.000.patch
- 9 kB
- Haohui Mai
- HDFS-8008.001.patch
- 9 kB
- Haohui Mai
- HDFS-8008.002.patch
- 9 kB
- Haohui Mai
- HDFS-8008.003.patch
- 9 kB
- Haohui Mai
Issue Links
- causes
-
HDFS-16293 Client sleeps and holds 'dataQueue' when DataNodes are congested
- Resolved
- is related to
-
HDFS-7270 Add congestion signaling capability to DataNode write protocol
- Closed
Activity
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12708532/HDFS-8008.000.patch
against trunk revision e428fea.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 1 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. There were no new javadoc warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
-1 findbugs. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:
org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/10132//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/10132//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/10132//console
This message is automatically generated.
This patch looks good, Haohui. I have just a few small comments.
If backOffIfNecessary throws InterruptedException, then this will be passed along to setLastException and eventually propagate to the caller. This is inconsistent with existing interruption handling logic in this loop, which catches InterruptedException and then allows execution to proceed without propagating the exception. (See lines 383 and 441.) Shall we do the same here?
On a side note, there is a lot of swallowing of InterruptedException in this code. It probably ought to do Thread.currentThread().interrupt(), but that's not related to your current patch.
Minor nitpick: the hyperlink got truncated in this comment.
/** * This function sleeps for a certain amount of time when the writing * pipeline is congested. The function calculates the time based on a * decorrelated filter which is available at {@link http://www * .com/2015/03/backoff.html}. */
Thanks!
Thanks Chris for the reviews. I uploaded v2 patch to address the comments.
v2 looks good. There is just one more little problem in that hyperlink in the comment. I'll be +1 pending Jenkins run after that's corrected. Thanks, Haohui.
* @see <a href="http://www.com/2015/03/backoff.html">http://www
* .com/2015/03/backoff.html</a>.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12708771/HDFS-8008.002.patch
against trunk revision 796fb26.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 1 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. There were no new javadoc warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:
org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager
Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/10147//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/10147//console
This message is automatically generated.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12708771/HDFS-8008.002.patch
against trunk revision a3a96a0.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 1 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. There were no new javadoc warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs:
org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager
Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/10149//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/10149//console
This message is automatically generated.
HDFS-7471 tracks the failure in TestDatanodeManager and HDFS-7576 tracks the failure in TestPipelinesFailover. These are unrelated to the current patch.
I've committed the patch to trunk and branch-2. Thanks for the reviews.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12708786/HDFS-8008.003.patch
against trunk revision c94d594.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 1 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. There were no new javadoc warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:
org.apache.hadoop.hdfs.web.TestWebHDFSXAttr
Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/10152//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/10152//console
This message is automatically generated.
SUCCESS: Integrated in Hadoop-trunk-Commit #7490 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7490/)
HDFS-8008. Support client-side back off when the datanodes are congested. Contributed by Haohui Mai. (wheat9: rev 6ccf4fbf8a8374c289370f67b26ac05abad30ebc)
- hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
- hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PipelineAck.java
- hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
- hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java
FAILURE: Integrated in Hadoop-Yarn-trunk #885 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/885/)
HDFS-8008. Support client-side back off when the datanodes are congested. Contributed by Haohui Mai. (wheat9: rev 6ccf4fbf8a8374c289370f67b26ac05abad30ebc)
- hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PipelineAck.java
- hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
- hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java
- hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #151 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/151/)
HDFS-8008. Support client-side back off when the datanodes are congested. Contributed by Haohui Mai. (wheat9: rev 6ccf4fbf8a8374c289370f67b26ac05abad30ebc)
- hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java
- hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
- hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PipelineAck.java
- hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
FAILURE: Integrated in Hadoop-Hdfs-trunk #2083 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2083/)
HDFS-8008. Support client-side back off when the datanodes are congested. Contributed by Haohui Mai. (wheat9: rev 6ccf4fbf8a8374c289370f67b26ac05abad30ebc)
- hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PipelineAck.java
- hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java
- hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
- hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #142 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/142/)
HDFS-8008. Support client-side back off when the datanodes are congested. Contributed by Haohui Mai. (wheat9: rev 6ccf4fbf8a8374c289370f67b26ac05abad30ebc)
- hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PipelineAck.java
- hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
- hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
- hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java
FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #151 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/151/)
HDFS-8008. Support client-side back off when the datanodes are congested. Contributed by Haohui Mai. (wheat9: rev 6ccf4fbf8a8374c289370f67b26ac05abad30ebc)
- hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
- hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PipelineAck.java
- hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java
- hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
FAILURE: Integrated in Hadoop-Mapreduce-trunk #2101 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2101/)
HDFS-8008. Support client-side back off when the datanodes are congested. Contributed by Haohui Mai. (wheat9: rev 6ccf4fbf8a8374c289370f67b26ac05abad30ebc)
- hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java
- hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
- hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
- hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PipelineAck.java
This patch implements the decorrelated exponential backoff mechanism described in http://www.awsarchitectureblog.com/2015/03/backoff.html when it discovers that there is at least one data node in the pipeline is in congested state.