Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: datanode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The DataNode should signal congestion (i.e. "I'm too busy") in the PipelineAck using the mechanism introduced in HDFS-7270.

        Issue Links

          Activity

          Hide
          wheat9 Haohui Mai added a comment -

          The v0 patch allows the datanode to signal congestion when the load of the system is over 1.5 times of the number of available cores, which means that there are at least 50% more processes are waiting to be scheduled in the OS. We found it an effective metric to signal that the DNs are undergoing massive amount of writes.

          Show
          wheat9 Haohui Mai added a comment - The v0 patch allows the datanode to signal congestion when the load of the system is over 1.5 times of the number of available cores, which means that there are at least 50% more processes are waiting to be scheduled in the OS. We found it an effective metric to signal that the DNs are undergoing massive amount of writes.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12708538/HDFS-8009.000.patch
          against trunk revision e428fea.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The test build failed in hadoop-hdfs-project/hadoop-hdfs

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/10133//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/10133//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12708538/HDFS-8009.000.patch against trunk revision e428fea. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The test build failed in hadoop-hdfs-project/hadoop-hdfs Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/10133//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/10133//console This message is automatically generated.
          Hide
          cnauroth Chris Nauroth added a comment -

          +1 for the patch. Thank you, Haohui.

          Show
          cnauroth Chris Nauroth added a comment - +1 for the patch. Thank you, Haohui.
          Hide
          cnauroth Chris Nauroth added a comment -

          The test failure appears to be unrelated. It passed fine on my local machine.

          Show
          cnauroth Chris Nauroth added a comment - The test failure appears to be unrelated. It passed fine on my local machine.
          Hide
          wheat9 Haohui Mai added a comment -

          I've committed the patch to trunk and branch-2. Thanks Chris for the review.

          Show
          wheat9 Haohui Mai added a comment - I've committed the patch to trunk and branch-2. Thanks Chris for the review.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #7486 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7486/)
          HDFS-8009. Signal congestion on the DataNode. Contributed by Haohui Mai. (wheat9: rev 53471d462c987e67ad73b974646a5560a4b5d424)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
            Add the missing files for HDFS-8009. (wheat9: rev 796fb268710aef8445dc97a04464a0579062f6f5)
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeECN.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #7486 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7486/ ) HDFS-8009 . Signal congestion on the DataNode. Contributed by Haohui Mai. (wheat9: rev 53471d462c987e67ad73b974646a5560a4b5d424) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Add the missing files for HDFS-8009 . (wheat9: rev 796fb268710aef8445dc97a04464a0579062f6f5) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeECN.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #885 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/885/)
          HDFS-8009. Signal congestion on the DataNode. Contributed by Haohui Mai. (wheat9: rev 53471d462c987e67ad73b974646a5560a4b5d424)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
            Add the missing files for HDFS-8009. (wheat9: rev 796fb268710aef8445dc97a04464a0579062f6f5)
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeECN.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #885 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/885/ ) HDFS-8009 . Signal congestion on the DataNode. Contributed by Haohui Mai. (wheat9: rev 53471d462c987e67ad73b974646a5560a4b5d424) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Add the missing files for HDFS-8009 . (wheat9: rev 796fb268710aef8445dc97a04464a0579062f6f5) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeECN.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #151 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/151/)
          HDFS-8009. Signal congestion on the DataNode. Contributed by Haohui Mai. (wheat9: rev 53471d462c987e67ad73b974646a5560a4b5d424)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
            Add the missing files for HDFS-8009. (wheat9: rev 796fb268710aef8445dc97a04464a0579062f6f5)
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeECN.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #151 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/151/ ) HDFS-8009 . Signal congestion on the DataNode. Contributed by Haohui Mai. (wheat9: rev 53471d462c987e67ad73b974646a5560a4b5d424) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java Add the missing files for HDFS-8009 . (wheat9: rev 796fb268710aef8445dc97a04464a0579062f6f5) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeECN.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2083 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2083/)
          HDFS-8009. Signal congestion on the DataNode. Contributed by Haohui Mai. (wheat9: rev 53471d462c987e67ad73b974646a5560a4b5d424)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
            Add the missing files for HDFS-8009. (wheat9: rev 796fb268710aef8445dc97a04464a0579062f6f5)
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeECN.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2083 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2083/ ) HDFS-8009 . Signal congestion on the DataNode. Contributed by Haohui Mai. (wheat9: rev 53471d462c987e67ad73b974646a5560a4b5d424) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Add the missing files for HDFS-8009 . (wheat9: rev 796fb268710aef8445dc97a04464a0579062f6f5) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeECN.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #142 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/142/)
          HDFS-8009. Signal congestion on the DataNode. Contributed by Haohui Mai. (wheat9: rev 53471d462c987e67ad73b974646a5560a4b5d424)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
            Add the missing files for HDFS-8009. (wheat9: rev 796fb268710aef8445dc97a04464a0579062f6f5)
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeECN.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #142 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/142/ ) HDFS-8009 . Signal congestion on the DataNode. Contributed by Haohui Mai. (wheat9: rev 53471d462c987e67ad73b974646a5560a4b5d424) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java Add the missing files for HDFS-8009 . (wheat9: rev 796fb268710aef8445dc97a04464a0579062f6f5) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeECN.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #151 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/151/)
          HDFS-8009. Signal congestion on the DataNode. Contributed by Haohui Mai. (wheat9: rev 53471d462c987e67ad73b974646a5560a4b5d424)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
            Add the missing files for HDFS-8009. (wheat9: rev 796fb268710aef8445dc97a04464a0579062f6f5)
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeECN.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #151 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/151/ ) HDFS-8009 . Signal congestion on the DataNode. Contributed by Haohui Mai. (wheat9: rev 53471d462c987e67ad73b974646a5560a4b5d424) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java Add the missing files for HDFS-8009 . (wheat9: rev 796fb268710aef8445dc97a04464a0579062f6f5) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeECN.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2101 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2101/)
          HDFS-8009. Signal congestion on the DataNode. Contributed by Haohui Mai. (wheat9: rev 53471d462c987e67ad73b974646a5560a4b5d424)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
            Add the missing files for HDFS-8009. (wheat9: rev 796fb268710aef8445dc97a04464a0579062f6f5)
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeECN.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2101 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2101/ ) HDFS-8009 . Signal congestion on the DataNode. Contributed by Haohui Mai. (wheat9: rev 53471d462c987e67ad73b974646a5560a4b5d424) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Add the missing files for HDFS-8009 . (wheat9: rev 796fb268710aef8445dc97a04464a0579062f6f5) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeECN.java
          Hide
          mingma Ming Ma added a comment -

          This overall DN congestion functionality seem useful.

          • System load is a useful metrics. But I wonder if we plan to take fairness into account. e.g., DN will ask only the heavy users to back off.
          • Support for read operation. Sometimes DNs could become the hotspot due to some applications reading specific blocks and cause high NIC bandwidth utilization.
          Show
          mingma Ming Ma added a comment - This overall DN congestion functionality seem useful. System load is a useful metrics. But I wonder if we plan to take fairness into account. e.g., DN will ask only the heavy users to back off. Support for read operation. Sometimes DNs could become the hotspot due to some applications reading specific blocks and cause high NIC bandwidth utilization.
          Hide
          wheat9 Haohui Mai added a comment -

          Both are very good suggestions.

          I believe that in the longer term signaling congestion can be pluggable as the definition of load and congestion varies in different deployment.

          For fairness and read I think it requires a somewhat more generalized design to address both (1) prioritizing I/O operations and (2) QoS for different users. We can address it in a separate jira.

          Show
          wheat9 Haohui Mai added a comment - Both are very good suggestions. I believe that in the longer term signaling congestion can be pluggable as the definition of load and congestion varies in different deployment. For fairness and read I think it requires a somewhat more generalized design to address both (1) prioritizing I/O operations and (2) QoS for different users. We can address it in a separate jira.

            People

            • Assignee:
              wheat9 Haohui Mai
              Reporter:
              wheat9 Haohui Mai
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development