Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-7742

favoring decommissioning node for replication can cause a block to stay underreplicated for long periods

    Details

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      When choosing a source node to replicate a block from, a decommissioning node is favored. The reason for the favoritism is that decommissioning nodes aren't servicing any writes so in-theory they are less loaded.

      However, the same selection algorithm also tries to make sure it doesn't get "stuck" on any particular node:

            // switch to a different node randomly
            // this to prevent from deterministically selecting the same node even
            // if the node failed to replicate the block on previous iterations
      

      Unfortunately, the decommissioning check is prior to this randomness so the algorithm can get stuck trying to replicate from a decommissioning node. We've seen this in practice where a decommissioning datanode was failing to replicate a block for many days, when other viable replicas of the block were available.

      Given that we limit the number of streams we'll assign to a given node (default soft limit of 2, hard limit of 4), It doesn't seem like favoring a decommissioning node has significant benefit. i.e. when there is significant replication work to do, we'll quickly hit the stream limit of the decommissioning nodes and use other nodes in the cluster anyway; when there isn't significant replication work then in theory we've got plenty of replication bandwidth available so choosing a decommissioning node isn't much of a win.

      I see two choices:
      1) Change the algorithm to still favor decommissioning nodes but with some level of randomness that will avoid always selecting the decommissioning node
      2) Remove the favoritism for decommissioning nodes

      I prefer #2. It simplifies the algorithm, and given the other throttles we have in place, I'm not sure there is a significant benefit to selecting decommissioning nodes.

      1. HDFS-7742-v0.patch
        4 kB
        Nathan Roberts

        Activity

        Hide
        nroberts Nathan Roberts added a comment -

        Attached patch. Favors decommissioning nodes a bit by allowing them to go up to hard limit, otherwise not at all.

        Show
        nroberts Nathan Roberts added a comment - Attached patch. Favors decommissioning nodes a bit by allowing them to go up to hard limit, otherwise not at all.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12707876/HDFS-7742-v0.patch
        against trunk revision 05499b1.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

        org.apache.hadoop.hdfs.server.namenode.TestMalformedURLs

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/10095//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/10095//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12707876/HDFS-7742-v0.patch against trunk revision 05499b1. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestMalformedURLs Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/10095//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/10095//console This message is automatically generated.
        Hide
        szetszwo Tsz Wo Nicholas Sze added a comment -

        +1 patch looks good. Is there a way to add a test?

        Show
        szetszwo Tsz Wo Nicholas Sze added a comment - +1 patch looks good. Is there a way to add a test?
        Hide
        nroberts Nathan Roberts added a comment -

        Thanks for the review nicholas!

        There is a test in the patch. Are you asking for a specific test case to be added?

        The test failure from the QA bot (TestMalformedURLs) should be unrelated.

        Show
        nroberts Nathan Roberts added a comment - Thanks for the review nicholas! There is a test in the patch. Are you asking for a specific test case to be added? The test failure from the QA bot (TestMalformedURLs) should be unrelated.
        Hide
        kihwal Kihwal Lee added a comment -

        Thanks for the patch, Nathan, and for the review, Nicholas. I've committed this to trunk through branch-2.7.

        Show
        kihwal Kihwal Lee added a comment - Thanks for the patch, Nathan, and for the review, Nicholas. I've committed this to trunk through branch-2.7.
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-trunk-Commit #7459 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7459/)
        HDFS-7742. Favoring decommissioning node for replication can cause a block to stay (kihwal: rev 04ee18ed48ceef34598f954ff40940abc9fde1d2)

        • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
        • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
        • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #7459 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7459/ ) HDFS-7742 . Favoring decommissioning node for replication can cause a block to stay (kihwal: rev 04ee18ed48ceef34598f954ff40940abc9fde1d2) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Hide
        szetszwo Tsz Wo Nicholas Sze added a comment -

        Thanks Nathan. The test in the patch looks good.

        Show
        szetszwo Tsz Wo Nicholas Sze added a comment - Thanks Nathan. The test in the patch looks good.
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #149 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/149/)
        HDFS-7742. Favoring decommissioning node for replication can cause a block to stay (kihwal: rev 04ee18ed48ceef34598f954ff40940abc9fde1d2)

        • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
        • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #149 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/149/ ) HDFS-7742 . Favoring decommissioning node for replication can cause a block to stay (kihwal: rev 04ee18ed48ceef34598f954ff40940abc9fde1d2) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Yarn-trunk #883 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/883/)
        HDFS-7742. Favoring decommissioning node for replication can cause a block to stay (kihwal: rev 04ee18ed48ceef34598f954ff40940abc9fde1d2)

        • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
        • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
        • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk #883 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/883/ ) HDFS-7742 . Favoring decommissioning node for replication can cause a block to stay (kihwal: rev 04ee18ed48ceef34598f954ff40940abc9fde1d2) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk #2081 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2081/)
        HDFS-7742. Favoring decommissioning node for replication can cause a block to stay (kihwal: rev 04ee18ed48ceef34598f954ff40940abc9fde1d2)

        • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
        • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2081 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2081/ ) HDFS-7742 . Favoring decommissioning node for replication can cause a block to stay (kihwal: rev 04ee18ed48ceef34598f954ff40940abc9fde1d2) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #140 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/140/)
        HDFS-7742. Favoring decommissioning node for replication can cause a block to stay (kihwal: rev 04ee18ed48ceef34598f954ff40940abc9fde1d2)

        • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
        • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #140 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/140/ ) HDFS-7742 . Favoring decommissioning node for replication can cause a block to stay (kihwal: rev 04ee18ed48ceef34598f954ff40940abc9fde1d2) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #149 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/149/)
        HDFS-7742. Favoring decommissioning node for replication can cause a block to stay (kihwal: rev 04ee18ed48ceef34598f954ff40940abc9fde1d2)

        • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
        • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
        • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #149 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/149/ ) HDFS-7742 . Favoring decommissioning node for replication can cause a block to stay (kihwal: rev 04ee18ed48ceef34598f954ff40940abc9fde1d2) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk #2099 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2099/)
        HDFS-7742. Favoring decommissioning node for replication can cause a block to stay (kihwal: rev 04ee18ed48ceef34598f954ff40940abc9fde1d2)

        • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
        • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
        • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2099 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2099/ ) HDFS-7742 . Favoring decommissioning node for replication can cause a block to stay (kihwal: rev 04ee18ed48ceef34598f954ff40940abc9fde1d2) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Sangjin Lee backported this to 2.6.1, the patch applies cleanly.

        I just pushed the commit to 2.6.1 after running compilation and TestBlockManager which changed in the patch.

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Sangjin Lee backported this to 2.6.1, the patch applies cleanly. I just pushed the commit to 2.6.1 after running compilation and TestBlockManager which changed in the patch.

          People

          • Assignee:
            nroberts Nathan Roberts
            Reporter:
            nroberts Nathan Roberts
          • Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development