Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.17.0
    • Fix Version/s: 0.17.2
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      In HADOOP-3633, namenode was assigning some datanodes to receive hundreds of blocks in a short period which caused datanodes to go out of memroy(threads).
      Most of them were from remote rack.

      Looking at the code,

          166           chooseLocalRack(results.get(1), excludedNodes, blocksize,
          167                           maxNodesPerRack, results);
      

      was sometimes not choosing the local rack of the writer(source).

      As a result, when a datanode goes down, other datanodes on the same rack were getting large number of blocks from remote racks.

      1. rereplicationPolicy.patch
        10 kB
        Hairong Kuang
      2. rereplicationPolicy1.patch
        10 kB
        Hairong Kuang

        Issue Links

          Activity

          Hide
          Hairong Kuang added a comment -

          This bug is introduced by HADOOP-2559. The change there works for choosing targets for a new block, but does not work for re-replicating an underreplicated block.

          Show
          Hairong Kuang added a comment - This bug is introduced by HADOOP-2559 . The change there works for choosing targets for a new block, but does not work for re-replicating an underreplicated block.
          Hide
          Hairong Kuang added a comment - - edited

          This patch places a third replica on the rack where the source is located in case of rereplication when two existing replicas are on two different racks. Since the source and the target are at the same rack, only the datanodes on the same rack may choose to replicate an underreplicated blocks to this rack. Therefore at most twice of (rack size-1) block transfers may happen to a single target within a heartbeat interval.

          Show
          Hairong Kuang added a comment - - edited This patch places a third replica on the rack where the source is located in case of rereplication when two existing replicas are on two different racks. Since the source and the target are at the same rack, only the datanodes on the same rack may choose to replicate an underreplicated blocks to this rack. Therefore at most twice of (rack size-1) block transfers may happen to a single target within a heartbeat interval.
          Hide
          Robert Chansler added a comment -

          Need patches for both 17 and 18, if different.

          Show
          Robert Chansler added a comment - Need patches for both 17 and 18, if different.
          Hide
          Lohit Vijayarenu added a comment -

          +1 patch looks good. Should we document this someplace? I see that we missed changing hdfs_design.html

          Show
          Lohit Vijayarenu added a comment - +1 patch looks good. Should we document this someplace? I see that we missed changing hdfs_design.html
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12385220/rereplicationPolicy.patch
          against trunk revision 674645.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2804/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12385220/rereplicationPolicy.patch against trunk revision 674645. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2804/console This message is automatically generated.
          Hide
          Hairong Kuang added a comment -

          Here is a patch that applies to the trunk.

          Show
          Hairong Kuang added a comment - Here is a patch that applies to the trunk.
          Hide
          Hairong Kuang added a comment -

          Targets test-core and test-patch are passed on my local machine under trunk, branch 17, and branch 18.

          Here is the test-patch result:
          [exec] +1 overall.

          [exec] +1 @author. The patch does not contain any @author tags.

          [exec] +1 tests included. The patch appears to include 3 new or modified tests.

          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.

          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.

          Show
          Hairong Kuang added a comment - Targets test-core and test-patch are passed on my local machine under trunk, branch 17, and branch 18. Here is the test-patch result: [exec] +1 overall. [exec] +1 @author. The patch does not contain any @author tags. [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
          Hide
          Hairong Kuang added a comment -

          I've just committed this.

          Show
          Hairong Kuang added a comment - I've just committed this.
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #581 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/581/ )

            People

            • Assignee:
              Hairong Kuang
              Reporter:
              Koji Noguchi
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development