Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-15

Rack replication policy can be violated for over replicated blocks

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.20.3
    • Fix Version/s: 0.20.3, 0.21.0
    • Component/s: None
    • Labels:
      None

      Description

      HDFS replicas placement strategy guarantees that the replicas of a block exist on at least two racks when its replication factor is greater than one. But fsck still reports that the replicas of some blocks end up on one rack.

      The cause of the problem is that decommission and corruption handling only check the block's replication factor but not the rack requirement. When an over-replicated block loses a replica due to decomission, corruption, or heartbeat lost, namenode does not take any action to guarantee that remaining replicas are on different racks.

      1. hdfs-15-b20-1.patch
        34 kB
        Eli Collins
      2. HDFS-15.patch.3
        18 kB
        Jitendra Nath Pandey
      3. HDFS-15.patch.2
        17 kB
        Jitendra Nath Pandey
      4. HDFS-15.patch
        25 kB
        Jitendra Nath Pandey
      5. HDFS-15.6.patch
        19 kB
        Jitendra Nath Pandey
      6. HDFS-15.5.patch
        19 kB
        Jitendra Nath Pandey
      7. HDFS-15.4.patch
        18 kB
        Jitendra Nath Pandey

        Issue Links

          Activity

          Hide
          Hairong Kuang added a comment -

          My proposal is to include both under-replicated blocks and blocks that do not satisfy rack requirement in the neededReplication queue. The neededReplication queue supports four priorities:
          Priority 0: Blocks that have only one replicas;
          Priority 1: Blocks whose replicas are on only one rack;
          Priority 2: Blocks whose number of replicas is no greater than 1/3 of it replication factor;
          Priority 3: All other under-replicated blocks.

          In general we should have priority 4 which includes those blocks that do not belong to priorities 0-3 and do not satisfy the HDFS rack requirement. Currently HDFS provides only two-rack guarantee so priority 1 covers all rack requirement break cases.

          In methods addStoredBlock, removeStoredBlock, startDecomission, and markBlockAsCorrupt in FSNamesystem, put both under-replication and 1 rack blocks into the neededReplication queue. Replicator will in addition replicate one more replicas for only 1 rack not under-replicated blocks.

          Show
          Hairong Kuang added a comment - My proposal is to include both under-replicated blocks and blocks that do not satisfy rack requirement in the neededReplication queue. The neededReplication queue supports four priorities: Priority 0: Blocks that have only one replicas; Priority 1: Blocks whose replicas are on only one rack; Priority 2: Blocks whose number of replicas is no greater than 1/3 of it replication factor; Priority 3: All other under-replicated blocks. In general we should have priority 4 which includes those blocks that do not belong to priorities 0-3 and do not satisfy the HDFS rack requirement. Currently HDFS provides only two-rack guarantee so priority 1 covers all rack requirement break cases. In methods addStoredBlock, removeStoredBlock, startDecomission, and markBlockAsCorrupt in FSNamesystem, put both under-replication and 1 rack blocks into the neededReplication queue. Replicator will in addition replicate one more replicas for only 1 rack not under-replicated blocks.
          Hide
          dhruba borthakur added a comment -

          Does this problem exist (or have been observed) on 0.20?

          Show
          dhruba borthakur added a comment - Does this problem exist (or have been observed) on 0.20?
          Hide
          Jitendra Nath Pandey added a comment -

          The proposal is as follows. First 3 points are same as the approach suggested in the first coment, except a slight change that the blocks which are not sufficiently replicated take higher priority over the blocks that have required replicas but violate the rack requirement.
          1. Both under-replicated blocks and blocks that do not satisfy rack requirement should be included in the neededReplication queue.
          2. neededReplication queue should have 4 priorites
          Priority 0: Blocks that have only one replicas
          Priority 1: Blocks whose number of replicas is no greater than 1/3 of it replication factor.
          Priority 2: All other blocks which do not have required number of replicas.
          Priority 3: Blocks which have required number of replicas or more but all of them on the same rack.
          3. In methods addStoredBlock, removeStoredBlock, startDecomission, and markBlockAsCorrupt in FSNamesystem, put both under-replication and 1 rack blocks into the neededReplication queue. Replicator will in addition replicate one more replicas for only 1 rack not under-replicated blocks.
          4. If a block is in priority 3 of neededReplication queue and ReplicationTargetChooser is unable to find a location for replica that meets the rack requirement, do not schedule a replication, instead keep the block in the same queue.

          Show
          Jitendra Nath Pandey added a comment - The proposal is as follows. First 3 points are same as the approach suggested in the first coment, except a slight change that the blocks which are not sufficiently replicated take higher priority over the blocks that have required replicas but violate the rack requirement. 1. Both under-replicated blocks and blocks that do not satisfy rack requirement should be included in the neededReplication queue. 2. neededReplication queue should have 4 priorites Priority 0: Blocks that have only one replicas Priority 1: Blocks whose number of replicas is no greater than 1/3 of it replication factor. Priority 2: All other blocks which do not have required number of replicas. Priority 3: Blocks which have required number of replicas or more but all of them on the same rack. 3. In methods addStoredBlock, removeStoredBlock, startDecomission, and markBlockAsCorrupt in FSNamesystem, put both under-replication and 1 rack blocks into the neededReplication queue. Replicator will in addition replicate one more replicas for only 1 rack not under-replicated blocks. 4. If a block is in priority 3 of neededReplication queue and ReplicationTargetChooser is unable to find a location for replica that meets the rack requirement, do not schedule a replication, instead keep the block in the same queue.
          Hide
          Jitendra Nath Pandey added a comment -

          Updated proposal:
          A different queue (neededReplicationsForRacks) is maintained for blocks which do not have sufficient racks assigned. Blocks in this queue are treated with lower priority than the blocks in neededReplications, therefore the priority logic is same as in comment#3. The reasons for this change:
          1. The semantics of neededReplications queue remains unchanged and it contains only those blocks which are really under-replicated.
          2. Keeps the code cleaner as we maintain seperate queue for blocks with not enough racks. No code change needed to UnderReplicatedBlocks.java. This new queue can be implemented as just a TreeSet<Block>, therefore no new class needs to be implemented.
          Point # 3 in previous comment remains same except that a block is added to neededReplicationsForRacks if it doesn't have enough racks.
          Point #4 in previous comment remains unchanged.

          Show
          Jitendra Nath Pandey added a comment - Updated proposal: A different queue (neededReplicationsForRacks) is maintained for blocks which do not have sufficient racks assigned. Blocks in this queue are treated with lower priority than the blocks in neededReplications, therefore the priority logic is same as in comment#3. The reasons for this change: 1. The semantics of neededReplications queue remains unchanged and it contains only those blocks which are really under-replicated. 2. Keeps the code cleaner as we maintain seperate queue for blocks with not enough racks. No code change needed to UnderReplicatedBlocks.java. This new queue can be implemented as just a TreeSet<Block>, therefore no new class needs to be implemented. Point # 3 in previous comment remains same except that a block is added to neededReplicationsForRacks if it doesn't have enough racks. Point #4 in previous comment remains unchanged.
          Hide
          dhruba borthakur added a comment -

          > A different queue (neededReplicationsForRacks) is maintained for blocks which do not have sufficient rac

          There was a time when the Namenode was littered with plenty of adhoc data structures, each for its own purpose. There was an effort to consolidate the functionality of these data structures into a smaller set. I am not against this patch, but is it really difficult to integrate this new data structure into neededReplication as explained in your first proposal?

          Show
          dhruba borthakur added a comment - > A different queue (neededReplicationsForRacks) is maintained for blocks which do not have sufficient rac There was a time when the Namenode was littered with plenty of adhoc data structures, each for its own purpose. There was an effort to consolidate the functionality of these data structures into a smaller set. I am not against this patch, but is it really difficult to integrate this new data structure into neededReplication as explained in your first proposal?
          Hide
          Jitendra Nath Pandey added a comment -

          > There was a time when the Namenode was littered with plenty of adhoc data structures, each for its own purpose. There was an effort to consolidate the functionality of these data structures into a smaller set. I am not against this patch, but is it really difficult to integrate this new data structure into neededReplication as explained in your first proposal?

          It is not difficult but it will make the code confusing. At many places it is assumed that a block in neededReplications is strictly under-replicated and that will make the logic complicated. Code will be cleaner if we use strict definitions of neededReplications and neededReplicationsForRacks queues.
          Also it will include significant changes to UnderReplicatedBlocks.java. If we use seperate queue all the code changes are confined to BlockManager.java

          The patch is attached.

          Show
          Jitendra Nath Pandey added a comment - > There was a time when the Namenode was littered with plenty of adhoc data structures, each for its own purpose. There was an effort to consolidate the functionality of these data structures into a smaller set. I am not against this patch, but is it really difficult to integrate this new data structure into neededReplication as explained in your first proposal? It is not difficult but it will make the code confusing. At many places it is assumed that a block in neededReplications is strictly under-replicated and that will make the logic complicated. Code will be cleaner if we use strict definitions of neededReplications and neededReplicationsForRacks queues. Also it will include significant changes to UnderReplicatedBlocks.java. If we use seperate queue all the code changes are confined to BlockManager.java The patch is attached.
          Hide
          Raghu Angadi added a comment -

          I am in favor of not having another list. I looked at the patch briefly, not sure if having extra set reduces new code.. Are you implying newer proposal has less changes?

          It is better to have 'neededReplicationList' to have all the block that need replication. May be 'UnderReplicatedBlocks' could be renamed.

          Show
          Raghu Angadi added a comment - I am in favor of not having another list. I looked at the patch briefly, not sure if having extra set reduces new code.. Are you implying newer proposal has less changes? It is better to have 'neededReplicationList' to have all the block that need replication. May be 'UnderReplicatedBlocks' could be renamed.
          Hide
          Jitendra Nath Pandey added a comment -

          If we go with only one list, add, remove, update, getPriority methods in UnderReplicatedBlocks.java, will need to have another argument to pass the numberOfRacks or a boolean to denote insufficient racks. And these methods will change to take into account the rack policy. I agree, in that case we should rename this class.
          Also, since the blocks with insufficient racks need some special treatment (e.g. don't schedule replication if target happens to be on same rack), we would need to have checks on priority of the block in BlockManager, if we put them in the same queue.
          Ofcourse, if we have different lists, we have to iterate over two lists when we process these blocks to schedule replication.

          Apart from these differences, the two approaches are similar in effect. Please recommend.

          Show
          Jitendra Nath Pandey added a comment - If we go with only one list, add, remove, update, getPriority methods in UnderReplicatedBlocks.java, will need to have another argument to pass the numberOfRacks or a boolean to denote insufficient racks. And these methods will change to take into account the rack policy. I agree, in that case we should rename this class. Also, since the blocks with insufficient racks need some special treatment (e.g. don't schedule replication if target happens to be on same rack), we would need to have checks on priority of the block in BlockManager, if we put them in the same queue. Ofcourse, if we have different lists, we have to iterate over two lists when we process these blocks to schedule replication. Apart from these differences, the two approaches are similar in effect. Please recommend.
          Hide
          Raghu Angadi added a comment -

          > If we go with only one list, add, remove, update, getPriority methods in UnderReplicatedBlocks.java, will need to have another argument to pass the numberOfRacks or a boolean to denote insufficient racks. And these methods will change to take into account the rack policy. I agree, in that case we should rename this class.

          Not required according to definition above : "Priority 3: All other under-replicated blocks." . I.e. you don't need need the extra args. Note that underreplicated interface already takes "expectedReplicas" as argument (required for deciding priority).

          "Special treatment" is required irrespective of which list it lies in, because we were not handling this condition before and we need to now.

          Show
          Raghu Angadi added a comment - > If we go with only one list, add, remove, update, getPriority methods in UnderReplicatedBlocks.java, will need to have another argument to pass the numberOfRacks or a boolean to denote insufficient racks. And these methods will change to take into account the rack policy. I agree, in that case we should rename this class. Not required according to definition above : "Priority 3: All other under-replicated blocks." . I.e. you don't need need the extra args. Note that underreplicated interface already takes "expectedReplicas" as argument (required for deciding priority). "Special treatment" is required irrespective of which list it lies in, because we were not handling this condition before and we need to now.
          Hide
          Jitendra Nath Pandey added a comment -

          New patch with approach suggested by Raghu.

          Show
          Jitendra Nath Pandey added a comment - New patch with approach suggested by Raghu.
          Hide
          Raghu Angadi added a comment -

          What happens in the common case for most users, where there is only one rack "default"? Will all the blocks stay in the neededReplication list?

          Show
          Raghu Angadi added a comment - What happens in the common case for most users, where there is only one rack "default"? Will all the blocks stay in the neededReplication list?
          Hide
          Jitendra Nath Pandey added a comment -

          In this patch, the blocks will stay in the list, but will not be scheduled for replication because no new rack would be found to allocate sufficient racks.
          Suggestion from Hairong:
          If user doesn't specify a topology script for rack determination, we can ignore the check for enough racks.
          We can implement it by checking for config variable SCRIPT_FILENAME_KEY in blockHasEnoughRacks function. If this config key returns null blockHasEnoughRacks will return true, which will effectively eliminate the check for enough racks.

          Show
          Jitendra Nath Pandey added a comment - In this patch, the blocks will stay in the list, but will not be scheduled for replication because no new rack would be found to allocate sufficient racks. Suggestion from Hairong: If user doesn't specify a topology script for rack determination, we can ignore the check for enough racks. We can implement it by checking for config variable SCRIPT_FILENAME_KEY in blockHasEnoughRacks function. If this config key returns null blockHasEnoughRacks will return true, which will effectively eliminate the check for enough racks.
          Hide
          Jitendra Nath Pandey added a comment -

          New patch with changes mentioned in previous comment.

          Show
          Jitendra Nath Pandey added a comment - New patch with changes mentioned in previous comment.
          Hide
          Raghu Angadi added a comment -

          +1. Looks good.

          one nit : the check needs to consider blocks with single replication.. so that they don't end up on needed replication list.

          Show
          Raghu Angadi added a comment - +1. Looks good. one nit : the check needs to consider blocks with single replication.. so that they don't end up on needed replication list.
          Hide
          Jitendra Nath Pandey added a comment -

          Added check so that blocks that require single replica are not added to neededReplications queue.

          Show
          Jitendra Nath Pandey added a comment - Added check so that blocks that require single replica are not added to neededReplications queue.
          Hide
          Jitendra Nath Pandey added a comment -

          result of test-patch

          [exec] +1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] +1 tests included. The patch appears to include 5 new or modified tests.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
          [exec]
          [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

          Show
          Jitendra Nath Pandey added a comment - result of test-patch [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 5 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12417685/HDFS-15.5.patch
          against trunk revision 807818.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 5 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/90/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/90/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/90/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/90/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12417685/HDFS-15.5.patch against trunk revision 807818. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/90/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/90/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/90/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/90/console This message is automatically generated.
          Hide
          Jitendra Nath Pandey added a comment -

          SCRIPT_FILENAME_KEY should be "topology.script.file.name"

          Show
          Jitendra Nath Pandey added a comment - SCRIPT_FILENAME_KEY should be "topology.script.file.name"
          Hide
          Jitendra Nath Pandey added a comment -

          Patch updated with "topology.script.file.name"

          Show
          Jitendra Nath Pandey added a comment - Patch updated with "topology.script.file.name"
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12417944/HDFS-15.6.patch
          against trunk revision 808670.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 5 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/96/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/96/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/96/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/96/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12417944/HDFS-15.6.patch against trunk revision 808670. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/96/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/96/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/96/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/96/console This message is automatically generated.
          Hide
          Jitendra Nath Pandey added a comment -

          The failed test
          org.apache.hadoop.security.authorize.TestServiceLevelAuthorization.testServiceLevelAuthorization
          is unrelated to this patch. This is a seperate issue being tracked in HDFS-568.

          Show
          Jitendra Nath Pandey added a comment - The failed test org.apache.hadoop.security.authorize.TestServiceLevelAuthorization.testServiceLevelAuthorization is unrelated to this patch. This is a seperate issue being tracked in HDFS-568 .
          Hide
          Hairong Kuang added a comment -

          I've committed this. Thanks Jitendra!

          Show
          Hairong Kuang added a comment - I've committed this. Thanks Jitendra!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #8 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/8/)
          . All replicas end up on 1 rack. Contributed by Jitendra Nath Pandey.

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #8 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/8/ ) . All replicas end up on 1 rack. Contributed by Jitendra Nath Pandey.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #69 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/69/)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #69 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/69/ )
          Hide
          eric baldeschwieler added a comment -

          Why was it decided that being on two racks is less important than meeting a replication goal of all replicas?

          It seems intuitive to me that making sure that all blocks with 2 replicas be on two racks should be the higher priority than guaranteeing replicas > 2. In practice this may not be a big difference, but I don't understand why this priority was chosen.

          Show
          eric baldeschwieler added a comment - Why was it decided that being on two racks is less important than meeting a replication goal of all replicas? It seems intuitive to me that making sure that all blocks with 2 replicas be on two racks should be the higher priority than guaranteeing replicas > 2. In practice this may not be a big difference, but I don't understand why this priority was chosen.
          Hide
          Eli Collins added a comment -

          This bug exists in branch 20 as well.

          Show
          Eli Collins added a comment - This bug exists in branch 20 as well.
          Hide
          Eli Collins added a comment -

          Here's a patch that applies against branch 20. The original patch was done after the block management refactoring so it isn't a straight forward application to 20, so I've written a new patch, but the fix is in the same spirit as the code in trunk.

          The bug is hard to reproduce as you have to have a failure/decommission on the only cross rack replica of a block in the window of time this block is over-replicated.

          The patch adds following new tests that cover rack policy violations not covered by the existing tests. Some of them fail when looped repeatedly w/o the fix (after commenting out the asserts that check neededReplications which will always fail). I'll forward port these test to trunk in another jira.

          • Test that blocks that have a sufficient number of total replicas, but are not replicated cross rack, get replicated cross rack when a rack becomes available.
          • Test that new blocks for an underreplicated file will get replicated cross rack.
          • Mark a block as corrupt, test that when it is re-replicated that it is still replicated across racks.
          • Reduce the replication factor of a file, making sure that the only block that is across racks is not removed when deleting replicas.
          • Test that when a block is replicated because a replica is lost due to host failure the the rack policy is preserved.
          • Test that when the execss replicas of a block are reduced due to a node re-joining the cluster the rack policy is not violated.
          • Test that rack policy is still respected when blocks are replicated due to node decommissioning.
          • Test that rack policy is still respected when blocks are replicated due to node decommissioning, even when the blocks are over-replicated.
          Show
          Eli Collins added a comment - Here's a patch that applies against branch 20. The original patch was done after the block management refactoring so it isn't a straight forward application to 20, so I've written a new patch, but the fix is in the same spirit as the code in trunk. The bug is hard to reproduce as you have to have a failure/decommission on the only cross rack replica of a block in the window of time this block is over-replicated. The patch adds following new tests that cover rack policy violations not covered by the existing tests. Some of them fail when looped repeatedly w/o the fix (after commenting out the asserts that check neededReplications which will always fail). I'll forward port these test to trunk in another jira. Test that blocks that have a sufficient number of total replicas, but are not replicated cross rack, get replicated cross rack when a rack becomes available. Test that new blocks for an underreplicated file will get replicated cross rack. Mark a block as corrupt, test that when it is re-replicated that it is still replicated across racks. Reduce the replication factor of a file, making sure that the only block that is across racks is not removed when deleting replicas. Test that when a block is replicated because a replica is lost due to host failure the the rack policy is preserved. Test that when the execss replicas of a block are reduced due to a node re-joining the cluster the rack policy is not violated. Test that rack policy is still respected when blocks are replicated due to node decommissioning. Test that rack policy is still respected when blocks are replicated due to node decommissioning, even when the blocks are over-replicated.
          Hide
          Eli Collins added a comment -

          Patch passes all unit tests on branch 20. test-patch results:

               [exec] 
               [exec] -1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     +1 tests included.  The patch appears to include 15 new or modified tests.
               [exec] 
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
               [exec] 
               [exec]     -1 Eclipse classpath. The patch causes the Eclipse classpath to differ from the contents of the lib directories.
               [exec] 
          
          Show
          Eli Collins added a comment - Patch passes all unit tests on branch 20. test-patch results: [exec] [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 15 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] -1 Eclipse classpath. The patch causes the Eclipse classpath to differ from the contents of the lib directories. [exec]

            People

            • Assignee:
              Jitendra Nath Pandey
              Reporter:
              Hairong Kuang
            • Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development