Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.17.2
    • Fix Version/s: 0.18.3
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Suspicion that all replicas of a block were marked for deletion. (Don't panic, investigation underway.)

      1. missing-block-example.txt
        2 kB
        Raghu Angadi
      2. staleInvalidates.patch
        4 kB
        Hairong Kuang
      3. staleInvalidates1.patch
        5 kB
        Hairong Kuang
      4. staleInvalidates2.patch
        5 kB
        Hairong Kuang
      5. staleInvalidates2-br18.patch
        5 kB
        Hairong Kuang

        Issue Links

          Activity

          Hide
          Raghu Angadi added a comment -

          The "main cause" (among others) seems to be because of incomplete clean up when a datanode is marked dead.

          Specifically, the pending requests for block deletions don't seem to be removed when a node is marked dead.. These deletion requests might be sent after the datanode registers back again... but the state might be very different that time.

          Show
          Raghu Angadi added a comment - The "main cause" (among others) seems to be because of incomplete clean up when a datanode is marked dead. Specifically, the pending requests for block deletions don't seem to be removed when a node is marked dead.. These deletion requests might be sent after the datanode registers back again... but the state might be very different that time.
          Hide
          dhruba borthakur added a comment -

          > These deletion requests might be sent after the datanode registers back again

          But then this will delete a block that was anyway supposed to have been deleted much earlier. can this really cause the bug described?

          Show
          dhruba borthakur added a comment - > These deletion requests might be sent after the datanode registers back again But then this will delete a block that was anyway supposed to have been deleted much earlier. can this really cause the bug described?
          Hide
          Raghu Angadi added a comment -

          yes. E.g. :

          • P and Q have a block B. Say B is over replicated to R and P is asked to delete the block.
          • Now Q and R have the block and there is a pending delete-request to P.
          • Now P is marked dead and P comes back up again.
          • block will become over replicated and NN asks Q to delete it.
          • Now the block will be delete from both P and Q! Block is under replicated.

          Does it seem right? IOW, why do we not want to remove the pending deletion requests?

          The above example shows loss of one replica unintentionally. The real example is lot crazier in terms of how the DNs were maked dead and came back alive and we ended up losing all the replicas. I don't know yet why the heartbeat was lost.

          Note that this in no way means we should not fix HADOOP-4540. These are two different problems (or two parts of a problem).

          Show
          Raghu Angadi added a comment - yes. E.g. : P and Q have a block B. Say B is over replicated to R and P is asked to delete the block. Now Q and R have the block and there is a pending delete-request to P. Now P is marked dead and P comes back up again. block will become over replicated and NN asks Q to delete it. Now the block will be delete from both P and Q! Block is under replicated. Does it seem right? IOW, why do we not want to remove the pending deletion requests? The above example shows loss of one replica unintentionally. The real example is lot crazier in terms of how the DNs were maked dead and came back alive and we ended up losing all the replicas. I don't know yet why the heartbeat was lost. Note that this in no way means we should not fix HADOOP-4540 . These are two different problems (or two parts of a problem).
          Hide
          dhruba borthakur added a comment -

          Ok, got it. This is a race between the block report in Step 3 ("Now P is marked dead and P comes back up again") and the block-delete-call that the namenode issued in Step 1. The block report inserted a block in the blocksmap that is going to be deleted from the datanode.

          Two options come to mind:

          1. On every block report processing, the NN clears out blocks from invalidateSets that belong to that datanode. This might consume non-trivial amount of CPU on the namenode.

          2. The datanode sends its current time on every block report, let call this as the reportStamp. The NN stores the reportStamp in the DatanodeDescriptor and sends it back to the DN on every block deletion message. The Datanode process a delete-block request only if the reportStamp in the delete-block-message is greater than the reportStamp it used in the last block report that it send out.

          Show
          dhruba borthakur added a comment - Ok, got it. This is a race between the block report in Step 3 ("Now P is marked dead and P comes back up again") and the block-delete-call that the namenode issued in Step 1. The block report inserted a block in the blocksmap that is going to be deleted from the datanode. Two options come to mind: 1. On every block report processing, the NN clears out blocks from invalidateSets that belong to that datanode. This might consume non-trivial amount of CPU on the namenode. 2. The datanode sends its current time on every block report, let call this as the reportStamp. The NN stores the reportStamp in the DatanodeDescriptor and sends it back to the DN on every block deletion message. The Datanode process a delete-block request only if the reportStamp in the delete-block-message is greater than the reportStamp it used in the last block report that it send out.
          Hide
          Hairong Kuang added a comment -

          This patch fixes the stale invalidates problem by
          1. removeStoredBlock makes sure that the replica also gets removed from recentInvalidatesSet;
          2. When removing a datanode, reset its scheduled invalidate blocks;
          3. Fix a typo: proccessOverReplicatedBlock +> processOverReplicatedBlock

          Show
          Hairong Kuang added a comment - This patch fixes the stale invalidates problem by 1. removeStoredBlock makes sure that the replica also gets removed from recentInvalidatesSet; 2. When removing a datanode, reset its scheduled invalidate blocks; 3. Fix a typo: proccessOverReplicatedBlock +> processOverReplicatedBlock
          Hide
          Raghu Angadi added a comment -

          Couple of minor comments :

          • The debug comment removed while choosing excess replicas can be retained and changed to INFO level. This is a pretty useful log.
          • in removeFromInvalidates(), we need to log only if remove() returns true. Usually it will be false.

          I am trying to see one of the actual cases seen will be fixed by this patch.

          Show
          Raghu Angadi added a comment - Couple of minor comments : The debug comment removed while choosing excess replicas can be retained and changed to INFO level. This is a pretty useful log. in removeFromInvalidates(), we need to log only if remove() returns true. Usually it will be false. I am trying to see one of the actual cases seen will be fixed by this patch.
          Hide
          Raghu Angadi added a comment -

          Trace of a block that got all of its replicas deleted (courtesy Rob Chansler).

          Show
          Raghu Angadi added a comment - Trace of a block that got all of its replicas deleted (courtesy Rob Chansler).
          Hide
          Hairong Kuang added a comment -

          The patch incorporates Raghu's comment.

          Show
          Hairong Kuang added a comment - The patch incorporates Raghu's comment.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12393522/staleInvalidates1.patch
          against trunk revision 712339.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no tests are needed for this patch.

          -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3554/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3554/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3554/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3554/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12393522/staleInvalidates1.patch against trunk revision 712339. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3554/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3554/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3554/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3554/console This message is automatically generated.
          Hide
          Raghu Angadi added a comment -

          > 2. When removing a datanode, reset its scheduled invalidate blocks.

          There is also 'recentIvalidateSets'. When NN deletes a block, it first goes into 'recentInvalidateSets', then the replication monitor thread moves this request to datanode's invalidateSet. I don't know why we have this two step process, but looks like this patch needs to clear the datanode's set in recentInvalidateSets as well.

          I don't think 'addToInvalidatesNoLog' is necessary. The extra log is fine and will make it consistent as well.

          Show
          Raghu Angadi added a comment - > 2. When removing a datanode, reset its scheduled invalidate blocks. There is also 'recentIvalidateSets'. When NN deletes a block, it first goes into 'recentInvalidateSets', then the replication monitor thread moves this request to datanode's invalidateSet. I don't know why we have this two step process, but looks like this patch needs to clear the datanode's set in recentInvalidateSets as well. I don't think 'addToInvalidatesNoLog' is necessary. The extra log is fine and will make it consistent as well.
          Hide
          Hairong Kuang added a comment -

          recentInvalidatesSet is cleared in removeStoredBlock.

          Show
          Hairong Kuang added a comment - recentInvalidatesSet is cleared in removeStoredBlock.
          Hide
          Raghu Angadi added a comment -

          > 1. removeStoredBlock makes sure that the replica also gets removed from recentInvalidatesSet;
          the patch removes it from the other set.

          So we have two sets : one in dn and one in FSNamesystem. These two are cleared in two different places (and two different reasons): dn's is cleared when a datanode is marked dead and FSNamesystem's is cleared when a block is deleted.

          I think for consistency and clarity, both should be cleared in one place in the code.

          The patch as is I think fixes the problem.. so if we don't want to change the patch, it might be ok for 0.18. But for 19 and trunk at least I think it is better to implement this policy in one place.

          Show
          Raghu Angadi added a comment - > 1. removeStoredBlock makes sure that the replica also gets removed from recentInvalidatesSet; the patch removes it from the other set. So we have two sets : one in dn and one in FSNamesystem. These two are cleared in two different places (and two different reasons): dn's is cleared when a datanode is marked dead and FSNamesystem's is cleared when a block is deleted. I think for consistency and clarity, both should be cleared in one place in the code. The patch as is I think fixes the problem.. so if we don't want to change the patch, it might be ok for 0.18. But for 19 and trunk at least I think it is better to implement this policy in one place.
          Hide
          Hairong Kuang added a comment -

          Having addToInvalidatesNoLog is to avoid an extra log and allow the calling function to log the reason why a block gets invalid.

          Show
          Hairong Kuang added a comment - Having addToInvalidatesNoLog is to avoid an extra log and allow the calling function to log the reason why a block gets invalid.
          Hide
          Hairong Kuang added a comment -

          > -1 javadoc. The javadoc tool appears to have generated 1 warning messages.
          This is not right. I ran "ant javadoc" with and without the patch. I did not see that the patch added any additional javadoc warning.

          Show
          Hairong Kuang added a comment - > -1 javadoc. The javadoc tool appears to have generated 1 warning messages. This is not right. I ran "ant javadoc" with and without the patch. I did not see that the patch added any additional javadoc warning.
          Hide
          Raghu Angadi added a comment -

          I tested the patch with an artificial test that brings datanodes down and brings them up at different times among other things.

          on trunk, the block loses all the replicas and with the patch it does not.

          Show
          Raghu Angadi added a comment - I tested the patch with an artificial test that brings datanodes down and brings them up at different times among other things. on trunk, the block loses all the replicas and with the patch it does not.
          Hide
          Hairong Kuang added a comment -

          The attached patch put the cleaning of two invalidate sets in one place when a datanode is removed.

          Show
          Hairong Kuang added a comment - The attached patch put the cleaning of two invalidate sets in one place when a datanode is removed.
          Hide
          Raghu Angadi added a comment -

          Thanks Hairong. +1. patch looks good to me.

          Show
          Raghu Angadi added a comment - Thanks Hairong. +1. patch looks good to me.
          Hide
          Hairong Kuang added a comment -

          Patch for branch 0.18.

          Show
          Hairong Kuang added a comment - Patch for branch 0.18.
          Hide
          Hairong Kuang added a comment -

          ant test-core result:
          BUILD SUCCESSFUL
          Total time: 116 minutes 32 seconds

          ant test-patch result:
          [exec] -1 overall.

          [exec] +1 @author. The patch does not contain any @author tags.

          [exec] -1 tests included. The patch doesn't appear to include any new or modified tests.
          [exec] Please justify why no tests are needed for this patch.

          [exec] -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.

          [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          Show
          Hairong Kuang added a comment - ant test-core result: BUILD SUCCESSFUL Total time: 116 minutes 32 seconds ant test-patch result: [exec] -1 overall. [exec] +1 @author. The patch does not contain any @author tags. [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] -1 javadoc. The javadoc tool appears to have generated 1 warning messages. [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
          Hide
          Hairong Kuang added a comment -

          The patch is manually tested. The javadoc warning is irrelevant of this patch.

          I just committed this.

          Show
          Hairong Kuang added a comment - The patch is manually tested. The javadoc warning is irrelevant of this patch. I just committed this.

            People

            • Assignee:
              Hairong Kuang
              Reporter:
              Robert Chansler
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development