Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-7707

Edit log corruption due to delayed block removal again

    Details

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      Edit log corruption is seen again, even with the fix of HDFS-6825.

      Prior to HDFS-6825 fix, if dirX is deleted recursively, an OP_CLOSE can get into edit log for the fileY under dirX, thus corrupting the edit log (restarting NN with the edit log would fail).

      What HDFS-6825 does to fix this issue is, to detect whether fileY is already deleted by checking the ancestor dirs on it's path, if any of them doesn't exist, then fileY is already deleted, and don't put OP_CLOSE to edit log for the file.

      For this new edit log corruption, what I found was, the client first deleted dirX recursively, then create another dir with exactly the same name as dirX right away. Because HDFS-6825 count on the namespace checking (whether dirX exists in its parent dir) to decide whether a file has been deleted, the newly created dirX defeats this checking, thus OP_CLOSE for the already deleted file gets into the edit log, due to delayed block removal.

      What we need to do is to have a more robust way to detect whether a file has been deleted.

      1. HDFS-7707.003.patch
        6 kB
        Yongjun Zhang
      2. HDFS-7707.002.patch
        6 kB
        Yongjun Zhang
      3. HDFS-7707.001.patch
        5 kB
        Yongjun Zhang
      4. reproduceHDFS-7707.patch
        0.8 kB
        Yongjun Zhang

        Issue Links

          Activity

          Hide
          yzhangal Yongjun Zhang added a comment -

          One possible solution I thought about is, whenever we need to delete a fileOrDirX, do

          • check permission recursively,
          • if it's permitted to delete fileOrDirX,
            • rename it to a unique name fileOrDirX_to_be_deleted that client won't be using,
            • delete fileOrDirX_to_be_deleted.

          This will cause some confusion in the edit log though. Also if fileOrDirX is not permitted to be deleted, some sub dir / file in it may be deleted, so this operation need to be done at allowed dir/file in a recursively fashion, which may not be clean.

          Show
          yzhangal Yongjun Zhang added a comment - One possible solution I thought about is, whenever we need to delete a fileOrDirX, do check permission recursively, if it's permitted to delete fileOrDirX, rename it to a unique name fileOrDirX_to_be_deleted that client won't be using, delete fileOrDirX_to_be_deleted. This will cause some confusion in the edit log though. Also if fileOrDirX is not permitted to be deleted, some sub dir / file in it may be deleted, so this operation need to be done at allowed dir/file in a recursively fashion, which may not be clean.
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          Seems to be same as HDFS-7414, can you duplicate to HDFS-7414? Where I did not clearly mention the scenario..
          Please let me know your opinion..

          Show
          brahmareddy Brahma Reddy Battula added a comment - Seems to be same as HDFS-7414 , can you duplicate to HDFS-7414 ? Where I did not clearly mention the scenario.. Please let me know your opinion..
          Hide
          kihwal Kihwal Lee added a comment -

          How is isFileDeleted() check defeated? The check walks up the tree following the parent reference, not symbolically using path name. Creation of another directory (i.e. different INode) with the same name should not affect the check.

          Show
          kihwal Kihwal Lee added a comment - How is isFileDeleted() check defeated? The check walks up the tree following the parent reference, not symbolically using path name. Creation of another directory (i.e. different INode) with the same name should not affect the check.
          Hide
          yzhangal Yongjun Zhang added a comment -

          Hi Brahma Reddy Battula and Kihwal Lee,

          Thanks a lot for your comments!

          Currently isFileDeleted() does the following:

           if (tmpParent == null ||
                    tmpParent.searchChildren(tmpChild.getLocalNameBytes()) < 0) {
                  return true;
                }
          

          which is to check whether a child name exists in parent directory.
          That's the part I was referring to that gets defeated.

          I hope my understanding is correct. I described a possible solution as the first comment, would you please share some insight?

          Thanks.

          Show
          yzhangal Yongjun Zhang added a comment - Hi Brahma Reddy Battula and Kihwal Lee , Thanks a lot for your comments! Currently isFileDeleted() does the following: if (tmpParent == null || tmpParent.searchChildren(tmpChild.getLocalNameBytes()) < 0) { return true ; } which is to check whether a child name exists in parent directory. That's the part I was referring to that gets defeated. I hope my understanding is correct. I described a possible solution as the first comment, would you please share some insight? Thanks.
          Hide
          kihwal Kihwal Lee added a comment -

          isFileDeletec() is always called with the fsn lock held, so no modification is done while in the method and tmpParent is obtained by calling file.getParent(). So tmpParent cannot be a newly created directory inode, unless something is automatically setting the file inode's parent to the new directory inode. If isFileDeletec() is called with a wrong file inode, then it is possible to hit this condition. That means both the parent dir and the file were recreated and NN got confused. Does this case also involve commitBlockSynchronization()?

          Show
          kihwal Kihwal Lee added a comment - isFileDeletec() is always called with the fsn lock held, so no modification is done while in the method and tmpParent is obtained by calling file.getParent() . So tmpParent cannot be a newly created directory inode, unless something is automatically setting the file inode's parent to the new directory inode. If isFileDeletec() is called with a wrong file inode, then it is possible to hit this condition. That means both the parent dir and the file were recreated and NN got confused. Does this case also involve commitBlockSynchronization() ?
          Hide
          yzhangal Yongjun Zhang added a comment -

          HI Kihwal,

          Thanks a lot for your further comments. I did the analysis based on the edit log. I assumed commitBlockSynchronization() is involved due to the delayed block removal. Basically the same code path as examined by HDFS-6825. I will take a look at other path too.

          Assuming commitBlockSynchronization is involved (. The iNodeFile is got by the following code:

              BlockCollection blockCollection = storedBlock.getBlockCollection();
              INodeFile iFile = ((INode)blockCollection).asFile();
          

          Do you mean that we could get a wrong iFile here?

          BTW, your comment rang a bell to me: when we delete a dir, what's the reason that tmpParent won't get a null at the dirX when trying to get the parent of dirX (if this happened)?

             while (true) {
                if (tmpParent == null ||
                    tmpParent.searchChildren(tmpChild.getLocalNameBytes()) < 0) {
                  return true;
                }
                if (tmpParent.isRoot()) {
                  break;
                }
                tmpChild = tmpParent;
                tmpParent = tmpParent.getParent();
              }
          

          Thanks.

          Show
          yzhangal Yongjun Zhang added a comment - HI Kihwal, Thanks a lot for your further comments. I did the analysis based on the edit log. I assumed commitBlockSynchronization() is involved due to the delayed block removal. Basically the same code path as examined by HDFS-6825 . I will take a look at other path too. Assuming commitBlockSynchronization is involved (. The iNodeFile is got by the following code: BlockCollection blockCollection = storedBlock.getBlockCollection(); INodeFile iFile = ((INode)blockCollection).asFile(); Do you mean that we could get a wrong iFile here? BTW, your comment rang a bell to me: when we delete a dir, what's the reason that tmpParent won't get a null at the dirX when trying to get the parent of dirX (if this happened)? while ( true ) { if (tmpParent == null || tmpParent.searchChildren(tmpChild.getLocalNameBytes()) < 0) { return true ; } if (tmpParent.isRoot()) { break ; } tmpChild = tmpParent; tmpParent = tmpParent.getParent(); } Thanks.
          Hide
          kihwal Kihwal Lee added a comment -

          Do you mean that we could get a wrong iFile here?

          Since the block collection of a block won't magically get updated to a new inode file, I don't see how it can be a wrong inode file. So it may not be due to delayed block removal.

          what's the reason that tmpParent won't get a null at the dirX when trying to get the parent of dirX (if this happened)?

          If snapshot is not involved, the parent will be set to null during delete while in the fsn write lock. Lack of memory barrier can cause stale values to be used in multi-processor and multi-threaded env, but I am not sure whether that is the cause here.

          If commitBlockSynchronization() was involved, was it initiated by client (e.g. revoerLease() or create/append() )?

          Show
          kihwal Kihwal Lee added a comment - Do you mean that we could get a wrong iFile here? Since the block collection of a block won't magically get updated to a new inode file, I don't see how it can be a wrong inode file. So it may not be due to delayed block removal. what's the reason that tmpParent won't get a null at the dirX when trying to get the parent of dirX (if this happened)? If snapshot is not involved, the parent will be set to null during delete while in the fsn write lock. Lack of memory barrier can cause stale values to be used in multi-processor and multi-threaded env, but I am not sure whether that is the cause here. If commitBlockSynchronization() was involved, was it initiated by client (e.g. revoerLease() or create/append() )?
          Hide
          yzhangal Yongjun Zhang added a comment -

          Thank you so much Kihwal!

          What happened was, the user manually delete the dir by issuing Hadoop fs –rm –r –skipTrash command. So it seems still related to delayed block removal. It appears that snapshot is involved but I will confirm.

          Show
          yzhangal Yongjun Zhang added a comment - Thank you so much Kihwal! What happened was, the user manually delete the dir by issuing Hadoop fs –rm –r –skipTrash command. So it seems still related to delayed block removal. It appears that snapshot is involved but I will confirm.
          Hide
          yzhangal Yongjun Zhang added a comment -

          I reproduced the issue by modifying the testcase I added for HDFS-6825, that is, do a mkdir of the same dir right after deleting it recursively.

          Show
          yzhangal Yongjun Zhang added a comment - I reproduced the issue by modifying the testcase I added for HDFS-6825 , that is, do a mkdir of the same dir right after deleting it recursively.
          Hide
          yzhangal Yongjun Zhang added a comment -

          HI Kihwal and other folks who are watching,

          I described a possible solution as the first comment of this jira. Now I am thinking about a possibly cleaner one: if we have a dir/file creation time, then we can compare the creation time to determine that a dir is newer than the file. I searched and found HADOOP-1377, which initially introduced creation time but dropped later per the discussion there. Due to the nature of delayed block removal, the scenario described in this jira is a valid case to handle. It seems having creation time would make the detection of deleted file much easier.

          Well, when we copy file, if we use -p option to preserve attributes, including creation time, then the creation time of a file under a dir may be older than the dir's creation time. So this might not be foolproof. If files/dirs under a dir have newer creation time than the parent, then it would work. Even if it works, existing clusters don't have creation time, it won't be a backward compatible solution. On the other hand, the first possible solution will be backward compatible.

          Just throw the thoughts here, if any one has insight, I'd appreciate if could share.

          Thanks.

          Show
          yzhangal Yongjun Zhang added a comment - HI Kihwal and other folks who are watching, I described a possible solution as the first comment of this jira. Now I am thinking about a possibly cleaner one: if we have a dir/file creation time, then we can compare the creation time to determine that a dir is newer than the file. I searched and found HADOOP-1377 , which initially introduced creation time but dropped later per the discussion there. Due to the nature of delayed block removal, the scenario described in this jira is a valid case to handle. It seems having creation time would make the detection of deleted file much easier. Well, when we copy file, if we use -p option to preserve attributes, including creation time, then the creation time of a file under a dir may be older than the dir's creation time. So this might not be foolproof. If files/dirs under a dir have newer creation time than the parent, then it would work. Even if it works, existing clusters don't have creation time, it won't be a backward compatible solution. On the other hand, the first possible solution will be backward compatible. Just throw the thoughts here, if any one has insight, I'd appreciate if could share. Thanks.
          Hide
          kihwal Kihwal Lee added a comment -

          Yongjun Zhang Can you post the test case for reference?

          Show
          kihwal Kihwal Lee added a comment - Yongjun Zhang Can you post the test case for reference?
          Hide
          yzhangal Yongjun Zhang added a comment -

          Hi Kihwal, thanks for your follow-up and the good suggestion. I just posted the test case change for reproduction as reproduceHDFS-7707.patch.

          Show
          yzhangal Yongjun Zhang added a comment - Hi Kihwal, thanks for your follow-up and the good suggestion. I just posted the test case change for reproduction as reproduceHDFS-7707.patch.
          Hide
          yzhangal Yongjun Zhang added a comment -

          Hi Kihwal,

          Inspired by your comment
          https://issues.apache.org/jira/browse/HDFS-7707?focusedCommentId=14299106&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14299106

          I think I have a better solution now. That is, instead of checking the name string, check the inode id. Comparing the inode id of the deleted file/dir against a newly created inode id will mismatch, thus the detecting that the file/dir was deleted.

          Thanks.

          Show
          yzhangal Yongjun Zhang added a comment - Hi Kihwal, Inspired by your comment https://issues.apache.org/jira/browse/HDFS-7707?focusedCommentId=14299106&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14299106 I think I have a better solution now. That is, instead of checking the name string, check the inode id. Comparing the inode id of the deleted file/dir against a newly created inode id will mismatch, thus the detecting that the file/dir was deleted. Thanks.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12696055/HDFS-7707.001.patch
          against trunk revision 8acc5e9.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9405//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12696055/HDFS-7707.001.patch against trunk revision 8acc5e9. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9405//console This message is automatically generated.
          Hide
          yzhangal Yongjun Zhang added a comment -

          Hi Kihwal,

          I submitted patch rev 001 per the solution described in my last comment. Would you please help taking a look? thanks a lot!

          Show
          yzhangal Yongjun Zhang added a comment - Hi Kihwal, I submitted patch rev 001 per the solution described in my last comment. Would you please help taking a look? thanks a lot!
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12696056/HDFS-7707.001.patch
          against trunk revision 8cb4731.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.hdfs.server.namenode.TestCommitBlockSynchronization

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9406//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9406//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12696056/HDFS-7707.001.patch against trunk revision 8cb4731. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestCommitBlockSynchronization Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9406//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9406//console This message is automatically generated.
          Hide
          yzhangal Yongjun Zhang added a comment -

          I found that the test failure of TestCommitBlockSynchronization is caused by incorrect behaviour of parent.addChild(file) }} on a mocked {{INodeDirectory instance parent. That is, after parent.addChild(file) }}, {{parent doesn't have the child file. I wonder if anyone knows why.

          I did a workaround by creating a new INodeDirectory instance, and uploaded patch rev 002.

          Thanks.

          Show
          yzhangal Yongjun Zhang added a comment - I found that the test failure of TestCommitBlockSynchronization is caused by incorrect behaviour of parent.addChild(file) }} on a mocked {{INodeDirectory instance parent . That is, after parent.addChild(file) }}, {{parent doesn't have the child file . I wonder if anyone knows why. I did a workaround by creating a new INodeDirectory instance, and uploaded patch rev 002. Thanks.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12696100/HDFS-7707.002.patch
          against trunk revision 8cb4731.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          -1 javac. The applied patch generated 1198 javac compiler warnings (more than the trunk's current 1189 warnings).

          -1 javadoc. The javadoc tool appears to have generated 5 warning messages.
          See https://builds.apache.org/job/PreCommit-HDFS-Build/9410//artifact/patchprocess/diffJavadocWarnings.txt for details.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9410//testReport/
          Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9410//artifact/patchprocess/diffJavacWarnings.txt
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9410//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12696100/HDFS-7707.002.patch against trunk revision 8cb4731. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. -1 javac . The applied patch generated 1198 javac compiler warnings (more than the trunk's current 1189 warnings). -1 javadoc . The javadoc tool appears to have generated 5 warning messages. See https://builds.apache.org/job/PreCommit-HDFS-Build/9410//artifact/patchprocess/diffJavadocWarnings.txt for details. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9410//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9410//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9410//console This message is automatically generated.
          Hide
          yzhangal Yongjun Zhang added a comment -

          Hi Brahma Reddy Battula,

          Sorry for a late clarification here about HDFS-7414, to address your earlier comment:

          https://issues.apache.org/jira/browse/HDFS-7707?focusedCommentId=14298588&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14298588

          My comments in HDFS-7414 earlier mentioned both HDFS-6527 and HDFS-6825. The code you pasted there indicates that the release you are running has HDFS-6527 but not HDFS-6825.

          HDFS-7707 is to remedy HDFS-6825 fix, for the special the case that a new dir is created with the same name as the previously deleted dir. It's possible HDFS-6825 alone can solve your issue (you can try now to see if that's the case since HDFS-6825 fix is already committed), or you need to wait for HDFS-7707 fix and combine with HDFS-6825 fix.

          Thanks.

          Show
          yzhangal Yongjun Zhang added a comment - Hi Brahma Reddy Battula , Sorry for a late clarification here about HDFS-7414 , to address your earlier comment: https://issues.apache.org/jira/browse/HDFS-7707?focusedCommentId=14298588&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14298588 My comments in HDFS-7414 earlier mentioned both HDFS-6527 and HDFS-6825 . The code you pasted there indicates that the release you are running has HDFS-6527 but not HDFS-6825 . HDFS-7707 is to remedy HDFS-6825 fix, for the special the case that a new dir is created with the same name as the previously deleted dir. It's possible HDFS-6825 alone can solve your issue (you can try now to see if that's the case since HDFS-6825 fix is already committed), or you need to wait for HDFS-7707 fix and combine with HDFS-6825 fix. Thanks.
          Hide
          kihwal Kihwal Lee added a comment -

          If snapshot is not involved, the parent of the file inode will be null, so the existing check should work. Your test case also reproduces the corruption only when snapshot is used. So it looks like your approach is correct. When walking up the tree for deletion check, it should look up the child with the same inode id. Since look up is only possible symbolically, the id needs to be compared if there is a hit. I will spend a bit more time on this.

          Show
          kihwal Kihwal Lee added a comment - If snapshot is not involved, the parent of the file inode will be null, so the existing check should work. Your test case also reproduces the corruption only when snapshot is used. So it looks like your approach is correct. When walking up the tree for deletion check, it should look up the child with the same inode id. Since look up is only possible symbolically, the id needs to be compared if there is a hit. I will spend a bit more time on this.
          Hide
          kihwal Kihwal Lee added a comment -

          The patch looks good except the unnecessary imports in FSNamesystem.java.

          Show
          kihwal Kihwal Lee added a comment - The patch looks good except the unnecessary imports in FSNamesystem.java.
          Hide
          yzhangal Yongjun Zhang added a comment -

          Many thanks Kihwal Lee, for your review and comments! I just uploaded rev 003 to remove the unnecessary import.

          I also ran the failed test TestFailureToReadEdits locally and it was successful.

          Show
          yzhangal Yongjun Zhang added a comment - Many thanks Kihwal Lee , for your review and comments! I just uploaded rev 003 to remove the unnecessary import. I also ran the failed test TestFailureToReadEdits locally and it was successful.
          Hide
          kihwal Kihwal Lee added a comment -

          As for TestFailureToReadEdits, the test is flawed. Since the port for qjm is hard-coded, it sometimes does not work. HDFS-6054 is supposed to fix it.

          Show
          kihwal Kihwal Lee added a comment - As for TestFailureToReadEdits , the test is flawed. Since the port for qjm is hard-coded, it sometimes does not work. HDFS-6054 is supposed to fix it.
          Hide
          hadoopqa Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12696209/HDFS-7707.003.patch
          against trunk revision 80705e0.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9416//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9416//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12696209/HDFS-7707.003.patch against trunk revision 80705e0. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9416//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9416//console This message is automatically generated.
          Hide
          kihwal Kihwal Lee added a comment -

          +1.

          Show
          kihwal Kihwal Lee added a comment - +1.
          Hide
          kihwal Kihwal Lee added a comment -

          I've committed this to branch-2 and trunk. Thanks for reporting and fixing the issue, Yongjun Zhang.

          Show
          kihwal Kihwal Lee added a comment - I've committed this to branch-2 and trunk. Thanks for reporting and fixing the issue, Yongjun Zhang .
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-trunk-Commit #6992 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6992/)
          HDFS-7707. Edit log corruption due to delayed block removal again. Contributed by Yongjun Zhang (kihwal: rev 843806d03ab1a24f191782f42eb817505228eb9f)

          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeleteRace.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #6992 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6992/ ) HDFS-7707 . Edit log corruption due to delayed block removal again. Contributed by Yongjun Zhang (kihwal: rev 843806d03ab1a24f191782f42eb817505228eb9f) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeleteRace.java
          Hide
          yzhangal Yongjun Zhang added a comment -

          Hi Kihwal,

          Thank you so much for the quick help!

          About HDFS-6054, I took a look, will post some comment there soon.

          Show
          yzhangal Yongjun Zhang added a comment - Hi Kihwal, Thank you so much for the quick help! About HDFS-6054 , I took a look, will post some comment there soon.
          Hide
          yzhangal Yongjun Zhang added a comment -

          Hi Kihwal,

          Thanks a lot for your help with this jira and pointing me to HDFS-6054! I submitted a small patch to HDFS-6054, if time allows, could you help taking a look? Thank you so much!

          Sorry I attached that patch to this jira by mistake, just removed it.

          Show
          yzhangal Yongjun Zhang added a comment - Hi Kihwal, Thanks a lot for your help with this jira and pointing me to HDFS-6054 ! I submitted a small patch to HDFS-6054 , if time allows, could you help taking a look? Thank you so much! Sorry I attached that patch to this jira by mistake, just removed it.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #94 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/94/)
          HDFS-7707. Edit log corruption due to delayed block removal again. Contributed by Yongjun Zhang (kihwal: rev 843806d03ab1a24f191782f42eb817505228eb9f)

          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeleteRace.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #94 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/94/ ) HDFS-7707 . Edit log corruption due to delayed block removal again. Contributed by Yongjun Zhang (kihwal: rev 843806d03ab1a24f191782f42eb817505228eb9f) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeleteRace.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #828 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/828/)
          HDFS-7707. Edit log corruption due to delayed block removal again. Contributed by Yongjun Zhang (kihwal: rev 843806d03ab1a24f191782f42eb817505228eb9f)

          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeleteRace.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #828 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/828/ ) HDFS-7707 . Edit log corruption due to delayed block removal again. Contributed by Yongjun Zhang (kihwal: rev 843806d03ab1a24f191782f42eb817505228eb9f) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeleteRace.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2026 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2026/)
          HDFS-7707. Edit log corruption due to delayed block removal again. Contributed by Yongjun Zhang (kihwal: rev 843806d03ab1a24f191782f42eb817505228eb9f)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeleteRace.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2026 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2026/ ) HDFS-7707 . Edit log corruption due to delayed block removal again. Contributed by Yongjun Zhang (kihwal: rev 843806d03ab1a24f191782f42eb817505228eb9f) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeleteRace.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #91 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/91/)
          HDFS-7707. Edit log corruption due to delayed block removal again. Contributed by Yongjun Zhang (kihwal: rev 843806d03ab1a24f191782f42eb817505228eb9f)

          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeleteRace.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #91 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/91/ ) HDFS-7707 . Edit log corruption due to delayed block removal again. Contributed by Yongjun Zhang (kihwal: rev 843806d03ab1a24f191782f42eb817505228eb9f) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeleteRace.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #95 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/95/)
          HDFS-7707. Edit log corruption due to delayed block removal again. Contributed by Yongjun Zhang (kihwal: rev 843806d03ab1a24f191782f42eb817505228eb9f)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeleteRace.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #95 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/95/ ) HDFS-7707 . Edit log corruption due to delayed block removal again. Contributed by Yongjun Zhang (kihwal: rev 843806d03ab1a24f191782f42eb817505228eb9f) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeleteRace.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2045 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2045/)
          HDFS-7707. Edit log corruption due to delayed block removal again. Contributed by Yongjun Zhang (kihwal: rev 843806d03ab1a24f191782f42eb817505228eb9f)

          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeleteRace.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2045 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2045/ ) HDFS-7707 . Edit log corruption due to delayed block removal again. Contributed by Yongjun Zhang (kihwal: rev 843806d03ab1a24f191782f42eb817505228eb9f) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeleteRace.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Sangjin Lee backported this to 2.6.1. I just pushed the commit to 2.6.1 after running compilation and TestCommitBlockSynchronization, TestDeleteRace which changed in the patch.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Sangjin Lee backported this to 2.6.1. I just pushed the commit to 2.6.1 after running compilation and TestCommitBlockSynchronization, TestDeleteRace which changed in the patch.

            People

            • Assignee:
              yzhangal Yongjun Zhang
              Reporter:
              yzhangal Yongjun Zhang
            • Votes:
              0 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development