Hadoop Common
  1. Hadoop Common
  2. HADOOP-6631

FileUtil.fullyDelete() should continue to delete other files despite failure at any level.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: fs, util
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Ravi commented about this on HADOOP-6536. Paraphrasing...

      Currently FileUtil.fullyDelete(myDir) comes out stopping deletion of other files/directories if it is unable to delete a file/dir(say because of not having permissions to delete that file/dir) anywhere under myDir. This is because we return from method if the recursive call "if(!fullyDelete())

      {return false;}

      " fails at any level of recursion.

      Shouldn't it continue with deletion of other files/dirs continuing in the for loop instead of returning false here ?

      I guess fullyDelete() should delete as many files as possible(similar to 'rm -rf').

      1. hadoop-6631-y20s-2.patch
        6 kB
        Sreekanth Ramakrishnan
      2. hadoop-6631-y20s-1.patch
        6 kB
        Sreekanth Ramakrishnan
      3. HADOOP-6631-20100506-ydist.final.txt
        8 kB
        Vinod Kumar Vavilapalli
      4. HADOOP-6631-20100506.2.txt
        8 kB
        Vinod Kumar Vavilapalli
      5. HADOOP-6631-20100505.txt
        6 kB
        Vinod Kumar Vavilapalli
      6. HADOOP-6631.v1.patch
        4 kB
        Ravi Gummadi
      7. HADOOP-6631.patch
        4 kB
        Ravi Gummadi
      8. HADOOP-6631.patch
        4 kB
        Ravi Gummadi

        Issue Links

          Activity

          Hide
          Vinod Kumar Vavilapalli added a comment -

          I think this is a critical bug in a util method that is used throughout MAPREDUCE extensively.

          The important implication of this bug is the unreclaimed disk space because of single/few undeleteable files/dirs.

          Show
          Vinod Kumar Vavilapalli added a comment - I think this is a critical bug in a util method that is used throughout MAPREDUCE extensively. The important implication of this bug is the unreclaimed disk space because of single/few undeleteable files/dirs.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          We can simply go on with deleting other files/dirs as Ravi suggested.

          One of the most common reasons why fullyDelete() fails to delete stuff is the case of non-writable permissions on directories when issues like MAPREDUCE-896 happen. So, we can go one step further and try to set writable permissions on failing directories and then try deleting them too. Completely failure can be only when the files/dirs are themselves non-deletable due to ownerhip issues. Thoughts?

          Show
          Vinod Kumar Vavilapalli added a comment - We can simply go on with deleting other files/dirs as Ravi suggested. One of the most common reasons why fullyDelete() fails to delete stuff is the case of non-writable permissions on directories when issues like MAPREDUCE-896 happen. So, we can go one step further and try to set writable permissions on failing directories and then try deleting them too. Completely failure can be only when the files/dirs are themselves non-deletable due to ownerhip issues. Thoughts?
          Hide
          Ravi Gummadi added a comment -

          >So, we can go one step further and try to set writable permissions on failing directories and then try deleting them too. Completely failure can be only when the files/dirs are themselves non-deletable due to ownerhip issues.

          Hmm... Would that be aggressive and may be harmful at times ? May be leaving "setting permissions on failing directories" to the caller of fullyDelete() so that he is aware of(and has the flexibility of) what he wants to do on any subdir under myDir(that was given to fullyDelete()) could be safer option ?

          Show
          Ravi Gummadi added a comment - >So, we can go one step further and try to set writable permissions on failing directories and then try deleting them too. Completely failure can be only when the files/dirs are themselves non-deletable due to ownerhip issues. Hmm... Would that be aggressive and may be harmful at times ? May be leaving "setting permissions on failing directories" to the caller of fullyDelete() so that he is aware of(and has the flexibility of) what he wants to do on any subdir under myDir(that was given to fullyDelete()) could be safer option ?
          Hide
          Doug Cutting added a comment -

          Unix command line tools usually provide a good model for our tools. I think this proposal roughly amounts to changing fullyDelete() from 'rm -r' to 'rm -rf'. We might make metaphor even more explicit, having an option named "force". I've never heard of a unix comand line tool that changes protections so that it can remove things as it goes. Rather one uses 'chmod' first, no?

          Show
          Doug Cutting added a comment - Unix command line tools usually provide a good model for our tools. I think this proposal roughly amounts to changing fullyDelete() from 'rm -r' to 'rm -rf'. We might make metaphor even more explicit, having an option named "force". I've never heard of a unix comand line tool that changes protections so that it can remove things as it goes. Rather one uses 'chmod' first, no?
          Hide
          Ravi Gummadi added a comment -

          No chmod should be done explicitly. Let the caller of fullyDelete() do that if needed. If permissions are not there for some file, that file will not be deleted.

          This JIRA fixes the issue of not continuing deletion of other files/directories when some file could not deleted anywhere in the tree.

          Show
          Ravi Gummadi added a comment - No chmod should be done explicitly. Let the caller of fullyDelete() do that if needed. If permissions are not there for some file, that file will not be deleted. This JIRA fixes the issue of not continuing deletion of other files/directories when some file could not deleted anywhere in the tree.
          Hide
          Ravi Gummadi added a comment -

          Attaching patch for trunk.

          Show
          Ravi Gummadi added a comment - Attaching patch for trunk.
          Hide
          Sreekanth Ramakrishnan added a comment -

          Patch for earlier version of hadoop, not for commit.

          Show
          Sreekanth Ramakrishnan added a comment - Patch for earlier version of hadoop, not for commit.
          Hide
          Amareshwari Sriramadasu added a comment -

          Looked at hadoop-6631-y20s-1.patch.
          Patch has a System.out statement in testFailFullyDelete(), can you remove that? All other changes look good.

          Show
          Amareshwari Sriramadasu added a comment - Looked at hadoop-6631-y20s-1.patch. Patch has a System.out statement in testFailFullyDelete(), can you remove that? All other changes look good.
          Hide
          Sreekanth Ramakrishnan added a comment -

          Removing the sys out which was accidentally added. Patch for older version not to be committed.

          Show
          Sreekanth Ramakrishnan added a comment - Removing the sys out which was accidentally added. Patch for older version not to be committed.
          Hide
          Amareshwari Sriramadasu added a comment -

          The patch : hadoop-6631-y20s-2.patch looks fine.

          Show
          Amareshwari Sriramadasu added a comment - The patch : hadoop-6631-y20s-2.patch looks fine.
          Hide
          Amareshwari Sriramadasu added a comment -

          The trunk patch also looks good.
          Ravi, Can you reattach the patch and submit for hudson?

          Show
          Amareshwari Sriramadasu added a comment - The trunk patch also looks good. Ravi, Can you reattach the patch and submit for hudson?
          Hide
          Ravi Gummadi added a comment -

          Reattaching the same patch for trunk just to make Hudson pick this up.

          Show
          Ravi Gummadi added a comment - Reattaching the same patch for trunk just to make Hudson pick this up.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12443562/HADOOP-6631.patch
          against trunk revision 940527.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/497/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/497/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/497/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/497/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12443562/HADOOP-6631.patch against trunk revision 940527. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/497/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/497/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/497/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/497/console This message is automatically generated.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          I have one comment on the patch. When we list the files in a dir, and then try deleting a file and fail, we return from the method. This may be right when the parent directory itself is non-writable because then going on with other files/dirs is useless anyways. But I quickly checked the man page for unlink on Linux and realized that delete of a file can fail when

          • write permissions are denied on the parent dir
          • the file is being used by some other process
          • the file doesn't exist anymore
          • or the file is on a read-only filesystem.

          The current solution is enough for the 1st and 4th cases. The 2nd and 3rd are really possible, and so should be handled gracefully as well by proceeding to the delete of other files/dirs in the parent-dir. To optimize the non-writable directory case, we may want to do a check if the parent-dir is writable or not in the beginning itself.

          Show
          Vinod Kumar Vavilapalli added a comment - I have one comment on the patch. When we list the files in a dir, and then try deleting a file and fail, we return from the method. This may be right when the parent directory itself is non-writable because then going on with other files/dirs is useless anyways. But I quickly checked the man page for unlink on Linux and realized that delete of a file can fail when write permissions are denied on the parent dir the file is being used by some other process the file doesn't exist anymore or the file is on a read-only filesystem. The current solution is enough for the 1st and 4th cases. The 2nd and 3rd are really possible, and so should be handled gracefully as well by proceeding to the delete of other files/dirs in the parent-dir. To optimize the non-writable directory case, we may want to do a check if the parent-dir is writable or not in the beginning itself.
          Hide
          Ravi Gummadi added a comment -

          Will make fullyDelete() to continue even for the other cases of failure of single file(in addition to the earlier handled case of deletion failure of directory).

          >> To optimize the non-writable directory case, we may want to do a check if the parent-dir is writable or not in the beginning itself.

          I see a case where we want to continue deletion of files under dir/subdirectory even if the files or subdirectories under a dir can not be deleted. So not doing this optimization.

          Show
          Ravi Gummadi added a comment - Will make fullyDelete() to continue even for the other cases of failure of single file(in addition to the earlier handled case of deletion failure of directory). >> To optimize the non-writable directory case, we may want to do a check if the parent-dir is writable or not in the beginning itself. I see a case where we want to continue deletion of files under dir/subdirectory even if the files or subdirectories under a dir can not be deleted. So not doing this optimization.
          Hide
          Ravi Gummadi added a comment -

          Attaching new patch for trunk with review comment of Vinod incorporated.

          Show
          Ravi Gummadi added a comment - Attaching new patch for trunk with review comment of Vinod incorporated.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          We will need a test to verify the non-deletable file case. Talked to Ravi and he came out with a brilliant idea and pointed out it CAN be done deterministically by checking the case where we want to continue deletion of files under dir/subdirectory even if the files or subdirectories under a dir can not be deleted. I'm trying to get this testcase done myself..

          Show
          Vinod Kumar Vavilapalli added a comment - We will need a test to verify the non-deletable file case. Talked to Ravi and he came out with a brilliant idea and pointed out it CAN be done deterministically by checking the case where we want to continue deletion of files under dir/subdirectory even if the files or subdirectories under a dir can not be deleted. I'm trying to get this testcase done myself..
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12443688/HADOOP-6631.v1.patch
          against trunk revision 940989.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/504/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/504/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/504/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/504/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12443688/HADOOP-6631.v1.patch against trunk revision 940989. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/504/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/504/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/504/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/504/console This message is automatically generated.
          Hide
          Ravi Gummadi added a comment -

          Cancelling patch as Vinod is trying to (1) add new testcase for the case of a file under a dir can not be deleted but files under a subdirectory under dir can be deleted and (2) modify the testcase of this patch sothat it doesn't depend on the order in which listStatus() returns the files.

          Show
          Ravi Gummadi added a comment - Cancelling patch as Vinod is trying to (1) add new testcase for the case of a file under a dir can not be deleted but files under a subdirectory under dir can be deleted and (2) modify the testcase of this patch sothat it doesn't depend on the order in which listStatus() returns the files.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Updated patch with testcase verifying non-deletable files also.

          Show
          Vinod Kumar Vavilapalli added a comment - Updated patch with testcase verifying non-deletable files also.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12443720/HADOOP-6631-20100505.txt
          against trunk revision 940989.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 1018 javac compiler warnings (more than the trunk's current 1017 warnings).

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/505/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/505/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/505/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/505/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12443720/HADOOP-6631-20100505.txt against trunk revision 940989. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 1018 javac compiler warnings (more than the trunk's current 1017 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/505/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/505/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/505/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/505/console This message is automatically generated.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Updated patch fixing javac warning and including some comments from related to the test-case.

          Show
          Vinod Kumar Vavilapalli added a comment - Updated patch fixing javac warning and including some comments from related to the test-case.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12443832/HADOOP-6631-20100506.2.txt
          against trunk revision 941508.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/507/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/507/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/507/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/507/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12443832/HADOOP-6631-20100506.2.txt against trunk revision 941508. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/507/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/507/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/507/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/507/console This message is automatically generated.
          Hide
          Ravi Gummadi added a comment -

          Latest patch looks good.
          +1 from my side.

          Show
          Ravi Gummadi added a comment - Latest patch looks good. +1 from my side.
          Hide
          Sharad Agarwal added a comment -

          I just committed this to 0.21 and trunk. Thanks Ravi and Vinod.

          Show
          Sharad Agarwal added a comment - I just committed this to 0.21 and trunk. Thanks Ravi and Vinod.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Patch for yahoo! dist 20 security branch. Not for commit here.

          Show
          Vinod Kumar Vavilapalli added a comment - Patch for yahoo! dist 20 security branch. Not for commit here.

            People

            • Assignee:
              Ravi Gummadi
              Reporter:
              Vinod Kumar Vavilapalli
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development