Hadoop Common
  1. Hadoop Common
  2. HADOOP-6344

rm and rmr fail to correctly move the user's files to the trash prior to deleting when they are over quota.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.0, 0.20.1, 0.21.0, 0.22.0
    • Fix Version/s: 0.21.0, 0.22.0
    • Component/s: fs
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Trash feature notifies user of over-quota condition rather than silently deleting files/directories; deletion can be compelled with "rm -skiptrash".

      Description

      With trash turned on, if a user is over his quota and does a rm (or rmr), the file is deleted without a copy being placed in the trash.

      1. HDFS-740-for-Y20.patch
        0.5 kB
        Jakob Homan
      2. HDFS-740.patch
        0.5 kB
        Jakob Homan

        Issue Links

          Activity

          Hide
          Albert XU added a comment -

          It could be reproduced in the Hadoop-1.0.4

          1. bin/hadoop dfs -rmr input
            13/05/15 00:14:15 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
            13/05/15 00:14:15 WARN fs.Trash: Can't create trash directory: file:/C:/Documents and Settings/XU Zheng/.Trash/Current/C:/cygwin/usr/hadoop-1.0.4
            Problem with Trash.Failed to set permissions of path: C:\Documents and Settings\XU Zheng\.Trash\Current\C:\cygwin\usr\hadoop-1.0.4 to 0700. Consider using -skipTrash option
            rmr: Failed to move to trash: file:/C:/cygwin/usr/hadoop-1.0.4/input
          Show
          Albert XU added a comment - It could be reproduced in the Hadoop-1.0.4 bin/hadoop dfs -rmr input 13/05/15 00:14:15 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 13/05/15 00:14:15 WARN fs.Trash: Can't create trash directory: file:/C:/Documents and Settings/XU Zheng/.Trash/Current/C:/cygwin/usr/hadoop-1.0.4 Problem with Trash.Failed to set permissions of path: C:\Documents and Settings\XU Zheng\.Trash\Current\C:\cygwin\usr\hadoop-1.0.4 to 0700. Consider using -skipTrash option rmr: Failed to move to trash: file:/C:/cygwin/usr/hadoop-1.0.4/input
          Hide
          Albert XU added a comment -

          I could reproduce it on the version of 0.20.2.

          1. bin/hadoop dfs -rmr input
            13/05/14 02:37:12 WARN fs.Trash: Can't create trash directory: file:/C:/Documents and Settings/XU Zheng/.Trash/Current/C:/cygwin/usr/hadoop-0.20.2
            Deleted file:/C:/cygwin/usr/hadoop-0.20.2/input
          Show
          Albert XU added a comment - I could reproduce it on the version of 0.20.2. bin/hadoop dfs -rmr input 13/05/14 02:37:12 WARN fs.Trash: Can't create trash directory: file:/C:/Documents and Settings/XU Zheng/.Trash/Current/C:/cygwin/usr/hadoop-0.20.2 Deleted file:/C:/cygwin/usr/hadoop-0.20.2/input
          Konstantin Shvachko made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Nigel Daley made changes -
          Fix Version/s 0.22.0 [ 12314296 ]
          Fix Version/s 0.21.0 [ 12313563 ]
          Robert Chansler made changes -
          Release Note Trash feature notifies user of over-quota condition rather than silently deleting files/directories; deletion can be compelled with "rm -skiptrash".
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk #144 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/144/)
          . Fix rm and rmr immediately delete files rather than sending to trash, if a user is over-quota. Contributed by Jakob Homan.

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk #144 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/144/ ) . Fix rm and rmr immediately delete files rather than sending to trash, if a user is over-quota. Contributed by Jakob Homan.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #78 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/78/)
          . Fix rm and rmr immediately delete files rather than sending to trash, if a user is over-quota. Contributed by Jakob Homan.

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #78 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/78/ ) . Fix rm and rmr immediately delete files rather than sending to trash, if a user is over-quota. Contributed by Jakob Homan.
          Suresh Srinivas made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          Suresh Srinivas added a comment -

          Committed the patch to trunk and branch 0.21. Thank you Jakob.

          Show
          Suresh Srinivas added a comment - Committed the patch to trunk and branch 0.21. Thank you Jakob.
          Suresh Srinivas made changes -
          Hadoop Flags [Reviewed]
          Assignee Jakob Homan [ jghoman ]
          Component/s fs [ 12310689 ]
          Hide
          Jakob Homan added a comment -

          Of note, tests passed for Y! branch and test-patch was fine (well, it warned about the class path, but I reject that as false).

          Show
          Jakob Homan added a comment - Of note, tests passed for Y! branch and test-patch was fine (well, it warned about the class path, but I reject that as false).
          Hide
          Suresh Srinivas added a comment -

          +1 for the patch.

          Show
          Suresh Srinivas added a comment - +1 for the patch.
          Hide
          Konstantin Shvachko added a comment -

          There are some questionable things going on in moveToTrash(), which should be addressed in HADOOP-6345.

          As for the patch I think it does what it is intended for, namely not removing silently the directory if the quota exceeded or any other exception is thrown while creating a trash directory. I first thought that the directory will still be removed in FsShell.delete() if mkDirs() returned false. But it turns out that HDFS mkDirs() never returns false: the result is either true or an exception.

          +1 one for the patch.

          Show
          Konstantin Shvachko added a comment - There are some questionable things going on in moveToTrash(), which should be addressed in HADOOP-6345 . As for the patch I think it does what it is intended for, namely not removing silently the directory if the quota exceeded or any other exception is thrown while creating a trash directory. I first thought that the directory will still be removed in FsShell.delete() if mkDirs() returned false. But it turns out that HDFS mkDirs() never returns false: the result is either true or an exception. +1 one for the patch.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I am not sure whether this is bug:
          The trash behavior to me is a best-effort trash, i.e. the files being deleted will be moved to trash if possible. Otherwise, the files will be deleted permanently.

          The doc seems not defining clearly the trash behavior. We should first clarify trash behavior before changing it.

          Show
          Tsz Wo Nicholas Sze added a comment - I am not sure whether this is bug: The trash behavior to me is a best-effort trash, i.e. the files being deleted will be moved to trash if possible. Otherwise, the files will be deleted permanently. The doc seems not defining clearly the trash behavior. We should first clarify trash behavior before changing it.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12423635/HDFS-740.patch
          against trunk revision 831070.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/116/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/116/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/116/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/116/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423635/HDFS-740.patch against trunk revision 831070. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/116/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/116/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/116/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/116/console This message is automatically generated.
          Jakob Homan made changes -
          Link This issue is related to HADOOP-6345 [ HADOOP-6345 ]
          Hide
          gary murry added a comment -

          I tested manually using the steps above. I got the following results:

          [hadoopqa@gsbl20230 hadoop-0.20.1.3092118000]$ bin/hadoop fs -put ../hadoop-0.20.1.3092118002/test.bin gmurry/test1.bin
          put: org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota of /user/hadoopqa is exceeded: quota=1 diskspace consumed=11.2g
          [hadoopqa@gsbl20230 hadoop-0.20.1.3092118000]$ bin/hadoop fs -rm gmurry/test2.bin
          09/10/29 22:28:07 WARN fs.Trash: Can't create trash directory: hdfs://gsbl20230.blue.ygrid.yahoo.com/user/hadoopqa/.Trash/Current/user/hadoopqa/gmurry
          rm: Failed to move to trash: hdfs://gsbl20230.blue.ygrid.yahoo.com/user/hadoopqa/gmurry/test2.bin

          And the files were still present.
          +1

          Also, we need to have an attached Jira for the fixing of the unit tests to find this sort of thing.

          Thanks

          Show
          gary murry added a comment - I tested manually using the steps above. I got the following results: [hadoopqa@gsbl20230 hadoop-0.20.1.3092118000] $ bin/hadoop fs -put ../hadoop-0.20.1.3092118002/test.bin gmurry/test1.bin put: org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota of /user/hadoopqa is exceeded: quota=1 diskspace consumed=11.2g [hadoopqa@gsbl20230 hadoop-0.20.1.3092118000] $ bin/hadoop fs -rm gmurry/test2.bin 09/10/29 22:28:07 WARN fs.Trash: Can't create trash directory: hdfs://gsbl20230.blue.ygrid.yahoo.com/user/hadoopqa/.Trash/Current/user/hadoopqa/gmurry rm: Failed to move to trash: hdfs://gsbl20230.blue.ygrid.yahoo.com/user/hadoopqa/gmurry/test2.bin And the files were still present. +1 Also, we need to have an attached Jira for the fixing of the unit tests to find this sort of thing. Thanks
          Jakob Homan made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Jakob Homan added a comment -

          submitting patch

          Show
          Jakob Homan added a comment - submitting patch
          Jakob Homan made changes -
          Attachment HDFS-740-for-Y20.patch [ 12423634 ]
          Attachment HDFS-740.patch [ 12423635 ]
          Hide
          Jakob Homan added a comment -

          Attaching patches to fix this code path for trunk and Yahoo! 20 branch. The problem is as described, rather than stopping the delete, we swallow the exception and proceed. These patches fix that and toss the exception received from mkdirs up to FsShell. After patch, the file is not remains, the user is notified of the exception and the exception is logged.

          Ran tests (all fine) and test-patch on trunk version:

          [exec] +1 overall.  
          [exec] 
          [exec]     +1 @author.  The patch does not contain any @author tags.
          [exec] 
          [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
          [exec]                         Please justify why no new tests are needed for this patch.
          [exec]                         Also please list what manual steps were performed to verify this patch.
          [exec] 
          [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
          [exec] 
          [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
          [exec] 
          [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
          [exec] 
          [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.

          Tests and test-patch for Y! 20 release still pending. Will update.

          Reason for no tests: This is very difficult code path to test and we're doing it manually. This entire method should be refactored: there are like 10 different ways to get out of the method and its tests re-written, since this has been the source of several bugs. I'll open a JIRA shortly to do this. For now, I believe a manual test will suffice and drive the next set of automatic tests.

          Show
          Jakob Homan added a comment - Attaching patches to fix this code path for trunk and Yahoo! 20 branch. The problem is as described, rather than stopping the delete, we swallow the exception and proceed. These patches fix that and toss the exception received from mkdirs up to FsShell. After patch, the file is not remains, the user is notified of the exception and the exception is logged. Ran tests (all fine) and test-patch on trunk version: [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no new tests are needed for this patch. [exec] Also please list what manual steps were performed to verify this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Tests and test-patch for Y! 20 release still pending. Will update. Reason for no tests: This is very difficult code path to test and we're doing it manually. This entire method should be refactored: there are like 10 different ways to get out of the method and its tests re-written, since this has been the source of several bugs. I'll open a JIRA shortly to do this. For now, I believe a manual test will suffice and drive the next set of automatic tests.
          Hide
          Jakob Homan added a comment -

          Moved to Common, since that is where the affected code resides post-split.

          Show
          Jakob Homan added a comment - Moved to Common, since that is where the affected code resides post-split.
          Jakob Homan made changes -
          Project Hadoop HDFS [ 12310942 ] Hadoop Common [ 12310240 ]
          Key HDFS-740 HADOOP-6344
          Affects Version/s 0.20.1 [ 12313866 ]
          Affects Version/s 0.20.0 [ 12313438 ]
          Affects Version/s 0.21.0 [ 12313563 ]
          Affects Version/s 0.22.0 [ 12314296 ]
          Affects Version/s 0.20.1 [ 12314048 ]
          Hide
          Jakob Homan added a comment -

          A quick instrumentation of code and Gary able to prove mkdir is throwing an exception, but it's an odd one:

          an IOException from mkdirs: org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota of /user/hadoopqa is exceeded: quota=1 diskspace consumed=19.6g
          

          I didn't think that creating a directory would count against your DSQuota. Still poking around.
          At the very least this code should be fixed no to silently catch the exception. After Boris' fix (HADOOP-6203), the log message at least indicates there was an exception rather than a false return value, but still swallows the exception and doesn't log what it actually was.

          Show
          Jakob Homan added a comment - A quick instrumentation of code and Gary able to prove mkdir is throwing an exception, but it's an odd one: an IOException from mkdirs: org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota of /user/hadoopqa is exceeded: quota=1 diskspace consumed=19.6g I didn't think that creating a directory would count against your DSQuota. Still poking around. At the very least this code should be fixed no to silently catch the exception. After Boris' fix ( HADOOP-6203 ), the log message at least indicates there was an exception rather than a false return value, but still swallows the exception and doesn't log what it actually was.
          Hide
          Jakob Homan added a comment -

          It looks like Gary did his test slightly differently: in his test the Trash directory doesn't exist beforehand and there isn't quota available to even create it. At which point we hit:

              for (int i = 0; i < 2; i++) {
                try {
                  if (!fs.mkdirs(baseTrashPath, PERMISSION)) {      // create current
                    LOG.warn("Can't create trash directory: "+baseTrashPath);
                    return false;
                  }
                } catch (IOException e) {
                  LOG.warn("Can't create trash directory: "+baseTrashPath);
                  return false;
                }
          

          This false gets percolated up to FsShell:delete which interprets the failure as instruction to go ahead with the hard delete:

                if (trashTmp.moveToTrash(src)) {
                  System.out.println("Moved to trash: " + src);
                  return;
                }
              }
              
          if (srcFs.delete(src, true)) {

          It would probably be better to throw and exception rather than return false, so that the deletion doesn't go ahead.
          This is not new behavior, it's been around since at least January (the farthest back into the repo I went). Looks like it just hasn't been tested.

          Show
          Jakob Homan added a comment - It looks like Gary did his test slightly differently: in his test the Trash directory doesn't exist beforehand and there isn't quota available to even create it. At which point we hit: for ( int i = 0; i < 2; i++) { try { if (!fs.mkdirs(baseTrashPath, PERMISSION)) { // create current LOG.warn( "Can't create trash directory: " +baseTrashPath); return false ; } } catch (IOException e) { LOG.warn( "Can't create trash directory: " +baseTrashPath); return false ; } This false gets percolated up to FsShell:delete which interprets the failure as instruction to go ahead with the hard delete: if (trashTmp.moveToTrash(src)) { System .out.println( "Moved to trash: " + src); return ; } } if (srcFs.delete(src, true )) { It would probably be better to throw and exception rather than return false, so that the deletion doesn't go ahead. This is not new behavior, it's been around since at least January (the farthest back into the repo I went). Looks like it just hasn't been tested.
          Hide
          gary murry added a comment -

          Actually, I take that back. Turns out my code did have the fixe for HDFS-677.

          Show
          gary murry added a comment - Actually, I take that back. Turns out my code did have the fixe for HDFS-677 .
          Hide
          gary murry added a comment -

          Yea, I think Rob is right. It turns out the build I am testing, does not have it in.

          Show
          gary murry added a comment - Yea, I think Rob is right. It turns out the build I am testing, does not have it in.
          Hide
          Robert Chansler added a comment -

          Duplicate of HDFS-677?

          Show
          Robert Chansler added a comment - Duplicate of HDFS-677 ?
          gary murry made changes -
          Field Original Value New Value
          Summary rm and rmr can accidently delete user's data rm and rmr fail to correctly move the user's files to the trash prior to deleting when they are over quota.
          Hide
          gary murry added a comment -

          After much deserved ridicule... I fixed the title.

          Show
          gary murry added a comment - After much deserved ridicule... I fixed the title.
          Hide
          gary murry added a comment -

          Steps to Reproduce:
          1) Make sure fs.trash.interval in core-site.xml is set to some positive number
          2) Copy a large file to your hdfs
          3) Set a low size quota for your hdfs
          4) Do a rm of the large file

          Result:
          A message comes up saying, that the directory could not be created in the trash, but that the file was deleted.

          Expected:
          If the file fails to move to trash, then it should not be deleted.

          Show
          gary murry added a comment - Steps to Reproduce: 1) Make sure fs.trash.interval in core-site.xml is set to some positive number 2) Copy a large file to your hdfs 3) Set a low size quota for your hdfs 4) Do a rm of the large file Result: A message comes up saying, that the directory could not be created in the trash, but that the file was deleted. Expected: If the file fails to move to trash, then it should not be deleted.
          gary murry created issue -

            People

            • Assignee:
              Jakob Homan
              Reporter:
              gary murry
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development