Issue Details (XML | Word | Printable)

Key: HADOOP-6080
Type: New Feature New Feature
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Jakob Homan
Reporter: Koji Noguchi
Votes: 0
Watchers: 4
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Handling of Trash with quota

Created: 18/Jun/09 07:02 PM   Updated: 01/Jul/09 11:10 AM
Return to search
Component/s: fs
Affects Version/s: None
Fix Version/s: 0.20.1

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works HADOOP-6080-v20.patch 2009-06-30 07:12 AM Jakob Homan 9 kB
Text File Licensed for inclusion in ASF works HADOOP-6080.patch 2009-06-30 07:12 AM Jakob Homan 9 kB
Text File javac_warnings_diff.txt 2009-06-30 01:31 AM Jakob Homan 17 kB

Hadoop Flags: Reviewed
Release Note: Provide a new option to rm and rmr, -skipTrash, which will immediately delete the files specified, rather than moving them to the trash.
Resolution Date: 30/Jun/09 07:02 PM


 Description  « Hide
Currently with quota turned on, user cannot call '-rmr' on large directory that causes over quota.
[knoguchi src]$ hadoop dfs -rmr /tmp/net2
rmr: Failed to move to trash: hdfs://abc.def.com/tmp/net2
[knoguchi src]$ hadoop dfs -mv /tmp/net2 /user/knoguchi/.Trash/Current
mv: org.apache.hadoop.hdfs.protocol.QuotaExceededException: The quota of /user/knoguchi is exceeded: namespace
quota=37500 file count=37757, diskspace quota=-1 diskspace=1991250043353

Besides from error message being unfriendly, how should this be handled?



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Koji Noguchi added a comment - 18/Jun/09 07:03 PM
1) Tell users to use -Dfs.trash.interval=0 when deleting large directory
2) Exclude /user/<username>/.Trash from the quota
3) Move .Trash out of /user directory. Maybe /Trash/<username> and set different quota.
4) When -rm/rmr fail with quota, automatically delete them.
5) Introduce a separate command that does (1). Something like -rmr -skipTrash for force delete.

Raghu Angadi added a comment - 18/Jun/09 07:45 PM
+1 for (5).

Tsz Wo (Nicholas), SZE added a comment - 18/Jun/09 10:10 PM
If we do (5), we still have to do (1), i.e. add a message telling the user to use -skipTrash .

Jakob Homan added a comment - 26/Jun/09 12:10 AM
I'm going to go ahead and implement 5.

Jakob Homan added a comment - 30/Jun/09 01:14 AM
Patch adds a new option to the fsshell rm and rmr commands: -skipTrash, which performs as expected. Adds to trash unit test to verify correct execution. Changes documentation to reflect new option. Docs suggest this option as being a solution when a directory is over quota.

Passes all commons unit tests. Running test patch now. Will post those results when done.


Jakob Homan added a comment - 30/Jun/09 01:14 AM
submitting patch.

Tsz Wo (Nicholas), SZE added a comment - 30/Jun/09 01:20 AM
+1 patch looks good

Jakob Homan added a comment - 30/Jun/09 01:31 AM
Test-patch:
[exec] -1 overall.  
[exec] 
[exec]     +1 @author.  The patch does not contain any @author tags.
[exec] 
[exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
[exec] 
[exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
[exec] 
[exec]     -1 javac.  The applied patch generated 64 javac compiler warnings (more than the trunk's current 124 warnings).
[exec] 
[exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
[exec] 
[exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.

This is weird. I've attached the javac warnings it says are new and they have nothing to do with this patch. test-patch must be broken in this regard. I believe the patch is ready to go.


Jakob Homan added a comment - 30/Jun/09 03:24 AM
Attaching two new files:
  • Updated patch. Previous patch missed updating the help text for rmr to include -skipTrash option. No change to actual code.
  • Patch for Hadoop 20 off of the Hadoop-20 branch from svn. Nothing had to be changed for patch, just file locations were different. Code is still the same. Passes unit tests.

Jakob Homan added a comment - 30/Jun/09 05:36 AM
Canceling patch to double check something.

Jakob Homan added a comment - 30/Jun/09 07:12 AM
Ran into a problem that I didn't notice with the HDFS version of TestTrash. I think there's an issue with the FileSystem.listStatus methods between LocalFileSystem and DistributedFileSystem, which I'll look into. In the meantime, modified test so that it doesn't rely on that method and works on both local and distributed file systems.
Will run full test suite tonight, report tomorrow morning. Also, deleted old patches to avoid confusion. New patches for both trunk and v20 should be good to go.

Jakob Homan added a comment - 30/Jun/09 03:02 PM
Updated patches are good to go on all commons unit tests for trunk and all tests for v20. Test-patch is fine except the incorrect javac warnings, which are not related.
     [exec] -1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     -1 javac.  The applied patch generated 64 javac compiler warnings (more than the trunk's current 124 warnings).
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.

Sanjay Radia added a comment - 30/Jun/09 04:25 PM
Should we consider excluding trash for files deleted from /tmp (ie make -skipTrash implicit when deleting from /tmp.)?

Jakob Homan added a comment - 30/Jun/09 04:54 PM

Should we consider excluding trash for files deleted from /tmp (ie make -skipTrash implicit when deleting from /tmp.)?

I'm not a fan of special cases for certain directories, even for /tmp, and particularly when we're already straying away from the posix world with the trash feature. Minimizing surprise seems a good goal, and I'd be very surprised if I were accustomed to explicitly skipping the trash when I want and discovering something I had expected to be trashed had been helpfully nuked by the system.


Tsz Wo (Nicholas), SZE added a comment - 30/Jun/09 06:35 PM
> [exec] -1 javac. The applied patch generated 64 javac compiler warnings (more than the trunk's current 124 warnings).

The patch does not seem to have so many warnings. Filed HADOOP-6122.


Konstantin Shvachko added a comment - 30/Jun/09 07:02 PM
I just committed this. Thank you Jakob.

Jakob Homan added a comment - 30/Jun/09 07:40 PM
added release note.

Hudson added a comment - 01/Jul/09 11:10 AM
Integrated in Hadoop-Common-trunk #13 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/13/)
is going to 0.20.
. Introduce -skipTrash option to rm and rmr. Contributed by Jakob Homan.