Issue Details (XML | Word | Printable)

Key: HADOOP-3187
Type: New Feature New Feature
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Hairong Kuang
Reporter: Robert Chansler
Votes: 0
Watchers: 2
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Quotas for name space management

Created: 05/Apr/08 12:18 AM   Updated: 06/Sep/08 01:03 AM
Return to search
Component/s: None
Affects Version/s: None
Fix Version/s: 0.18.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works quota.patch 2008-05-27 05:41 PM Hairong Kuang 68 kB
Text File Licensed for inclusion in ASF works quota1.patch 2008-05-30 11:10 PM Hairong Kuang 89 kB
Text File Licensed for inclusion in ASF works quota2.patch 2008-06-03 06:58 PM Hairong Kuang 93 kB
Text File Licensed for inclusion in ASF works quota3.patch 2008-06-03 11:31 PM Hairong Kuang 93 kB
Text File Licensed for inclusion in ASF works quota4.patch 2008-06-04 06:13 PM Hairong Kuang 93 kB
PDF File Licensed for inclusion in ASF works QuotasDesign.pdf 2008-05-02 05:54 PM Hairong Kuang 115 kB
PDF File Licensed for inclusion in ASF works QuotasDesign1.pdf 2008-05-27 11:49 PM Hairong Kuang 115 kB
PDF File Licensed for inclusion in ASF works QuotasDesign2.pdf 2008-06-05 10:53 PM Hairong Kuang 116 kB
Issue Links:
Dependants
 
Reference
 

Hadoop Flags: Reviewed
Release Note:
Introduced directory quota as hard limits on the number of names in the tree rooted at that directory. An administrator may set quotas on individual directories explicitly. Newly created directories have no associated quota. File/directory creations fault if the quota would be exceeded. The attempt to set a quota faults if the directory would be in violation of the new quota.
Resolution Date: 05/Jun/08 06:13 AM


 Description  « Hide
Create a quota mechanism for name space management.

Quota set at a directory by super user restricts that number of names in and below that directory.

Quota tested by create() and rename().



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Hairong Kuang added a comment - 02/May/08 05:54 PM
Deisgn document is attached.

Doug Cutting added a comment - 02/May/08 06:16 PM
+1 This sounds like a good design to me. I assume that one could set the quota on the root directory to something based on the amount of memory available on the namenode, as the simplest means of control, right?

Hairong Kuang added a comment - 02/May/08 09:59 PM
> I assume that one could set the quota on the root directory to something based on the amount of memory available on the namenode, as the simplest means of control, right?

Doug, yes, you are right.


Hairong Kuang added a comment - 27/May/08 05:41 PM
This patch has the following major changes:
1. Support two adminstrative commands: setQuota and clrQuota;
2. Add an option -q to the count command that prints the quota of a directory;
3. Change operations mkdirs, create, and rename to respect quota;
4. All directories save its quota to the image file.

dhruba borthakur added a comment - 27/May/08 06:14 PM
The code to load an fsimage form disk to memory should try to avoid object allocation if possible, this speeds up namenode restart times. FSEditLog.readInt() allocates an IntWritbale every time it is invoked. Is this method being used at fsimage loading time?

Hairong Kuang added a comment - 27/May/08 06:40 PM
Yes, instead of allocating a new IntWritale, I should allocate a static IntWritable and reuse this variable every time reading an integer from fsimage.

Hairong Kuang added a comment - 27/May/08 11:49 PM
A slightly modified design document that matches the quota implementation.

Konstantin Shvachko added a comment - 29/May/08 02:13 AM
  1. DFSAdmin.printHelp()
    • summary should be consistent with the detailed usage messages
      <code>
      "\t[-refreshNodes] [-setQuota <dirname> <N>] [-clrQuota <dirname>]\n" +
      </code>
    • the same in printUsage()
    • The documentation for setQuota defines parameters in the reverse order -setQuota N name.
  2. I am not sure, but my first thought was that QuotaExceededException is a partial case of AccessControlException.
    One way to treat it is: if quota is exceeded for a directory then the write access to the directory is forbidden.
  3. File QuotaExceededException.java is missing in the patch.
  4. FSNamesystem.clearQuota() should not be public.
  5. Please use **-comments for new methods (private or not) so that they appear in JavaDoc.
  6. unprotectedRenameTo() catches IOException which is never thrown.
  7. INodeDirectory.replaceChildWithSameName(newNode, oldNode) probably does not need 2 parameters.
    You can just use newNode and search for the old inode using the new one's name.
    You do a binary search anyway. And the method name (ChildWithSameName) reflects the semantics.
  8. INodeDirectoryWithQuota.count is int. This in fact imposes a restriction on the number of files in hdfs compared to current implementation.
    The restriction is imposed by making rootDir a INodeDirectoryWithQuota. The quota is not set for the root by default,
    but the count is an integer.
    Right now our name-node cannot support Integer.MAX_VALUE number of files, but may be in the future it will.
    I think it is ok to have int count for directories that have their quotas set, and I think it is all right to restrict
    the quota values (and therefore the count) by Integer.MAX_VALUE. But regular directories and especially the root
    should be able to have more entries.
  9. The same with ContentSummary. fileCount and directoryCount fields should remain long imo, while quota can be int.
  10. This is not introduced by your patch but could you please remove redundant imports of java.io.DataInput and DataOutput
  11. When creating files and directories you first increment counters for all directories with quotas then actually create
    a new directory entry at the specified path. You restore the counters for the directories only if one of the quotas
    on the path happens to exceed the high mark. If something happens before file or directory is actually
    added but after the counters are incremented the quotas will be wrong. I think it would be safer to
    1. verify that no quotas are exceeded
    2. add new directory entry
    3. increment counters for directories with quotas on the path.
      This will require splitting the updateCount() into 2 methods: one will verify quotas and and the other will increment the counters.

Hairong Kuang added a comment - 30/May/08 06:31 PM
Konstantin, thanks a lot for your comments. I agree with most of them except for
Comment 2: AccessControlException concerns about user identity. But QuotaExceededException has nothing to do with users and groups. So I would prefer to make it be an IOException.
Comments 8, 9: Currently int would be enough for counting namespace size. But if we concern about the backward compatibility and future support, I will make both quota and count to be long.
Comments 11: I prefer to make the checking and updating count to be an atomic operation in stead of breaking it into two operations. Even with your approach, there is a risk of inconsistency if there is a runtime exception in between the tree update and the count update.

Hairong Kuang added a comment - 30/May/08 11:10 PM
This patch does the followings:
1. incorporate Konstantin and Dhruba's comments;
2. allows globs and multiple directories as the arguments to setQuota and clrQuota commands;
3. has a unit test TestQuota.

Tsz Wo (Nicholas), SZE added a comment - 03/Jun/08 02:14 AM
I like the idea of defining an abstract class for all shell commands. Not able to review the whole patch yet. Here are some comments:
  • It is good to move CommandFormat to fs.shell package
  • Instead of initializing Command.args in subclasses, it is better to initialize in Command constructor.
  • In Command, the default constructor can be removed. Members like fs and args can be declared as final.
  • ContentSummary.HEADER should be final. This is probably my fault when I introduced it.
  • For parsing String to get the first line, it is more efficient to use substring(...) than split(...)
    int eol = s.indexOf('\n');
    String firstline = eol<0? s: s.substring(0, eol);
  • In QuotaExceededException, constructors QuotaExceededException(String msg) and QuotaExceededException(String path, long quota, long count) are not used anywhere. I suggest to remove them. When creating an instance by QuotaExceededException(long quota, long count), the error message does not look nice since path=="". QuotaException sounds better to me and it includes all quota related issues.
  • INodeDirectoryWithQuota should be defined in a new file INodeDirectoryWithQuota.java
  • verifyQuota(long quota, long count) should be static.
  • When setting quota, it is good to check whether the new quota is a positive integer. Right now the new quota will be compared with the count and if the new quota <0, a QuotaExceededException will be thrown after a long calculation.

Hadoop QA added a comment - 03/Jun/08 04:24 AM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12383155/quota1.patch
against trunk revision 662569.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 24 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

-1 javac. The applied patch generated 412 javac compiler warnings (more than the trunk's current 410 warnings).

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2541/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2541/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2541/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2541/console

This message is automatically generated.


Hairong Kuang added a comment - 03/Jun/08 06:58 PM
This patch incorporates most of Nicholas' comments.

Tsz Wo (Nicholas), SZE added a comment - 03/Jun/08 11:01 PM
Two minor comments to the new patch. Otherwise, everything looks good to me.
  • We don't need readLongWritable(DataInputStream in), just call in.readLong().
  • In unprotectedClearQuota, I think "if (targetNode.getQuota() >= 0) {" should be "if (targetNode instanceof INodeDirectoryWithQuota) {". The assert followed can be removed.

Hairong Kuang added a comment - 03/Jun/08 11:31 PM
Long was saved to the disk using the serialization method in LongWritable so we need to use the deserialization method in LongWritable to read it out.

I incorporated the second comment in the new patch.


Hadoop QA added a comment - 04/Jun/08 12:26 AM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12383338/quota3.patch
against trunk revision 662913.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 27 new or modified tests.

-1 patch. The patch command could not apply the patch.

Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2560/console

This message is automatically generated.


Konstantin Shvachko added a comment - 04/Jun/08 04:35 AM
  • FSDirectory.removeChild() should throw QuotaExceededException rather than IOException
  • could you please add comments to the ignored catch IOException esp, in FSDirectory.unprotectedRenameTo().
    It took me a while to realize that you just postpone processing of this till the next if statement.
    The comment can say something like // srcChild == null; goto if statement.
  • FSDirectory.updateCount() is a confusing name. Can we rename it to updateSiblingCount() or updateCountOfChildren().
  • I think INodeDirectoryWithQuota should stay inside INode.java. Otherwise all other subclasses should be factored out too.
  • Public class QuotaExceededException needs JavaDoc comments.
  • As long as the external API (ContentSummary) has long file counters I am fine with the implementation that has integer counters.
    This makes it an implementation restriction (changeable on demand) rather than a global file system restriction.

The patch looks good.


Hadoop QA added a comment - 04/Jun/08 01:30 PM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12383338/quota3.patch
against trunk revision 662976.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 27 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

-1 javac. The applied patch generated 411 javac compiler warnings (more than the trunk's current 409 warnings).

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

-1 core tests. The patch failed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2569/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2569/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2569/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2569/console

This message is automatically generated.


Hairong Kuang added a comment - 04/Jun/08 06:03 PM
1, FSDirectory.updateCount() is changed to be updateCountForInodeWithQuota.
2. I still keep quota & count to be long. So in the future when dfs grows, we do not need to make any change.
The patch Incorporated rest of Konstantin's comments.

Tsz Wo (Nicholas), SZE added a comment - 04/Jun/08 10:56 PM - edited
> I think INodeDirectoryWithQuota should stay inside INode.java. Otherwise all other subclasses should be factored out too.

Yes, all other subclasses should be factored out too. We should fix it in HADOOP-1826.

+1 patch looks good


Hadoop QA added a comment - 05/Jun/08 12:21 AM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12383403/quota4.patch
against trunk revision 663370.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 27 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

-1 javac. The applied patch generated 449 javac compiler warnings (more than the trunk's current 447 warnings).

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

+1 core tests. The patch passed core unit tests.

-1 contrib tests. The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2578/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2578/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2578/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2578/console

This message is automatically generated.


Konstantin Shvachko added a comment - 05/Jun/08 12:25 AM - edited
> 2. I still keep quota & count to be long.
Yes keep them long, that is what I meant to say.

The contrib test failure does not seem to be related to the patch, the previous build failed the same test.

+1


Hairong Kuang added a comment - 05/Jun/08 06:13 AM
The failed tests exist in the trunk. They does not seem to be related to the patch. The increased 2 javac warnings are caused by using UTF8 to save path names into the edit log.

I just committed this.


Devaraj Das added a comment - 05/Jun/08 07:28 AM
By the way, I reverted the patch and applied it again since some files were missing in the first commit and the trunk build was broken due to the missing files.

Hairong Kuang added a comment - 05/Jun/08 10:53 PM
Updated design document.