|
[
Permlink
| « Hide
]
Robert Chansler added a comment - 12/Aug/08 07:53 PM
Draft user guide.
Patch for diskspace quotas is attached.
It works very similar to name space quotas implemented in Description of "Space quotas" in the admin guide attached pretty much serves as the spec. DFSClient is modified so that it throws the actual exception (including QuotaExceededException) when these exception occurs in other threads (most of the time). TestQuota.java tests space quotas as well. Currently there are differences in dfsadmin command names. will fix them. User guide says : "A quota of zero forces the directory tree to remain empty of files."
A directory with a space quota of '0' bytes can have zero length files (though setting the quota to 0 is not actually allowed). Somewhat more subtle than R's comment, it would not be possible to create a file in a directory with zero quota as creation requires that a block's-worth of quota is available. But one might think of setting a zero quota on a directory with only zero-length files. In general, new file creations can be prevented by setting a directory's byte quota to the sum of existing file lengths, or the name quota to the number of existing files and directories.
Also, did you want to include the updated documentation in this patch?
Actually creating a file does not require block allocations. So space quota will be not be checked. It is possible to do dfs.create(); dfs.close(), even in a directory that reached its space quota. Is that what we want? > Also, did you want to include the updated documentation in this patch? sure, but not required in this jira. Irrespective of whether it is possible or not, I think we need clarify/decide the policy : Should creating an empty file take up any space quota? My vote is that creating a empty file does not take any disk-quota. (However, it does take up a namespace quota).
Sorry, I confused the issue.
Create should not require any space quota. The quota should be checked only when a new block is requested. I'll attach a new version of the guide with the clarification! Corrects a grammar mistake and clarifies that file creation does not require space quota.
Updated patch resolves a conflict with a recent commit.
Thanks for the review Konstantin.
Regd (8), (9) etc : Could you suggest a value for quota that implies 'do not modify quota'. Updated patch incorporates detailed review from Konstantin.
> 8 , 9 : We use Long.MIN_VALUE to indicate "don't modify". This reduces quite a bit of similar looking code.
You could use the wrapper class Long and use null for "don't modify". i.e. private void unprotectedSetQuota(String src, Long nsQuota, Long dsQuota)
+1 the rest looks good to me. -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12389870/HADOOP-3938.patch against trunk revision 694562. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 22 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3257/testReport/ This message is automatically generated. Attached patch does the following :
Regd Nicholas' comment : Using Long object also works. Not sure if there is any real difference. From the user spec:
"Quotas are persistent with the fsimage. When starting, if the fsimage is immediately in violation of a quota (perhaps the fsimage was surreptitiously modified), the startup operation fails with an error report" So, how as an administrator could I ever fix such a problem? Are there administration commands that are going to work directly on the fsimage without a namenode? This seems like a very poor requirement. -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12390029/HADOOP-3938.patch against trunk revision 695690. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 22 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3265/testReport/ This message is automatically generated.
my impression as well. I am not very sure why that was a requirement. Imo, you should be able to start name-node with quotas turned off (in config) and then be able to correct quotas for the faulty directory.
Does the patch allow that? We should also check namespace quotas for the same problem. That's weird to set quotas with the configuration turned off? I hope this is implemented as Konstantin said, but some unit tests for this use case would probably help future implementors
I'm going to jump on the pile and ask that a test plan be included for this feature.
I don't see any global conf to disable quota restrictions. This is the case before and after this patch. If the image is wrong for some reason, then it won't start.. it looks like.
> I'm going to jump on the pile and ask that a test plan be included for this feature.
It is the same as in On second thought my proposal is not good since it is exactly the way to produce such incorrect state.
Instead, we should just remove the restriction and let the server start with a warning. A unit test would be hard to write for the case since there are no valid ways to reproduce the condition. I'm guilty of having caused the confusion.
In any case, the consensus seems to be for policy to be the same for both space and name quotas, and that the proper policy is to log any violations that are observed at start up, but to resume normal operation. I reviewed the patch, and the offending words about not starting are not in the new documentation. The new documentation as written only requires that the quota violation be logged.
Konstantin said
> A unit test would be hard to write for the case since there are no valid ways to reproduce the condition. you could a functional test with a vmware/xen image, though it would take a lot of work. The alternate tactic is to have a mock implementation of the code to determine disk space use, and simulate failures when a node comes up. We do have a test that uses a fixed NameNode image (used for testing upgrade from previous versions). It takes work to maintain as well. The quota violation can happen only with software bug and it is not fatal, I don't think it is a must to have a unit test for it in this jira. I will manually test of course.
Steve, what I meant is that you cannot (naturally, using hdfs methods) create an image which violates a quota.
You can obviously write an arbitrary number instead of the existing quota value directly in the fsimage file if you know where, but this is not a "valid" way. BTW, this issue is about hdfs (distributed) space quotas, not sure that determining "disk space use" helps a lot. Updated patch has following changes over the previous patch :
The handling of quota violations are tested by running a modified NameNode that does not generate a QuotaExceededException. will have another comment on this. Konstantin, could you take another look at the patch? thanks.
tes-patch output from my machine :
[exec] +1 overall.
[exec] +1 @author. The patch does not contain any @author tags.
[exec] +1 tests included. The patch appears to include 22 new or modified tests.
[exec] +1 javadoc. The javadoc tool did not generate any warning messages.
[exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
[exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
Patch updated : minor change for INodeDirectoryWithQuota.setQuota(). It verifies only if the new quota is more restrictive.
Thanks Konstantin. Updated patch modifies setQuota() as Konstantin suggested. It looks better now. setQuota() does verifies only the the quota that is modified. So if space quota is not modified then it will not be checked. This is essential to deal with problems in quota management (previous run of HDFS).
+1 this looks good now.
How to test with quota violations in fsimage :
===================================
Integrated in Hadoop-trunk #611 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/611/
Attaching TestPlan for Space Quota feature.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||