Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.19.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Introducted byte space quotas for directories. The count shell command modified to report both name and byte quotas.

      Description

      Directory quotas for bytes limit the number of bytes used by files in and below the directory. Operation is independent of name quotas (HADOOP-3187), but the implementation is parallel. Each file is charged according to its length multiplied by its intended replication factor.

      1. HADOOP-3938.patch
        103 kB
        Raghu Angadi
      2. HADOOP-3938.patch
        103 kB
        Raghu Angadi
      3. HADOOP-3938.patch
        103 kB
        Raghu Angadi
      4. HADOOP-3938.patch
        101 kB
        Raghu Angadi
      5. HADOOP-3938.patch
        91 kB
        Raghu Angadi
      6. HADOOP-3938.patch
        88 kB
        Raghu Angadi
      7. HADOOP-3938.patch
        89 kB
        Raghu Angadi
      8. HADOOP-3938.patch
        90 kB
        Raghu Angadi
      9. HADOOP-3938.patch
        89 kB
        Raghu Angadi
      10. hdfs_quota_admin_guide.pdf
        8 kB
        Robert Chansler
      11. hdfs_quota_admin_guide.xml
        5 kB
        Robert Chansler
      12. SpaceQuota.html
        17 kB
        Ravi Phulari

        Issue Links

          Activity

          Hide
          Robert Chansler added a comment -

          Draft user guide.

          Show
          Robert Chansler added a comment - Draft user guide.
          Hide
          Raghu Angadi added a comment -

          Patch for diskspace quotas is attached.

          It works very similar to name space quotas implemented in HADOOP-3187 and design document attached applies to this jira as well.

          Description of "Space quotas" in the admin guide attached pretty much serves as the spec.

          DFSClient is modified so that it throws the actual exception (including QuotaExceededException) when these exception occurs in other threads (most of the time).

          TestQuota.java tests space quotas as well.

          Currently there are differences in dfsadmin command names. will fix them.

          Show
          Raghu Angadi added a comment - Patch for diskspace quotas is attached. It works very similar to name space quotas implemented in HADOOP-3187 and design document attached applies to this jira as well. Description of "Space quotas" in the admin guide attached pretty much serves as the spec. DFSClient is modified so that it throws the actual exception (including QuotaExceededException) when these exception occurs in other threads (most of the time). TestQuota.java tests space quotas as well. Currently there are differences in dfsadmin command names. will fix them.
          Hide
          Raghu Angadi added a comment -

          User guide says : "A quota of zero forces the directory tree to remain empty of files."

          A directory with a space quota of '0' bytes can have zero length files (though setting the quota to 0 is not actually allowed).

          Show
          Raghu Angadi added a comment - User guide says : "A quota of zero forces the directory tree to remain empty of files." A directory with a space quota of '0' bytes can have zero length files (though setting the quota to 0 is not actually allowed).
          Hide
          Robert Chansler added a comment -

          Somewhat more subtle than R's comment, it would not be possible to create a file in a directory with zero quota as creation requires that a block's-worth of quota is available. But one might think of setting a zero quota on a directory with only zero-length files. In general, new file creations can be prevented by setting a directory's byte quota to the sum of existing file lengths, or the name quota to the number of existing files and directories.

          Also, did you want to include the updated documentation in this patch?

          Show
          Robert Chansler added a comment - Somewhat more subtle than R's comment, it would not be possible to create a file in a directory with zero quota as creation requires that a block's-worth of quota is available. But one might think of setting a zero quota on a directory with only zero-length files. In general, new file creations can be prevented by setting a directory's byte quota to the sum of existing file lengths, or the name quota to the number of existing files and directories. Also, did you want to include the updated documentation in this patch?
          Hide
          Raghu Angadi added a comment -

          Somewhat more subtle than R's comment, it would not be possible to create a file in a directory with zero quota as creation requires that a block's-worth of quota is available. But one might think of setting a zero quota on a directory with only zero-length files.

          Actually creating a file does not require block allocations. So space quota will be not be checked. It is possible to do dfs.create(); dfs.close(), even in a directory that reached its space quota. Is that what we want?

          > Also, did you want to include the updated documentation in this patch?

          sure, but not required in this jira.

          Irrespective of whether it is possible or not, I think we need clarify/decide the policy : Should creating an empty file take up any space quota?

          Show
          Raghu Angadi added a comment - Somewhat more subtle than R's comment, it would not be possible to create a file in a directory with zero quota as creation requires that a block's-worth of quota is available. But one might think of setting a zero quota on a directory with only zero-length files. Actually creating a file does not require block allocations. So space quota will be not be checked. It is possible to do dfs.create(); dfs.close() , even in a directory that reached its space quota. Is that what we want? > Also, did you want to include the updated documentation in this patch? sure, but not required in this jira. Irrespective of whether it is possible or not, I think we need clarify/decide the policy : Should creating an empty file take up any space quota?
          Hide
          dhruba borthakur added a comment -

          My vote is that creating a empty file does not take any disk-quota. (However, it does take up a namespace quota).

          Show
          dhruba borthakur added a comment - My vote is that creating a empty file does not take any disk-quota. (However, it does take up a namespace quota).
          Hide
          Robert Chansler added a comment -

          Sorry, I confused the issue.

          Create should not require any space quota. The quota should be checked only when a new block is requested.

          I'll attach a new version of the guide with the clarification!

          Show
          Robert Chansler added a comment - Sorry, I confused the issue. Create should not require any space quota. The quota should be checked only when a new block is requested. I'll attach a new version of the guide with the clarification!
          Hide
          Robert Chansler added a comment -

          Corrects a grammar mistake and clarifies that file creation does not require space quota.

          Show
          Robert Chansler added a comment - Corrects a grammar mistake and clarifies that file creation does not require space quota.
          Hide
          Raghu Angadi added a comment -

          Updated patch resolves a conflict with a recent commit.

          Show
          Raghu Angadi added a comment - Updated patch resolves a conflict with a recent commit.
          Hide
          Raghu Angadi added a comment -

          Patch updated for trunk.

          Show
          Raghu Angadi added a comment - Patch updated for trunk.
          Hide
          Konstantin Shvachko added a comment - - edited
          1. It is better to implement
            public ContentSummary(long length, long fileCount, long directoryCount) {
             this(length, fileCount, directoryCount, length, -1)
            }
            
          2. DFSClient in locateFollowingBlock() you do not need to throw unwrapped exception in order to check whether it is a RemoteException.
            And also you do not throw if the remote exception is not of any of the 4 types considered.
            try {
             return namenode.addBlock(src, clientName);
            } catch (RemoteException re) {
              IOException ue = re.unwrapRemoteException(FileNotFoundException.class,
                                                        AccessControlException.class,
                                                        QuotaExceededException.class);
              if(re != ue)
                throw ue;
              ue = re.unwrapRemoteException(NotReplicatedYetException.class);
              if(--retries == 0 && re == ue)  // not a NotReplicatedYetException
                throw re;
              LOG.warn("NotReplicatedYetException sleeping " + src + " retries left " + retries);
              try {
                Thread.sleep(sleeptime);
                sleeptime *= 2;
              } catch (InterruptedException ie) {}
            }
            
          3. JavaDoc for QuotaExceededException should reflect new space quota related semantics.
          4. In QuotaExceededException.getMessage() " file count=" instead of " namespace count=" may sound better.
          5. FSDirectory does not need to import StringUtils.
          6. FSDirectory.unprotectedAddFile() should probably not add blocks first and then remove them
            if the space quota is violated. The blocks can be added once the node is successfully created,
            you may need to pass the size of the file to addNode().
          7. numItemsInTree() followed by diskspaceInTree() is called on several occasions: INodeDirectoryWithQuota(), addChild(), removeChild().
            This is very inefficient, tree traversal should be be done only once.
            You can use something like the TwoCounters class but with a more meaningful name, say DirectoryQuota.
            I think numItemsInTree() and diskspaceInTree() should be merged in one method preferably non-recursive.
          8. unprotectedSetQuota() should take 2 longs rather than a long and a boolean
            private void unprotectedSetQuota(String src, long nsQuota, long dsQuota)
            

            as in other places, e.g. addToParent().

          9. I am not sure we need to add an extra pair of set/clear-DiskspaceQuota(), may be we just need
            an extra parameter in the old setQuota(src, nsQuota, dsQuota)
          10. Should we check space quota when we start an append?
          11. NameNode redundantly imports BlockCommand (not introduced by this patch).
          Show
          Konstantin Shvachko added a comment - - edited It is better to implement public ContentSummary( long length, long fileCount, long directoryCount) { this (length, fileCount, directoryCount, length, -1) } DFSClient in locateFollowingBlock() you do not need to throw unwrapped exception in order to check whether it is a RemoteException. And also you do not throw if the remote exception is not of any of the 4 types considered. try { return namenode.addBlock(src, clientName); } catch (RemoteException re) { IOException ue = re.unwrapRemoteException(FileNotFoundException.class, AccessControlException.class, QuotaExceededException.class); if (re != ue) throw ue; ue = re.unwrapRemoteException(NotReplicatedYetException.class); if (--retries == 0 && re == ue) // not a NotReplicatedYetException throw re; LOG.warn( "NotReplicatedYetException sleeping " + src + " retries left " + retries); try { Thread .sleep(sleeptime); sleeptime *= 2; } catch (InterruptedException ie) {} } JavaDoc for QuotaExceededException should reflect new space quota related semantics. In QuotaExceededException.getMessage() " file count=" instead of " namespace count=" may sound better. FSDirectory does not need to import StringUtils . FSDirectory.unprotectedAddFile() should probably not add blocks first and then remove them if the space quota is violated. The blocks can be added once the node is successfully created, you may need to pass the size of the file to addNode() . numItemsInTree() followed by diskspaceInTree() is called on several occasions: INodeDirectoryWithQuota(), addChild(), removeChild() . This is very inefficient, tree traversal should be be done only once. You can use something like the TwoCounters class but with a more meaningful name, say DirectoryQuota. I think numItemsInTree() and diskspaceInTree() should be merged in one method preferably non-recursive. unprotectedSetQuota() should take 2 longs rather than a long and a boolean private void unprotectedSetQuota( String src, long nsQuota, long dsQuota) as in other places, e.g. addToParent(). I am not sure we need to add an extra pair of set/clear-DiskspaceQuota(), may be we just need an extra parameter in the old setQuota(src, nsQuota, dsQuota) Should we check space quota when we start an append? NameNode redundantly imports BlockCommand (not introduced by this patch).
          Hide
          Raghu Angadi added a comment -

          Thanks for the review Konstantin.
          Regd (8), (9) etc : Could you suggest a value for quota that implies 'do not modify quota'.

          Show
          Raghu Angadi added a comment - Thanks for the review Konstantin. Regd (8), (9) etc : Could you suggest a value for quota that implies 'do not modify quota'.
          Hide
          Raghu Angadi added a comment -

          Updated patch incorporates detailed review from Konstantin.

          • 1, 3, 4, 5, 6, 11 : done
          • 2: done. actual changes are slightly different to keep the lines changed to minimum
          • 7 : done. It is still recursive. I am not sure if non-recursive implementation buys anything. We can avoid recursive calls to INodeFile if we want to.
          • 8 , 9 : We use Long.MIN_VALUE to indicate "don't modify". This reduces quite a bit of similar looking code.
          • 10 : It is checked eventually in side startFileInternal() if the file being appended has a partially filled block.
          Show
          Raghu Angadi added a comment - Updated patch incorporates detailed review from Konstantin. 1, 3, 4, 5, 6, 11 : done 2: done. actual changes are slightly different to keep the lines changed to minimum 7 : done. It is still recursive. I am not sure if non-recursive implementation buys anything. We can avoid recursive calls to INodeFile if we want to. 8 , 9 : We use Long.MIN_VALUE to indicate "don't modify". This reduces quite a bit of similar looking code. 10 : It is checked eventually in side startFileInternal() if the file being appended has a partially filled block.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > 8 , 9 : We use Long.MIN_VALUE to indicate "don't modify". This reduces quite a bit of similar looking code.

          You could use the wrapper class Long and use null for "don't modify". i.e.

          private void unprotectedSetQuota(String src, Long nsQuota, Long dsQuota)
          
          Show
          Tsz Wo Nicholas Sze added a comment - > 8 , 9 : We use Long.MIN_VALUE to indicate "don't modify". This reduces quite a bit of similar looking code. You could use the wrapper class Long and use null for "don't modify". i.e. private void unprotectedSetQuota( String src, Long nsQuota, Long dsQuota)
          Hide
          Konstantin Shvachko added a comment -
          • In FSDirectory.replaceNode() you calculate diskspaceConsumed() even if you do not need to updateDiskspace.

          +1 the rest looks good to me.

          Show
          Konstantin Shvachko added a comment - In FSDirectory.replaceNode() you calculate diskspaceConsumed() even if you do not need to updateDiskspace . +1 the rest looks good to me.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12389870/HADOOP-3938.patch
          against trunk revision 694562.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 22 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3257/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3257/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3257/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3257/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12389870/HADOOP-3938.patch against trunk revision 694562. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 22 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3257/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3257/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3257/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3257/console This message is automatically generated.
          Hide
          Raghu Angadi added a comment -

          Attached patch does the following :

          • includes updated hdfs_quota_admin_guide.xml from Rob
          • Konstantin's review : invoke diskspaceConsumed() only when required (it is not required only in some error cases).
          • fix findbugs warning. ant test-patch is also run.

          Regd Nicholas' comment : Using Long object also works. Not sure if there is any real difference.

          Show
          Raghu Angadi added a comment - Attached patch does the following : includes updated hdfs_quota_admin_guide.xml from Rob Konstantin's review : invoke diskspaceConsumed() only when required (it is not required only in some error cases). fix findbugs warning. ant test-patch is also run. Regd Nicholas' comment : Using Long object also works. Not sure if there is any real difference.
          Hide
          Pete Wyckoff added a comment -

          From the user spec:

          "Quotas are persistent with the fsimage. When starting, if the fsimage is immediately in violation of a quota (perhaps the fsimage was surreptitiously modified), the startup operation fails with an error report"

          So, how as an administrator could I ever fix such a problem? Are there administration commands that are going to work directly on the fsimage without a namenode? This seems like a very poor requirement.

          Show
          Pete Wyckoff added a comment - From the user spec: "Quotas are persistent with the fsimage. When starting, if the fsimage is immediately in violation of a quota (perhaps the fsimage was surreptitiously modified), the startup operation fails with an error report" So, how as an administrator could I ever fix such a problem? Are there administration commands that are going to work directly on the fsimage without a namenode? This seems like a very poor requirement.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12390029/HADOOP-3938.patch
          against trunk revision 695690.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 22 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3265/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3265/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3265/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3265/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12390029/HADOOP-3938.patch against trunk revision 695690. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 22 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3265/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3265/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3265/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3265/console This message is automatically generated.
          Hide
          Raghu Angadi added a comment -

          "Quotas are persistent with the fsimage. When starting, if the fsimage is immediately in violation of a quota (perhaps the fsimage was surreptitiously modified), the startup operation fails with an error report"

          So, how as an administrator could I ever fix such a problem?

          my impression as well. I am not very sure why that was a requirement.

          Show
          Raghu Angadi added a comment - "Quotas are persistent with the fsimage. When starting, if the fsimage is immediately in violation of a quota (perhaps the fsimage was surreptitiously modified), the startup operation fails with an error report" So, how as an administrator could I ever fix such a problem? my impression as well. I am not very sure why that was a requirement.
          Hide
          Konstantin Shvachko added a comment -

          Imo, you should be able to start name-node with quotas turned off (in config) and then be able to correct quotas for the faulty directory.
          Does the patch allow that?
          We should also check namespace quotas for the same problem.

          Show
          Konstantin Shvachko added a comment - Imo, you should be able to start name-node with quotas turned off (in config) and then be able to correct quotas for the faulty directory. Does the patch allow that? We should also check namespace quotas for the same problem.
          Hide
          Pete Wyckoff added a comment - - edited

          That's weird to set quotas with the configuration turned off? I hope this is implemented as Konstantin said, but some unit tests for this use case would probably help future implementors

          Show
          Pete Wyckoff added a comment - - edited That's weird to set quotas with the configuration turned off? I hope this is implemented as Konstantin said, but some unit tests for this use case would probably help future implementors
          Hide
          Nigel Daley added a comment -

          I'm going to jump on the pile and ask that a test plan be included for this feature.

          Show
          Nigel Daley added a comment - I'm going to jump on the pile and ask that a test plan be included for this feature.
          Hide
          Raghu Angadi added a comment - - edited

          I don't see any global conf to disable quota restrictions. This is the case before and after this patch. If the image is wrong for some reason, then it won't start.. it looks like.

          Show
          Raghu Angadi added a comment - - edited I don't see any global conf to disable quota restrictions. This is the case before and after this patch. If the image is wrong for some reason, then it won't start.. it looks like.
          Hide
          Raghu Angadi added a comment -

          > I'm going to jump on the pile and ask that a test plan be included for this feature.
          It is the same as in HADOOP-3187. The unit test covers pretty much all of them. Does it sound good enough? Only the scale test is not done.

          Show
          Raghu Angadi added a comment - > I'm going to jump on the pile and ask that a test plan be included for this feature. It is the same as in HADOOP-3187 . The unit test covers pretty much all of them. Does it sound good enough? Only the scale test is not done.
          Hide
          Konstantin Shvachko added a comment -

          On second thought my proposal is not good since it is exactly the way to produce such incorrect state.
          Instead, we should just remove the restriction and let the server start with a warning.
          A unit test would be hard to write for the case since there are no valid ways to reproduce the condition.

          Show
          Konstantin Shvachko added a comment - On second thought my proposal is not good since it is exactly the way to produce such incorrect state. Instead, we should just remove the restriction and let the server start with a warning. A unit test would be hard to write for the case since there are no valid ways to reproduce the condition.
          Hide
          Robert Chansler added a comment -

          I'm guilty of having caused the confusion.

          In any case, the consensus seems to be for policy to be the same for both space and name quotas, and that the proper policy is to log any violations that are observed at start up, but to resume normal operation.

          Show
          Robert Chansler added a comment - I'm guilty of having caused the confusion. In any case, the consensus seems to be for policy to be the same for both space and name quotas, and that the proper policy is to log any violations that are observed at start up, but to resume normal operation.
          Hide
          Robert Chansler added a comment -

          I reviewed the patch, and the offending words about not starting are not in the new documentation. The new documentation as written only requires that the quota violation be logged.

          Show
          Robert Chansler added a comment - I reviewed the patch, and the offending words about not starting are not in the new documentation. The new documentation as written only requires that the quota violation be logged.
          Hide
          steve_l added a comment -

          Konstantin said
          > A unit test would be hard to write for the case since there are no valid ways to reproduce the condition.

          you could a functional test with a vmware/xen image, though it would take a lot of work. The alternate tactic is to have a mock implementation of the code to determine disk space use, and simulate failures when a node comes up.

          Show
          steve_l added a comment - Konstantin said > A unit test would be hard to write for the case since there are no valid ways to reproduce the condition. you could a functional test with a vmware/xen image, though it would take a lot of work. The alternate tactic is to have a mock implementation of the code to determine disk space use, and simulate failures when a node comes up.
          Hide
          Raghu Angadi added a comment -

          We do have a test that uses a fixed NameNode image (used for testing upgrade from previous versions). It takes work to maintain as well. The quota violation can happen only with software bug and it is not fatal, I don't think it is a must to have a unit test for it in this jira. I will manually test of course.

          Show
          Raghu Angadi added a comment - We do have a test that uses a fixed NameNode image (used for testing upgrade from previous versions). It takes work to maintain as well. The quota violation can happen only with software bug and it is not fatal, I don't think it is a must to have a unit test for it in this jira. I will manually test of course.
          Hide
          Konstantin Shvachko added a comment -

          Steve, what I meant is that you cannot (naturally, using hdfs methods) create an image which violates a quota.
          You can obviously write an arbitrary number instead of the existing quota value directly in the fsimage file if you know where, but this is not a "valid" way.
          BTW, this issue is about hdfs (distributed) space quotas, not sure that determining "disk space use" helps a lot.

          Show
          Konstantin Shvachko added a comment - Steve, what I meant is that you cannot (naturally, using hdfs methods) create an image which violates a quota. You can obviously write an arbitrary number instead of the existing quota value directly in the fsimage file if you know where, but this is not a "valid" way. BTW, this issue is about hdfs (distributed) space quotas, not sure that determining "disk space use" helps a lot.
          Hide
          Raghu Angadi added a comment -

          Updated patch has following changes over the previous patch :

          1. Quota violations are tolerated while loading filesystem during start up.
          2. updateCountForINodeWithQuota() called during initialization is rewritten.
          3. updateCountForINodeWithQuota() is now called after loading edits, rather than right after loading fsimage.
          4. Added one more test case to handle the case when quotas specified are too large.
          5. user_guide is slightly updated.

          The handling of quota violations are tested by running a modified NameNode that does not generate a QuotaExceededException. will have another comment on this.

          Show
          Raghu Angadi added a comment - Updated patch has following changes over the previous patch : Quota violations are tolerated while loading filesystem during start up. updateCountForINodeWithQuota() called during initialization is rewritten. updateCountForINodeWithQuota() is now called after loading edits, rather than right after loading fsimage. Added one more test case to handle the case when quotas specified are too large. user_guide is slightly updated. The handling of quota violations are tested by running a modified NameNode that does not generate a QuotaExceededException . will have another comment on this.
          Hide
          Raghu Angadi added a comment -

          Konstantin, could you take another look at the patch? thanks.

          Show
          Raghu Angadi added a comment - Konstantin, could you take another look at the patch? thanks.
          Hide
          Raghu Angadi added a comment -

          tes-patch output from my machine :

               [exec] +1 overall.
          
               [exec]     +1 @author.  The patch does not contain any @author tags.
          
               [exec]     +1 tests included.  The patch appears to include 22 new or modified tests.
          
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
          
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
          
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
          
          Show
          Raghu Angadi added a comment - tes-patch output from my machine : [exec] +1 overall. [exec] +1 @author. The patch does not contain any @author tags. [exec] +1 tests included. The patch appears to include 22 new or modified tests. [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
          Hide
          Raghu Angadi added a comment -

          Patch updated : minor change for INodeDirectoryWithQuota.setQuota(). It verifies only if the new quota is more restrictive.

          Show
          Raghu Angadi added a comment - Patch updated : minor change for INodeDirectoryWithQuota.setQuota() . It verifies only if the new quota is more restrictive.
          Hide
          Raghu Angadi added a comment -

          Thanks Konstantin. Updated patch modifies setQuota() as Konstantin suggested. It looks better now. setQuota() does verifies only the the quota that is modified. So if space quota is not modified then it will not be checked. This is essential to deal with problems in quota management (previous run of HDFS).

          Show
          Raghu Angadi added a comment - Thanks Konstantin. Updated patch modifies setQuota() as Konstantin suggested. It looks better now. setQuota() does verifies only the the quota that is modified. So if space quota is not modified then it will not be checked. This is essential to deal with problems in quota management (previous run of HDFS).
          Hide
          Konstantin Shvachko added a comment -

          +1 this looks good now.

          Show
          Konstantin Shvachko added a comment - +1 this looks good now.
          Hide
          Raghu Angadi added a comment -

          How to test with quota violations in fsimage :
          ===================================

          • Comment out two places where QuotaExceededException is thrown in INodeDirectoryWithQuota
          • Run a fresh HDFS and create a few files and directories such that it exceeds quotas for a few directories. For e.g. :
            bin/hadoop fs -mkdir quota3-3k
            bin/hadoop dfsadmin -setQuota 3 quota3-3k
            bin/hadoop fs -mkdir quota3-3k/emptyDir
            bin/hadoop fs -put /dev/null quota3-3k/emptyFile
            bin/hadoop dfsadmin -setSpaceQuota 3072 quota3-3k
            bin/hadoop fs -put ~/cws/tmp/5Mb quota3-3k/
            
            bin/hadoop fs -mkdir quota3-3k/quota2
            bin/hadoop dfsadmin -setQuota 2 quota3-3k/quota2
            bin/hadoop fs -mkdir quota3-3k/quota2/emptyDir
            bin/hadoop fs -put /dev/null quota3-3k/quota2/emptyFile
            
            bin/hadoop fs -mkdir quota3-3k/quota-5k
            bin/hadoop dfsadmin -setSpaceQuota 5120 quota3-3k/quota-5k
            bin/hadoop fs -put ~/cws/tmp/5Mb quota3-3k/quota-5k
            bin/hadoop fs -mkdir quota3-3k/quota-5k/emptyDir
            
          • Stop the cluster and start it again.
          • The cluster should restart fine. You should see some warnings for quota3-3k etc in NameNode log.
          • Adding any more files or data to such directories should fail.
          • You should be able to either increase the quotas or delete some files to satisfy the requirements.
          • Once these are fixed, there should not be any more warnings when NameNode restarts.
          Show
          Raghu Angadi added a comment - How to test with quota violations in fsimage : =================================== Comment out two places where QuotaExceededException is thrown in INodeDirectoryWithQuota Run a fresh HDFS and create a few files and directories such that it exceeds quotas for a few directories. For e.g. : bin/hadoop fs -mkdir quota3-3k bin/hadoop dfsadmin -setQuota 3 quota3-3k bin/hadoop fs -mkdir quota3-3k/emptyDir bin/hadoop fs -put /dev/null quota3-3k/emptyFile bin/hadoop dfsadmin -setSpaceQuota 3072 quota3-3k bin/hadoop fs -put ~/cws/tmp/5Mb quota3-3k/ bin/hadoop fs -mkdir quota3-3k/quota2 bin/hadoop dfsadmin -setQuota 2 quota3-3k/quota2 bin/hadoop fs -mkdir quota3-3k/quota2/emptyDir bin/hadoop fs -put /dev/null quota3-3k/quota2/emptyFile bin/hadoop fs -mkdir quota3-3k/quota-5k bin/hadoop dfsadmin -setSpaceQuota 5120 quota3-3k/quota-5k bin/hadoop fs -put ~/cws/tmp/5Mb quota3-3k/quota-5k bin/hadoop fs -mkdir quota3-3k/quota-5k/emptyDir Stop the cluster and start it again. The cluster should restart fine. You should see some warnings for quota3-3k etc in NameNode log. Adding any more files or data to such directories should fail. You should be able to either increase the quotas or delete some files to satisfy the requirements. Once these are fixed, there should not be any more warnings when NameNode restarts.
          Hide
          Raghu Angadi added a comment -

          I just committed this.

          Show
          Raghu Angadi added a comment - I just committed this.
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #611 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/611/ )
          Hide
          Ravi Phulari added a comment -

          Attaching TestPlan for Space Quota feature.

          Show
          Ravi Phulari added a comment - Attaching TestPlan for Space Quota feature.

            People

            • Assignee:
              Raghu Angadi
              Reporter:
              Robert Chansler
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development