Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.20.203.0, 0.23.0, 1.0.2
    • Fix Version/s: 1.0.3
    • Component/s: namenode
    • Labels:
      None
    • Environment:

      Description

      It appears that there's a condition under which a HDFS directory with a space quota set can get to a point where the cached size for the directory can permanently differ from the computed value. When this happens the following command:

      hadoop fs -count -q /tmp/quota-test
      

      results in the following output in the NameNode logs:

      WARN org.apache.hadoop.hdfs.server.namenode.NameNode: Inconsistent diskspace for directory quota-test. Cached: 6000 Computed: 6072
      

      I've observed both transient and persistent instances of this happening. In the transient instances this warning goes away, but in the persistent instances every invocation of the fs -count -q command yields the above warning.

      I've seen instances where the actual disk usage of a directory is 25% of the cached value in INodeDirectory, which creates problems since the quota code uses this cached value to determine whether block write requests are permitted.

      This isn't easy to reproduce - I am able to (inconsistently) get HDFS into this state with a simple program which:

      1. Writes files into HDFS
      2. When a DSQuotaExceededException is encountered removes all files created in step 1
      3. Repeat step 1

      I'm going to try and come up with a more repeatable test case to reproduce this issue.

      1. QuotaTestSimple.java
        4 kB
        Alex Holmes
      2. hdfs-3061-branch-1.patch
        3 kB
        Kihwal Lee

        Issue Links

          Activity

          Hide
          Alex Holmes added a comment -

          Sample Java class which may eventually reproduce the problem.

          Show
          Alex Holmes added a comment - Sample Java class which may eventually reproduce the problem.
          Hide
          Todd Lipcon added a comment -

          Possible this is HDFS-1189 or HDFS-1377? Can you repro on 0.20.204 or later?

          Show
          Todd Lipcon added a comment - Possible this is HDFS-1189 or HDFS-1377 ? Can you repro on 0.20.204 or later?
          Hide
          Alex Holmes added a comment -

          We're running on 0.20.203 with HDFS-1377 and HDFS-2053 backported - sorry, I should have mentioned that in the description. Would HDFS-1189 apply, since the test case which (sometimes) reproduces this condition doesn't clear the quota?

          Show
          Alex Holmes added a comment - We're running on 0.20.203 with HDFS-1377 and HDFS-2053 backported - sorry, I should have mentioned that in the description. Would HDFS-1189 apply, since the test case which (sometimes) reproduces this condition doesn't clear the quota?
          Hide
          Kihwal Lee added a comment -

          We are seeing this in 1.0.

          Show
          Kihwal Lee added a comment - We are seeing this in 1.0.
          Hide
          Kihwal Lee added a comment -

          One example:

           
          012-04-18 00:06:04,246 WARN org.apache.hadoop.hdfs.server.namenode.NameNode:
          Inconsistent diskspace for directory xxxxx. Cached: 7877252111770 Computed:
          4159086535
          

          The delta grows everyday.

          Show
          Kihwal Lee added a comment - One example: 012-04-18 00:06:04,246 WARN org.apache.hadoop.hdfs.server.namenode.NameNode: Inconsistent diskspace for directory xxxxx. Cached: 7877252111770 Computed: 4159086535 The delta grows everyday.
          Hide
          Kihwal Lee added a comment -

          We (mostly Koji) have tracked it down to abandonBlock(). Branch-1 is missing HDFS-1487.

          Show
          Kihwal Lee added a comment - We (mostly Koji) have tracked it down to abandonBlock(). Branch-1 is missing HDFS-1487 .
          Hide
          Kihwal Lee added a comment -

          Since HDFS-1487 is closed, I will track the work in this jira.

          Show
          Kihwal Lee added a comment - Since HDFS-1487 is closed, I will track the work in this jira.
          Hide
          Kihwal Lee added a comment -

          0.23.3, 2.0 and trunk have the fix. It's only the branch-1.

          Show
          Kihwal Lee added a comment - 0.23.3, 2.0 and trunk have the fix. It's only the branch-1.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12525690/hdfs-3061-branch-1.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2380//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12525690/hdfs-3061-branch-1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2380//console This message is automatically generated.
          Hide
          Matt Foley added a comment -

          +1. Looks like the correct port of HDFS-1487 to branch-1. Committing to branch-1 and branch-1.0. Thanks, Kihwal!

          Show
          Matt Foley added a comment - +1. Looks like the correct port of HDFS-1487 to branch-1. Committing to branch-1 and branch-1.0. Thanks, Kihwal!
          Hide
          Matt Foley added a comment -

          Closed upon release of Hadoop-1.0.3.

          Show
          Matt Foley added a comment - Closed upon release of Hadoop-1.0.3.
          Hide
          Eli Collins added a comment -

          Updating summary to reflect that this patch is the HDFS-1487 patch applied to branch-1.

          Show
          Eli Collins added a comment - Updating summary to reflect that this patch is the HDFS-1487 patch applied to branch-1.

            People

            • Assignee:
              Kihwal Lee
              Reporter:
              Alex Holmes
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development