Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1258

Clearing namespace quota on "/" corrupts FS image

    Details

    • Hadoop Flags:
      Reviewed

      Description

      The HDFS root directory starts out with a default namespace quota of Integer.MAX_VALUE. If you clear this quota (using "hadoop dfsadmin -clrQuota /"), the fsimage gets corrupted immediately. Subsequent 2NN rolls will fail, and the NN will not come back up from a restart.

      1. clear-quota-0.21.patch
        2 kB
        Aaron T. Myers
      2. clear-quota-0.20.patch
        2 kB
        Aaron T. Myers
      3. clear-quota.patch
        2 kB
        Aaron T. Myers
      4. clear-quota.patch
        2 kB
        Aaron T. Myers

        Issue Links

          Activity

          Hide
          Allen Wittenauer added a comment -

          If someone validates this, we should probably mark this as a blocker for 0.21.

          Show
          Allen Wittenauer added a comment - If someone validates this, we should probably mark this as a blocker for 0.21.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Let's mark this as a blocker. We have to resolve this before new releases anyway.

          Show
          Tsz Wo Nicholas Sze added a comment - Let's mark this as a blocker. We have to resolve this before new releases anyway.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Unfortunately, this seems true.

          1. start hdfs
          2. put a file
          3. clear / quota
          4. restart namenode
            2010-06-22 22:22:16,337 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
            java.io.EOFException
                    at java.io.DataInputStream.readFully(DataInputStream.java:180)
                    at org.apache.hadoop.io.UTF8.readFields(UTF8.java:106)
                    at org.apache.hadoop.hdfs.server.namenode.FSImage.readBytes(FSImage.java:1588)
                    at org.apache.hadoop.hdfs.server.namenode.FSImage.readINodeUnderConstruction(FSImage.java:1227)
                    at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFilesUnderConstruction(FSImage.java:1205)
                    at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:959)
                    at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:807)
                    at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
                    at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
                    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:303)
                    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:284)
                    at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
                    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
                    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
                    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
            2010-06-22 22:22:16,338 INFO org.apache.hadoop.ipc.Server: Stopping server on 9000
            2010-06-22 22:22:16,339 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.EOFException
                    at java.io.DataInputStream.readFully(DataInputStream.java:180)
                    at org.apache.hadoop.io.UTF8.readFields(UTF8.java:106)
                    at org.apache.hadoop.hdfs.server.namenode.FSImage.readBytes(FSImage.java:1588)
                    at org.apache.hadoop.hdfs.server.namenode.FSImage.readINodeUnderConstruction(FSImage.java:1227)
                    at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFilesUnderConstruction(FSImage.java:1205)
                    at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:959)
                    at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:807)
                    at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
                    at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
                    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:303)
                    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:284)
                    at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
                    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
                    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
                    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
            
          Show
          Tsz Wo Nicholas Sze added a comment - Unfortunately, this seems true. start hdfs put a file clear / quota restart namenode 2010-06-22 22:22:16,337 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed. java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at org.apache.hadoop.io.UTF8.readFields(UTF8.java:106) at org.apache.hadoop.hdfs.server.namenode.FSImage.readBytes(FSImage.java:1588) at org.apache.hadoop.hdfs.server.namenode.FSImage.readINodeUnderConstruction(FSImage.java:1227) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFilesUnderConstruction(FSImage.java:1205) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:959) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:807) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:303) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:284) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965) 2010-06-22 22:22:16,338 INFO org.apache.hadoop.ipc.Server: Stopping server on 9000 2010-06-22 22:22:16,339 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at org.apache.hadoop.io.UTF8.readFields(UTF8.java:106) at org.apache.hadoop.hdfs.server.namenode.FSImage.readBytes(FSImage.java:1588) at org.apache.hadoop.hdfs.server.namenode.FSImage.readINodeUnderConstruction(FSImage.java:1227) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFilesUnderConstruction(FSImage.java:1205) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:959) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:807) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:303) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:284) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
          Hide
          Tom White added a comment -

          This is serious, for sure, but I think we could release 0.21.0 without it. The point of 0.21.0 is to exercise the release process, and make a Hadoop release available to people who want to try newer features and help stabilize post-20 Hadoop, so that later 0.21 releases and the 0.22 release in November will be more widely usable. 0.21 already has known issues (e.g. HDFS-875), so this one too could be called out in the release notes, so folks are made aware of its seriousness.

          Show
          Tom White added a comment - This is serious, for sure, but I think we could release 0.21.0 without it. The point of 0.21.0 is to exercise the release process, and make a Hadoop release available to people who want to try newer features and help stabilize post-20 Hadoop, so that later 0.21 releases and the 0.22 release in November will be more widely usable. 0.21 already has known issues (e.g. HDFS-875 ), so this one too could be called out in the release notes, so folks are made aware of its seriousness.
          Hide
          Aaron T. Myers added a comment -

          This patch doesn't actually solve the root problem of clearing the root directory quota causing a corrupt FS image, but it will prevent people from accidentally borking their file system in the mean time, until that gets fixed.

          Show
          Aaron T. Myers added a comment - This patch doesn't actually solve the root problem of clearing the root directory quota causing a corrupt FS image, but it will prevent people from accidentally borking their file system in the mean time, until that gets fixed.
          Hide
          Todd Lipcon added a comment -

          Patch looks good. Can you reupload it with the --no-prefix option to git diff, and then change to "Patch Available" status so the Hudson QA bot runs?

          Show
          Todd Lipcon added a comment - Patch looks good. Can you reupload it with the --no-prefix option to git diff, and then change to "Patch Available" status so the Hudson QA bot runs?
          Hide
          Aaron T. Myers added a comment -

          Same patch, but with the --no-prefix option to git diff.

          Show
          Aaron T. Myers added a comment - Same patch, but with the --no-prefix option to git diff.
          Hide
          Aaron T. Myers added a comment -

          Patch prevents user's from clearing namespace quota on "/".

          Show
          Aaron T. Myers added a comment - Patch prevents user's from clearing namespace quota on "/".
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Thanks for providing a patch assigning this to you.

          Show
          Tsz Wo Nicholas Sze added a comment - Thanks for providing a patch assigning this to you.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12448268/clear-quota.patch
          against trunk revision 957669.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/208/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/208/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/208/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/208/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12448268/clear-quota.patch against trunk revision 957669. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/208/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/208/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/208/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/208/console This message is automatically generated.
          Hide
          Aaron T. Myers added a comment -

          Both of those test failures were failing in trunk before I created the patch.

          Show
          Aaron T. Myers added a comment - Both of those test failures were failing in trunk before I created the patch.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          The test report web site is not available at the moment. Will check it later.

          Aaron, could you also provide patches for 0.20 and 0.21?

          Show
          Tsz Wo Nicholas Sze added a comment - The test report web site is not available at the moment. Will check it later. Aaron, could you also provide patches for 0.20 and 0.21?
          Hide
          Jakob Homan added a comment -

          +1. Looks good.

          Show
          Jakob Homan added a comment - +1. Looks good.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > Both of those test failures were failing in trunk before I created the patch.

          The failed tests were TestBlockToken and TestJspHelper. They are not related to this.

          Once the 0.20 and 0.21 patches are available. We can commit this.

          Show
          Tsz Wo Nicholas Sze added a comment - > Both of those test failures were failing in trunk before I created the patch. The failed tests were TestBlockToken and TestJspHelper. They are not related to this. Once the 0.20 and 0.21 patches are available. We can commit this.
          Hide
          Aaron T. Myers added a comment -

          Patches for 0.20 and 0.21.

          Show
          Aaron T. Myers added a comment - Patches for 0.20 and 0.21.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I have committed this to 0.20, 0.20-append and above. Thanks, Aaron!

          Show
          Tsz Wo Nicholas Sze added a comment - I have committed this to 0.20, 0.20-append and above. Thanks, Aaron!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #331 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/331/)
          HDFS-1258. Clearing namespace quota on "/" corrupts fs image. Contributed by Aaron T. Myers

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #331 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/331/ ) HDFS-1258 . Clearing namespace quota on "/" corrupts fs image. Contributed by Aaron T. Myers
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I have merged this to 0.20-security.

          Show
          Tsz Wo Nicholas Sze added a comment - I have merged this to 0.20-security.

            People

            • Assignee:
              Aaron T. Myers
              Reporter:
              Aaron T. Myers
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development