Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-3127

failure in recovering removed storage directories should not stop checkpoint process

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.1.0
    • Fix Version/s: 1.0.3
    • Component/s: namenode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      When a restore fails, rollEditLog() also fails even if there are healthy directories. Any exceptions from recovering the removed directories should not fail checkpoint process.

      1. HDFS-3127.branch-1.patch
        7 kB
        Brandon Li
      2. HDFS-3127.branch-1.patch
        2 kB
        Brandon Li

        Issue Links

          Activity

          Matt Foley made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Hide
          Matt Foley added a comment -

          Closed upon release of Hadoop-1.0.3.

          Show
          Matt Foley added a comment - Closed upon release of Hadoop-1.0.3.
          Matt Foley made changes -
          Fix Version/s 1.0.3 [ 12320249 ]
          Fix Version/s 1.0.2 [ 12320051 ]
          Tsz Wo Nicholas Sze made changes -
          Link This issue relates to HDFS-3131 [ HDFS-3131 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Appreciate your understanding! Filed HDFS-3131.

          Show
          Tsz Wo Nicholas Sze added a comment - Appreciate your understanding! Filed HDFS-3131 .
          Hide
          Aaron T. Myers added a comment -

          Haha - looks like our comments raced with each other.

          Thanks for filing the JIRA to address the comments.

          Show
          Aaron T. Myers added a comment - Haha - looks like our comments raced with each other. Thanks for filing the JIRA to address the comments.
          Hide
          Aaron T. Myers added a comment -

          Nicholas, why weren't my comments addressed before commit?

          Show
          Aaron T. Myers added a comment - Nicholas, why weren't my comments addressed before commit?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Oops, sorry Aaron. I did not see your new comment. I started committing the patch right after I posted the test-patch result. I will file a JIRA to address your comments.

          Show
          Tsz Wo Nicholas Sze added a comment - Oops, sorry Aaron. I did not see your new comment. I started committing the patch right after I posted the test-patch result. I will file a JIRA to address your comments.
          Tsz Wo Nicholas Sze made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Hadoop Flags Reviewed [ 10343 ]
          Fix Version/s 1.0.2 [ 12320051 ]
          Resolution Fixed [ 1 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I have committed this. Thanks Brandon!

          Show
          Tsz Wo Nicholas Sze added a comment - I have committed this. Thanks Brandon!
          Hide
          Aaron T. Myers added a comment -

          Patch largely looks good, Brandon. A few comments:

          1. removeStorageAccess, restoreAccess, and numStorageDirs can all be made private
          2. numStorageDirs can be made static
          3. Rather than do set(Readable/Executable/Writable), use FileUtil.chmod(...).
          4. Please put the contents of the test in a try/finally, with the calls to shutdown the cluster and the 2NN in the finally block.
          5. Some lines are over 80 chars.
          6. No need for the numDatanodes variable - it's only used in one place.
          7. Instead of "xwr" use "rwx", which I think is a more common way of describing permissions.
          Show
          Aaron T. Myers added a comment - Patch largely looks good, Brandon. A few comments: removeStorageAccess, restoreAccess, and numStorageDirs can all be made private numStorageDirs can be made static Rather than do set(Readable/Executable/Writable), use FileUtil.chmod(...). Please put the contents of the test in a try/finally, with the calls to shutdown the cluster and the 2NN in the finally block. Some lines are over 80 chars. No need for the numDatanodes variable - it's only used in one place. Instead of "xwr" use "rwx", which I think is a more common way of describing permissions.
          Hide
          Tsz Wo Nicholas Sze added a comment -
               [exec] -1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
               [exec] 
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     -1 findbugs.  The patch appears to introduce 7 new Findbugs (version 1.3.9) warnings.
          

          The findbugs warnings are not related to the patch. The result is the same for an empty patch.

          Show
          Tsz Wo Nicholas Sze added a comment - [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] -1 findbugs. The patch appears to introduce 7 new Findbugs (version 1.3.9) warnings. The findbugs warnings are not related to the patch. The result is the same for an empty patch.
          Hide
          Jitendra Nath Pandey added a comment -

          +1

          Show
          Jitendra Nath Pandey added a comment - +1
          Brandon Li made changes -
          Link This issue relates to HADOOP-4885 [ HADOOP-4885 ]
          Brandon Li made changes -
          Link This issue relates to HDFS-3075 [ HDFS-3075 ]
          Hide
          Brandon Li added a comment -

          I ran test-core and it passed.

          Show
          Brandon Li added a comment - I ran test-core and it passed.
          Brandon Li made changes -
          Attachment HDFS-3127.branch-1.patch [ 12519510 ]
          Hide
          Brandon Li added a comment -

          More test case is added in the new patch.

          Show
          Brandon Li added a comment - More test case is added in the new patch.
          Hide
          Brandon Li added a comment -

          Trunk doesn't have this problem.

          Show
          Brandon Li added a comment - Trunk doesn't have this problem.
          Hide
          Aaron T. Myers added a comment -

          Does this change not also need to be made on trunk?

          Also, please run the full HDFS test suite before committing this.

          Show
          Aaron T. Myers added a comment - Does this change not also need to be made on trunk? Also, please run the full HDFS test suite before committing this.
          Hide
          Jitendra Nath Pandey added a comment -

          I should have mentioned earlier, please add a test for this.

          Show
          Jitendra Nath Pandey added a comment - I should have mentioned earlier, please add a test for this.
          Hide
          Jitendra Nath Pandey added a comment -

          +1, looks good to me.

          Show
          Jitendra Nath Pandey added a comment - +1, looks good to me.
          Brandon Li made changes -
          Field Original Value New Value
          Attachment HDFS-3127.branch-1.patch [ 12519486 ]
          Brandon Li created issue -

            People

            • Assignee:
              Brandon Li
              Reporter:
              Brandon Li
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development