Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-1921

Save namespace can cause NN to be unable to come up on restart


    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.22.0, 0.23.0
    • Fix Version/s: 0.22.0, 0.23.0
    • Component/s: None
    • Labels:


      I discovered this in the course of trying to implement a fix for HDFS-1505.

      Per the comment for FSImage.saveNamespace(...), the algorithm for save namespace proceeds in the following order:

      1. rename current to lastcheckpoint.tmp for all of them,
      2. save image and recreate edits for all of them,
      3. rename lastcheckpoint.tmp to previous.checkpoint.

      The problem is that step 3 occurs regardless of whether or not an error occurs for all storage directories in step 2. Upon restart, the NN will see non-existent or corrupt current directories, and no lastcheckpoint.tmp directories, and so will conclude that the storage directories are not formatted.

      This issue appears to be present on both 0.22 and 0.23. This should arguably be a 0.22/0.23 blocker.


        1. hdfs-1505-1-test.txt
          3 kB
          Matt Foley
        2. hdfs1921_v23.patch
          3 kB
          Matt Foley
        3. hdfs1921_v23.patch
          3 kB
          Matt Foley
        4. hdfs-1921.txt
          5 kB
          Todd Lipcon
        5. hdfs-1921-2_v22.patch
          5 kB
          Matt Foley
        6. hdfs-1921-2.patch
          5 kB
          Matt Foley

          Issue Links



              • Assignee:
                mattf Matt Foley
                atm Aaron T. Myers
              • Votes:
                0 Vote for this issue
                4 Start watching this issue


                • Created: