Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3724

Namenode does not start due to exception throw while saving Image

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.18.0
    • Fix Version/s: 0.18.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Re-start of namenode failed with this stack trace while savingImage during initialization

      2008-07-09 00:20:21,470 INFO org.apache.hadoop.ipc.Server: Stopping server on 9000
      2008-07-09 00:20:21,493 ERROR org.apache.hadoop.dfs.NameNode: java.io.IOException: saveLeases found path /foo/bar/jambajuice but no matching entry in namespace.  
      at org.apache.hadoop.dfs.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:4376)  
      at org.apache.hadoop.dfs.FSImage.saveFSImage(FSImage.java:874)  
      at org.apache.hadoop.dfs.FSImage.saveFSImage(FSImage.java:892)  
      at org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:81)   
      at org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:273)   
      at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:252)   
      at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:148)   
      at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:193)   
      at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:179)   
      at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:822)  
      at org.apache.hadoop.dfs.NameNode.main(NameNode.java:831)
      

      Looks like it was throwing IOException in saveFilesUnderConstruction

      Before restart NameNode was killed while some jobs were running. Upon looking at the namenode log before the stopping of namenode, there were many entries like this

      2008-07-09 00:12:55,301 INFO org.apache.hadoop.fs.FSNamesystem: Recovering lease=[Lease.  Holder: DFSClient_-510679348, pendingcreates: 1], src=/foo/bar/jambajuice
      2008-07-09 00:12:55,301 WARN org.apache.hadoop.dfs.StateChange: DIR* NameSystem.internalReleaseCreate: attempt to release a create lock on /foo/bar/jambajuice  file does not exist.
      

      These 2 lines are repeated forever every second, to a point where I see that a 7 node cluster had namenode log with size close to 41G.

      Could not find any other information about the file as there were not previous namenode logs.

        Attachments

        1. renameWhileOpen.patch
          13 kB
          Dhruba Borthakur
        2. renameWhileOpen2.patch
          18 kB
          Dhruba Borthakur
        3. renameWhileOpen3.patch
          18 kB
          Dhruba Borthakur

          Issue Links

            Activity

              People

              • Assignee:
                dhruba Dhruba Borthakur
                Reporter:
                lohit Lohit Vijaya Renu
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: