Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-6774

Namenode is not able to recover from disk full condition

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 0.20.2, 0.21.0
    • None
    • fs
    • None
    • Linux sjc9-flash-grid00.ciq.com 2.6.18-164.el5 #1 SMP Thu Sep 3 03:28:30 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux

    • Hide
      Implemented a daemon thread to monitor the disk usage for periodically and if the disk usage reaches the threshold value, put the name node into Safe mode so that no modification to file system will occur. Once the disk usage reaches below the threshold, name node will be put out of the safe mode. Here threshold value and interval to check the disk usage are configurable.
      Show
      Implemented a daemon thread to monitor the disk usage for periodically and if the disk usage reaches the threshold value, put the name node into Safe mode so that no modification to file system will occur. Once the disk usage reaches below the threshold, name node will be put out of the safe mode. Here threshold value and interval to check the disk usage are configurable.

    Description

      We ran an internal flow which resulted in:
      Exception in thread "main" java.lang.RuntimeException: initialization of flow executor failed

      After that we freed disk space on the Namenode server, but restarting Namenode failed.
      Here is from Namenode log:

      2010-05-19 17:15:15,514 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: sjc1-qa-certiq1.sjc1.ciq.com/10.201.8.247:9000
      2010-05-19 17:15:15,516 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null
      2010-05-19 17:15:15,518 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
      2010-05-19 17:15:15,579 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hadoop,hadoop
      2010-05-19 17:15:15,579 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
      2010-05-19 17:15:15,579 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true
      2010-05-19 17:15:15,588 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext
      2010-05-19 17:15:15,590 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStatusMBean
      2010-05-19 17:15:15,637 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files = 1874
      2010-05-19 17:15:16,202 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files under construction = 2
      2010-05-19 17:15:16,204 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 259450 loaded in 0 seconds.
      2010-05-19 17:15:16,599 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.NumberFormatException: For input string: ""
      at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
      at java.lang.Long.parseLong(Long.java:431)
      at java.lang.Long.parseLong(Long.java:468)
      at org.apache.hadoop.hdfs.server.namenode.FSEditLog.readLong(FSEditLog.java:1273)
      at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:656)
      at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:999)
      at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:812)
      at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
      at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:88)
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:312)
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:293)
      at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:224)
      at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:306)
      at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1004)
      at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1013)

      2010-05-19 17:15:16,599 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:

      Attachments

        1. hadoop-6774.stack
          8 kB
          Ted Yu
        2. HADOOP-6774.patch
          103 kB
          Devaraj Kavali

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ted_yu Ted Yu
              Votes:
              3 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: