Hadoop Common
  1. Hadoop Common
  2. HADOOP-6774

Namenode is not able to recover from disk full condition

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: 0.20.2, 0.21.0
    • Fix Version/s: None
    • Component/s: fs
    • Labels:
      None
    • Environment:

      Linux sjc9-flash-grid00.ciq.com 2.6.18-164.el5 #1 SMP Thu Sep 3 03:28:30 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux

    • Release Note:
      Hide
      Implemented a daemon thread to monitor the disk usage for periodically and if the disk usage reaches the threshold value, put the name node into Safe mode so that no modification to file system will occur. Once the disk usage reaches below the threshold, name node will be put out of the safe mode. Here threshold value and interval to check the disk usage are configurable.
      Show
      Implemented a daemon thread to monitor the disk usage for periodically and if the disk usage reaches the threshold value, put the name node into Safe mode so that no modification to file system will occur. Once the disk usage reaches below the threshold, name node will be put out of the safe mode. Here threshold value and interval to check the disk usage are configurable.

      Description

      We ran an internal flow which resulted in:
      Exception in thread "main" java.lang.RuntimeException: initialization of flow executor failed

      After that we freed disk space on the Namenode server, but restarting Namenode failed.
      Here is from Namenode log:

      2010-05-19 17:15:15,514 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: sjc1-qa-certiq1.sjc1.ciq.com/10.201.8.247:9000
      2010-05-19 17:15:15,516 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null
      2010-05-19 17:15:15,518 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
      2010-05-19 17:15:15,579 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hadoop,hadoop
      2010-05-19 17:15:15,579 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
      2010-05-19 17:15:15,579 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true
      2010-05-19 17:15:15,588 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext
      2010-05-19 17:15:15,590 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStatusMBean
      2010-05-19 17:15:15,637 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files = 1874
      2010-05-19 17:15:16,202 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files under construction = 2
      2010-05-19 17:15:16,204 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 259450 loaded in 0 seconds.
      2010-05-19 17:15:16,599 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.NumberFormatException: For input string: ""
      at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
      at java.lang.Long.parseLong(Long.java:431)
      at java.lang.Long.parseLong(Long.java:468)
      at org.apache.hadoop.hdfs.server.namenode.FSEditLog.readLong(FSEditLog.java:1273)
      at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:656)
      at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:999)
      at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:812)
      at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
      at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:88)
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:312)
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:293)
      at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:224)
      at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:306)
      at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1004)
      at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1013)

      2010-05-19 17:15:16,599 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:

      1. hadoop-6774.stack
        8 kB
        Ted Yu
      2. HADOOP-6774.patch
        103 kB
        Devaraj K

        Issue Links

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              Ted Yu
            • Votes:
              3 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development