Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-1864

Hadoop Namenode not starting up.

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • None
    • None
    • None
    • None

    Description

      1. Checked to make sure hadoop was running properly. Discovered that we suppose to run 'jps' and make sure there is a namenode process.

      2. Documentation said, if namenode does not exist - then run

      /etc/init.d/hadoop-0.20-namenode start

      /etc/init.d/hadoop-0.20-namenode status - namenode process fails

      EQX hdfs@hadoop-master:/usr/lib/hadoop/bin$ /etc/init.d/hadoop-0.20-namenode status
      namenode dead but pid file exists

      3. Searched for pid files. We deleted pid files.

      4. All over stats fell off. As direct of result, looking at the process list - and there appeared to be a stalled process that was killed.

      kill -9

      for the following process:

      EQX root@hadoop-master:/etc/init.d# ps aux | grep namenode
      hdfs 5038 0.2 1.0 3617440 526704 ? Sl Mar31 74:02 /usr/java/default/bin/java -Dproc_namenode -Xmx3000m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dhadoop.log.dir=/usr/lib/hadoop/logs -Dhadoop.log.file=hadoop-hdfs-namenode-hadoop-master.rockyou.com.log -Dhadoop.home.dir=/usr/lib/hadoop -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,DRFA -Djava.library.path=/usr/lib/hadoop/lib/native/Linux-amd64-64 -Dhadoop.policy.file=hadoop-policy.xml -classpath /usr/lib/hadoop/conf:/usr/java/default/lib/tools.jar:/usr/lib/hadoop:/usr/lib/hadoop/hadoop-core-0.20.2+737.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop/lib/commons-net-1.4.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/hadoop-fairscheduler-0.20.2+737.jar:/usr/lib/hadoop/lib/hadoop-lzo-0.4.8.jar:/usr/lib/hadoop/lib/hadoop-lzo.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop/lib/hue-plugins-1.1.0.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jetty-6.1.14.jar:/usr/lib/hadoop/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop/lib/junit-4.5.jar:/usr/lib/hadoop/lib/kfs-0.2.2.jar:/usr/lib/hadoop/lib/log4j-1.2.15.jar:/usr/lib/hadoop/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop/lib/oro-2.0.8.jar:/usr/lib/hadoop/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/usr/lib/hadoop/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop/lib/jsp-2.1/jsp-api-2.1.jar::/usr/local/lib/mysql-connector-java-5.1.7-bin.jar:/usr/local/lib/mail.jar:/usr/local/lib/mysql-connector-java-5.1.7-bin.jar:/usr/local/lib/mail.jar:/usr/local/lib/mysql-connector-java-5.1.7-bin.jar:/usr/local/lib/mail.jar:/usr/local/lib/mysql-connector-java-5.1.7-bin.jar:/usr/local/lib/mail.jar:/usr/local/lib/mysql-connector-java-5.1.7-bin.jar:/usr/local/lib/mail.jar org.apache.hadoop.hdfs.server.namenode.NameNode
      root 16449 0.0 0.0 61136 744 pts/4 S+ 16:29 0:00 grep namenode
      EQX root@hadoop-master:/etc/init.d# kill -9 5038

      We starting looking at log output - we discovered the namenode startup process is throwing a null pointer exception.

      STARTUP_MSG: build = -r 98c55c28258aa6f42250569bd7fa431ac657bdbd; compiled by 'root' on Mon Oct 11 13:14:05 EDT 2010
      ************************************************************/
      2011-04-25 21:16:47,841 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null
      2011-04-25 21:16:47,949 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.ganglia.GangliaContext31
      2011-04-25 21:16:47,982 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hdfs
      2011-04-25 21:16:47,982 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=root
      2011-04-25 21:16:47,982 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true
      2011-04-25 21:16:47,987 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
      2011-04-25 21:16:48,301 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.ganglia.GangliaContext31
      2011-04-25 21:16:48,302 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStatusMBean
      2011-04-25 21:16:48,328 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files = 237791
      2011-04-25 21:16:51,699 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files under construction = 0
      2011-04-25 21:16:51,699 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 42758182 loaded in 3 seconds.
      2011-04-25 21:16:51,701 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.NullPointerException
      at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1088)

      at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1100)
      at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:987)

      at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:974)

      at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:718)

      at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1034)
      at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:845)

      at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:379)
      at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:99)

      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:343)
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:317)

      at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:214)
      at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:394)

      at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1148)

      at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1157)

      Attachments

        Activity

          People

            Unassigned Unassigned
            ronak.shah Ronak Shah
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: