Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-9249

NPE is thrown if an IOException is thrown in NameNode constructor

    Details

    • Hadoop Flags:
      Reviewed

      Description

      This issue was found when running test case TestBackupNode.testCheckpointNode, but upon closer look, the problem is not due to the test case.

      Looks like an IOException was thrown in
      try {
      initializeGenericKeys(conf, nsId, namenodeId);
      initialize(conf);
      try

      { haContext.writeLock(); state.prepareToEnterState(haContext); state.enterState(haContext); }

      finally

      { haContext.writeUnlock(); }

      causing the namenode to stop, but the namesystem was not yet properly instantiated, causing NPE.

      I tried to reproduce locally, but to no avail.

      Because I could not reproduce the bug, and the log does not indicate what caused the IOException, I suggest make this a supportability JIRA to log the exception for future improvement.

      Stacktrace
      java.lang.NullPointerException: null
      at org.apache.hadoop.hdfs.server.namenode.NameNode.getFSImage(NameNode.java:906)
      at org.apache.hadoop.hdfs.server.namenode.BackupNode.stop(BackupNode.java:210)
      at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:827)
      at org.apache.hadoop.hdfs.server.namenode.BackupNode.<init>(BackupNode.java:89)
      at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1474)
      at org.apache.hadoop.hdfs.server.namenode.TestBackupNode.startBackupNode(TestBackupNode.java:102)
      at org.apache.hadoop.hdfs.server.namenode.TestBackupNode.testCheckpoint(TestBackupNode.java:298)
      at org.apache.hadoop.hdfs.server.namenode.TestBackupNode.testCheckpointNode(TestBackupNode.java:130)
      The last few lines of log:
      2015-10-14 19:45:07,807 INFO namenode.NameNode (NameNode.java:createNameNode(1422)) - createNameNode [-checkpoint]
      2015-10-14 19:45:07,807 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:init(158)) - CheckpointNode metrics system started (again)
      2015-10-14 19:45:07,808 INFO namenode.NameNode (NameNode.java:setClientNamenodeAddress(402)) - fs.defaultFS is hdfs://localhost:37835
      2015-10-14 19:45:07,808 INFO namenode.NameNode (NameNode.java:setClientNamenodeAddress(422)) - Clients are to use localhost:37835 to access this namenode/service.
      2015-10-14 19:45:07,810 INFO hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdown(1708)) - Shutting down the Mini HDFS Cluster
      2015-10-14 19:45:07,810 INFO namenode.FSNamesystem (FSNamesystem.java:stopActiveServices(1298)) - Stopping services started for active state
      2015-10-14 19:45:07,811 INFO namenode.FSEditLog (FSEditLog.java:endCurrentLogSegment(1228)) - Ending log segment 1
      2015-10-14 19:45:07,811 INFO namenode.FSNamesystem (FSNamesystem.java:run(5306)) - NameNodeEditLogRoller was interrupted, exiting
      2015-10-14 19:45:07,811 INFO namenode.FSEditLog (FSEditLog.java:printStatistics(703)) - Number of transactions: 3 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 4 SyncTimes(ms): 2 1
      2015-10-14 19:45:07,811 INFO namenode.FSNamesystem (FSNamesystem.java:run(5373)) - LazyPersistFileScrubber was interrupted, exiting
      2015-10-14 19:45:07,822 INFO namenode.FileJournalManager (FileJournalManager.java:finalizeLogSegment(142)) - Finalizing edits file /data/jenkins/workspace/CDH5.5.0-Hadoop-HDFS-2.6.0/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name1/current/edits_inprogress_0000000000000000001 > /data/jenkins/workspace/CDH5.5.0-HadoopHDFS-2.6.0/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name1/current/edits_0000000000000000001-0000000000000000003
      2015-10-14 19:45:07,835 INFO namenode.FileJournalManager (FileJournalManager.java:finalizeLogSegment(142)) - Finalizing edits file /data/jenkins/workspace/CDH5.5.0-Hadoop-HDFS-2.6.0/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name2/current/edits_inprogress_0000000000000000001 > /data/jenkins/workspace/CDH5.5.0-HadoopHDFS-2.6.0/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name2/current/edits_0000000000000000001-0000000000000000003
      2015-10-14 19:45:07,836 INFO blockmanagement.CacheReplicationMonitor (CacheReplicationMonitor.java:run(169)) - Shutting down CacheReplicationMonitor
      2015-10-14 19:45:07,836 INFO ipc.Server (Server.java:stop(2485)) - Stopping server on 37835
      2015-10-14 19:45:07,837 INFO ipc.Server (Server.java:run(718)) - Stopping IPC Server listener on 37835
      2015-10-14 19:45:07,837 INFO ipc.Server (Server.java:run(844)) - Stopping IPC Server Responder
      2015-10-14 19:45:07,837 INFO blockmanagement.BlockManager (BlockManager.java:run(3781)) - Stopping ReplicationMonitor.
      2015-10-14 19:45:07,838 WARN blockmanagement.DecommissionManager (DecommissionManager.java:run(78)) - Monitor interrupted: java.lang.InterruptedException: sleep interrupted
      2015-10-14 19:45:07,844 INFO namenode.FSNamesystem (FSNamesystem.java:stopActiveServices(1298)) - Stopping services started for active state
      2015-10-14 19:45:07,845 INFO namenode.FSNamesystem (FSNamesystem.java:stopStandbyServices(1386)) - Stopping services started for standby state
      2015-10-14 19:45:07,848 INFO mortbay.log (Slf4jLog.java:info(67)) - Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@localhost:0

        Attachments

        1. HDFS-9249.006.patch
          6 kB
          Wei-Chiu Chuang
        2. HDFS-9249.005.patch
          6 kB
          Wei-Chiu Chuang
        3. HDFS-9249.004.patch
          5 kB
          Wei-Chiu Chuang
        4. HDFS-9249.003.patch
          6 kB
          Wei-Chiu Chuang
        5. HDFS-9249.002.patch
          5 kB
          Wei-Chiu Chuang
        6. HDFS-9249.001.patch
          2 kB
          Wei-Chiu Chuang

          Activity

            People

            • Assignee:
              jojochuang Wei-Chiu Chuang
              Reporter:
              jojochuang Wei-Chiu Chuang
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: