Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-4302

Precondition in EditLogFileInputStream's length() method is checked too early in NameNode startup, causing fatal exception

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 2.0.3-alpha
    • ha, namenode
    • None
    • Reviewed

    Description

      When bringing up a namenode in standby mode, where DEBUG is enabled for namenode, the namenode will hit the following code in hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java:

       if (LOG.isDebugEnabled()) {
            LOG.debug("edit log length: " + in.length() + ", start txid: "
                + expectedStartingTxId + ", last txid: " + lastTxId);
          }
      

      .

      However, if in has an EditLogFileInputStream as its streams[0], this code is hit before the EditLogFileInputStream's advertizedSize is initialized (before the HTTP client connects to the remote edit log server (i.e. the journal node)). This causes the following precondition to fail in EditLogFileInputStream:length():

            Preconditions.checkState(advertisedSize != -1,
                "must get input stream before length is available");
      

      which shuts down the namenode with the following log messages and stack trace:

      2012-12-11 10:45:33,319 DEBUG ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(217)) - Call: getEditLogManifest took 88ms
      2012-12-11 10:45:33,336 DEBUG client.QuorumJournalManager (QuorumJournalManager.java:selectInputStreams(459)) - selectInputStream manifests:
      172.16.175.1:8485: [[1,3]]
      2012-12-11 10:45:33,351 DEBUG namenode.FSImage (FSImage.java:loadFSImage(605)) - Planning to load image :
      FSImageFile(file=/tmp/hadoop-data/dfs/name/current/fsimage_0000000000000000000, cpktTxId=0000000000000000000)
      2012-12-11 10:45:33,351 DEBUG namenode.FSImage (FSImage.java:loadFSImage(607)) - Planning to load edit log stream: org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9
      2012-12-11 10:45:33,355 INFO  namenode.FSImage (FSImageFormat.java:load(168)) - Loading image file /tmp/hadoop-data/dfs/name/current/fsimage_0000000000000000000 using no compression
      2012-12-11 10:45:33,355 INFO  namenode.FSImage (FSImageFormat.java:load(171)) - Number of files = 1
      2012-12-11 10:45:33,356 INFO  namenode.FSImage (FSImageFormat.java:loadFilesUnderConstruction(383)) - Number of files under construction = 0
      2012-12-11 10:45:33,357 INFO  namenode.FSImage (FSImageFormat.java:load(193)) - Image file of size 119 loaded in 0 seconds.
      2012-12-11 10:45:33,357 INFO  namenode.FSImage (FSImage.java:loadFSImage(753)) - Loaded image for txid 0 from /tmp/hadoop-data/dfs/name/current/fsimage_0000000000000000000
      2012-12-11 10:45:33,357 DEBUG namenode.FSImage (FSImage.java:loadEdits(686)) - About to load edits:
        org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9
      2012-12-11 10:45:33,359 INFO  namenode.FSImage (FSImage.java:loadEdits(694)) - Reading org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9 expecting start txid #1
      2012-12-11 10:45:33,361 DEBUG ipc.Client (Client.java:stop(1060)) - Stopping client
      2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:close(1016)) - IPC Client (1462017562) connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 from hdfs/eugenes-macbook-pro.local@EXAMPLE.COM: closed
      2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:run(848)) - IPC Client (1462017562) connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 from hdfs/eugenes-macbook-pro.local@EXAMPLE.COM: stopped, remaining connections 0
      2012-12-11 10:45:33,464 FATAL namenode.NameNode (NameNode.java:main(1224)) - Exception in namenode join
      java.lang.IllegalStateException: must get input stream before length is available
              at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
              at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog.length(EditLogFileInputStream.java:405)
              at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.length(EditLogFileInputStream.java:258)
              at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.length(RedundantEditLogInputStream.java:256)
              at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.length(RedundantEditLogInputStream.java:256)
              at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:125)
              at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:88)
              at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:697)
              at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:642)
              at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:259)
              at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:604)
              at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:447)
              at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:409)
              at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:400)
              at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:434)
              at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:606)
              at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:591)
              at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1153)
              at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1219)
      2012-12-11 10:45:33,470 INFO  util.ExitUtil (ExitUtil.java:terminate(84)) - Exiting with status 1
      2012-12-11 10:45:33,471 INFO  namenode.NameNode (StringUtils.java:run(620)) - SHUTDOWN_MSG:
      /************************************************************
      SHUTDOWN_MSG: Shutting down NameNode at Eugenes-MacBook-Pro.local/172.16.175.1
      ************************************************************/
      

      Attachments

        1. HDFS-4302.patch
          2 kB
          Eugene Joseph Koontz
        2. HDFS-4302.patch
          0.9 kB
          Eugene Joseph Koontz

        Issue Links

          Activity

            People

              ekoontz Eugene Joseph Koontz
              ekoontz Eugene Joseph Koontz
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: