Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-3399 BookKeeper option support for NN HA
  3. HDFS-3769

standby namenode become active fails because starting log segment fail on shared storage

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 2.0.0-alpha
    • 2.1.0-beta
    • ha
    • None
    • 3 datanode:158.1.132.18,158.1.132.19,160.161.0.143
      2 namenode:158.1.131.18,158.1.132.19
      3 zk:158.1.132.18,158.1.132.19,160.161.0.143
      3 bookkeeper:158.1.132.18,158.1.132.19,160.161.0.143

      ensemble-size:2,quorum-size:2

    Description

      2012-08-06 15:09:46,264 ERROR org.apache.hadoop.contrib.bkjournal.utils.RetryableZookeeper: Node /ledgers/available already exists and this is not a retry
      2012-08-06 15:09:46,264 INFO org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager: Successfully created bookie available path : /ledgers/available
      2012-08-06 15:09:46,273 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering unfinalized segments in /opt/namenodeHa/hadoop-2.0.1/hadoop-root/dfs/name/current
      2012-08-06 15:09:46,277 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Catching up to latest edits from old active before taking over writer role in edits logs.
      2012-08-06 15:09:46,363 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Reprocessing replication and invalidation queues...
      2012-08-06 15:09:46,363 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Marking all datandoes as stale
      2012-08-06 15:09:46,383 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Total number of blocks = 239
      2012-08-06 15:09:46,383 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Number of invalid blocks = 0
      2012-08-06 15:09:46,383 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Number of under-replicated blocks = 0
      2012-08-06 15:09:46,383 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Number of over-replicated blocks = 0
      2012-08-06 15:09:46,383 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Number of blocks being written = 0
      2012-08-06 15:09:46,383 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing edit logs at txnid 2354
      2012-08-06 15:09:46,471 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 2354
      2012-08-06 15:09:46,472 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: starting log segment 2354 failed for required journal (JournalAndStream(mgr=org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager@4eda1515, stream=null))
      java.io.IOException: We've already seen 2354. A new stream cannot be created with it
      at org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.startLogSegment(BookKeeperJournalManager.java:297)
      at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalAndStream.startLogSegment(JournalSet.java:86)
      at org.apache.hadoop.hdfs.server.namenode.JournalSet$2.apply(JournalSet.java:182)
      at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:319)
      at org.apache.hadoop.hdfs.server.namenode.JournalSet.startLogSegment(JournalSet.java:179)
      at org.apache.hadoop.hdfs.server.namenode.FSEditLog.startLogSegment(FSEditLog.java:894)
      at org.apache.hadoop.hdfs.server.namenode.FSEditLog.openForWrite(FSEditLog.java:268)
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:618)
      at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1322)
      at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
      at org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:63)
      at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
      at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1230)
      at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:990)
      at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
      at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:3633)
      at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ewenpower liaowenrui
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: