Details
-
Bug
-
Status: Patch Available
-
Blocker
-
Resolution: Unresolved
-
3.2.2, 3.3.1, 3.2.3, 3.3.2
-
None
-
None
Description
2021-11-22 20:36:44,440 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Using longest log: 10.65.57.133:8485=segmentState
{ startTxId: 3906965 endTxId: 3906965 isInProgress: false }lastWriterEpoch: 5
lastCommittedTxId: 3906964
2021-11-22 20:36:44,457 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering unfinalized segments in /data12/data/flashHadoopU/namenode/current
2021-11-22 20:36:44,495 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /data12/data/flashHadoopU/namenode/current/edits_inprogress_0000000000003898378 -> /data12/data/flashHadoopU/namenode/current/edits_0000000000003898378-0000000000003898412
2021-11-22 20:36:44,657 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage
java.io.IOException: Gap in transactions. Expected to be able to read up until at least txid 2510934 but unable to find any edit logs containing txid 2510933
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.checkForGaps(FSEditLog.java:1578)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1536)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:652)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:976)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:681)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:585)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:645)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:812)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:796)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1493)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1559)
2021-11-22 20:36:44,660 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@pro-hadoop-dc01-057133.vm.dc01.hellocloud.tech:50070
2021-11-22 20:36:44,760 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system...
2021-11-22 20:36:44,761 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped.
2021-11-22 20:36:44,761 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
2021-11-22 20:36:44,761 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
Old version: 2.7.3
New version: 3.2.2
Steps to Reproduce
Step 1: Start NN1 as active , NN2 as standby .
Step 2: Perform "hdfs dfsadmin -rollingUpgrade prepare"
Step 3: Start NN2 active and NN1 as standby with rolling upgrade started option.
Step 4: DN also restarted in upgrade mode.
Step 5: Restart journalnode with new hadoop version
Step 6: a few days later
Step 7: bring down both NN, journalnode, DN
Step 8: Start JN with old version
Step 9: Start NN1 with rolling upgrade rollback option. nn started failed with above ERROR(Above mentioned txid version 2510933 has been deleted because of checkpoint mechanism)
Attachments
Attachments
Issue Links
- Blocked
-
HDFS-5920 Support rollback of rolling upgrade in NameNode and JournalNodes
- Resolved