Details
Description
- After rollingUpgrade NN from 3.1.3/3.2.1 to 3.3.0, if the NN is restarted, it fails while replaying edit logs.
HDFS-14922,HDFS-14924, andHDFS-15054introduced the modification time bits to the editLog transactions.
- When NN is restarted and the edit logs are replayed, the NN reads the old layout version from the editLog file. When parsing the transactions, it assumes that the transactions are also from the previous layout and hence skips parsing the modification time bits.
- This cascades into reading the wrong set of bits for other fields and leads to NN shutting down.
2020-09-07 19:34:42,085 | DEBUG | main | Stopping client | Client.java:1361 2020-09-07 19:34:42,087 | ERROR | main | Failed to start namenode. | NameNode.java:1751 java.lang.IllegalArgumentException at com.google.common.base.Preconditions.checkArgument(Preconditions.java:72) at org.apache.hadoop.ipc.ClientId.toString(ClientId.java:56) at org.apache.hadoop.hdfs.server.namenode.FSEditLogOp.appendRpcIdsToString(FSEditLogOp.java:318) at org.apache.hadoop.hdfs.server.namenode.FSEditLogOp.access$700(FSEditLogOp.java:153) at org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$DeleteSnapshotOp.toString(FSEditLogOp.java:3606) at java.lang.String.valueOf(String.java:2994) at java.lang.StringBuilder.append(StringBuilder.java:131) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:305) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:188) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:932) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:779) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:337) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1136) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:742) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:654) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:716) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:959) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:932) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1674) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1744)
Attachments
Attachments
Issue Links
- breaks
-
HDFS-16001 TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes
- Resolved
- is blocked by
-
HDFS-15624 Fix the SetQuotaByStorageTypeOp problem after updating hadoop
- Resolved
- is broken by
-
HDFS-14922 Prevent snapshot modification time got change on startup
- Resolved
-
HDFS-14924 RenameSnapshot not updating new modification time
- Resolved
-
HDFS-15054 Delete Snapshot not updating new modification time
- Resolved