Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
Description
OM shutdown due to block checksum mismatch while creating snapshot
OM error log stacktrace -
2023-11-17 01:36:32,449 INFO [OM StateMachine ApplyTransaction Thread - 0]-org.apache.hadoop.ozone.om.request.snapshot.OMSnapshotCreateRequest: Created snapshot: 'snap-ck8h9' with snapshotId: 'f6f0dc88-5937-4722-98cc-cd6e1afc0558' under path 'voljitog/bucketjitog' 2023-11-17 01:36:33,531 ERROR [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager: Unable to create RocksDB Snapshot. java.io.IOException: RocksDatabase[/var/lib/hadoop-ozone/om/data167990/om.db]: Failed to flush; status : Corruption; message : block checksum mismatch: stored = 3301695847, computed = 750921363, type = 1 in /var/lib/hadoop-ozone/om/data167990/om.db/000701.sst offset 0 size 317 at org.apache.hadoop.hdds.utils.HddsServerUtil.toIOException(HddsServerUtil.java:667) at org.apache.hadoop.hdds.utils.db.RocksDatabase.toIOException(RocksDatabase.java:90) at org.apache.hadoop.hdds.utils.db.RocksDatabase.flush(RocksDatabase.java:504) at org.apache.hadoop.hdds.utils.db.RDBCheckpointManager.createCheckpoint(RDBCheckpointManager.java:81) at org.apache.hadoop.hdds.utils.db.RDBStore.getSnapshot(RDBStore.java:329) at org.apache.hadoop.ozone.om.OmSnapshotManager.createOmSnapshotCheckpoint(OmSnapshotManager.java:437) at org.apache.hadoop.ozone.om.response.snapshot.OMSnapshotCreateResponse.addToDBBatch(OMSnapshotCreateResponse.java:81) at org.apache.hadoop.ozone.om.response.OMClientResponse.checkAndUpdateDB(OMClientResponse.java:73) at org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.lambda$5(OzoneManagerDoubleBuffer.java:409) at org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.addToBatchWithTrace(OzoneManagerDoubleBuffer.java:237) at org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.addToBatch(OzoneManagerDoubleBuffer.java:408) at org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushBatch(OzoneManagerDoubleBuffer.java:335) at org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushCurrentBuffer(OzoneManagerDoubleBuffer.java:314) at org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:279) at java.base/java.lang.Thread.run(Thread.java:833) Caused by: org.rocksdb.RocksDBException: block checksum mismatch: stored = 3301695847, computed = 750921363, type = 1 in /var/lib/hadoop-ozone/om/data167990/om.db/000701.sst offset 0 size 317 at org.rocksdb.RocksDB.flush(Native Method) at org.rocksdb.RocksDB.flush(RocksDB.java:3785) at org.rocksdb.RocksDB.flush(RocksDB.java:3763) at org.apache.hadoop.hdds.utils.db.RocksDatabase.flush(RocksDatabase.java:500) ... 12 more 2023-11-17 01:36:33,584 ERROR [OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer: Terminating with exit status 1: During flush to DB encountered error in OMDoubleBuffer flush thread OMDoubleBufferFlushThread when handling OMRequest: cmdType: CreateSnapshot traceID: "" success: true status: OK CreateSnapshotResponse { snapshotInfo { snapshotID { mostSigBits: -652779467798264030 leastSigBits: -7436343011912710824 } name: "snap-ck8h9" volumeName: "voljitog" bucketName: "bucketjitog" snapshotStatus: SNAPSHOT_ACTIVE creationTime: 1700184992446 deletionTime: 18446744073709551615 globalPreviousSnapshotID { mostSigBits: 6265401439073879226 leastSigBits: -4917962699912174853 } snapshotPath: "voljitog/bucketjitog" checkpointDir: "-f6f0dc88-5937-4722-98cc-cd6e1afc0558" dbTxSequenceNumber: 8220 deepClean: true sstFiltered: false } } java.io.IOException: Rocks Database is closed at org.apache.hadoop.hdds.utils.db.RocksDatabase.assertClose(RocksDatabase.java:444) at org.apache.hadoop.hdds.utils.db.RocksDatabase.newIterator(RocksDatabase.java:856) at org.apache.hadoop.hdds.utils.db.RDBTable.iterator(RDBTable.java:232) at org.apache.hadoop.hdds.utils.db.TypedTable.iterator(TypedTable.java:417) at org.apache.hadoop.hdds.utils.db.TypedTable.iterator(TypedTable.java:409) at org.apache.hadoop.hdds.utils.db.TypedTable.iterator(TypedTable.java:55) at org.apache.hadoop.ozone.om.OmSnapshotManager.deleteKeysFromDelKeyTableInSnapshotScope(OmSnapshotManager.java:637) at org.apache.hadoop.ozone.om.OmSnapshotManager.createOmSnapshotCheckpoint(OmSnapshotManager.java:442) at org.apache.hadoop.ozone.om.response.snapshot.OMSnapshotCreateResponse.addToDBBatch(OMSnapshotCreateResponse.java:81) at org.apache.hadoop.ozone.om.response.OMClientResponse.checkAndUpdateDB(OMClientResponse.java:73) at org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.lambda$5(OzoneManagerDoubleBuffer.java:409) at org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.addToBatchWithTrace(OzoneManagerDoubleBuffer.java:237) at org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.addToBatch(OzoneManagerDoubleBuffer.java:408) at org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushBatch(OzoneManagerDoubleBuffer.java:335) at org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushCurrentBuffer(OzoneManagerDoubleBuffer.java:314) at org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:279) at java.base/java.lang.Thread.run(Thread.java:833)