Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
Description
RuntimeException encountered when generating snapshotDiff report between 2 snapshots
OM Log error stacktrace -
2023-11-16 22:54:06,804 INFO [IPC Server handler 32 on 9862]-org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager: Submitting snap diff report generation request for volume: voly6zr4, bucket: buckety6zr4, fromSnapshot: snap-wfiql and toSnapshot: snap-ohtc8 2023-11-16 22:54:06,804 INFO [snapshot-diff-job-thread-id-12]-org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager: Started snap diff report generation for volume: 'voly6zr4', bucket: 'buckety6zr4', fromSnapshot: 'snap-wfiql', toSnapshot: 'snap-ohtc8' 2023-11-16 22:54:06,805 INFO [snapshot-diff-job-thread-id-12]-org.apache.hadoop.ozone.om.snapshot.SnapshotCache: Loading snapshot. Table key: /voly6zr4/buckety6zr4/snap-wfiql 2023-11-16 22:54:06,805 INFO [snapshot-diff-job-thread-id-12]-org.apache.hadoop.ozone.om.snapshot.SnapshotCache: Loading snapshot. Table key: /voly6zr4/buckety6zr4/snap-ohtc8 2023-11-16 22:54:06,837 ERROR [snapshot-diff-job-thread-id-12]-org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager: Caught unchecked exception during diff report generation for volume: voly6zr4 bucket: buckety6zr4, fromSnapshot: snap-wfiql and toSnapshot: snap-ohtc8 java.lang.RuntimeException: java.io.IOException: RocksDatabase[/var/lib/hadoop-ozone/om/data159041/db.snapshots/checkpointState/om.db-b5edff31-58ff-458d-b431-141634b48006]: Failed to get /-9223372036853507328/-9223372036853506816/-9223372036853506303/vectortab_txt from ColumnFamily-directoryTable; status : IOError; message : While pread offset 0 len 518: /var/lib/hadoop-ozone/om/data159041/db.snapshots/checkpointState/om.db-b5edff31-58ff-458d-b431-141634b48006/001614.sst: Bad file descriptor at org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.lambda$7(SnapshotDiffManager.java:1170) at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133) at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658) at org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.addToObjectIdMap(SnapshotDiffManager.java:1137) at org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.getDeltaFilesAndDiffKeysToObjectIdToKeyMap(SnapshotDiffManager.java:1078) at org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.lambda$4(SnapshotDiffManager.java:958) at org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.generateSnapshotDiffReport(SnapshotDiffManager.java:1014) at org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.lambda$2(SnapshotDiffManager.java:741) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) Caused by: java.io.IOException: RocksDatabase[/var/lib/hadoop-ozone/om/data159041/db.snapshots/checkpointState/om.db-b5edff31-58ff-458d-b431-141634b48006]: Failed to get /-9223372036853507328/-9223372036853506816/-9223372036853506303/vectortab_txt from ColumnFamily-directoryTable; status : IOError; message : While pread offset 0 len 518: /var/lib/hadoop-ozone/om/data159041/db.snapshots/checkpointState/om.db-b5edff31-58ff-458d-b431-141634b48006/001614.sst: Bad file descriptor at org.apache.hadoop.hdds.utils.HddsServerUtil.toIOException(HddsServerUtil.java:667) at org.apache.hadoop.hdds.utils.db.RocksDatabase.toIOException(RocksDatabase.java:90) at org.apache.hadoop.hdds.utils.db.RocksDatabase.get(RocksDatabase.java:750) at org.apache.hadoop.hdds.utils.db.RDBTable.get(RDBTable.java:134) at org.apache.hadoop.hdds.utils.db.TypedTable.lambda$getFromTable$0(TypedTable.java:313) at org.apache.hadoop.hdds.utils.db.CodecBuffer.putFromSource(CodecBuffer.java:422) at org.apache.hadoop.hdds.utils.db.TypedTable.getFromTable(TypedTable.java:312) at org.apache.hadoop.hdds.utils.db.TypedTable.getFromTable(TypedTable.java:344) at org.apache.hadoop.hdds.utils.db.TypedTable.getFromTable(TypedTable.java:318) at org.apache.hadoop.hdds.utils.db.TypedTable.get(TypedTable.java:228) at org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.lambda$7(SnapshotDiffManager.java:1139) ... 11 more Caused by: org.rocksdb.RocksDBException: While pread offset 0 len 518: /var/lib/hadoop-ozone/om/data159041/db.snapshots/checkpointState/om.db-b5edff31-58ff-458d-b431-141634b48006/001614.sst: Bad file descriptor at org.rocksdb.RocksDB.getDirect(Native Method) at org.rocksdb.RocksDB.get(RocksDB.java:1251) at org.apache.hadoop.hdds.utils.db.RocksDatabase.get(RocksDatabase.java:743) ... 19 more
2023-11-16 22:54:16,652 ERROR [Timer for 'OzoneManager' metrics system]-org.apache.hadoop.hdds.utils.RocksDBStoreMetrics: Failed to get property mem-table-flush-pending from rocksdb java.io.IOException: Rocks Database is closed at org.apache.hadoop.hdds.utils.db.RocksDatabase.assertClose(RocksDatabase.java:444) at org.apache.hadoop.hdds.utils.db.RocksDatabase.getProperty(RocksDatabase.java:807) at org.apache.hadoop.hdds.utils.RocksDBStoreMetrics.getDBPropertyData(RocksDBStoreMetrics.java:214) at org.apache.hadoop.hdds.utils.RocksDBStoreMetrics.getMetrics(RocksDBStoreMetrics.java:151) at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:200) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.snapshotMetrics(MetricsSystemImpl.java:419) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.sampleMetrics(MetricsSystemImpl.java:406) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.onTimerEvent(MetricsSystemImpl.java:381) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$4.run(MetricsSystemImpl.java:368) at java.base/java.util.TimerThread.mainLoop(Timer.java:556) at java.base/java.util.TimerThread.run(Timer.java:506) 2023-11-16 22:54:16,653 ERROR [Timer for 'OzoneManager' metrics system]-org.apache.hadoop.hdds.utils.RocksDBStoreMetrics: Failed to compute sst file stat java.io.IOException: Rocks Database is closed at org.apache.hadoop.hdds.utils.db.RocksDatabase.assertClose(RocksDatabase.java:444) at org.apache.hadoop.hdds.utils.db.RocksDatabase.getLiveFilesMetaData(RocksDatabase.java:642) at org.apache.hadoop.hdds.utils.RocksDBStoreMetrics.computeSstFileStat(RocksDBStoreMetrics.java:251) at org.apache.hadoop.hdds.utils.RocksDBStoreMetrics.getDBPropertyData(RocksDBStoreMetrics.java:235) at org.apache.hadoop.hdds.utils.RocksDBStoreMetrics.getMetrics(RocksDBStoreMetrics.java:151) at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:200) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.snapshotMetrics(MetricsSystemImpl.java:419) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.sampleMetrics(MetricsSystemImpl.java:406) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.onTimerEvent(MetricsSystemImpl.java:381) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$4.run(MetricsSystemImpl.java:368) at java.base/java.util.TimerThread.mainLoop(Timer.java:556) at java.base/java.util.TimerThread.run(Timer.java:506) 2023-11-16 22:54:16,653 ERROR [Timer for 'OzoneManager' metrics system]-org.apache.hadoop.hdds.utils.RocksDBStoreMetrics: Failed to get latest sequence number java.io.IOException: Rocks Database is closed at org.apache.hadoop.hdds.utils.db.RocksDatabase.assertClose(RocksDatabase.java:444) at org.apache.hadoop.hdds.utils.db.RocksDatabase.getLatestSequenceNumber(RocksDatabase.java:834) at org.apache.hadoop.hdds.utils.RocksDBStoreMetrics.getLatestSequenceNumber(RocksDBStoreMetrics.java:302) at org.apache.hadoop.hdds.utils.RocksDBStoreMetrics.getMetrics(RocksDBStoreMetrics.java:152) at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:200) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.snapshotMetrics(MetricsSystemImpl.java:419) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.sampleMetrics(MetricsSystemImpl.java:406) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.onTimerEvent(MetricsSystemImpl.java:381) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$4.run(MetricsSystemImpl.java:368) at java.base/java.util.TimerThread.mainLoop(Timer.java:556) at java.base/java.util.TimerThread.run(Timer.java:506)
Attachments
Issue Links
- duplicates
-
HDDS-10149 Bad file descriptor in TestOmSnapshotFsoWithNativeLib.testSnapshotCompactionDag
- Resolved