Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
SSTFiltering service should use snapshot cache for rocksdb.
Currently concurrent snapshot diff & SSTFiltering service fails with the following error:
2023-06-26 13:05:15,385 [snapshot-diff-job-thread-id-18] ERROR org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager: Caught checked exception during diff report generation for volume: volume1 bucket: bucket1, fromSnapshot: alma2 and toSnapshot: cm-tmp-148e6230-2672-4077-b969-5ef70578f264 java.util.concurrent.ExecutionException: java.io.IOException: Failed init RocksDB, db path : /var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-77982b43-6040-46d1-89c8-9a7c7c4d446c, exception :org.rocksdb.RocksDBException lock hold by current process, acquire time 1687542204 acquiring thread 139729809377024: /var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-77982b43-6040-46d1-89c8-9a7c7c4d446c/LOCK: No locks available at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:588) at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:547) at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:113) at com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:240) at com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2317) at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2283) at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2159) at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2049) at com.google.common.cache.LocalCache.get(LocalCache.java:3966) at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3989) at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4950) at org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.generateSnapshotDiffReport(SnapshotDiffManager.java:669) at org.apache.hadoop.ozone.om.snapshot.SnapshotDiffManager.lambda$0(SnapshotDiffManager.java:565) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: Failed init RocksDB, db path : /var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-77982b43-6040-46d1-89c8-9a7c7c4d446c, exception :org.rocksdb.RocksDBException lock hold by current process, acquire time 1687542204 acquiring thread 139729809377024: /var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-77982b43-6040-46d1-89c8-9a7c7c4d446c/LOCK: No locks available at org.apache.hadoop.hdds.utils.db.RDBStore.<init>(RDBStore.java:173) at org.apache.hadoop.hdds.utils.db.DBStoreBuilder.build(DBStoreBuilder.java:213) at org.apache.hadoop.ozone.om.OmMetadataManagerImpl.loadDB(OmMetadataManagerImpl.java:557) at org.apache.hadoop.ozone.om.OmMetadataManagerImpl.<init>(OmMetadataManagerImpl.java:379) at org.apache.hadoop.ozone.om.OmSnapshotManager$1.load(OmSnapshotManager.java:319) at org.apache.hadoop.ozone.om.OmSnapshotManager$1.load(OmSnapshotManager.java:1)
Attachments
Issue Links
- relates to
-
HDDS-9029 Intermittent failure in TestOMSnapshotDAG#testDAGReconstruction
- Resolved
- links to