Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
Description
Steps :
- Create volume, bucket, key and create snapshot snap1
- Delete snapshot snap1
- Try to access contents of deleted snapshot snap1 through 'fs -ls'
OM error stacktrace -
2023-05-03 06:47:09,555 [Socket Reader #1 for port 9862] INFO SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager: Authorization successful for om@ROOT.HWX.SITE (auth:KERBEROS) for protocol=interface org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol 2023-05-03 06:47:11,287 [OM StateMachine ApplyTransaction Thread - 0] ERROR org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine: Terminating with exit status 1: Request cmdType: PurgeDirectories clientId: "client-4D5F4A3C07A4" purgeDirectoriesRequest { snapshotTableKey: "/vol2/buck1/snap1" } failed with exception java.lang.IllegalStateException: java.io.IOException: FILE_NOT_FOUND org.apache.hadoop.ozone.om.exceptions.OMException: Unable to load snapshot. Snapshot with table key '/vol2/buck1/snap1' is no longer active at org.apache.hadoop.ozone.om.request.key.OMDirectoriesPurgeRequestWithFSO.validateAndUpdateCache(OMDirectoriesPurgeRequestWithFSO.java:133) at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequest(OzoneManagerRequestHandler.java:337) at org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:567) at org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$1(OzoneManagerStateMachine.java:358) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: FILE_NOT_FOUND org.apache.hadoop.ozone.om.exceptions.OMException: Unable to load snapshot. Snapshot with table key '/vol2/buck1/snap1' is no longer active at org.apache.hadoop.ozone.om.OmSnapshotManager.checkForSnapshot(OmSnapshotManager.java:523) at org.apache.hadoop.ozone.om.request.key.OMDirectoriesPurgeRequestWithFSO.validateAndUpdateCache(OMDirectoriesPurgeRequestWithFSO.java:77) ... 7 more Caused by: FILE_NOT_FOUND org.apache.hadoop.ozone.om.exceptions.OMException: Unable to load snapshot. Snapshot with table key '/vol2/buck1/snap1' is no longer active at org.apache.hadoop.ozone.om.OmSnapshotManager$1.load(OmSnapshotManager.java:288) at org.apache.hadoop.ozone.om.OmSnapshotManager$1.load(OmSnapshotManager.java:1) at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3533) at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2282) at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2159) at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2049) at com.google.common.cache.LocalCache.get(LocalCache.java:3966) at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3989) at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4950) at org.apache.hadoop.ozone.om.OmSnapshotManager.checkForSnapshot(OmSnapshotManager.java:521) ... 8 more 2023-05-03 06:47:11,434 [shutdown-hook-0] INFO org.apache.ranger.audit.provider.AuditProviderFactory: ==> JVMShutdownHook.run() 2023-05-03 06:47:11,449 [shutdown-hook-0] INFO org.apache.ranger.audit.provider.AuditProviderFactory: JVMShutdownHook: Signalling async audit cleanup to start. 2023-05-03 06:47:11,455 [shutdown-hook-0] INFO org.apache.ranger.audit.provider.AuditProviderFactory: JVMShutdownHook: Waiting up to 30 seconds for audit cleanup to finish. 2023-05-03 06:47:11,459 [Ranger async Audit cleanup] INFO org.apache.ranger.audit.provider.AuditProviderFactory: RangerAsyncAuditCleanup: Starting cleanup 2023-05-03 06:47:11,472 [Ranger async Audit cleanup] INFO org.apache.ranger.audit.queue.AuditAsyncQueue: Stop called. name=ozone.async 2023-05-03 06:47:11,472 [Ranger async Audit cleanup] INFO org.apache.ranger.audit.queue.AuditAsyncQueue: Interrupting consumerThread. name=ozone.async, consumer=ozone.async.summary 2023-05-03 06:47:11,473 [Ranger async Audit cleanup] INFO org.apache.ranger.audit.provider.AuditProviderFactory: RangerAsyncAuditCleanup: Done cleanup 2023-05-03 06:47:11,473 [Ranger async Audit cleanup] INFO org.apache.ranger.audit.provider.AuditProviderFactory: RangerAsyncAuditCleanup: Waiting to audit cleanup start signal 2023-05-03 06:47:11,473 [org.apache.ranger.audit.queue.AuditAsyncQueue0] INFO org.apache.ranger.audit.queue.AuditAsyncQueue: Caught exception in consumer thread. Shutdown might be in progress 2023-05-03 06:47:11,474 [org.apache.ranger.audit.queue.AuditAsyncQueue0] INFO org.apache.ranger.audit.queue.AuditAsyncQueue: Exiting polling loop. name=ozone.async 2023-05-03 06:47:11,474 [shutdown-hook-0] INFO org.apache.ranger.audit.provider.AuditProviderFactory: JVMShutdownHook: Audit cleanup finished after 19 milli seconds 2023-05-03 06:47:11,474 [org.apache.ranger.audit.queue.AuditAsyncQueue0] INFO org.apache.ranger.audit.queue.AuditAsyncQueue: Calling to stop consumer. name=ozone.async, consumer.name=ozone.async.summary 2023-05-03 06:47:11,474 [shutdown-hook-0] INFO org.apache.ranger.audit.provider.AuditProviderFactory: JVMShutdownHook: Interrupting ranger async audit cleanup thread 2023-05-03 06:47:11,474 [shutdown-hook-0] INFO org.apache.ranger.audit.provider.AuditProviderFactory: <== JVMShutdownHook.run() 2023-05-03 06:47:11,474 [Ranger async Audit cleanup] INFO org.apache.ranger.audit.provider.AuditProviderFactory: RangerAsyncAuditCleanup: Interrupted while waiting for audit startCleanup signal! Exiting the thread... java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) at java.util.concurrent.Semaphore.acquire(Semaphore.java:312) at org.apache.ranger.audit.provider.AuditProviderFactory$RangerAsyncAuditCleanup.run(AuditProviderFactory.java:531) at java.lang.Thread.run(Thread.java:748) 2023-05-03 06:47:11,474 [org.apache.ranger.audit.queue.AuditAsyncQueue0] INFO org.apache.ranger.audit.queue.AuditSummaryQueue: Stop called. name=ozone.async.summary 2023-05-03 06:47:11,481 [org.apache.ranger.audit.queue.AuditAsyncQueue0] INFO org.apache.ranger.audit.queue.AuditSummaryQueue: Interrupting consumerThread. name=ozone.async.summary, consumer=ozone.async.summary.batch 2023-05-03 06:47:11,481 [org.apache.ranger.audit.queue.AuditAsyncQueue0] INFO org.apache.ranger.audit.queue.AuditAsyncQueue: Exiting consumerThread.run() method. name=ozone.async 2023-05-03 06:47:11,481 [org.apache.ranger.audit.queue.AuditSummaryQueue0] INFO org.apache.ranger.audit.queue.AuditSummaryQueue: Caught exception in consumer thread. Shutdown might be in progress 2023-05-03 06:47:11,481 [org.apache.ranger.audit.queue.AuditSummaryQueue0] INFO org.apache.ranger.audit.queue.AuditSummaryQueue: Exiting polling loop. name=ozone.async.summary 2023-05-03 06:47:11,481 [org.apache.ranger.audit.queue.AuditSummaryQueue0] INFO org.apache.ranger.audit.queue.AuditSummaryQueue: Calling to stop consumer. name=ozone.async.summary, consumer.name=ozone.async.summary.batch 2023-05-03 06:47:11,481 [org.apache.ranger.audit.queue.AuditSummaryQueue0] INFO org.apache.ranger.audit.queue.AuditBatchQueue: Stop called. name=ozone.async.summary.batch 2023-05-03 06:47:11,486 [org.apache.ranger.audit.queue.AuditSummaryQueue0] INFO org.apache.ranger.audit.queue.AuditBatchQueue: Interrupting consumerThread. name=ozone.async.summary.batch, consumer=ozone.async.summary.batch.solr 2023-05-03 06:47:11,486 [org.apache.ranger.audit.queue.AuditSummaryQueue0] INFO org.apache.ranger.audit.queue.AuditSummaryQueue: Exiting consumerThread.run() method. name=ozone.async.summary 2023-05-03 06:47:11,491 [shutdown-hook-0] INFO org.apache.hadoop.ozone.om.OzoneManager: om1[jsinha-1.jsinha.root.hwx.site:9862]: Stopping Ozone Manager 2023-05-03 06:47:11,492 [org.apache.ranger.audit.queue.AuditBatchQueue0] ERROR org.apache.solr.client.solrj.impl.BaseCloudSolrClient: Request to collection [ranger_audits] failed due to (0) java.lang.InterruptedException, retry=0 commError=false errorCode=0 2023-05-03 06:47:11,492 [org.apache.ranger.audit.queue.AuditBatchQueue0] INFO org.apache.solr.client.solrj.impl.BaseCloudSolrClient: request was not communication error it seems 2023-05-03 06:47:11,507 [shutdown-hook-0] INFO org.apache.hadoop.ozone.om.OzoneManagerStarter: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down OzoneManager at jsinha-1.jsinha.root.hwx.site/172.27.88.82 ************************************************************/
Attachments
Issue Links
- is related to
-
HDDS-8449 [snapshot] OM shuts down on trying to delete same snapshot twice before reclamation
- Resolved