Uploaded image for project: 'CloudStack'
  1. CloudStack
  2. CLOUDSTACK-9538

Deleting Snapshot From Primary Storage Fails on RBD Storage if you already delete vm's itself

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 4.9.0
    • Fix Version/s: None
    • Security Level: Public (Anyone can view this level - this is the default.)
    • Labels:
      None
    • Environment:
      Ubuntu 14.04 Management Server + Ubuntu 14.04 KVM

      Description

      Hi;
      We plan to store vm snapshots as vm backups on secondary storage while we still destroyed/expunged related vm. The idea is good there was a bug which blocks this idea to work and it was fixed with CLOUDSTACK-9297 bug.

      Normally with 4.9 release we expected this idea to work on our on our 4.9 ACS environment but we noticed that because we are using rbd as primary storage we need to fix one minor bug for this idea to work.

      The problem occurs because CLOUDSTACK-8302 bug fixed on 4.9 release and it block our idea to work. If you destroy a vm which is on RBD Storage as primary storage it also deletes any related snapshots of that vm on Primary RBD Storage. So after vm destroy no disk file or snapshot file over RBD Storage. This is good for cleanup purposes on primary storage but XenserverSnapshotStrategy.deleteSnapshot class method did not expect this to happen.

      org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.deleteSnapshot method receives exception. The code tries 10 times on KVM node to remove RBD snapshot but because there is no snapshot on RBD side it get exception after 10 retries, it also spends nearly 5 minutes to delete snapshots and after that it ends with an error like "Failed to delete snapshot" error.

      I think we need to disable snapshot cleanup on primary storage only for RBD type Primary Storage if its related vm was already been destroyed. (Because vm destroy stage removed all snapshots related to vm on primary storage so there is no need to take any action on primary storage.)

      We make some tests below to make this issue clear for bug.

      1) We create a vm with 3 snapshots on ACS.

      mysql> select * from snapshot_store_ref where snapshot_id in (93,94,95);
      ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

      id store_id snapshot_id created last_updated job_id store_role size physical_size parent_snapshot_id install_path state update_count ref_cnt updated volume_id

      ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

      185 1 93 2016-10-12 10:13:44 NULL NULL Primary 28991029248 28991029248 0 cst4/bb9ca3c7-96d6-4465-85b5-cd01f4d635f2/54008bf3-43dd-469d-91a7-4acd146d7b84 Ready 2 0 2016-10-12 10:13:45 4774
      186 1 93 2016-10-12 10:13:45 NULL NULL Image 28991029248 28991029248 0 snapshots/2/4774/54008bf3-43dd-469d-91a7-4acd146d7b84 Ready 2 0 2016-10-12 10:15:04 4774
      187 1 94 2016-10-12 10:15:38 NULL NULL Primary 28991029248 28991029248 0 cst4/bb9ca3c7-96d6-4465-85b5-cd01f4d635f2/45fc4f44-b377-49c0-9264-5d813fefe93f Ready 2 0 2016-10-12 10:15:39 4774
      188 1 94 2016-10-12 10:15:39 NULL NULL Image 28991029248 28991029248 0 snapshots/2/4774/45fc4f44-b377-49c0-9264-5d813fefe93f Ready 2 0 2016-10-12 10:16:52 4774
      189 1 95 2016-10-12 10:17:08 NULL NULL Primary 28991029248 28991029248 0 cst4/bb9ca3c7-96d6-4465-85b5-cd01f4d635f2/1c1890bf-5a55-4060-bf3f-cf064629dcdf Ready 2 0 2016-10-12 10:17:09 4774
      190 1 95 2016-10-12 10:17:09 NULL NULL Image 28991029248 28991029248 0 snapshots/2/4774/1c1890bf-5a55-4060-bf3f-cf064629dcdf Ready 2 0 2016-10-12 10:18:23 4774

      ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      6 rows in set (0.00 sec)

      mysql> select * from snapshots where name like '%TolgaC72%';
      ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

      id data_center_id account_id domain_id volume_id disk_offering_id status path name uuid snapshot_type type_description size created removed backup_snap_id swift_id sechost_id prev_snap_id hypervisor_type version s3_id min_iops max_iops

      ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

      93 1 2 1 4774 19 BackedUp NULL S1TolgaC72Test41 0f84acd8-9a5c-4de5-a58a-b528b4e72839 0 MANUAL 28991029248 2016-10-12 10:13:42 NULL NULL NULL NULL NULL KVM 2.2 NULL NULL NULL
      94 1 2 1 4774 19 BackedUp NULL S2TolgaC72Test41 51ef5246-27ae-435e-92ac-7428bab64cb8 0 MANUAL 28991029248 2016-10-12 10:15:36 NULL NULL NULL NULL NULL KVM 2.2 NULL NULL NULL
      95 1 2 1 4774 19 BackedUp NULL S3TolgaC72Test41 e5550dd6-9a43-4dd7-959a-7aabf26f0593 0 MANUAL 28991029248 2016-10-12 10:17:06 NULL NULL NULL NULL NULL KVM 2.2 NULL NULL NULL

      ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      3 rows in set (0.00 sec)

      mysql> select uuid, name from volumes where name='ROOT-4795';
      -----------------------------------------------+

      uuid name

      -----------------------------------------------+

      bb9ca3c7-96d6-4465-85b5-cd01f4d635f2 ROOT-4795

      -----------------------------------------------+
      1 row in set (0.00 sec)

      mysql>

      #VM + 3 Snapshots
      #Ceph / Primary Storage
      root@mont1:/var/log/ceph# rbd ls -l -p cst4|grep bb9ca3c7-96d6-4465-85b5-cd01f4d635f2
      bb9ca3c7-96d6-4465-85b5-cd01f4d635f2 27648M cst4/fb80c005-51d2-4fc4-8a98-39073866e826@cloudstack-base-snap 2
      bb9ca3c7-96d6-4465-85b5-cd01f4d635f2@54008bf3-43dd-469d-91a7-4acd146d7b84 27648M cst4/fb80c005-51d2-4fc4-8a98-39073866e826@cloudstack-base-snap 2
      bb9ca3c7-96d6-4465-85b5-cd01f4d635f2@45fc4f44-b377-49c0-9264-5d813fefe93f 27648M cst4/fb80c005-51d2-4fc4-8a98-39073866e826@cloudstack-base-snap 2
      bb9ca3c7-96d6-4465-85b5-cd01f4d635f2@1c1890bf-5a55-4060-bf3f-cf064629dcdf 27648M cst4/fb80c005-51d2-4fc4-8a98-39073866e826@cloudstack-base-snap 2
      root@mont1:/var/log/ceph#
      #NFS / Secondary Side
      root@cloudstackt4:/CST/S# ls -l snapshots/2/4774/54008bf3-43dd-469d-91a7-4acd146d7b84 snapshots/2/4774/45fc4f44-b377-49c0-9264-5d813fefe93f snapshots/2/4774/1c1890bf-5a55-4060-bf3f-cf064629dcdf
      rw-rr- 1 root root 28991029248 Oct 12 13:18 snapshots/2/4774/1c1890bf-5a55-4060-bf3f-cf064629dcdf
      rw-rr- 1 root root 28991029248 Oct 12 13:16 snapshots/2/4774/45fc4f44-b377-49c0-9264-5d813fefe93f
      rw-rr- 1 root root 28991029248 Oct 12 13:15 snapshots/2/4774/54008bf3-43dd-469d-91a7-4acd146d7b84
      root@cloudstackt4:/CST/S#

      2) We destroy only vm not snapshots so 3 snapshots are active with BackedUp Status. On vm destroy stage vm destroy method removes all vm's files and its related snapshots on primary storage.

      #VM Destroyed we have 3 Snapshots on BackedUp state
      #Ceph Side / Primary Storage
      root@mont1:/var/log/ceph# rbd ls -l -p cst4|grep bb9ca3c7-96d6-4465-85b5-cd01f4d635f2
      root@mont1:/var/log/ceph#
      #NFS / Secondary Side
      root@cloudstackt4:/CST/S# ls -l snapshots/2/4774/54008bf3-43dd-469d-91a7-4acd146d7b84 snapshots/2/4774/45fc4f44-b377-49c0-9264-5d813fefe93f snapshots/2/4774/1c1890bf-5a55-4060-bf3f-cf064629dcdf
      rw-rr- 1 root root 28991029248 Oct 12 13:18 snapshots/2/4774/1c1890bf-5a55-4060-bf3f-cf064629dcdf
      rw-rr- 1 root root 28991029248 Oct 12 13:16 snapshots/2/4774/45fc4f44-b377-49c0-9264-5d813fefe93f
      rw-rr- 1 root root 28991029248 Oct 12 13:15 snapshots/2/4774/54008bf3-43dd-469d-91a7-4acd146d7b84
      root@cloudstackt4:/CST/S#

      3) We destroy 1 Snapshot named S3TolgaC72Test41
      #Ceph Side / Primary Storage
      root@mont1:/var/log/ceph# rbd ls -l -p cst4|grep bb9ca3c7-96d6-4465-85b5-cd01f4d635f2
      root@mont1:/var/log/ceph#
      #NFS / Secondary Side
      root@cloudstackt4:/CST/S# ls -l snapshots/2/4774/54008bf3-43dd-469d-91a7-4acd146d7b84 snapshots/2/4774/45fc4f44-b377-49c0-9264-5d813fefe93f snapshots/2/4774/1c1890bf-5a55-4060-bf3f-cf064629dcdf
      ls: cannot access snapshots/2/4774/1c1890bf-5a55-4060-bf3f-cf064629dcdf: No such file or directory
      rw-rr- 1 root root 28991029248 Oct 12 13:16 snapshots/2/4774/45fc4f44-b377-49c0-9264-5d813fefe93f
      rw-rr- 1 root root 28991029248 Oct 12 13:15 snapshots/2/4774/54008bf3-43dd-469d-91a7-4acd146d7b84

      #SQL Side
      mysql> select * from snapshots where name like '%TolgaC72%' and name != 'StolgaC72Revert';
      ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

      id data_center_id account_id domain_id volume_id disk_offering_id status path name uuid snapshot_type type_description size created removed backup_snap_id swift_id sechost_id prev_snap_id hypervisor_type version s3_id min_iops max_iops

      ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

      93 1 2 1 4774 19 BackedUp NULL S1TolgaC72Test41 0f84acd8-9a5c-4de5-a58a-b528b4e72839 0 MANUAL 28991029248 2016-10-12 10:13:42 NULL NULL NULL NULL NULL KVM 2.2 NULL NULL NULL
      94 1 2 1 4774 19 BackedUp NULL S2TolgaC72Test41 51ef5246-27ae-435e-92ac-7428bab64cb8 0 MANUAL 28991029248 2016-10-12 10:15:36 NULL NULL NULL NULL NULL KVM 2.2 NULL NULL NULL
      95 1 2 1 4774 19 Destroyed NULL S3TolgaC72Test41 e5550dd6-9a43-4dd7-959a-7aabf26f0593 0 MANUAL 28991029248 2016-10-12 10:17:06 NULL NULL NULL NULL NULL KVM 2.2 NULL NULL NULL

      ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      3 rows in set (0.00 sec)

      mysql> select * from snapshot_store_ref where snapshot_id in (93,94,95);
      -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

      id store_id snapshot_id created last_updated job_id store_role size physical_size parent_snapshot_id install_path state update_count ref_cnt updated volume_id

      -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

      185 1 93 2016-10-12 10:13:44 NULL NULL Primary 28991029248 28991029248 0 cst4/bb9ca3c7-96d6-4465-85b5-cd01f4d635f2/54008bf3-43dd-469d-91a7-4acd146d7b84 Ready 2 0 2016-10-12 10:13:45 4774
      186 1 93 2016-10-12 10:13:45 NULL NULL Image 28991029248 28991029248 0 snapshots/2/4774/54008bf3-43dd-469d-91a7-4acd146d7b84 Ready 2 0 2016-10-12 10:15:04 4774
      187 1 94 2016-10-12 10:15:38 NULL NULL Primary 28991029248 28991029248 0 cst4/bb9ca3c7-96d6-4465-85b5-cd01f4d635f2/45fc4f44-b377-49c0-9264-5d813fefe93f Ready 2 0 2016-10-12 10:15:39 4774
      188 1 94 2016-10-12 10:15:39 NULL NULL Image 28991029248 28991029248 0 snapshots/2/4774/45fc4f44-b377-49c0-9264-5d813fefe93f Ready 2 0 2016-10-12 10:16:52 4774
      190 1 95 2016-10-12 10:17:09 NULL NULL Image 28991029248 28991029248 0 snapshots/2/4774/1c1890bf-5a55-4060-bf3f-cf064629dcdf Destroyed 4 0 2016-10-12 10:48:31 4774

      -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      5 rows in set (0.00 sec)

      mysql>

      ERROR on Cloudstack Management Server: Failed to delete snapshot
      2016-10-12 13:54:02,911 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy] (API-Job-Executor-70:ctx-06858bab job-34505 ctx-a6bc52fb) (logid:a2b92099) Failed to delete snapshot:
      com.cloud.utils.exception.CloudRuntimeException: Failed to remove snapshot
      at org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.deleteSnapshot(SnapshotServiceImpl.java:396)
      at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.deleteSnapshot(XenserverSnapshotStrategy.java:272)
      at com.cloud.storage.snapshot.SnapshotManagerImpl.deleteSnapshot(SnapshotManagerImpl.java:454)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:606)
      at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
      at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
      at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
      at org.apache.cloudstack.network.contrail.management.EventUtils$EventInterceptor.invoke(EventUtils.java:106)
      at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
      at com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:51)
      at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
      at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
      at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
      at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
      at com.sun.proxy.$Proxy192.deleteSnapshot(Unknown Source)
      at org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd.execute(DeleteSnapshotCmd.java:103)
      at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:150)
      at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:108)
      at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:554)
      at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
      at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
      at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
      at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
      at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
      at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:502)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:745)
      2016-10-12 13:54:02,918 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Executor-70:ctx-06858bab job-34505) (logid:a2b92099) Complete async job-34505, jobStatus: FAILED, resultCode: 530,
      result: org.apache.cloudstack.api.response.ExceptionResponse/null/

      {"uuidList":[],"errorcode":530,"errortext":"Failed to delete snapshot"}

      ERROR on KVM Agent
      2016-10-12 13:49:38,227 DEBUG [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-3:null) Could not find volume bb9ca3c7-96d6-4465-85b5-cd01f4d635f2: Storage volume not found: no storage vol with matching name 'bb9ca3c7-96d6-4465-85b5-cd01f4d635f2'
      2016-10-12 13:49:38,227 DEBUG [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-3:null) Refreshing storage pool 01b375db-d3f5-33c1-9389-8bf226c887e8
      2016-10-12 13:49:41,253 DEBUG [kvm.storage.KVMStoragePoolManager] (agentRequest-Handler-3:null) Failed to find volume:bb9ca3c7-96d6-4465-85b5-cd01f4d635f2 due tocom.cloud.utils.exception.CloudRuntimeException: Could not find volume bb9ca3c7-96d6-4465-85b5-cd01f4d635f2: Storage volume not found: no storage vol with matching name 'bb9ca3c7-96d6-4465-85b5-cd01f4d635f2', retry:2
      2016-10-12 13:51:53,819 DEBUG [kvm.storage.KVMStoragePoolManager] (agentRequest-Handler-3:null) Failed to find volume:bb9ca3c7-96d6-4465-85b5-cd01f4d635f2 due tocom.cloud.utils.exception.CloudRuntimeException: Could not find volume bb9ca3c7-96d6-4465-85b5-cd01f4d635f2: Storage volume not found: no storage vol with matching name 'bb9ca3c7-96d6-4465-85b5-cd01f4d635f2', retry:6
      2016-10-12 13:52:26,858 DEBUG [kvm.storage.KVMStoragePoolManager] (agentRequest-Handler-3:null) Failed to find volume:bb9ca3c7-96d6-4465-85b5-cd01f4d635f2 due tocom.cloud.utils.exception.CloudRuntimeException: Could not find volume bb9ca3c7-96d6-4465-85b5-cd01f4d635f2: Storage volume not found: no storage vol with matching name 'bb9ca3c7-96d6-4465-85b5-cd01f4d635f2', retry:7
      2016-10-12 13:53:33,390 DEBUG [kvm.storage.KVMStoragePoolManager] (agentRequest-Handler-3:null) Failed to find volume:bb9ca3c7-96d6-4465-85b5-cd01f4d635f2 due tocom.cloud.utils.exception.CloudRuntimeException: Could not find volume bb9ca3c7-96d6-4465-85b5-cd01f4d635f2: Storage volume not found: no storage vol with matching name 'bb9ca3c7-96d6-4465-85b5-cd01f4d635f2', retry:9
      2016-10-12 13:54:03,390 ERROR [kvm.storage.KVMStorageProcessor] (agentRequest-Handler-3:null) Failed to remove snapshot , with exception: com.cloud.utils.exception.CloudRuntimeException: com.cloud.utils.exception.CloudRuntimeException: Could not find volume bb9ca3c7-96d6-4465-85b5-cd01f4d635f2: Storage volume not found: no storage vol with matching name 'bb9ca3c7-96d6-4465-85b5-cd01f4d635f2'

      Thanks
      Özhan

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                ozhank Özhan Rüzgar Karaman
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated: