Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
4.9.0
-
None
-
Security Level: Public (Anyone can view this level - this is the default.)
-
None
-
Ubuntu 14.04 Management Server + Ubuntu 14.04 KVM
Description
Hi;
We plan to store vm snapshots as vm backups on secondary storage while we still destroyed/expunged related vm. The idea is good there was a bug which blocks this idea to work and it was fixed with CLOUDSTACK-9297 bug.
Normally with 4.9 release we expected this idea to work on our on our 4.9 ACS environment but we noticed that because we are using rbd as primary storage we need to fix one minor bug for this idea to work.
The problem occurs because CLOUDSTACK-8302 bug fixed on 4.9 release and it block our idea to work. If you destroy a vm which is on RBD Storage as primary storage it also deletes any related snapshots of that vm on Primary RBD Storage. So after vm destroy no disk file or snapshot file over RBD Storage. This is good for cleanup purposes on primary storage but XenserverSnapshotStrategy.deleteSnapshot class method did not expect this to happen.
org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.deleteSnapshot method receives exception. The code tries 10 times on KVM node to remove RBD snapshot but because there is no snapshot on RBD side it get exception after 10 retries, it also spends nearly 5 minutes to delete snapshots and after that it ends with an error like "Failed to delete snapshot" error.
I think we need to disable snapshot cleanup on primary storage only for RBD type Primary Storage if its related vm was already been destroyed. (Because vm destroy stage removed all snapshots related to vm on primary storage so there is no need to take any action on primary storage.)
We make some tests below to make this issue clear for bug.
1) We create a vm with 3 snapshots on ACS.
mysql> select * from snapshot_store_ref where snapshot_id in (93,94,95);
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
id | store_id | snapshot_id | created | last_updated | job_id | store_role | size | physical_size | parent_snapshot_id | install_path | state | update_count | ref_cnt | updated | volume_id |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
185 | 1 | 93 | 2016-10-12 10:13:44 | NULL | NULL | Primary | 28991029248 | 28991029248 | 0 | cst4/bb9ca3c7-96d6-4465-85b5-cd01f4d635f2/54008bf3-43dd-469d-91a7-4acd146d7b84 | Ready | 2 | 0 | 2016-10-12 10:13:45 | 4774 |
186 | 1 | 93 | 2016-10-12 10:13:45 | NULL | NULL | Image | 28991029248 | 28991029248 | 0 | snapshots/2/4774/54008bf3-43dd-469d-91a7-4acd146d7b84 | Ready | 2 | 0 | 2016-10-12 10:15:04 | 4774 |
187 | 1 | 94 | 2016-10-12 10:15:38 | NULL | NULL | Primary | 28991029248 | 28991029248 | 0 | cst4/bb9ca3c7-96d6-4465-85b5-cd01f4d635f2/45fc4f44-b377-49c0-9264-5d813fefe93f | Ready | 2 | 0 | 2016-10-12 10:15:39 | 4774 |
188 | 1 | 94 | 2016-10-12 10:15:39 | NULL | NULL | Image | 28991029248 | 28991029248 | 0 | snapshots/2/4774/45fc4f44-b377-49c0-9264-5d813fefe93f | Ready | 2 | 0 | 2016-10-12 10:16:52 | 4774 |
189 | 1 | 95 | 2016-10-12 10:17:08 | NULL | NULL | Primary | 28991029248 | 28991029248 | 0 | cst4/bb9ca3c7-96d6-4465-85b5-cd01f4d635f2/1c1890bf-5a55-4060-bf3f-cf064629dcdf | Ready | 2 | 0 | 2016-10-12 10:17:09 | 4774 |
190 | 1 | 95 | 2016-10-12 10:17:09 | NULL | NULL | Image | 28991029248 | 28991029248 | 0 | snapshots/2/4774/1c1890bf-5a55-4060-bf3f-cf064629dcdf | Ready | 2 | 0 | 2016-10-12 10:18:23 | 4774 |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
6 rows in set (0.00 sec)
mysql> select * from snapshots where name like '%TolgaC72%';
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
id | data_center_id | account_id | domain_id | volume_id | disk_offering_id | status | path | name | uuid | snapshot_type | type_description | size | created | removed | backup_snap_id | swift_id | sechost_id | prev_snap_id | hypervisor_type | version | s3_id | min_iops | max_iops |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
93 | 1 | 2 | 1 | 4774 | 19 | BackedUp | NULL | S1TolgaC72Test41 | 0f84acd8-9a5c-4de5-a58a-b528b4e72839 | 0 | MANUAL | 28991029248 | 2016-10-12 10:13:42 | NULL | NULL | NULL | NULL | NULL | KVM | 2.2 | NULL | NULL | NULL |
94 | 1 | 2 | 1 | 4774 | 19 | BackedUp | NULL | S2TolgaC72Test41 | 51ef5246-27ae-435e-92ac-7428bab64cb8 | 0 | MANUAL | 28991029248 | 2016-10-12 10:15:36 | NULL | NULL | NULL | NULL | NULL | KVM | 2.2 | NULL | NULL | NULL |
95 | 1 | 2 | 1 | 4774 | 19 | BackedUp | NULL | S3TolgaC72Test41 | e5550dd6-9a43-4dd7-959a-7aabf26f0593 | 0 | MANUAL | 28991029248 | 2016-10-12 10:17:06 | NULL | NULL | NULL | NULL | NULL | KVM | 2.2 | NULL | NULL | NULL |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3 rows in set (0.00 sec)
mysql> select uuid, name from volumes where name='ROOT-4795';
-----------------------------------------------+
uuid | name |
-----------------------------------------------+
bb9ca3c7-96d6-4465-85b5-cd01f4d635f2 | ROOT-4795 |
-----------------------------------------------+
1 row in set (0.00 sec)
mysql>
#VM + 3 Snapshots
#Ceph / Primary Storage
root@mont1:/var/log/ceph# rbd ls -l -p cst4|grep bb9ca3c7-96d6-4465-85b5-cd01f4d635f2
bb9ca3c7-96d6-4465-85b5-cd01f4d635f2 27648M cst4/fb80c005-51d2-4fc4-8a98-39073866e826@cloudstack-base-snap 2
bb9ca3c7-96d6-4465-85b5-cd01f4d635f2@54008bf3-43dd-469d-91a7-4acd146d7b84 27648M cst4/fb80c005-51d2-4fc4-8a98-39073866e826@cloudstack-base-snap 2
bb9ca3c7-96d6-4465-85b5-cd01f4d635f2@45fc4f44-b377-49c0-9264-5d813fefe93f 27648M cst4/fb80c005-51d2-4fc4-8a98-39073866e826@cloudstack-base-snap 2
bb9ca3c7-96d6-4465-85b5-cd01f4d635f2@1c1890bf-5a55-4060-bf3f-cf064629dcdf 27648M cst4/fb80c005-51d2-4fc4-8a98-39073866e826@cloudstack-base-snap 2
root@mont1:/var/log/ceph#
#NFS / Secondary Side
root@cloudstackt4:/CST/S# ls -l snapshots/2/4774/54008bf3-43dd-469d-91a7-4acd146d7b84 snapshots/2/4774/45fc4f44-b377-49c0-9264-5d813fefe93f snapshots/2/4774/1c1890bf-5a55-4060-bf3f-cf064629dcdf
rw-rr- 1 root root 28991029248 Oct 12 13:18 snapshots/2/4774/1c1890bf-5a55-4060-bf3f-cf064629dcdf
rw-rr- 1 root root 28991029248 Oct 12 13:16 snapshots/2/4774/45fc4f44-b377-49c0-9264-5d813fefe93f
rw-rr- 1 root root 28991029248 Oct 12 13:15 snapshots/2/4774/54008bf3-43dd-469d-91a7-4acd146d7b84
root@cloudstackt4:/CST/S#
2) We destroy only vm not snapshots so 3 snapshots are active with BackedUp Status. On vm destroy stage vm destroy method removes all vm's files and its related snapshots on primary storage.
#VM Destroyed we have 3 Snapshots on BackedUp state
#Ceph Side / Primary Storage
root@mont1:/var/log/ceph# rbd ls -l -p cst4|grep bb9ca3c7-96d6-4465-85b5-cd01f4d635f2
root@mont1:/var/log/ceph#
#NFS / Secondary Side
root@cloudstackt4:/CST/S# ls -l snapshots/2/4774/54008bf3-43dd-469d-91a7-4acd146d7b84 snapshots/2/4774/45fc4f44-b377-49c0-9264-5d813fefe93f snapshots/2/4774/1c1890bf-5a55-4060-bf3f-cf064629dcdf
rw-rr- 1 root root 28991029248 Oct 12 13:18 snapshots/2/4774/1c1890bf-5a55-4060-bf3f-cf064629dcdf
rw-rr- 1 root root 28991029248 Oct 12 13:16 snapshots/2/4774/45fc4f44-b377-49c0-9264-5d813fefe93f
rw-rr- 1 root root 28991029248 Oct 12 13:15 snapshots/2/4774/54008bf3-43dd-469d-91a7-4acd146d7b84
root@cloudstackt4:/CST/S#
3) We destroy 1 Snapshot named S3TolgaC72Test41
#Ceph Side / Primary Storage
root@mont1:/var/log/ceph# rbd ls -l -p cst4|grep bb9ca3c7-96d6-4465-85b5-cd01f4d635f2
root@mont1:/var/log/ceph#
#NFS / Secondary Side
root@cloudstackt4:/CST/S# ls -l snapshots/2/4774/54008bf3-43dd-469d-91a7-4acd146d7b84 snapshots/2/4774/45fc4f44-b377-49c0-9264-5d813fefe93f snapshots/2/4774/1c1890bf-5a55-4060-bf3f-cf064629dcdf
ls: cannot access snapshots/2/4774/1c1890bf-5a55-4060-bf3f-cf064629dcdf: No such file or directory
rw-rr- 1 root root 28991029248 Oct 12 13:16 snapshots/2/4774/45fc4f44-b377-49c0-9264-5d813fefe93f
rw-rr- 1 root root 28991029248 Oct 12 13:15 snapshots/2/4774/54008bf3-43dd-469d-91a7-4acd146d7b84
#SQL Side
mysql> select * from snapshots where name like '%TolgaC72%' and name != 'StolgaC72Revert';
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
id | data_center_id | account_id | domain_id | volume_id | disk_offering_id | status | path | name | uuid | snapshot_type | type_description | size | created | removed | backup_snap_id | swift_id | sechost_id | prev_snap_id | hypervisor_type | version | s3_id | min_iops | max_iops |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
93 | 1 | 2 | 1 | 4774 | 19 | BackedUp | NULL | S1TolgaC72Test41 | 0f84acd8-9a5c-4de5-a58a-b528b4e72839 | 0 | MANUAL | 28991029248 | 2016-10-12 10:13:42 | NULL | NULL | NULL | NULL | NULL | KVM | 2.2 | NULL | NULL | NULL |
94 | 1 | 2 | 1 | 4774 | 19 | BackedUp | NULL | S2TolgaC72Test41 | 51ef5246-27ae-435e-92ac-7428bab64cb8 | 0 | MANUAL | 28991029248 | 2016-10-12 10:15:36 | NULL | NULL | NULL | NULL | NULL | KVM | 2.2 | NULL | NULL | NULL |
95 | 1 | 2 | 1 | 4774 | 19 | Destroyed | NULL | S3TolgaC72Test41 | e5550dd6-9a43-4dd7-959a-7aabf26f0593 | 0 | MANUAL | 28991029248 | 2016-10-12 10:17:06 | NULL | NULL | NULL | NULL | NULL | KVM | 2.2 | NULL | NULL | NULL |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3 rows in set (0.00 sec)
mysql> select * from snapshot_store_ref where snapshot_id in (93,94,95);
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
id | store_id | snapshot_id | created | last_updated | job_id | store_role | size | physical_size | parent_snapshot_id | install_path | state | update_count | ref_cnt | updated | volume_id |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
185 | 1 | 93 | 2016-10-12 10:13:44 | NULL | NULL | Primary | 28991029248 | 28991029248 | 0 | cst4/bb9ca3c7-96d6-4465-85b5-cd01f4d635f2/54008bf3-43dd-469d-91a7-4acd146d7b84 | Ready | 2 | 0 | 2016-10-12 10:13:45 | 4774 |
186 | 1 | 93 | 2016-10-12 10:13:45 | NULL | NULL | Image | 28991029248 | 28991029248 | 0 | snapshots/2/4774/54008bf3-43dd-469d-91a7-4acd146d7b84 | Ready | 2 | 0 | 2016-10-12 10:15:04 | 4774 |
187 | 1 | 94 | 2016-10-12 10:15:38 | NULL | NULL | Primary | 28991029248 | 28991029248 | 0 | cst4/bb9ca3c7-96d6-4465-85b5-cd01f4d635f2/45fc4f44-b377-49c0-9264-5d813fefe93f | Ready | 2 | 0 | 2016-10-12 10:15:39 | 4774 |
188 | 1 | 94 | 2016-10-12 10:15:39 | NULL | NULL | Image | 28991029248 | 28991029248 | 0 | snapshots/2/4774/45fc4f44-b377-49c0-9264-5d813fefe93f | Ready | 2 | 0 | 2016-10-12 10:16:52 | 4774 |
190 | 1 | 95 | 2016-10-12 10:17:09 | NULL | NULL | Image | 28991029248 | 28991029248 | 0 | snapshots/2/4774/1c1890bf-5a55-4060-bf3f-cf064629dcdf | Destroyed | 4 | 0 | 2016-10-12 10:48:31 | 4774 |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
5 rows in set (0.00 sec)
mysql>
ERROR on Cloudstack Management Server: Failed to delete snapshot
2016-10-12 13:54:02,911 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy] (API-Job-Executor-70:ctx-06858bab job-34505 ctx-a6bc52fb) (logid:a2b92099) Failed to delete snapshot:
com.cloud.utils.exception.CloudRuntimeException: Failed to remove snapshot
at org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.deleteSnapshot(SnapshotServiceImpl.java:396)
at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.deleteSnapshot(XenserverSnapshotStrategy.java:272)
at com.cloud.storage.snapshot.SnapshotManagerImpl.deleteSnapshot(SnapshotManagerImpl.java:454)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
at org.apache.cloudstack.network.contrail.management.EventUtils$EventInterceptor.invoke(EventUtils.java:106)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
at com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:51)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
at com.sun.proxy.$Proxy192.deleteSnapshot(Unknown Source)
at org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd.execute(DeleteSnapshotCmd.java:103)
at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:150)
at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:108)
at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:554)
at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:502)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2016-10-12 13:54:02,918 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Executor-70:ctx-06858bab job-34505) (logid:a2b92099) Complete async job-34505, jobStatus: FAILED, resultCode: 530,
result: org.apache.cloudstack.api.response.ExceptionResponse/null/
ERROR on KVM Agent
2016-10-12 13:49:38,227 DEBUG [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-3:null) Could not find volume bb9ca3c7-96d6-4465-85b5-cd01f4d635f2: Storage volume not found: no storage vol with matching name 'bb9ca3c7-96d6-4465-85b5-cd01f4d635f2'
2016-10-12 13:49:38,227 DEBUG [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-3:null) Refreshing storage pool 01b375db-d3f5-33c1-9389-8bf226c887e8
2016-10-12 13:49:41,253 DEBUG [kvm.storage.KVMStoragePoolManager] (agentRequest-Handler-3:null) Failed to find volume:bb9ca3c7-96d6-4465-85b5-cd01f4d635f2 due tocom.cloud.utils.exception.CloudRuntimeException: Could not find volume bb9ca3c7-96d6-4465-85b5-cd01f4d635f2: Storage volume not found: no storage vol with matching name 'bb9ca3c7-96d6-4465-85b5-cd01f4d635f2', retry:2
2016-10-12 13:51:53,819 DEBUG [kvm.storage.KVMStoragePoolManager] (agentRequest-Handler-3:null) Failed to find volume:bb9ca3c7-96d6-4465-85b5-cd01f4d635f2 due tocom.cloud.utils.exception.CloudRuntimeException: Could not find volume bb9ca3c7-96d6-4465-85b5-cd01f4d635f2: Storage volume not found: no storage vol with matching name 'bb9ca3c7-96d6-4465-85b5-cd01f4d635f2', retry:6
2016-10-12 13:52:26,858 DEBUG [kvm.storage.KVMStoragePoolManager] (agentRequest-Handler-3:null) Failed to find volume:bb9ca3c7-96d6-4465-85b5-cd01f4d635f2 due tocom.cloud.utils.exception.CloudRuntimeException: Could not find volume bb9ca3c7-96d6-4465-85b5-cd01f4d635f2: Storage volume not found: no storage vol with matching name 'bb9ca3c7-96d6-4465-85b5-cd01f4d635f2', retry:7
2016-10-12 13:53:33,390 DEBUG [kvm.storage.KVMStoragePoolManager] (agentRequest-Handler-3:null) Failed to find volume:bb9ca3c7-96d6-4465-85b5-cd01f4d635f2 due tocom.cloud.utils.exception.CloudRuntimeException: Could not find volume bb9ca3c7-96d6-4465-85b5-cd01f4d635f2: Storage volume not found: no storage vol with matching name 'bb9ca3c7-96d6-4465-85b5-cd01f4d635f2', retry:9
2016-10-12 13:54:03,390 ERROR [kvm.storage.KVMStorageProcessor] (agentRequest-Handler-3:null) Failed to remove snapshot , with exception: com.cloud.utils.exception.CloudRuntimeException: com.cloud.utils.exception.CloudRuntimeException: Could not find volume bb9ca3c7-96d6-4465-85b5-cd01f4d635f2: Storage volume not found: no storage vol with matching name 'bb9ca3c7-96d6-4465-85b5-cd01f4d635f2'
Thanks
Özhan
Attachments
Issue Links
- links to