Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
4.3.0
-
Security Level: Public (Anyone can view this level - this is the default.)
-
None
-
Build from 4.3
Description
Set up:
Advanced Zone with 2 KVM (RHEL 6.3) hosts.
2 NFS secondary stores set up.
Steps to reproduce the problem:
1. Deploy 5 Vms in each of the hosts with 10 GB ROOT volume size , so we start with 10 Vms.
2. Start concurrent snapshots for ROOT volumes of all the Vms.
1 of the secondary store ss1 had the nfs server down for 1 and 1/2 hours.
The other secondary store -ss2 - was always reachable.
Snapshot tasks that went to the ss1 , succeeded after the nfs server was brought up (It temporarily halted when the the nfs server was down and resumed when the nsf server was made available).
First set of snapshot tasks that went to the ss2 all succeeded.
But the next hourly snapshot tasks, few of them failed with following exception: 2013-12-11 16:33:22,427 DEBUG [c.c.s.s.SnapshotManagerImpl] (Job-Executor-64:ctx-9c70ad77 ctx-3d959fa6) Failed t o create snapshot com.cloud.utils.exception.CloudRuntimeException: Failed to backup snapshot: qemu-img: Could not delete snapshot '89eced14-9121-44a7-bb97-26b567795726': -2 (No such file or directory)Failed to delete snapshot 89eced14-9121-44 a7-bb97-26b567795726 for path /mnt/c20ea198-e8ca-33c3-9f11-e361ec9b5532/71a5dce2-da7c-4692-8f25-ba37e5296886 at org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.backupSnapshot(SnapshotServiceImpl.java:27 5) at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.backupSnapshot(XenserverSnapshotStra tegy.java:135) at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot(XenserverSnapshotStrate gy.java:294) at com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:951) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocati on.java:183) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java: 150) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.ja va:91) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java: 172) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) at $Proxy161.takeSnapshot(Unknown Source) at org.apache.cloudstack.storage.volume.VolumeServiceImpl.takeSnapshot(VolumeServiceImpl.java:1341) at com.cloud.storage.VolumeApiServiceImpl.takeSnapshot(VolumeApiServiceImpl.java:1461) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
Copy to the secondary has succeed. Failure happens after this.
[root@Rack3Host5 118]# ls -ltr
total 10002852
rw-rr-. 1 root root 3637903360 Dec 11 20:33 89eced14-9121-44a7-bb97-26b567795726
rw-rr-. 1 root root 3638755328 Dec 11 21:37 b38d93db-4c14-45a7-9274-639ad95a3f29
rw-rr-. 1 root root 2956619776 Dec 11 22:24 452c8841-2025-41da-b6ec-49cea2a49da8
[root@Rack3Host5 118]#
Following are the volumes which are in "CreatedOnPrimary" state for which the failure occured.
113 | BackedUp | 2013-12-11 20:57:03 |
112 | BackedUp | 2013-12-11 20:57:03 |
110 | BackedUp | 2013-12-11 20:57:03 |
121 | CreatedOnPrimary | 2013-12-11 20:57:04 |
118 | CreatedOnPrimary | 2013-12-11 20:57:04 |
117 | BackedUp | 2013-12-11 20:57:04 |
116 | CreatedOnPrimary | 2013-12-11 20:57:04 |
Attachments
Attachments
Issue Links
- duplicates
-
CLOUDSTACK-4939 Failed to create snapshot (KVM, Multiple hosts, Sharedstorage)
- Closed
- is related to
-
CLOUDSTACK-5356 Xenserver - Failed to create snapshot when secondary store was made unavaibale for about 1.5 hour leaving behind snapshot in " CreatedOnPrimary" state. The subsequent scheduled snapshot also failed
- Resolved
-
CLOUDSTACK-5357 Xenserver - Failed to create snapshot due to "unable to destroy task(com.xe nsource.xenapi.Task@67d312d6) on host(23af93a0-93ff-40cb-ba11-a11d1b884d37)" when secondary store was unavaiable for 1 and 1/2 hours and then brought up..
- Resolved