Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
4.3.0, 4.4.0, 4.5.0, Future
-
Security Level: Public (Anyone can view this level - this is the default.)
-
None
-
OS: CentOS 6.5, CloudStack Version: 4.3.0, Hypervisor: KVM, Primary Storage: Local Storage, Secondary Storage: S3(RADOS), Zone: Advanced
Description
When we create some instances in parallel in an environment shown above, an exception like (2) happens. After that, reference count of a cache in NFS Secondary Staging Store(SSS) becomes negative(3).
In this situcation, we can't use a template used for creating the instances because negative reference count will cause another
exception(4). Furthermore, template caches in NFS SSS aren't cleaned up.
I think this is related to CLOUDSTACK-6236. I'm using CloudStack 4.3.0 branch. The code was applied a patch of CLOUDSTACK-6236 to and also fixed about volume reference count setter problem by myself. But the reference count problem still happens.
Conditions to trigger
- A cache of a template isn't on NFS SSS.
- Choose compute offering that occupies all resources in a host.(ex. Use all cores and almost all memory)
- Create multiple instances (1).
Other behaviors
- Management server inserts multiple entries for template cache(ImageCache) to cloud.template_store_ref.
- SSVM probably downloads same template from S3 and creates multiple caches to NFS SSS concurrently. Sometimes, SSVM retries download and cache creation.
#! /bin/bash COMPUTE_OFFERING="b6fc0598-6903-48cc-b894-ead3bb0e39f5" TEMPLATE="ef1d5a8e-5951-4036-a236-fe2d47224fe4" ZONE="23156857-4722-4fc4-86d4-a12ca1028197" NETWORK="4ba56179-f7b9-4560-a00e-80c946856ff8" KEYPAIR=mykey NAME1="test-template-error-0003" NAME2="test-template-error-0004" cloudmonkey deploy virtualmachine serviceofferingid=${COMPUTE_OFFERING} templateid=${TEMPLATE} zoneid=${ZONE} networkids=${NETWORK} displayname=${NAME1} name=${NAME1} keypair=${KEYPAIR} sleep 1 cloudmonkey deploy virtualmachine serviceofferingid=${COMPUTE_OFFERING} templateid=${TEMPLATE} zoneid=${ZONE} networkids=${NETWORK} displayname=${NAME2} name=${NAME2} keypair=${KEYPAIR}
2014-09-10 17:22:51,249 DEBUG [c.c.a.t.Request] (AgentManager-Handler-5:null) Seq 171-1789133096: Processing: { Ans: , MgmtId: 98532524288, via: 171, Ver: v1, Flags: 10, [{"org.apache.cloudstack.storage.command.CopyCmdAnswer":{"newData":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"path":"template/tmpl/3/330/22e69a40-0916-4d11-a07a-63d38e76887f.qcow2","id":0,"accountId":0,"hvm":false,"name":"22e69a40-0916-4d11-a07a-63d38e76887f.qcow2","size":14340259840,"physicalSize":14340259840}},"result":true,"wait":0}}] } 2014-09-10 17:22:51,249 DEBUG [c.c.a.t.Request] (Job-Executor-162:ctx-4a2eb345 ctx-fbf59f27) Seq 171-1789133096: Received: { Ans: , MgmtId: 98532524288, via: 171, Ver: v1, Flags: 10, { CopyCmdAnswer } } 2014-09-10 17:22:51,257 DEBUG [o.a.c.s.i.s.TemplateObject] (Job-Executor-162:ctx-4a2eb345 ctx-fbf59f27) failed to update state com.cloud.utils.fsm.NoTransitionException: Unable to transition to a new state from Allocated via OperationSuccessed at com.cloud.utils.fsm.StateMachine2.getNextState(StateMachine2.java:83) at com.cloud.utils.fsm.StateMachine2.transitTo(StateMachine2.java:100) at org.apache.cloudstack.storage.datastore.ObjectInDataStoreManagerImpl.update(ObjectInDataStoreManagerImpl.java:297) at org.apache.cloudstack.storage.image.store.TemplateObject.processEvent(TemplateObject.java:225) at org.apache.cloudstack.storage.cache.manager.StorageCacheManagerImpl.createCacheObject(StorageCacheManagerImpl.java:240) at org.apache.cloudstack.storage.cache.manager.StorageCacheManagerImpl.createCacheObject(StorageCacheManagerImpl.java:267) at org.apache.cloudstack.storage.motion.AncientDataMotionStrategy.copyObject(AncientDataMotionStrategy.java:165) at org.apache.cloudstack.storage.motion.AncientDataMotionStrategy.copyAsync(AncientDataMotionStrategy.java:428) at org.apache.cloudstack.storage.motion.DataMotionServiceImpl.copyAsync(DataMotionServiceImpl.java:70) at org.apache.cloudstack.storage.volume.VolumeServiceImpl.createBaseImageAsync(VolumeServiceImpl.java:473) at org.apache.cloudstack.storage.volume.VolumeServiceImpl.createVolumeFromTemplateAsync(VolumeServiceImpl.java:604) at org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.recreateVolume(VolumeOrchestrator.java:1192) at org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.prepare(VolumeOrchestrator.java:1256) at com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:963) at com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:761) at com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:601) at org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.deployVirtualMachine(VMEntityManagerImpl.java:228) at org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.deploy(VirtualMachineEntityImpl.java:207) at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3554) at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3161) at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3147) at sun.reflect.GeneratedMethodAccessor463.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) at com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:50) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) at com.sun.proxy.$Proxy170.startVirtualMachine(Unknown Source) at org.apache.cloudstack.api.command.user.vm.DeployVMCmd.execute(DeployVMCmd.java:443) at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:161) at com.cloud.api.ApiAsyncJobDispatcher.runJobInContext(ApiAsyncJobDispatcher.java:109) at com.cloud.api.ApiAsyncJobDispatcher$1.run(ApiAsyncJobDispatcher.java:66) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:63) at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:509) at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:701)
*************************** 166. row *************************** id: 431 store_role: Image template_id: 330 ref_cnt: 0 *************************** 167. row *************************** id: 432 store_role: ImageCache template_id: 330 ref_cnt: 0 *************************** 168. row *************************** id: 433 store_role: ImageCache template_id: 330 ref_cnt: 0 *************************** 169. row *************************** id: 434 store_role: ImageCache template_id: 330 ref_cnt: 0 *************************** 170. row *************************** id: 435 store_role: ImageCache template_id: 330 ref_cnt: 1 *************************** 171. row *************************** id: 436 store_role: ImageCache template_id: 330 ref_cnt: 0 *************************** 172. row *************************** id: 437 store_role: ImageCache template_id: 330 ref_cnt: 18446744073709551615 *************************** 173. row *************************** id: 438 store_role: ImageCache template_id: 330 ref_cnt: 0
2014-09-08 17:42:48,144 DEBUG [c.c.a.t.Request] (AgentManager-Handler-6:null) Seq 171-1789133066: Processing: { Ans: , MgmtId: 98532524288, via: 171, Ver: v1, Flags: 10, [{"org.apache.cloudstack.storage.command.CopyCmdAnswer":{"newData":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"path":"template/tmpl/3/320/07360420-1206-4054-946d-86567b646f98.qcow2","id":0,"accountId":0,"hvm":false,"name":"07360420-1206-4054-946d-86567b646f98.qcow2","size":14340653056,"physicalSize":14340653056}},"result":true,"wait":0}}] } 2014-09-08 17:42:48,144 DEBUG [c.c.a.t.Request] (Job-Executor-107:ctx-41d757d1 ctx-5fbbe761) Seq 171-1789133066: Received: { Ans: , MgmtId: 98532524288, via: 171, Ver: v1, Flags: 10, { CopyCmdAnswer } } 2014-09-08 17:42:48,146 DEBUG [o.a.c.s.i.s.TemplateObject] (Job-Executor-107:ctx-41d757d1 ctx-5fbbe761) failed to process event and answer com.cloud.utils.exception.CloudRuntimeException: DB Exception on: com.mysql.jdbc.JDBC4PreparedStatement@307febb8: SELECT template_store_ref.id, template_store_ref.store_id, template_store_ref.template_id, template_store_ref.store_role, template_store_ref.created, template_store_ref.last_updated, template_store_ref.download_pct, template_store_ref.size, template_store_ref.physical_size, template_store_ref.download_state, template_store_ref.local_path, template_store_ref.error_str, template_store_ref.job_id, template_store_ref.install_path, template_store_ref.url, template_store_ref.is_copy, template_store_ref.destroyed, template_store_ref.update_count, template_store_ref.updated, template_store_ref.state, template_store_ref.ref_cnt FROM template_store_ref WHERE template_store_ref.store_id = 1 AND template_store_ref.template_id = 320 AND template_store_ref.destroyed = 0 ORDER BY RAND() LIMIT 1 at com.cloud.utils.db.GenericDaoBase.searchIncludingRemoved(GenericDaoBase.java:421) at com.cloud.utils.db.GenericDaoBase.searchIncludingRemoved(GenericDaoBase.java:356) at com.cloud.utils.db.GenericDaoBase.findOneIncludingRemovedBy(GenericDaoBase.java:869) at org.apache.cloudstack.storage.image.db.TemplateDataStoreDaoImpl.findByStoreTemplate(TemplateDataStoreDaoImpl.java:320) at sun.reflect.GeneratedMethodAccessor125.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) at com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:33) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) at com.sun.proxy.$Proxy109.findByStoreTemplate(Unknown Source) at org.apache.cloudstack.storage.image.store.TemplateObject.processEvent(TemplateObject.java:202) at org.apache.cloudstack.storage.cache.manager.StorageCacheManagerImpl.createCacheObject(StorageCacheManagerImpl.java:240) at org.apache.cloudstack.storage.cache.manager.StorageCacheManagerImpl.createCacheObject(StorageCacheManagerImpl.java:267) at org.apache.cloudstack.storage.motion.AncientDataMotionStrategy.copyObject(AncientDataMotionStrategy.java:165) at org.apache.cloudstack.storage.motion.AncientDataMotionStrategy.copyAsync(AncientDataMotionStrategy.java:428) at org.apache.cloudstack.storage.motion.DataMotionServiceImpl.copyAsync(DataMotionServiceImpl.java:70) at org.apache.cloudstack.storage.volume.VolumeServiceImpl.createBaseImageAsync(VolumeServiceImpl.java:473) at org.apache.cloudstack.storage.volume.VolumeServiceImpl.createVolumeFromTemplateAsync(VolumeServiceImpl.java:604) at org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.recreateVolume(VolumeOrchestrator.java:1192) at org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.prepare(VolumeOrchestrator.java:1256) at com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:963) at com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:761) at com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:601) at org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.deployVirtualMachine(VMEntityManagerImpl.java:228) at org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.deploy(VirtualMachineEntityImpl.java:207) at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3554) at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3161) at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3147) at sun.reflect.GeneratedMethodAccessor463.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) at com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:50) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) at com.sun.proxy.$Proxy170.startVirtualMachine(Unknown Source) at org.apache.cloudstack.api.command.user.vm.DeployVMCmd.execute(DeployVMCmd.java:443) at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:161) at com.cloud.api.ApiAsyncJobDispatcher.runJobInContext(ApiAsyncJobDispatcher.java:109) at com.cloud.api.ApiAsyncJobDispatcher$1.run(ApiAsyncJobDispatcher.java:66) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:63) at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:509) at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:701) Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLDataException: '18446744073709551615' in column '21' is outside valid range for the datatype BIGINT. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:534) at com.mysql.jdbc.Util.handleNewInstance(Util.java:411) at com.mysql.jdbc.Util.getInstance(Util.java:386) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1026) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927) at com.mysql.jdbc.ResultSetImpl.throwRangeException(ResultSetImpl.java:7964) at com.mysql.jdbc.ResultSetImpl.parseLongAsDouble(ResultSetImpl.java:7248) at com.mysql.jdbc.ResultSetImpl.getLong(ResultSetImpl.java:2946) at com.mysql.jdbc.ResultSetImpl.getLong(ResultSetImpl.java:2911) at org.apache.commons.dbcp.DelegatingResultSet.getLong(DelegatingResultSet.java:228) at org.apache.commons.dbcp.DelegatingResultSet.getLong(DelegatingResultSet.java:228) at com.cloud.utils.db.GenericDaoBase.setField(GenericDaoBase.java:541) at com.cloud.utils.db.GenericDaoBase.setField(GenericDaoBase.java:1702) at com.cloud.utils.db.GenericDaoBase.toEntityBean(GenericDaoBase.java:1582) at com.cloud.utils.db.GenericDaoBase.toEntityBean(GenericDaoBase.java:1543) at com.cloud.utils.db.GenericDaoBase.searchIncludingRemoved(GenericDaoBase.java:417) ... 65 more