Uploaded image for project: 'CloudStack'
  1. CloudStack
  2. CLOUDSTACK-7539

[S3] Parallel deployment makes reference count of a cache in nfs secondary staging store negative(-1)

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 4.3.0, 4.4.0, 4.5.0, Future
    • Fix Version/s: 4.5.2, 4.6.0
    • Security Level: Public (Anyone can view this level - this is the default.)
    • Labels:
      None
    • Environment:
      OS: CentOS 6.5, CloudStack Version: 4.3.0, Hypervisor: KVM, Primary Storage: Local Storage, Secondary Storage: S3(RADOS), Zone: Advanced

      Description

      When we create some instances in parallel in an environment shown above, an exception like (2) happens. After that, reference count of a cache in NFS Secondary Staging Store(SSS) becomes negative(3).

      In this situcation, we can't use a template used for creating the instances because negative reference count will cause another
      exception(4). Furthermore, template caches in NFS SSS aren't cleaned up.

      I think this is related to CLOUDSTACK-6236. I'm using CloudStack 4.3.0 branch. The code was applied a patch of CLOUDSTACK-6236 to and also fixed about volume reference count setter problem by myself. But the reference count problem still happens.

      Conditions to trigger

      • A cache of a template isn't on NFS SSS.
      • Choose compute offering that occupies all resources in a host.(ex. Use all cores and almost all memory)
      • Create multiple instances (1).

      Other behaviors

      • Management server inserts multiple entries for template cache(ImageCache) to cloud.template_store_ref.
      • SSVM probably downloads same template from S3 and creates multiple caches to NFS SSS concurrently. Sometimes, SSVM retries download and cache creation.

      (1)

      #! /bin/bash
      
      COMPUTE_OFFERING="b6fc0598-6903-48cc-b894-ead3bb0e39f5"
      TEMPLATE="ef1d5a8e-5951-4036-a236-fe2d47224fe4"
      ZONE="23156857-4722-4fc4-86d4-a12ca1028197"
      NETWORK="4ba56179-f7b9-4560-a00e-80c946856ff8"
      KEYPAIR=mykey
      
      NAME1="test-template-error-0003"
      NAME2="test-template-error-0004"
      
      cloudmonkey deploy virtualmachine serviceofferingid=${COMPUTE_OFFERING} templateid=${TEMPLATE} zoneid=${ZONE} networkids=${NETWORK} displayname=${NAME1} name=${NAME1} keypair=${KEYPAIR}
      sleep 1
      cloudmonkey deploy virtualmachine serviceofferingid=${COMPUTE_OFFERING} templateid=${TEMPLATE} zoneid=${ZONE} networkids=${NETWORK} displayname=${NAME2} name=${NAME2} keypair=${KEYPAIR}
      
      

      (2)

      2014-09-10 17:22:51,249 DEBUG [c.c.a.t.Request] (AgentManager-Handler-5:null) Seq 171-1789133096: Processing:  { Ans: , MgmtId: 98532524288, via: 171, Ver: v1, Flags: 10, [{"org.apache.cloudstack.storage.command.CopyCmdAnswer":{"newData":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"path":"template/tmpl/3/330/22e69a40-0916-4d11-a07a-63d38e76887f.qcow2","id":0,"accountId":0,"hvm":false,"name":"22e69a40-0916-4d11-a07a-63d38e76887f.qcow2","size":14340259840,"physicalSize":14340259840}},"result":true,"wait":0}}] }
      2014-09-10 17:22:51,249 DEBUG [c.c.a.t.Request] (Job-Executor-162:ctx-4a2eb345 ctx-fbf59f27) Seq 171-1789133096: Received:  { Ans: , MgmtId: 98532524288, via: 171, Ver: v1, Flags: 10, { CopyCmdAnswer } }
      2014-09-10 17:22:51,257 DEBUG [o.a.c.s.i.s.TemplateObject] (Job-Executor-162:ctx-4a2eb345 ctx-fbf59f27) failed to update state
      com.cloud.utils.fsm.NoTransitionException: Unable to transition to a new state from Allocated via OperationSuccessed
              at com.cloud.utils.fsm.StateMachine2.getNextState(StateMachine2.java:83)
              at com.cloud.utils.fsm.StateMachine2.transitTo(StateMachine2.java:100)
              at org.apache.cloudstack.storage.datastore.ObjectInDataStoreManagerImpl.update(ObjectInDataStoreManagerImpl.java:297)
              at org.apache.cloudstack.storage.image.store.TemplateObject.processEvent(TemplateObject.java:225)
              at org.apache.cloudstack.storage.cache.manager.StorageCacheManagerImpl.createCacheObject(StorageCacheManagerImpl.java:240)
              at org.apache.cloudstack.storage.cache.manager.StorageCacheManagerImpl.createCacheObject(StorageCacheManagerImpl.java:267)
              at org.apache.cloudstack.storage.motion.AncientDataMotionStrategy.copyObject(AncientDataMotionStrategy.java:165)
              at org.apache.cloudstack.storage.motion.AncientDataMotionStrategy.copyAsync(AncientDataMotionStrategy.java:428)
              at org.apache.cloudstack.storage.motion.DataMotionServiceImpl.copyAsync(DataMotionServiceImpl.java:70)
              at org.apache.cloudstack.storage.volume.VolumeServiceImpl.createBaseImageAsync(VolumeServiceImpl.java:473)
              at org.apache.cloudstack.storage.volume.VolumeServiceImpl.createVolumeFromTemplateAsync(VolumeServiceImpl.java:604)
              at org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.recreateVolume(VolumeOrchestrator.java:1192)
              at org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.prepare(VolumeOrchestrator.java:1256)
              at com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:963)
              at com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:761)
              at com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:601)
              at org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.deployVirtualMachine(VMEntityManagerImpl.java:228)
              at org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.deploy(VirtualMachineEntityImpl.java:207)
              at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3554)
              at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3161)
              at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3147)
              at sun.reflect.GeneratedMethodAccessor463.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:622)
              at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
              at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
              at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
              at com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:50)
              at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
              at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
              at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
              at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
              at com.sun.proxy.$Proxy170.startVirtualMachine(Unknown Source)
              at org.apache.cloudstack.api.command.user.vm.DeployVMCmd.execute(DeployVMCmd.java:443)
              at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:161)
              at com.cloud.api.ApiAsyncJobDispatcher.runJobInContext(ApiAsyncJobDispatcher.java:109)
              at com.cloud.api.ApiAsyncJobDispatcher$1.run(ApiAsyncJobDispatcher.java:66)
              at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
              at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
              at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
              at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:63)
              at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:509)
              at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
              at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
              at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
              at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
              at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
              at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
              at java.util.concurrent.FutureTask.run(FutureTask.java:166)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              at java.lang.Thread.run(Thread.java:701)
      
      

      (3)

      *************************** 166. row ***************************
               id: 431
       store_role: Image
      template_id: 330
          ref_cnt: 0
      *************************** 167. row ***************************
               id: 432
       store_role: ImageCache
      template_id: 330
          ref_cnt: 0
      *************************** 168. row ***************************
               id: 433
       store_role: ImageCache
      template_id: 330
          ref_cnt: 0
      *************************** 169. row ***************************
               id: 434
       store_role: ImageCache
      template_id: 330
          ref_cnt: 0
      *************************** 170. row ***************************
               id: 435
       store_role: ImageCache
      template_id: 330
          ref_cnt: 1
      *************************** 171. row ***************************
               id: 436
       store_role: ImageCache
      template_id: 330
          ref_cnt: 0
      *************************** 172. row ***************************
               id: 437
       store_role: ImageCache
      template_id: 330
          ref_cnt: 18446744073709551615
      *************************** 173. row ***************************
               id: 438
       store_role: ImageCache
      template_id: 330
          ref_cnt: 0
      
      

      (4)

      2014-09-08 17:42:48,144 DEBUG [c.c.a.t.Request] (AgentManager-Handler-6:null) Seq 171-1789133066: Processing:  { Ans: , MgmtId: 98532524288, via: 171, Ver: v1, Flags: 10, [{"org.apache.cloudstack.storage.command.CopyCmdAnswer":{"newData":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"path":"template/tmpl/3/320/07360420-1206-4054-946d-86567b646f98.qcow2","id":0,"accountId":0,"hvm":false,"name":"07360420-1206-4054-946d-86567b646f98.qcow2","size":14340653056,"physicalSize":14340653056}},"result":true,"wait":0}}] }
      2014-09-08 17:42:48,144 DEBUG [c.c.a.t.Request] (Job-Executor-107:ctx-41d757d1 ctx-5fbbe761) Seq 171-1789133066: Received:  { Ans: , MgmtId: 98532524288, via: 171, Ver: v1, Flags: 10, { CopyCmdAnswer } }
      2014-09-08 17:42:48,146 DEBUG [o.a.c.s.i.s.TemplateObject] (Job-Executor-107:ctx-41d757d1 ctx-5fbbe761) failed to process event and answer
      com.cloud.utils.exception.CloudRuntimeException: DB Exception on: com.mysql.jdbc.JDBC4PreparedStatement@307febb8: SELECT template_store_ref.id, template_store_ref.store_id, template_store_ref.template_id, template_store_ref.store_role, template_store_ref.created, template_store_ref.last_updated, template_store_ref.download_pct, template_store_ref.size, template_store_ref.physical_size, template_store_ref.download_state, template_store_ref.local_path, template_store_ref.error_str, template_store_ref.job_id, template_store_ref.install_path, template_store_ref.url, template_store_ref.is_copy, template_store_ref.destroyed, template_store_ref.update_count, template_store_ref.updated, template_store_ref.state, template_store_ref.ref_cnt FROM template_store_ref WHERE template_store_ref.store_id = 1  AND template_store_ref.template_id = 320  AND template_store_ref.destroyed = 0  ORDER BY RAND() LIMIT 1
              at com.cloud.utils.db.GenericDaoBase.searchIncludingRemoved(GenericDaoBase.java:421)
              at com.cloud.utils.db.GenericDaoBase.searchIncludingRemoved(GenericDaoBase.java:356)
              at com.cloud.utils.db.GenericDaoBase.findOneIncludingRemovedBy(GenericDaoBase.java:869)
              at org.apache.cloudstack.storage.image.db.TemplateDataStoreDaoImpl.findByStoreTemplate(TemplateDataStoreDaoImpl.java:320)
              at sun.reflect.GeneratedMethodAccessor125.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:622)
              at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
              at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
              at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
              at com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:33)
              at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
              at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
              at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
              at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
              at com.sun.proxy.$Proxy109.findByStoreTemplate(Unknown Source)
              at org.apache.cloudstack.storage.image.store.TemplateObject.processEvent(TemplateObject.java:202)
              at org.apache.cloudstack.storage.cache.manager.StorageCacheManagerImpl.createCacheObject(StorageCacheManagerImpl.java:240)
              at org.apache.cloudstack.storage.cache.manager.StorageCacheManagerImpl.createCacheObject(StorageCacheManagerImpl.java:267)
              at org.apache.cloudstack.storage.motion.AncientDataMotionStrategy.copyObject(AncientDataMotionStrategy.java:165)
              at org.apache.cloudstack.storage.motion.AncientDataMotionStrategy.copyAsync(AncientDataMotionStrategy.java:428)
              at org.apache.cloudstack.storage.motion.DataMotionServiceImpl.copyAsync(DataMotionServiceImpl.java:70)
              at org.apache.cloudstack.storage.volume.VolumeServiceImpl.createBaseImageAsync(VolumeServiceImpl.java:473)
              at org.apache.cloudstack.storage.volume.VolumeServiceImpl.createVolumeFromTemplateAsync(VolumeServiceImpl.java:604)
              at org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.recreateVolume(VolumeOrchestrator.java:1192)
              at org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.prepare(VolumeOrchestrator.java:1256)
              at com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:963)
              at com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:761)
              at com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:601)
              at org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.deployVirtualMachine(VMEntityManagerImpl.java:228)
              at org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.deploy(VirtualMachineEntityImpl.java:207)
              at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3554)
              at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3161)
              at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3147)
              at sun.reflect.GeneratedMethodAccessor463.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:622)
              at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
              at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
              at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
              at com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:50)
              at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
              at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
              at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
              at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
              at com.sun.proxy.$Proxy170.startVirtualMachine(Unknown Source)
              at org.apache.cloudstack.api.command.user.vm.DeployVMCmd.execute(DeployVMCmd.java:443)
              at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:161)
              at com.cloud.api.ApiAsyncJobDispatcher.runJobInContext(ApiAsyncJobDispatcher.java:109)
              at com.cloud.api.ApiAsyncJobDispatcher$1.run(ApiAsyncJobDispatcher.java:66)
              at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
              at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
              at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
              at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:63)
              at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:509)
              at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
              at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
              at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
              at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
              at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
              at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
              at java.util.concurrent.FutureTask.run(FutureTask.java:166)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              at java.lang.Thread.run(Thread.java:701)
      Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLDataException: '18446744073709551615' in column '21' is outside valid range for the datatype BIGINT.
              at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
              at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
              at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
              at java.lang.reflect.Constructor.newInstance(Constructor.java:534)
              at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
              at com.mysql.jdbc.Util.getInstance(Util.java:386)
              at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1026)
              at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987)
              at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982)
              at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927)
              at com.mysql.jdbc.ResultSetImpl.throwRangeException(ResultSetImpl.java:7964)
              at com.mysql.jdbc.ResultSetImpl.parseLongAsDouble(ResultSetImpl.java:7248)
              at com.mysql.jdbc.ResultSetImpl.getLong(ResultSetImpl.java:2946)
              at com.mysql.jdbc.ResultSetImpl.getLong(ResultSetImpl.java:2911)
              at org.apache.commons.dbcp.DelegatingResultSet.getLong(DelegatingResultSet.java:228)
              at org.apache.commons.dbcp.DelegatingResultSet.getLong(DelegatingResultSet.java:228)
              at com.cloud.utils.db.GenericDaoBase.setField(GenericDaoBase.java:541)
              at com.cloud.utils.db.GenericDaoBase.setField(GenericDaoBase.java:1702)
              at com.cloud.utils.db.GenericDaoBase.toEntityBean(GenericDaoBase.java:1582)
              at com.cloud.utils.db.GenericDaoBase.toEntityBean(GenericDaoBase.java:1543)
              at com.cloud.utils.db.GenericDaoBase.searchIncludingRemoved(GenericDaoBase.java:417)
              ... 65 more
      

        Attachments

          Activity

            People

            • Assignee:
              dahn Daan
              Reporter:
              hiroki-o Hiroki Ohashi
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: