Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Won't Fix
-
4.2.0
-
Security Level: Public (Anyone can view this level - this is the default.)
-
None
-
Simulator environment with large scale set up
Description
This is mostly similar to CLOUDSTACK-3441 and CLOUDSTACK-4179. Both these issues were fixed and verified in comparatively smaller environment with 4K and 8K hosts and 12K VMs
Now trying in much larger infrastructure with 20k hosts, 20K clusters and 2K Pods. This is also a special case where we are trying to deploy one VM in each host.
I am seeing delay both while acquiring network lock and during deployment planning.
(There was also an ERROR observed in the log during deployment)
Log snippet:
2013-09-02 22:40:52,335 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-336:job-999 = [ f437e46a-dfa4-4cea-a518-7da2f5360a89 ]) Listing clusters in order of aggregate capacity, that have (atleast one host with) enough CPU and RAM capacity under this Zone: 1
2013-09-02 22:40:57,544 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-336:job-999 = [ f437e46a-dfa4-4cea-a518-7da2f5360a89 ]) Removing from the clusterId list these clusters from avoid set: []
..
..
2013-09-02 22:41:05,637 DEBUG [cloud.network.NetworkManagerImpl] (Job-Executor-336:job-999 = [ f437e46a-dfa4-4cea-a518-7da2f5360a89 ]) Changing active number
of nics for network id=204 on 1
2013-09-02 22:41:05,690 DEBUG [cloud.network.NetworkManagerImpl] (Job-Executor-336:job-999 = [ f437e46a-dfa4-4cea-a518-7da2f5360a89 ]) Asking VirtualRouter to prepare for Nic[2246-1407-0d530dd3-3f25-4fde-b1fb-9ff9188f89e6-172.4.211.191]
2013-09-02 22:51:04,680 ERROR [cloud.vm.VirtualMachineManagerImpl] (Job-Executor-336:job-999 = [ f437e46a-dfa4-4cea-a518-7da2f5360a89 ]) Failed to start instance VM[User|414aa09b-a38c-4b30-bf9c-f1d9fe51134f]
2013-09-02 22:51:04,702 DEBUG [cloud.vm.VirtualMachineManagerImpl] (Job-Executor-336:job-999 = [ f437e46a-dfa4-4cea-a518-7da2f5360a89 ]) Cleaning up resources for the vm VM[User|414aa09b-a38c-4b30-bf9c-f1d9fe51134f] in Starting state
..
..
2013-09-02 22:51:17,018 DEBUG [cloud.network.NetworkManagerImpl] (Job-Executor-336:job-999 = [ f437e46a-dfa4-4cea-a518-7da2f5360a89 ]) Changing active number of nics for network id=204 on 1
2013-09-02 22:51:17,074 DEBUG [cloud.network.NetworkManagerImpl] (Job-Executor-336:job-999 = [ f437e46a-dfa4-4cea-a518-7da2f5360a89 ]) Asking VirtualRouter to prepare for Nic[2246-1407-159bacce-8663-477e-ab37-2d1081c0630b-172.4.211.191]
2013-09-02 22:57:56,139 DEBUG [network.router.VirtualNetworkApplianceManagerImpl] (Job-Executor-336:job-999 = [ f437e46a-dfa4-4cea-a518-7da2f5360a89 ]) Lock is acquired for network id 204 as a part of router startup in Dest[Zone(Id)-Pod(Id)-Cluster(Id)-Host(Id)-Storage(Volume(Id|Type-->Pool(Id))] : Dest[Zone(1)-Pod(975)-Cluster(9749)-Host(9750)-Storage(Volume(1407|ROOT-->Pool(9749))]
2013-09-02 22:57:56,144 DEBUG [network.router.VirtualNetworkApplianceManagerImpl] (Job-Executor-336:job-999 = [ f437e46a-dfa4-4cea-a518-7da2f5360a89 ]) Lock is released for network id 204 as a part of router startup in Dest[Zone(Id)-Pod(Id)-Cluster(Id)-Host(Id)-Storage(Volume(Id|Type-->Pool(Id))] : Dest[Zone(1)-Pod(975)-Cluster(9749)-Host(9750)-Storage(Volume(1407|ROOT-->Pool(9749))]
..
..