There are some VM deployment failures happening when multiple VMs are deployed at a time, failures mainly due to NetworkModel code that iterates over all the vlans in the pod. This causes each deployVM thread to hold the global lock on Network longer and cause delays. This delay in turn causes more threads to choose same host and fail since capacity is not available on that host.
Following are some changes required to be done to reduce delays during VM deployments which in turn causes some vm deployment failures when multiple VMs are launched at a time.
- In Planner, remove the clusters that do not contain a host with matching service offering tag. This will save some iterations over clusters that dont have matching tagged host
- In NetworkModel, do not query the vlans for the pod within the loop. Also optimized the logic to query the ip/ipv6
- In DeploymentPlanningManagerImpl, do not process the affinity group if the plan has hostId provided.