[CLOUDSTACK-9506] HA problem - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 4.2.0
Fix Version/s: None
Component/s: None
Security Level: Public (Anyone can view this level - this is the default.)
Labels:
Environment:
Centos

Description

Today our production cloudstack had a problem that a compute node went into alert and VM's and VR's quit working. When trying to move VM's to another node it came by no communication and it was unable to move the VM. We had to do a lot of manual items to get everything back up but need HA to work so that VM and VR move when a problem is found.

We had this problem a few months ago with node going into disconnected state and had to do the same thing.

What we had to do was login to the database and put item in a stopped state. Deleted the VR and restart network to get the VM's back up and running. Has anyone seen this before with HA not moving VM's quick enough and they get stuck in this state? What can be done so these VM's do not get into this state?

Thank you

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: David Gil

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 22/Sep/16 19:27

Updated:: 22/Sep/16 19:27