Uploaded image for project: 'CloudStack'
  1. CloudStack
  2. CLOUDSTACK-8643

Helper for KVM High Availability



    • Improvement
    • Status: Closed
    • Major
    • Resolution: Won't Fix
    • None
    • Future
    • KVM, Management Server
    • Security Level: Public (Anyone can view this level - this is the default.)
    • KVM hypervisors


      When running KVM with NFS storage all Agents will write a heartbeat to the NFS.

      Should a Agent go down, it will still be writing heartbeats even if libvirt has died.

      Using these heartbeats the Management Server can ask other KVM Agents if the other server is still beating. If not, it can fence it.

      While this works I've also encountered scenarios where you run without NFS and still want investigators.

      My proposal would be a Agent Helper running NEXT to the Agent it self.

      A simple Python daemon running a Basic HTTP server which queries libvirt every X seconds about:

      • Running Instances
      • Storage pools

      If keeps this in memory, so that even when libvirt goes down it knows what the last state was.

      Using the Qemu Monitor sockets we can actually see if the guests we have in memory are still online.

      If they are we simply keep the list.

      Now, if a investigator comes by and wants to know if the host is still up it can ALSO ask the helper.

      The management server can ask the helper, but the other agents could as well.

      This doesn't work in all cases, eg where storage is lost. But a additional helper would be useful to catch scenarios where the Agent itself became unresponsive.


        Issue Links



              Unassigned Unassigned
              widodh Wido den Hollander
              0 Vote for this issue
              2 Start watching this issue