Virtualizor now supports High Availability for KVM Virtualization.
- CentOS 7.x or AlmaLinux 8.x or Ubuntu 20.04 / 22.04
- Shared Storage to create the VPS disks.
- Shared mount point for KVM XML configuration files at /etc/libvirt
- At least four nodes to create HA cluster with Virtualizor KVM (to get reliable quorum) includes Virtualizor master.
- Shared IPPool among the HA Server group, so that the same IP can work on the other server where VM will get migrated on failure.
- Since Version 2.9.9+
- Login to the Virtualizor Master with the servers root details
- Click on Servers ->Add Server Groups and check the High Availability checkbox to enable High Availability for the Server Group.
NOTE: You MUST add the server group with High Availability enabled before adding Slave servers under High Availability cluster. Otherwise Virtualizor will not be able to add HA cluster and install HA utilities on new server which will be added under HA enabled server group.
Check if Server Group has HA enabled.
Click on Servers -> Server Groups/Regions
Once the HA server group is added and enabled you are ready to add new servers in HA group/cluster.
To add new server under HA Server Group, you will need to select HA enabled server group while adding the new server.
Once you have entered all the information for adding the new server with HA enabled server group, click on Add Server.
You can check the installation process on task wizard.
If the server has HA enabled, VM will be automatically create with HA enabled.
NOTE: Above option (High Availability) will be shown if the selected server is under HA enabled server group.
Once you have created/added Server with HA enabled, you can monitor the resource created on those HA cluster.
To check resource and node go to Admin Panel -> Virtual Servers -> High Availability
You can create select the HA enabled Group from the dropdown and it will fetch the status of that cluster.
Perform a Failover with following steps :
# pcs status Cluster name: HA_Group_1 Stack: corosync Current DC: ha2 (version 1.1.20-5.el7_7.1-3c4c782f70) - partition with quorum Last updated: Wed Mar 11 04:13:42 2020 Last change: Fri Feb 7 02:09:16 2020 by root via crm_resource on ha3 3 nodes configured 1 resource configured Online: [ ha2 ha3 ha4 ] Full list of resources: resource_v1001_4csljb16ihzegay (ocf::heartbeat:VirtualDomain): Started ha2 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled
You can see that the status of the v1001 resource is Started on a particular node (in this example, ha2 ).
Shut down Pacemaker and Corosync on that machine to trigger a failover :
# pcs cluster stop ha2
A cluster command such as pcs cluster stop nodename can be run from any node in the cluster, not just the affected node.
Verify that pacemaker and corosync are no longer running on ha2 server :
# pcs status Error: cluster is not currently running on this node
Go to the other node, and check the cluster status :
# pcs status Cluster name: HA_Group_1 Stack: corosync Current DC: ha2 (version 1.1.20-5.el7_7.1-3c4c782f70) - partition with quorum Last updated: Wed Mar 11 07:30:09 2020 Last change: Fri Feb 7 02:09:16 2020 by root via crm_resource on ha3 3 nodes configured 1 resource configured Online: [ ha3 ha4 ] Full list of resources: resource_v1001_4csljb16ihzegay (ocf::heartbeat:VirtualDomain): Started ha3 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled
Notice that v1001 is now running on ha3.
Failover happened automatically and no errors are reported.
You can even view it on Admin panel->Virtual Servers->High Availability
Check if pcsd service is running or not :
systemctl status pcsd.service
Use corosync-cfgtool to check whether cluster communication is active .
pcs status command should always show partition with quorum and also no stonith related errors should be shown to avoid any issues with working of high availability .