High Availability

Virtualizor now supports High Availability for KVM Virtualization.
For using it on Proxmox + Virtualizor setup, you can view this guide :
https://www.virtualizor.com/docs/install/install-proxmox/#ha-live-mirgration

Requirements

Fresh server with OS : CentOS 7.x or AlmaLinux 8.x or Ubuntu 20.04 / 22.04
yum/apt
Shared Storage to create the VPS disks.
(Permissions qemu:qemu CentOS/AlmaLinux hosts and libvirt-qemu:kvm for Ubuntu hosts)
Shared mount point for KVM XML configuration files at /etc/libvirt on your KVM nodes.
(Existing data under that directory needs to be saved somewhere temporarily so that it can be restored upon mounting your shared directory on /etc/libvirt )
At least four nodes to create HA cluster with Virtualizor KVM (to get reliable quorum) includes Virtualizor master.
Shared IPPool among the HA Server group, so that the same IP can work on the other server where VM will get migrated on failure. Domain forwarding option will not work in this case.
Since Version 2.9.9+

Installation

Login to the Virtualizor Master with the servers root details
Click on Servers ->Add Server Groups and check the High Availability checkbox to enable High Availability for the Server Group.

Note

You MUST add the server group with High Availability enabled before adding Slave servers under High Availability cluster. Otherwise Virtualizor will not be able to add HA cluster and install HA utilities on new server which will be added under HA enabled server group

Check if Server Group has HA enabled.

Click on Servers -> Server Groups/Regions

Add Server in HA enabled Server Group

Once the HA server group is added and enabled you are ready to add new servers in HA group/cluster.

Once you have entered all the information for adding the new server with HA enabled server group, click on Add Server.
You can check the installation process on task wizard.

Create VPS with HA Enabled

If the server has HA enabled, VM will be automatically create with HA enabled.
NOTE: Above option (High Availability) will be shown if the selected server is under HA enabled server group.

Monitor HA Cluster(s)

Once you have created/added Server with HA enabled, you can monitor the resource created on those HA cluster.

To check resource and node go to Admin Panel -> Virtual Servers -> High Availability

You can create select the HA enabled Group from the dropdown and it will fetch the status of that cluster.

Simulating HA

Perform a Failover with following steps :

# pcs status

Cluster name: HA_Group_1

Cluster Summary:

* Stack: corosync

* Current DC: ha2 (version 2.0.3-4b1f869f0f) - partition with quorum

* Last updated: Thu Mar 14 05:09:51 2024

* Last change: Thu Mar 14 04:58:26 2024 by hacluster via crmd on ha2

* 3 nodes configured

* 2 resource instances configured

Node List:

* Online: [ ha2 ha3 ha4 ]

Full List of Resources:

* resource_v1001_rcdNkRqKh27Kh5zq (ocf::heartbeat:VirtualDomain): Started ha2

* resource_v1002_KJGYvmpJbi473XYX (ocf::heartbeat:VirtualDomain): Started ha4

Daemon Status:

corosync: active/enabled

pacemaker: active/enabled

pcsd: active/enabled

You can see that the status of the v1001 resource is Started on a particular node (in this example, ha2 ).
Shut down Pacemaker and Corosync on that machine to trigger a failover :

# pcs cluster stop ha2

A cluster command such as pcs cluster stop nodename can be run from any node in the cluster, not just the affected node.

Verify that pacemaker and corosync are no longer running on ha2 server :

# pcs status

Error: cluster is not currently running on this node

Go to the other node, and check the cluster status :

# # pcs status

Cluster name: HA_Group_1

Cluster Summary:

* Stack: corosync

* Current DC: ha2 (version 2.0.3-4b1f869f0f) - partition with quorum

* Last updated: Thu Mar 14 05:09:51 2024

* Last change: Thu Mar 14 04:58:26 2024 by hacluster via crmd on ha3

* 3 nodes configured

* 2 resource instances configured

Node List:

* Online: [ ha3 ha4 ]

Full List of Resources:

* resource_v1001_rcdNkRqKh27Kh5zq (ocf::heartbeat:VirtualDomain): Started ha3

* resource_v1002_KJGYvmpJbi473XYX (ocf::heartbeat:VirtualDomain): Started ha4

Daemon Status:

corosync: active/enabled

pacemaker: active/enabled

pcsd: active/enabled

Notice that v1001 is now running on ha3.
Failover happened automatically and no errors are reported.

You can even view it on Admin panel->Virtual Servers->High Availability

Troubleshooting HA

Check if pcsd service is running or not :

# systemctl status pcsd.service

Use corosync-cfgtool to check whether cluster communication is active .

# corosync-cfgtool -s

pcs status command should always show partition with quorum and also no stonith related errors should be shown to avoid any issues with working of high availability .

Null resource

HA attemps to start a resource (VPS) and it fails multiple times then it will set the resource failcount to INFINITY :

# pcs resource failcount show resource_v1002_KJGYvmpJbi473XYX

Failcounts for resource 'resource_v1002_KJGYvmpJbi473XYX'
ha2 : INFINITY
ha3 : INFINITY
ha4 : INFINITY

And it would show up as null resource :

In-order to start the resource in HA, it will require clean-up of the resource :

# pcs resource cleanup resource_v1002_KJGYvmpJbi473XYX

Cleaned up resource_v1002_KJGYvmpJbi473XYX on ha2

Cleaned up resource_v1002_KJGYvmpJbi473XYX on ha3

Cleaned up resource_v1002_KJGYvmpJbi473XYX on ha4

Waiting for 3 replies from the controller... OK

Then it should attempt to start the resource and it would appear as active and the nodes would get listed for those VPSes instead of Null.

Requirements

Installation

Add Server in HA enabled Server Group

Create VPS with HA Enabled

Monitor HA Cluster(s)

Simulating HA

Troubleshooting HA

Null resource

Experience the magic of deploying Virtual Machines within seconds.

Subscribe to our Newsletter

VIRTUALIZOR LINKS

Guides

GET IN TOUCH

Other Products

INFORMATION

Blogs