Get Virtualizor

High Availability

Virtualizor now supports High Availability for KVM Virtualization.

Requirements

  • Fresh server with OS : CentOS 7.x or AlmaLinux 8.x or Ubuntu 20.04 / 22.04
  • yum/apt
  • Shared Storage to create the VPS disks.
    (Permissions qemu:qemu CentOS/AlmaLinux hosts and libvirt-qemu:kvm for Ubuntu hosts)
  • Shared mount point for KVM XML configuration files at /etc/libvirt on your KVM nodes.
    (Existing data under that directory needs to be saved somewhere temporarily so that it can be restored upon mounting your shared directory on /etc/libvirt )
  • At least four nodes to create HA cluster with Virtualizor KVM (to get reliable quorum) includes Virtualizor master.
  • Shared IPPool among the HA Server group, so that the same IP can work on the other server where VM will get migrated on failure.
  • Since Version 2.9.9+

Installation

  • Login to the Virtualizor Master with the servers root details
  • Click on Servers ->Add Server Groups and check the High Availability checkbox to enable High Availability for the Server Group.


Note
You MUST add the server group with High Availability enabled before adding Slave servers under High Availability cluster. Otherwise Virtualizor will not be able to add HA cluster and install HA utilities on new server which will be added under HA enabled server group
ha-enable
  • Check if Server Group has HA enabled.


  • Click on Servers -> Server Groups/Regions
ha-sg-list

Add Server in HA enabled Server Group

  • Once the HA server group is added and enabled you are ready to add new servers in HA group/cluster.

  • To add new server under HA Server Group, you will need to select HA enabled server group while adding the new server.
ha-add-server

Once you have entered all the information for adding the new server with HA enabled server group, click on Add Server.
You can check the installation process on task wizard.

ha-task

Create VPS with HA Enabled

If the server has HA enabled, VM will be automatically create with HA enabled.
NOTE: Above option (High Availability) will be shown if the selected server is under HA enabled server group.

Monitor HA Cluster(s)

Once you have created/added Server with HA enabled, you can monitor the resource created on those HA cluster.

To check resource and node go to Admin Panel -> Virtual Servers -> High Availability

ha

You can create select the HA enabled Group from the dropdown and it will fetch the status of that cluster.

Simulating HA

Perform a Failover with following steps :

# pcs status
Cluster name: HA_Group_1

Cluster Summary:
  * Stack: corosync
  * Current DC: ha2 (version 2.0.3-4b1f869f0f) - partition with quorum
  * Last updated: Thu Mar 14 05:09:51 2024
  * Last change:  Thu Mar 14 04:58:26 2024 by hacluster via crmd on ha2
  * 3 nodes configured
  * 2 resource instances configured

Node List:
  * Online: [ ha2 ha3 ha4 ]

Full List of Resources:
  * resource_v1001_rcdNkRqKh27Kh5zq     (ocf::heartbeat:VirtualDomain):  Started ha2
  * resource_v1002_KJGYvmpJbi473XYX     (ocf::heartbeat:VirtualDomain):  Started ha4

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

You can see that the status of the v1001 resource is Started on a particular node (in this example, ha2 ).
Shut down Pacemaker and Corosync on that machine to trigger a failover :

# pcs cluster stop  ha2

A cluster command such as pcs cluster stop nodename can be run from any node in the cluster, not just the affected node.

Verify that pacemaker and corosync are no longer running on ha2 server :
# pcs status
 Error: cluster is not currently running on this node

Go to the other node, and check the cluster status :

# # pcs status
Cluster name: HA_Group_1

Cluster Summary:
  * Stack: corosync
  * Current DC: ha2 (version 2.0.3-4b1f869f0f) - partition with quorum
  * Last updated: Thu Mar 14 05:09:51 2024
  * Last change:  Thu Mar 14 04:58:26 2024 by hacluster via crmd on ha3
  * 3 nodes configured
  * 2 resource instances configured

Node List:
  * Online: [ ha3 ha4 ]

Full List of Resources:
  * resource_v1001_rcdNkRqKh27Kh5zq     (ocf::heartbeat:VirtualDomain):  Started ha3
  * resource_v1002_KJGYvmpJbi473XYX     (ocf::heartbeat:VirtualDomain):  Started ha4

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Notice that v1001 is now running on ha3.
Failover happened automatically and no errors are reported.

You can even view it on Admin panel->Virtual Servers->High Availability

ha-simluate

Troubleshooting HA

Check if pcsd service is running or not :

# systemctl status pcsd.service

Use corosync-cfgtool to check whether cluster communication is active .

# corosync-cfgtool -s

pcs status command should always show partition with quorum and also no stonith related errors should be shown to avoid any issues with working of high availability .

Null resource

HA attemps to start a resource (VPS) and it fails multiple times then it will set the resource failcount to INFINITY  :

# pcs resource failcount show resource_v1002_KJGYvmpJbi473XYX

 Failcounts for resource 'resource_v1002_KJGYvmpJbi473XYX'
           ha2 : INFINITY
           ha3 : INFINITY
           ha4 : INFINITY

And it would show up as null resource :

HA-null

In-order to start the resource in HA, it will require clean-up of the resource :

# pcs resource cleanup  resource_v1002_KJGYvmpJbi473XYX

Cleaned up resource_v1002_KJGYvmpJbi473XYX on ha2
Cleaned up resource_v1002_KJGYvmpJbi473XYX on ha3
Cleaned up resource_v1002_KJGYvmpJbi473XYX on ha4
Waiting for 3 replies from the controller... OK

Then it should attempt to start the resource and it would appear as active and the nodes would get listed for those VPSes instead of Null.

    Was this page helpful?
    Newsletter Subscription
    Subscribing you to the mailing list