How a primary/secondary virtual IP address works in the same subnet?

Case of a mirror cluster with 2 Windows or Linux servers

When both servers of a mirror cluster are in the same subnet, the virtual IP address is set on the Ethernet card of the primary server (through IP aliasing). The virtual IP address is a third IP address coming in addition to the two physical IP addresses of server 1 and server 2. Note that with SafeKit, several virtual IP addresses can be set in the cluster on the same Ethernet card or on different Ethernet cards.

If server 1 is the primary server, then the virtual IP address is associated to the Ethernet MAC address of server 1 in the clients ARP caches: mac1 in the figure. If there is a failure of server 1 and a failover on server 2, SafeKit automatically sends gratuitous ARP to reroute clients ARP caches with the Ethernet address mac2 of server 2. Thus, clients are reconnected to server 2 running the application which has been restarted on this server by the SafeKit clustering mechanisms.

When two servers are in remote sites, the previous virtual IP address algorithms are working if they are connected in the same subnet through an extended LAN/VLAN. This is the simplest use case for remote sites.

How a primary/secondary virtual IP address works in different subnets?

Case of a mirror cluster with 2 Windows or Linux servers

If the servers are in differents subnets, the virtual IP address can be set at the level of a load balancer. The load balancer is configured with the two physical IP addresses of the two servers in their respective subnets. And the load balancer routes the traffic according a health check to servers.

The health check is based on a URL managed by SafeKit servers and answering OK or NOT FOUND according the status of a server. If the server is SECOND, the SafeKit health check returns NOT FOUND. Thus no traffic is sent by the load balancer to the secondary server. And if the server is PRIM, then the SafeKit health check returns OK. Thus all the traffic is sent by the load balancer to the primary server. In case of failover, SafeKit changes its answers to the health check. Thus the traffic of the load balancer is rerouted.

This implementation is the one used in SafeKit mirror-like solutions in the Cloud: Amazon AWS, Microsoft Azure and Google GCP.

Please note that SafeKit does not provide a load balancer; it only offers health checks. The load balancer must be supplied by the network infrastructure between the two subnets.

If needed, it can be discussed with the network team whether, instead of setting up a load balancer, an extended LAN could be configured between the two subnets. Moreover, when using a load balancer, it is essential to ensure that the application supports clients connecting via the load balancer's virtual IP address and that it properly handles connections arriving through the translated physical IP address assigned by the load balancer.

This issue does not arise with an extended LAN, which also provides sufficient bandwidth and appropriate latency for real-time synchronous replication without data loss.

How a load balanced virtual IP address works in the same subnet?

Case of a farm cluster with 2 Windows or Linux servers

In a load balancing farm cluster, a virtual IP address is required to load balance clients requests and to reroute clients in case of failover. In this example, we consider only two servers but the solution works with more than two servers.

When both servers of the cluster are in the same subnet, the virtual IP address is set on the Ethernet card of both servers (IP aliasing).

In the ARP cache of clients, the virtual IP address is associated to the Ethernet MAC address of one server: mac1 of server1 in the figure. A filter inside the kernel of server 1 receives the traffic and split it according the identity of the client packets (client IP address, client TCP port).

If there is a failure of server 1, SafeKit sends gratuitous ARP to reroute clients ARP caches with the Ethernet address mac2 of server 2. Thus, clients are reconnected to server 2.

How a load balanced virtual IP address works in different subnets?

Case of a farm cluster with 2 Windows or Linux servers

If the servers are in differents subnets, the virtual IP address can be set at the level of a load balancer. The load balancer is configured with the two physical IP addresses of the two servers in their subnets. And the load balancer routes the traffic according load balancing rules (client IP address, client TCP port) and according a health check to servers.

The health check is based on a URL managed by SafeKit servers and answering OK or NOT FOUND according the status of a server. If the server is UP, the SafeKit health check returns OK, else NOT FOUND. In case of failover, SafeKit does not answer anymore OK to the health check on the failed server. Thus the traffic of the load balancer is rerouted.

This implementation is the one used in SafeKit farm-like solutions in the Cloud: Amazon AWS, Microsoft Azure and Google GCP.

Virtual IP address in a farm cluster

On the previous figure, the application is running on the 3 servers (3 is an example, it can be 2 or more). Users are connected to a virtual IP address.

The virtual IP address is configured locally on each server in the farm cluster.

The input traffic to the virtual IP address is received by all the servers and split among them by a network filter inside each server's kernel.

SafeKit detects hardware and software failures, reconfigures network filters in the event of a failure, and offers configurable application checkers and recovery scripts.

Load balancing in a network filter

The network load balancing algorithm inside the network filter is based on the identity of the client packets (client IP address, client TCP port). Depending on the identity of the client packet input, only one filter in a server accepts the packet; the other filters in other servers reject it.

Once a packet is accepted by the filter on a server, only the CPU and memory of this server are used by the application that responds to the request of the client. The output messages are sent directly from the application server to the client.

If a server fails, the SafeKit membership protocol reconfigures the filters in the network load balancing cluster to re-balance the traffic on the remaining available servers.

Stateful or stateless applications

With a stateful application, there is session affinity. The same client must be connected to the same server on multiple TCP sessions to retrieve its context on the server. In this case, the SafeKit load balancing rule is configured on the client IP address. Thus, the same client is always connected to the same server on multiple TCP sessions. And different clients are distributed across different servers in the farm.

With a stateless application, there is no session affinity. The same client can be connected to different servers in the farm on multiple TCP sessions. There is no context stored locally on a server from one session to another. In this case, the SafeKit load balancing rule is configured on the TCP client session identity. This configuration is the one which is the best for distributing sessions between servers, but it requires a TCP service without session affinity.

SafeKit Solutions and Quick Installation Guides

Key differentiators of high availability at the virtual machine level or at the application level

Key differentiators of SafeKit vs Microsoft Hyper-V cluster and VMware HA

Key differentiators of a mirror cluster with replication and failover

Evidian SafeKit mirror cluster with real-time file replication and failover
3 products in 1 More info >	The SafeKit high availability software saves on Windows and Linux the cost of : external shared or replicated storage, load balancing boxes, enterprise editions of OS and databases SafeKit includes all clustering features: synchronous real-time file replication, monitoring of server / network / software failures, automatic application restart, virtual IP address switched in case of failure to reroute clients
Very simple configuration More info >	The cluster configuration is very simple and made by means of application modules. New services and new replicated directories can be added to an existing application module to complete a high availability solution All the configuration of clusters is made using a simple centralized web administration console There is no domain controller or active directory to configure as with Microsoft cluster
Synchronous replication More info >	The real-time replication is synchronous with no data loss on failure This is not the case with asynchronous replication
Fully automated failback More info >	After a failure when a server reboots, the replication failback procedure is fully automatic and the failed server reintegrates the cluster without stopping the application on the only remaining server This is not the case with most replication solutions particularly with replication at the database level. Manual operations are required for resynchronizing a failed server. The application may even be stopped on the only remaining server during the resynchonization of the failed server
Replication of any type of data More info >	The replication is working for databases but also for any files which shall be replicated This not the case for replication at the database level
File replication vs disk replication More info >	The replication is based on file directories that can be located anywhere (even in the system disk) This is not the case with disk replication where special application configuration must be made to put the application data in a special disk
File replication vs shared disk More info >	The servers can be put in two remote sites This is not the case with shared disk solutions
Remote sites and virtual IP address More info >	All SafeKit clustering features are working for 2 servers in remote sites. Replication requires an extended LAN type network (latency = performance of synchronous replication, bandwidth = performance of resynchronization after failure). If both servers are connected to the same IP network through an extended LAN between two remote sites, the virtual IP address of SafeKit is working with rerouting at level 2 If both servers are connected to two different IP networks between two remote sites, the virtual IP address can be configured at the level of a load balancer with the "healh check" of SafeKit.
Quorum and split brain More info >	The solution works with only 2 servers and for the quorum (network isolation between both sites), a simple split brain checker to a router is offered to support a single execution of the critical application This is not the case for most clustering solutions where a 3^rd server is required for the quorum
Active/active cluster More info >	The secondary server is not dedicated to the restart of the primary server. The cluster can be active-active by running 2 different mirror modules This is not the case with a fault-tolerant system where the secondary is dedicated to the execution of the same application synchronized at the instruction level
Uniform high availability solution More info >	SafeKit implements a mirror cluster with replication and failover. But it imlements also a farm cluster with load balancing and failover. Thus a N-tiers architecture can be made highly available and load balanced with the same solution on Windows and Linux (same installation, configuration, administration with the SafeKit console or with the command line interface). This is unique on the market This is not the case with an architecture mixing different technologies for load balancing, replication and failover
RTO / RPO More info >	SafeKit implements quick application restart in case of failure: around 1 mn or less Quick application restart is not ensured with full virtual machines replication. In case of hypervisor failure, a full VM must be rebooted on a new hypervisor with a recovery time depending on the OS reboot as with VMware HA or Hyper-V cluster

Key differentiators of a farm cluster with load balancing and failover

Evidian SafeKit farm cluster with load balancing and failover
No load balancer or dedicated proxy servers or special multicast Ethernet address More info >	The solution does not require load balancers or dedicated proxy servers above the farm for imlementing load balancing. SafeKit is installed directly on the application servers in the farm. The load balancing is based on a standard virtual IP address/Ethernet MAC address and is working with physical servers or virtual machines on Windows and Linux without special network configuration This is not the case with network load balancers This is not the case with dedicated proxies on Linux This is not the case with a specific multicast Ethernet address on Windows
All clustering features More info >	The solution includes all clustering features: virtual IP address, load balancing on client IP address or on sessions, monitoring of server / network / software failures, automatic application restart with a quick revovery time and a replication option with a mirror module This is not the case with other load balancing solutions. They are able to make load balancing but they do not include a full clustering solution with restart scripts and automatic application restart in case of failure. They do not offer a replication option The cluster configuration is very simple and made by means of application modules. There is no domain controller or active directory to configure on Windows. The solution works on Windows and Linux
Remote sites and virtual IP address More info >	If servers are connected to the same IP network through an extended LAN between remote sites, the virtual IP address of SafeKit is working with load balancing at level 2 If servers are connected to different IP networks between remote sites, the virtual IP address can be configured at the level of a load balancer with the help of the SafeKit health check. Thus you can implement load balancing but also all the clustering features of SafeKit, in particular monitoring and automatic recovery of the critical application on application servers
Uniform high availability solution More info >	SafeKit imlements a farm cluster with load balancing and failover. But it implements also a mirror cluster with replication and failover. Thus a N-tiers architecture can be made highly available and load balanced with the same solution on Windows and Linux (same installation, configuration, administration with the SafeKit console or with the command line interface). This is unique on the market This is not the case with an architecture mixing different technologies for load balancing, replication and failover

Key differentiators of the SafeKit high availability technology

Software clustering vs hardware clustering More info >
A simple software cluster with the SafeKit package just installed on two servers	Complex hardware clustering with external storage or network load balancers
Shared nothing vs a shared disk cluster More info >
SafeKit is a shared-nothing cluster: easy to deploy even in remote sites	A shared disk cluster is complex to deploy
Application High Availability vs Full Virtual Machine High Availability More info >
Application HA supports hardware failure and software failure with application checkers. Quick recovery time by restarting only the application (RTO around 1 mn or less). Application HA requires to define restart scripts per application and folders to replicate (SafeKit application modules).	Full virtual machines HA supports hardware failure and some software failures like a frozen VM. VM reboot on failure and recovery time depending on the OS reboot. No restart scripts to define with full virtual machines HA (SafeKit hyperv.safe or kvm.safe modules). Hypervisors are active/active with just multiple virtual machines.
High availability vs fault tolerance More info >
No dedicated server with SafeKit. Each server can be the failover server of the other one.Software failure with restart in another OS environment. Smooth upgrade of application and OS possible server by server (version N and N+1 can coexist)	Secondary server dedicated to the execution of the same application synchronized at the instruction level.Software exception on both servers at the same time. Smooth upgrade not possible
Synchronous replication vs asynchronous replication More info >
SafeKit implements real-time synchronous replication with no data loss in case of failure	With asynchronous replication, there is data loss on failure
Byte-level file replication vs block-level disk replication More info >
SafeKit implements real-time byte-level file replication and is simply configured with application directories to replicate even in the system disk	Block-level disk replication is complex to configure and requires to put application data in a special disk
Heartbeat, failover and quorum to avoid 2 master nodes More info >
To avoid 2 masters, SafeKit proposes a simple split brain checker configured on a router	To avoid 2 masters, other clusters require a complex configuration with a third machine, a special quorum disk, a special interconnect
Virtual IP address primary/secondary, network load balancing, failover More info >
No dedicated proxy servers and no special network configuration are required in a SafeKit cluster for virtual IP addresses	Special network configuration is required in other clusters for virtual IP addresses. Note that SafeKit offers a health check adapted to load balancers

VM HA with the SafeKit Hyper-V or KVM module	Application HA with SafeKit application modules

SafeKit inside 2 hypervisors: replication and failover of full VM	SafeKit inside 2 virtual or physical machines: replication and failover at application level
Replicates more data (App+OS)	Replicates only application data
Reboot of VM on hypervisor 2 if hypervisor 1 crashes Recovery time depending on the OS reboot VM checker and failover (Virtual Machine is unresponsive, has crashed, or stopped working)	Quick recovery time with restart of App on OS2 if crash of server 1 Around 1 mn or less (see RTO/RPO here) Application checker and software failover
Generic solution for any application / OS	Restart scripts to be written in application modules
Works with Windows/Hyper-V and Linux/KVM but not with VMware	Platform agnostic, works with physical or virtual machines, cloud infrastructure and any hypervisor including VMware

SafeKit with the Hyper-V module or the KVM module	Microsoft Hyper-V Cluster & VMware HA

No shared disk - synchronous real-time replication instead with no data loss	Shared disk and specific extenal bay of disk
Remote sites = no SAN for replication	Remote sites = replicated bays of disk across a SAN
No specific IT skill to configure the system (with hyperv.safe and kvm.safe)	Specific IT skills to configure the system
Note that the Hyper-V/SafeKit and KVM/SafeKit solutions are limited to replication and failover of 32 VMs.	Note that the Hyper-V built-in replication does not qualify as a high availability solution. This is because the replication is asynchronous, which can result in data loss during failures, and it lacks automatic failover and failback capabilities.

How a virtual IP address works (Windows/Linux)?

Evidian SafeKit

How a primary/secondary virtual IP address works in the same subnet?

Case of a mirror cluster with 2 Windows or Linux servers

How a primary/secondary virtual IP address works in different subnets?

Case of a mirror cluster with 2 Windows or Linux servers

How a load balanced virtual IP address works in the same subnet?

Case of a farm cluster with 2 Windows or Linux servers

How a load balanced virtual IP address works in different subnets?

Case of a farm cluster with 2 Windows or Linux servers

How the SafeKit mirror cluster works?

Step 1. Real-time replication

Step 2. Automatic failover

Step 3. Automatic failback

Step 4. Back to normal

Typical usage with SafeKit

Why a replication of a few Tera-bytes?

Why a replication < 1,000,000 files?

Why a failover ≤ 32 replicated VMs?

Why a LAN/VLAN network between remote sites?

How the SafeKit farm cluster works?

Virtual IP address in a farm cluster

Load balancing in a network filter

Stateful or stateless applications

SafeKit Solutions and Quick Installation Guides

SafeKit High Availability Differentiators

Evidian SafeKit mirror cluster with real-time file replication and failover

Evidian SafeKit farm cluster with load balancing and failover

Software clustering vs hardware clustering More info >

Shared nothing vs a shared disk cluster More info >

Application High Availability vs Full Virtual Machine High Availability More info >

High availability vs fault tolerance More info >

Synchronous replication vs asynchronous replication More info >

Byte-level file replication vs block-level disk replication More info >

Heartbeat, failover and quorum to avoid 2 master nodes More info >

Virtual IP address primary/secondary, network load balancing, failover More info >