Evidian > Products > High Availability Software - Zero Extra Hardware > Cloud: The Simplest Load Balancing Cluster with Failover on Windows and Linux
Evidian SafeKit brings load balancing and failover in Cloud between two Windows or Linux redundant servers or more.
This article explains how to implement quickly a Cloud cluster without specific skills.
Note that SafeKit is a generic product on Windows and Linux.
You can implement with the same product real-time replication and failover of any file directory and service, database, complete Hyper-V or KVM virtual machines, Docker, Kubernetes , Cloud applications.
This platform agnostic solution is ideal for a partner with a critical application and who wants to provide a high availability option easy to deploy to many customers.
This clustering solution is also recognized as the simplest to implement by our partners.
We deliver quick start templates for Amazon AWS, Microsoft Azure and Google GCP.
For other Clouds, Go to the Manual Installation tab.
Click on the blue button to access the Evidian SafeKit quick start template |
||
Cloud |
Mirror cluster with real-time replication and failover |
Farm cluster with load balancing and failover |
Amazon AWS
|
||
Microsoft Azure
|
||
Google GCP
|
The load balancer must be configured to periodically send health packets to virtual machines. For that, SafeKit provides a health check which runs inside the virtual machines and which
You must configure the Cloud load balancer with:
For more information, see the configuration of the Cloud load balancer.
The network security must be configured to enable communications for the following protocols and ports:
On both Windows servers
On both Linux servers
The configuration is presented with the web console connected to 2 Windows servers but it is the same thing with 2 Linux servers.
Important: all the configuration must be done from a single browser.
It is recommended to configure the web console in the https mode by connecting to https://<IP address of 1 VM>:9453 (next image). In this case, you must configure before the https mode by using the wizard described in the User's Guide: see "11.1 HTTPS Quick Configuration with the Configuration Wizard".
Or you can use the web console in the http mode by connecting to http://<IP address of 1 VM>:9010 (next image).
Note that you can also make a configuration with DNS names, especially if the IP addresses are not static.
Enter IP address of the first node and click on Confirm (next image)
Click on New node and enter IP address of the second node (next image)
Click on the red floppy disk to save the configuration (previous image)
In the Configuration tab, click on farm.safe then enter farm as the module name and Confirm (next images with farm instead of xxx)
Click on Validate (next image)
Do not configure a virtual IP address (next image) because this configuration is already made in the Cloud load balancer. This section is useful for on-premise configuration only.
If a process is defined in the Process Checker section (next image), it will be monitored with the action restart in case of failure. The services will be stopped an restarted locally on the local server if this process disappears from the list of running processes. After 3 unsuccessful local restarts, the module is stopped on the local server. As a consequence, the health check answers NOT FOUND to the Cloud load balancer and the load balancing is reconfigured to load balance the traffic on the remaining servers of the farm.
start_both and stop_both (next image) contain the start and the stop of services.
Click on Validate (previous image)
Click on Configure (previous image)
Check the success green message on both servers and click on Next (previous image)
Start the cluster on both nodes (previous image). Check that the status becomes UP (green) - UP (green) (next image).
The cluster is operational with services running on both UP nodes (previous image).
Be careful, components which are clients of the services must be configured with the virtual IP address. The configuration can be made with a DNS name (if a DNS name has been created and associated with the virtual IP address).
Check with Windows Microsoft Management Console (MMC) or with Linux command lines that the services are started on both UP nodes. Put services with Boot Startup Type = Manual (SafeKit controls start of services).
Stop one UP node by scrolling down the menu of the node and by clicking on Stop. Check that the load balancing is reconfigured with only the other node taking all TCP connections. And check that the services are stopped on the STOP node with Windows Microsoft Management Console (MMC) or with Linux command lines.
More information on tests in the User's Guide
Configure boot start (next image on the right side) configures the automatic boot of the module when the server boots. Do this configuration on both nodes once the load balancing and failover solution is correctly running.
Note that for synchronizing SafeKit at boot and at shutdown on Windows, we assume that a command line has been run on both nodes during installation as administrator: .\addStartupShutdown.cmd in C:\safekit\private\bin (otherwise dot it now).
Read the module log to understand the reasons of a failover, of a waiting state on the availability of a resource etc...
To see the module log of node 1 (next image):
Repeat the same operation to see the module log of node 2.
Read the application log to see the output messages of the stat_both and stop_both restart scripts.
To see the application log of node 1 (next image):
Repeat the same operation to see the application log of node 2.
More information on troubleshooting in the User's Guide
In Advanced Configuration tab (next image), you can edit internal files of the module: bin/start_both and bin/stop_both and conf/userconfig.xml (next image on the left side). If you make change in the internal files here, you must apply the new configuration by a right click on the blue icon/xxx on the left side (next image): the interface will allow you to redeploy the modified files on both servers.
More information on userconfig.xml in the User's Guide
For getting support on the call desk of https://support.evidian.com, get 2 Snaphots (2 .zip files), one for each server and upload them in the call desk tool (next image).
<!DOCTYPE safe>
<safe>
<service mode="farm" maxloop="3" loop_interval="24">
<!-- Farm topology configuration -->
<!-- Names or IP addresses on the default network are set during initialization in the console -->
<farm>
<lan name="default" />
</farm>
<!-- Software Error Detection Configuration -->
<!-- Replace
* PROCESS_NAME by the name of the process to monitor
-->
<errd polltimer="10">
<proc name="PROCESS_NAME" atleast="1" action="restart" class="both" />
</errd>
<!-- User scripts activation -->
<user nicestoptimeout="300" forcestoptimeout="300" logging="userlog" />
</service>
</safe>
start_both.cmd on Windows
@echo off
rem Script called on all servers for starting applications
rem For logging into SafeKit log use:
rem "%SAFE%\safekit" printi | printe "message"
rem stdout goes into Application log
echo "Running start_both %*"
set res=0
rem Fill with your services start call
set res=%errorlevel%
if %res% == 0 goto end
:stop
set res=%errorlevel%
"%SAFE%\safekit" printe "start_both failed"
rem uncomment to stop SafeKit when critical
rem "%SAFE%\safekit" stop -i "start_both"
:end
stop_both.cmd on Windows
@echo off
rem Script called on all servers for stopping application
rem For logging into SafeKit log use:
rem "%SAFE%\safekit" printi | printe "message"
rem ----------------------------------------------------------
rem
rem 2 stop modes:
rem
rem - graceful stop
rem call standard application stop with net stop
rem
rem - force stop (%1=force)
rem kill application's processes
rem
rem ----------------------------------------------------------
rem stdout goes into Application log
echo "Running stop_both %*"
set res=0
rem default: no action on forcestop
if "%1" == "force" goto end
rem Fill with your services stop call
rem If necessary, uncomment to wait for the real stop of services
rem "%SAFEBIN%\sleep" 10
if %res% == 0 goto end
"%SAFE%\safekit" printe "stop_both failed"
:end
<!DOCTYPE safe>
<safe>
<service mode="farm" maxloop="3" loop_interval="24">
<!-- Farm topology configuration for the membership protocol -->
<!-- Names or IP addresses on the default network are set during initialization in the console -->
<farm>
<lan name="default" />
</farm>
<!-- Software Error Detection Configuration -->
<!-- Replace
* PROCESS_NAME by the name of the process to monitor
-->
<errd polltimer="10">
<proc name="PROCESS_NAME" atleast="1" action="restart" class="both" />
</errd>
<!-- User scripts activation -->
<user nicestoptimeout="300" forcestoptimeout="300" logging="userlog" />
</service>
</safe>
start_both on Linux
#!/bin/sh
# Script called on the primary server for starting application
# For logging into SafeKit log use:
# $SAFE/safekit printi | printe "message"
# stdout goes into Application log
echo "Running start_both $*"
res=0
# Fill with your application start call
if [ $res -ne 0 ] ; then
$SAFE/safekit printe "start_both failed"
# uncomment to stop SafeKit when critical
# $SAFE/safekit stop -i "start_both"
fi
stop_both on Linux
#!/bin/sh
# Script called on the primary server for stopping application
# For logging into SafeKit log use:
# $SAFE/safekit printi | printe "message"
#----------------------------------------------------------
#
# 2 stop modes:
#
# - graceful stop
# call standard application stop
#
# - force stop ($1=force)
# kill application's processes
#
#----------------------------------------------------------
# stdout goes into Application log
echo "Running stop_both $*"
res=0
# default: no action on forcestop
[ "$1" = "force" ] && exit 0
# Fill with your application stop call
[ $res -ne 0 ] && $SAFE/safekit printe "stop_both failed"
Network load balancing and failover |
|
Windows farm |
Linux farm |
Generic farm > | Generic farm > |
Microsoft IIS > | - |
NGINX > | NGINX > |
Apache > | Apache > |
Amazon AWS farm > | Amazon AWS farm > |
Microsoft Azure farm > | Microsoft Azure farm > |
Google GCP farm > | Google GCP farm > |
Other cloud > | Other cloud > |
Several modules can be deployed on the same cluster. Thus, advanced clustering architectures can be implemented:
Evidian SafeKit mirror cluster with real-time file replication and failover |
|
|
|
|
|
|
|
Fully automated failback procedure > |
|
Replication of any type of data > |
|
File replication vs disk replication > |
|
File replication vs shared disk > |
|
Remote sites and virtual IP address > |
|
|
|
|
|
Uniform high availability solution > |
|
|
|
Evidian SafeKit farm cluster with load balancing and failover |
|
No load balancer or dedicated proxy servers or special multicast Ethernet address > |
|
|
|
Remote sites and virtual IP address > |
|
Uniform high availability solution > |
|
|
|
|
|
Application High Availability vs Full Virtual Machine High Availability > |
|
|
|
|
|
|
|
Byte-level file replication vs block-level disk replication > |
|
|
|
|
|
Virtual IP address |
|
|
|