eviden-logo

Evidian > Products > SafeKit: All-in-One SANless High Availability & Application Clustering Software > Amazon AWS: The Simplest High Availability Cluster with Synchronous Real-Time Replication and Failover

Amazon AWS: The Simplest High Availability Cluster with Synchronous Real-Time Replication and Failover

Evidian SafeKit

How the Evidian SafeKit software simply implements a high availability cluster in Amazon AWS?

The solution in Amazon AWS

Evidian SafeKit brings high availability in Amazon AWS between two Windows or Linux redundant servers.

This article explains how to implement quickly a Amazon AWS cluster without shared disk and without specific skills.

A generic product

Note that SafeKit is a generic product on Windows and Linux.

You can implement with the same product real-time replication and failover of any file directory and service, database, complete Hyper-V or KVM virtual machines, Docker, Kubernetes, Cloud applications (see all solutions).

Architecture

SafeKit mirror cluster with real-time replication and failover in Amazon AWS?

How it works in Amazon AWS?

  • The servers are running in different availability zones.
  • The critical application is running on the PRIM server.
  • Users are connected to a primary/secondary virtual IP address which is configured in the Amazon AWS load balancer.
  • SafeKit provides a generic health check for the load balancer.
    On the PRIM server, the health check returns OK to the load balancer and NOK on the SECOND server.
  • In each server, SafeKit monitors the critical application with process checkers and custom checkers.
  • SafeKit restarts automatically the critical application when there is a software failure or a hardware failure thanks to restart scripts.
  • SafeKit makes synchronous real-time replication of files containing critical data.
  • A connector for the SafeKit web console is installed in each server.
    Thus, the high availability cluster can be managed in a very simple way to avoid human errors.

How the SafeKit mirror cluster works?

Step 1. Real-time replication

Server 1 (PRIM) runs the application. Clients are connected to a virtual IP address. SafeKit replicates in real time modifications made inside files through the network.

File replication at byte level in a mirror cluster

The replication is synchronous with no data loss on failure contrary to asynchronous replication.
You just have to configure the names of directories to replicate in SafeKit. There are no pre-requisites on disk organization. Directories may be located in the system disk.

Step 2. Automatic failover

When Server 1 fails, Server 2 takes over. SafeKit switches the virtual IP address and restarts the application automatically on Server 2.
The application finds the files replicated by SafeKit uptodate on Server 2. The application continues to run on Server 2 by locally modifying its files that are no longer replicated to Server 1.

Failover in a mirror cluster

The failover time is equal to the fault-detection time (30 seconds by default) plus the application start-up time.

Step 3. Automatic failback

Failback involves restarting Server 1 after fixing the problem that caused it to fail.
SafeKit automatically resynchronizes the files, updating only the files modified on Server 2 while Server 1 was halted.

Failback in a mirror cluster

Failback takes place without disturbing the application, which can continue running on Server 2.

Step 4. Back to normal

After reintegration, the files are once again in mirror mode, as in step 1. The system is back in high-availability mode, with the application running on Server 2 and SafeKit replicating file updates to Server 1.

Return to normal operation in a mirror cluster

If the administrator wishes the application to run on Server 1, he/she can execute a "swap" command either manually at an appropriate time, or automatically through configuration.

How to configure a SafeKit Mirror Cluster?

SafeKit Web Console: High Availability configuration dashboard showing heartbeat networks, virtual IP setup, and real-time directory replication for a mirror cluster.

The SafeKit web console provides an intuitive interface to orchestrate high availability for your critical applications. In just a few steps, you can configure a SafeKit mirror cluster to ensure business continuity:

  • Application Failover (Macros Tab): Define the specific application services to be automatically restarted in the event of a failure.
  • Heartbeat network(s): Dedicated communication path(s) used by cluster nodes to continuously monitor each other's health and availability and synchronize failover decisions.
  • Virtual IP Management: Set up the Virtual IP (VIP) for transparent client reconnection after a failover.
  • Real-Time Replication: Select the critical directories for host-based, synchronous byte-level replication.
  • Checkers: Monitor the application's health and trigger automatic recovery if a process failure is detected.

The SafeKit cluster includes a dedicated split-brain checker to resolve network isolation issues without the need for a third witness machine or an additional heartbeat network. Learn more about power outage and network isolation in a cluster.

How to monitor a SafeKit mirror cluster?

SafeKit Web Console: Real-time monitoring of a 2-node mirror cluster showing PRIM and SECOND states with active data replication.

The SafeKit management console offers a unified view of your high availability infrastructure. It allows administrators to monitor the operational state of the cluster and track data synchronization in real-time.

For a 2-node mirror cluster, the console clearly displays the roles of each server:

  • PRIM (Primary): The active node currently running the application and managing the Virtual IP. It performs writes to the local storage and real-time replication to the secondary node.
  • SECOND (Secondary): The standby node receiving synchronous byte-level updates. It is ready to take over instantly if the Primary fails.
  • ALONE State: Visually alerts you when the cluster is running on a single node (e.g., during maintenance or after a failure), indicating that redundancy is temporarily lost.
  • Resynchronization Progress: When a failed node recovers, its status turns orange during background data reintegration, ensuring no downtime during the "return to normal" phase.

Beyond simple status icons, the interface provides one-click failover orchestration, allowing you to manually swap roles (Primary/Secondary) for planned maintenance without interrupting user activity.

🔍 SafeKit High Availability Navigation Hub

Explore SafeKit: Features, technical videos, documentation, and free trial
Resource Type Description Direct Link
Key Features Why Choose SafeKit for Simple and Cost-Effective High Availability? See Why Choose SafeKit for High Availability
Deployment Model All-in-One SANless HA: Shared-Nothing Software Clustering See SafeKit All-in-One SANless HA
Partners SafeKit: The Benchmark in High Availability for Partners See Why SafeKit Is the HA Benchmark for Partners
HA Strategies SafeKit: Infrastructure (VM) vs. Application-Level High Availability See SafeKit HA & Redundancy: VM vs. Application Level
Technical Specifications Technical Limitations for SafeKit Clustering See SafeKit High Availability Limitations
Proof of Concept SafeKit: High Availability Configuration & Failover Demos See SafeKit Failover Tutorials
Architecture How the SafeKit Mirror Cluster works (Real-Time Replication & Failover) See SafeKit Mirror Cluster: Real-Time Replication & Failover
Architecture How the SafeKit Farm Cluster works (Network Load Balancing & Failover) See SafeKit Farm Cluster: Network Load Balancing & Failover
Competitive Advantages Comparison: SafeKit vs. Traditional High Availability (HA) Clusters See SafeKit vs. Traditional HA Cluster Comparison
Technical Resources SafeKit High Availability: Documentation, Downloads & Trial See SafeKit HA Free Trial & Technical Documentation
Pre-configured Solutions SafeKit Application Module Library: Ready-to-Use HA Solutions See SafeKit High Availability Application Modules