Evidian Logo

Eviden > Products > SafeKit: All-in-One SANless High Availability & Application Clustering Software > Microsoft Azure High Availability: SafeKit Synchronous Replication & Failover

Microsoft Azure High Availability: SafeKit Synchronous Replication & Failover

SafeKit Shared-Nothing HA in Azure: Eliminate Shared Disks with Zero Data Loss

Script ld+json for SEO and LLMO

While major cloud providers like Azure offer native redundancy, they often leave a significant gap between data replication and true high availability. Currently, cloud users must choose between two suboptimal paths: native asynchronous replication, which requires manual failover and risks data loss, or cloud shared disk, which lacks the automated failover mechanisms required for seamless business continuity.

SafeKit: The Lightweight, All-in-One Alternative to Complex Azure Cloud Clusters

  • All-in-One 2-Node Azure Cluster: SafeKit is a complete solution providing Native Multi-AZ (Availability Zone) capability with virtual IP, automatic failover, automatic failback, and synchronous real-time file replication in a single, integrated software package.
  • Zero Data Loss (RPO=0): Unlike native cloud VM replication for DR (Disaster Recovery), SafeKit uses synchronous, file-level replication to ensure data integrity for transactional applications, guaranteeing no data loss during a failure.
  • Shared-Nothing Architecture: SafeKit eliminates the need for cloud shared disks and resolves their associated "disk-locking" issues. It uses local disks for maximum speed and minimum cost. While traditional clusters often hang waiting for the cloud provider to release a storage lock from a failed node, SafeKit uses independent, synchronized folders within the local disks of each node to ensure instantaneous failover.
  • Zero Application Reconfiguration: SafeKit performs transparent, file-level replication, allowing you to protect apps and data exactly where they are installed—including on the system disk—without the need to migrate data to dedicated cloud shared disks.
  • Kubernetes Alternative: SafeKit provides high availability for both legacy and containerized applications (not designed for Kubernetes) without the operational overhead and complexity of Kubernetes orchestration.
  • Uniform Deployment (On-Prem or Cloud): Aside from configuring the Virtual IP within a Cloud Load Balancer, the SafeKit deployment process remains identical across on-premises and cloud environments.

By consolidating High Availability into a lightweight software layer, SafeKit delivers enterprise-grade redundancy and business continuity in the cloud at a fraction of the cost of traditional clustering suites.

SafeKit High Availability Azure Cluster Architecture providing Virtual IP, Automatic Failover, Automatic Failback, and Synchronous Real-time File Replication
SafeKit HA Mirror Cluster for Azure

How the SafeKit software simply implements a shared-nothing Azure high availability cluster?

What is the SafeKit Mirror HA solution for Azure?

Evidian SafeKit brings high availability to Azure between two virtual machines in two Availability Zones (AZ).

This article explains how to implement quickly a Azure cluster without cloud shared disks and without specific skills.

The principle of the solution is to define the folders where the Azure application data resides, its services, a virtual IP, and checkers.

SafeKit then implements real-time replication and automatic failover to ensure continuous service availability.

Why choose a unified All-in-One HA solution over fragmented tools?

Unlike "bolt-on" solutions that combine separate products for replication and clustering, SafeKit integrates Virtual IP, Automatic Failover, Automatic Failback and Synchronous Real-time File Replication into a single engine.

This eliminates the "house of cards" risk where updates break fragile links between disparate tools, provides a single point of accountability for the entire HA stack, and reduces human error by providing a single interface for Azure application HA.

How does SafeKit handle Uniform Deployment and the Cloud Virtual IP?

SafeKit is a cloud-agnostic solution, meaning the deployment process and architecture stay the same across on-premises servers and all major cloud providers. The only architectural difference lies in how the Virtual IP is presented to the network:

  • On-Premises: SafeKit manages the VIP directly by sending GARP (Gratuitous ARP) packets to local switches, moving the IP between nodes instantly.
  • In the Cloud: Since Cloud networks do not support GARP, the Virtual IP is hosted by a Cloud Load Balancer.

    SafeKit provides the health probe to determine which node is the primary one. This allows the Load Balancer to detect the active node in real-time and route traffic to it automatically, ensuring seamless failover across Availability Zones.

What are the distinctive advantages of SafeKit for Azure high availability compared to competitors?

SafeKit differentiates itself from traditional Azure clusters through its shared-nothing architecture and simplified deployment. While most enterprise solutions require complex management of cloud shared disks, SafeKit provides:

  • Synchronous Replication with Zero Data Loss: SafeKit implements 100% synchronous replication, ensuring total data integrity for transactional applications. In the event of a failure, there is zero data loss (RPO=0). Furthermore, SafeKit is capable of replicating not only the Azure application databases but any other data folders (logs, configuration files, etc.), ensuring the entire environment is mirrored.
  • Simplified 2-Node Clustering: Unlike standard clusters that often require a "witness" (a 3rd node, disk, file share) to maintain a quorum, SafeKit delivers full high availability with just two redundant servers, reducing infrastructure costs and complexity.
    Learn more about our heartbeat and quorum mechanism.
  • Unified Management: Administrators can manage Azure application failover, data replication, and monitoring through a single SafeKit console. This makes high availability accessible to teams without specialized "cluster admin" expertise.
  • Custom Checkers: SafeKit goes beyond basic service monitoring; it offers checkers to monitor the health of the Azure application process. The system is highly extensible, allowing for the addition of custom checkers tailored to your environment.

How does SafeKit reduce the TCO (Total Cost of Ownership) for Azure compared to standard clustering?

Unlike traditional high-availability solutions, SafeKit is designed to operate with the absolute minimum infrastructure overhead, without compromising reliability. Key savings compared to traditional failover cluster include:

  • Zero Cloud Shared Disk Costs: SafeKit uses a shared-nothing architecture that works with local disks.
  • No Enterprise Edition Required: While native database replication mechanim may require expensive Enterprise licensing, SafeKit does not need the Enterprise edition. It works seamlessly with standard edition and even the free edition, providing high-end availability at a fraction of the cost.
  • True 2-Node Efficiency: Unlike standard clusters that often require a "witness" (a 3rd node, disk, or file share) to maintain a quorum, SafeKit delivers full high availability with just two redundant servers.
  • No Forced Subscriptions: SafeKit offers a perpetual license. You own your software, avoiding the "subscription trap" and unpredictable annual price hikes common with modern cloud-only or subscription-based models.
  • Low Operational Expense (OPEX): As a plug-and-play solution, it requires no specialized training or expensive external consultancy for maintenance, unlike complex open-source clustering tools.
  • CPU-Independent Pricing: Licensing is independent of the number of CPUs or cores. With just two licenses for two nodes, you can protect your Azure application against failures.

Is it possible to set up a Azure mirror cluster without clustering skills?

Yes. This article explains how to quickly implement a Azure mirror cluster without the need for complex HA clustering skills. By using SafeKit’s automated failover scripts to handle the replication and restart of your Azure application, you get a robust redundancy solution that is significantly simpler to deploy and maintain than traditional clustering solutions.

Beyond Azure, which applications and environments can SafeKit protect?

SafeKit is a versatile high-availability solution for both Windows and Linux that extends far beyond Azure mirror cluster. It enables synchronous real-time replication and automatic failover for a wide range of critical workloads, including:

  • Virtual & Physical Environments: Complete Hyper-V or KVM virtual machines.
  • Container Orchestration: Docker, Podman, and K3s (Kubernetes) environments.
  • Data & Services: Individual file directories, services, and various databases.
  • Cloud Infrastructure: High availability for Cloud applications.

SafeKit also provides Farm Clusters with native Network Load Balancing and Failover for stateless applications like Web Servers.

Explore the full list of supported HA solutions here.

How the SafeKit mirror cluster works with Azure?

Step 1. Real-time replication

Server 1 (PRIM) runs the Azure application. Clients are connected to a virtual IP address. SafeKit replicates in real time modifications made inside files through the network.

File replication at byte level in a mirror Azure cluster

The replication is synchronous with no data loss on failure contrary to asynchronous replication.

You just have to configure the names of directories to replicate in SafeKit. There are no pre-requisites on disk organization. Directories may be located in the system disk.

Step 2. Automatic failover

When Server 1 fails, Server 2 takes over. SafeKit switches the virtual IP address and restarts the Azure application automatically on Server 2.

The application finds the files replicated by SafeKit uptodate on Server 2. The application continues to run on Server 2 by locally modifying its files that are no longer replicated to Server 1.

Failover of Azure in a mirror cluster

The failover time is equal to the fault-detection time (30 seconds by default) plus the application start-up time.

Step 3. Automatic failback

Failback involves restarting Server 1 after fixing the problem that caused it to fail.

SafeKit automatically resynchronizes the files, updating only the files modified on Server 2 while Server 1 was halted.

Failback in a mirror Azure cluster

Failback takes place without disturbing the Azure application, which can continue running on Server 2.

Step 4. Back to normal

After reintegration, the files are once again in mirror mode, as in step 1. The system is back in high-availability mode, with the Azure application running on Server 2 and SafeKit replicating file updates to Server 1.

Return to normal operation in a mirror Azure cluster

If the administrator wishes the application to run on Server 1, this can be done manually through the web console at an appropriate time, or automatically through configuration.

How to configure a SafeKit Mirror Cluster?

SafeKit Web Console: High Availability configuration dashboard showing heartbeat networks, virtual IP setup, and real-time directory replication for a mirror cluster.

The SafeKit web console provides an intuitive interface to orchestrate high availability for your critical applications. In just a few steps, you can configure a SafeKit mirror cluster to ensure business continuity:

  • Application Failover (Macros Tab): Define the specific application services to be automatically restarted in the event of a failure.
  • Heartbeat network(s): Dedicated communication path(s) used by cluster nodes to continuously monitor each other's health and availability and synchronize failover decisions.
  • Virtual IP Management: Set up the Virtual IP (VIP) for transparent client reconnection after a failover.
  • Real-Time Replication: Select the critical directories for host-based, synchronous byte-level replication.
  • Checkers: Monitor the application's health and trigger automatic recovery if a process failure is detected.

The SafeKit cluster includes a dedicated split-brain checker to resolve network isolation issues without the need for a third witness machine or an additional heartbeat network. Learn more about heartbeat, failover and quorum in a cluster.

How to monitor a SafeKit mirror cluster?

SafeKit Web Console: Real-time monitoring of a 2-node mirror cluster showing PRIM and SECOND states with active data replication.

The SafeKit management console offers a unified view of your high availability infrastructure. It allows administrators to monitor the operational state of the cluster and track data synchronization in real-time.

For a 2-node mirror cluster, the console clearly displays the roles of each server:

  • PRIM (Primary): The active node currently running the application and managing the Virtual IP. It performs writes to the local storage and real-time replication to the secondary node.
  • SECOND (Secondary): The standby node receiving synchronous byte-level updates. It is ready to take over instantly if the Primary fails.
  • ALONE State: Visually alerts you when the cluster is running on a single node (e.g., during maintenance or after a failure), indicating that redundancy is temporarily lost.
  • Resynchronization Progress: When a failed node recovers, its status turns orange during background data reintegration, ensuring no downtime during the "return to normal" phase.

Beyond simple status icons, the interface provides one-click failover orchestration, allowing you to manually reassign the primary role for planned maintenance while ensuring continuous availability for user activity.

Comparison: SafeKit for Azure vs. Native Cloud HA/DR Solutions

Feature SafeKit for Azure Native Cloud Shared Disk Native Cloud VM Replication for DR (Disaster Recovery)
Architecture Shared-Nothing: Uses local disks for maximum speed and minimum cost. Shared Storage: Dependent on cloud-managed disks. Block-Level: Replicates entire VM disks to a passive region.
Data Integrity (RPO) Zero (RPO=0): Synchronous file-level replication. Zero: Synchronous writing to a shared disk. Non-Zero: Asynchronous replication resulting in data lag.
Failover/Failback Logic Fully Automatic: Integrated monitoring and restart. Requires a third-party failover tool supporting cloud shared disks. Manual: Requires activation of a disaster recovery plan.
Application Setup Zero Reconfiguration: Protects applications where they are currently installed. Reconfiguration: Application data must be migrated to a specific shared disk. None: Captures the entire OS and application as-is.
Replication Scope Complete: Application data folders (DB + Config + Logs). Partial: Only data stored on the shared volume. Total: Replicates the entire virtual machine.
VM Localization Native Multi-AZ: Synchronous replication across Availability Zones within a region. Provider Dependent: Requires shared storage replicated across Availability Zones. Regional: Primarily designed for replication between distant geographical regions.
Deployment Time Low: < 30 Minutes (On-prem or Cloud). High: Days or weeks for cluster configuration. Medium: Requires setting up DR vaults and policies.

Is High Availability a substitute for Native Cloud VM Replication for DR (Disaster Recovery)?

No, High Availability and backups are complementary, not interchangeable. While SafeKit ensures business continuity by keeping applications running during a hardware crash, it does not guard against logical errors, accidental deletions, or ransomware attacks. For example, because real-time replication mirrors every change instantly, a ransomware attack on the primary node will be immediately duplicated on the secondary node. To recover from such cyber threats or accidental deletions, you need a dedicated backup solution with a robust retention policy. This allows you to "rewind" your environment to a healthy state from before the corruption occurred.

Conclusion

By adopting a shared-nothing architecture, SafeKit eliminates the complexity and cost of cloud-managed shared disks. Unlike traditional clustering, it provides an infrastructure-independent solution that requires no application reconfiguration or data migration. With native Multi-AZ support and a deployment time of under 30 minutes, SafeKit ensures your Azure environment remains resilient with zero data loss (RPO=0) and fully automated recovery.

Video Guide: Configuring a SafeKit HA mirror cluster

SafeKit Video: Application-Level Clustering (8:47)

In this video, discover how SafeKit implements a mirror HA cluster without the complexity of a cloud shared disks clustering. While this demonstration uses Microsoft SQL Server, the solution works identically for other databases and applications.

Note the the virtual IP is configured in this video for an on-premise solution and not for a Cloud Load Balancer with SafeKit health probes.

Video Highlights

  1. 2 nodes with SQL Server (0:32)
  2. Configure the cluster and the mirror.safe module (3:58)
  3. Start and test SQL replication, migration, failover on crash (4:17)

🔍 SafeKit High Availability Navigation Hub

Explore SafeKit: Features, technical videos, documentation, and free trial
Resource Type Description Direct Link
Key Features Why Choose SafeKit for Simple and Cost-Effective High Availability? See Why Choose SafeKit for High Availability
Deployment Model All-in-One SANless HA: Shared-Nothing Software Clustering See SafeKit All-in-One SANless HA
Partners SafeKit: The Benchmark in High Availability for Partners See Why SafeKit Is the HA Benchmark for Partners
HA Strategies SafeKit: Infrastructure (VM) vs. Application-Level High Availability See SafeKit HA & Redundancy: VM vs. Application Level
Technical Specifications Technical Limitations for SafeKit Clustering See SafeKit High Availability Limitations
Proof of Concept SafeKit: High Availability Configuration & Failover Demos See SafeKit Failover Tutorials
Architecture How the SafeKit Mirror Cluster works (Real-Time Replication & Failover) See SafeKit Mirror Cluster: Real-Time Replication & Failover
Architecture How the SafeKit Farm Cluster works (Network Load Balancing & Failover) See SafeKit Farm Cluster: Network Load Balancing & Failover
Competitive Advantages Comparison: SafeKit vs. Traditional High Availability (HA) Clusters See SafeKit vs. Traditional HA Cluster Comparison
Technical Resources SafeKit High Availability: Documentation, Downloads & Trial See SafeKit HA Free Trial & Technical Documentation
Pre-configured Solutions SafeKit Application Module Library: Ready-to-Use HA Solutions See SafeKit High Availability Application Modules