eviden-logo

Evidian > Products > SafeKit: All-in-One SANless High Availability & Application Clustering Software > Kubernetes K3S: the simplest high availability cluster between two redundant servers

Kubernetes K3S: the simplest high availability cluster between two redundant servers

With the synchronous replication and automatic failover provided by Evidian SafeKit

How the Evidian SafeKit software simply implements a Kubernetes K3S high availability cluster between two redundant servers?

The solution for Kubernetes K3S

Evidian SafeKit brings high availability to Kubernetes K3S between two redundant servers. This article explains how to implement quickly a Kubernetes cluster on 2 nodes without NFS external storage, without an external configuration database and without specific skills.

Note that SafeKit is a generic product. You can implement with the same product real-time replication and failover of directories and services, databases, Docker, Podman, full Hyper-V or KVM virtual machines, Cloud applications (see all solutions).

This clustering solution is recognized as the simplest to implement by our customers and partners. The SafeKit solution is the perfect solution for running Kubernetes applications on premise and on 2 nodes.

We have chosen K3S as the Kubernetes engine because it is a lightweight solution for IoT & Edge computing.

The k3s.safe mirror module implements:

  • 2 active K3S masters/agents running pods
  • replication of the K3S configuration database (MariaDB)
  • replication of persistent volumes (implemented by NFS client dynamic provisionner storage class: nfs-client)
  • virtual IP address, automatic failover, automatic failback

How it works?

The following table explains how the solution is working on 2 nodes. Other nodes with K3S agents (without SafeKit) can be added for horizontal scalability.

Kubernetes K3S components
SafeKit PRIM node SafeKit SECOND node
K3S (master and agent) is running pods on the primary node K3S (master and agent) is running pods on the secondary node
NFS Server is running on the primary node with:

  • a virtual IP/NFS port
  • exported NFS share
  • K3S persistent volumes
Persistent volumes are replicated synchronously and in real-time by SafeKit on the secondary node
MariaDB server is running on the primary node with:

  • a virtual IP/MariaDB port
  • K3S configuration database
The configuration database is replicated synchronously and in real-time by SafeKit on the secondary node

A simple solution

SafeKit is the simplest high availabiliy solution for running Kubernetes applications on 2 nodes and on premise.

SafeKit Benefits
Synchronous real-time replication for persistent volumes No external NAS/NFS storage for persistent volumes
Only 2 nodes for HA of Kubernetes No need for 3 nodes like with etcd database
Same simple product for virtual IP address, replication, failover, failback, administration, maintenance Avoid different technologies for virtual IP (metal-lb, BGP), HA of persistent volumes, HA of configuration database
Supports disaster recovery with two remote nodes Avoid replicated NAS storage

How the SafeKit mirror cluster works with Kubernetes K3S?

Step 1. File replication at byte level in a mirror cluster

This step corresponds to the following figure. Server 1 (PRIM) runs the Kubernetes K3S components explained in the previous table. Clients are connected to the virtual IP address of the mirror cluster. SafeKit replicates in real time files opened by the Kubernetes K3S components. Only changes made by the components in the files are replicated across the network, thus limiting traffic (byte-level file replication).

File replication at byte level in a Kubernetes K3S mirror cluster

With a software data replication at the file level, only names of directories are configured in SafeKit. There are no pre-requisites on disk organization for the two servers. Directories to replicate may be located in the system disk. SafeKit implements synchronous replication with no data loss on failure contrary to asynchronous replication.

Step 2. Failover

When Server 1 fails, Server 2 takes over. SafeKit switches the cluster's virtual IP address and restarts the Kubernetes K3S components automatically on Server 2. The components find the files replicated by SafeKit uptodate on Server 2, thanks to the synchronous replication between Server 1 and Server 2. The components continue to run on Server 2 by locally modifying their files that are no longer replicated to Server 1.

Failover in a Kubernetes K3S mirror cluster

The failover time is equal to the fault-detection time (set to 30 seconds by default) plus the components start-up time. Unlike disk replication solutions, there is no delay for remounting file system and running file system recovery procedures.

Step 3. Failback and reintegration

Failback involves restarting Server 1 after fixing the problem that caused it to fail. SafeKit automatically resynchronizes the files, updating only the files modified on Server 2 while Server 1 was halted. This reintegration takes place without disturbing the Kubernetes K3S components, which can continue running on Server 2.

Failback in a Kubenetes mirror cluster

If SafeKit was cleanly stopped on server 1, then at its restart, only the modified zones inside files are resynchronized, according to modification tracking bitmaps.

If server 1 crashed (power off), the modification bitmaps are not reliable and not used. All the files bearing a modification timestamp more recent than the last known synchronization point are resynchronized.

Step 4. Return to byte-level file replication in the mirror cluster

After reintegration, the files are once again in mirror mode, as in step 1. The system is back in high-availability mode, with the Kubernetes K3S components running on Server 2 and SafeKit replicating file updates to the secondary Server 1.

Return to normal operation in a Kubernetes K3S mirror cluster

If the administrator wishes the Kubernetes K3S components to run on Server 1, he/she can execute a "swap" command either manually at an appropriate time, or automatically through configuration.

🔍 SafeKit High Availability Navigation Hub

Explore SafeKit: Features, technical videos, documentation, and free trial
Resource Type Description Direct Link
Key Features Why Choose SafeKit for Simple and Cost-Effective High Availability? See Why Choose SafeKit for High Availability
Deployment Model All-in-One SANless HA: Shared-Nothing Software Clustering See SafeKit All-in-One SANless HA
Partners SafeKit: The Benchmark in High Availability for Partners See Why SafeKit Is the HA Benchmark for Partners
HA Strategies SafeKit: Infrastructure (VM) vs. Application-Level High Availability See SafeKit HA & Redundancy: VM vs. Application Level
Technical Specifications Technical Limitations for SafeKit Clustering See SafeKit High Availability Limitations
Proof of Concept SafeKit: High Availability Configuration & Failover Demos See SafeKit Failover Tutorials
Architecture How the SafeKit Mirror Cluster works (Real-Time Replication & Failover) See SafeKit Mirror Cluster: Real-Time Replication & Failover
Architecture How the SafeKit Farm Cluster works (Network Load Balancing & Failover) See SafeKit Farm Cluster: Network Load Balancing & Failover
Competitive Advantages Comparison: SafeKit vs. Traditional High Availability (HA) Clusters See SafeKit vs. Traditional HA Cluster Comparison
Technical Resources SafeKit High Availability: Documentation, Downloads & Trial See SafeKit HA Free Trial & Technical Documentation
Pre-configured Solutions SafeKit Application Module Library: Ready-to-Use HA Solutions See SafeKit High Availability Application Modules
FAQ Frequently Asked Questions on Architecture, Technical specs, Features See SafeKit HA FAQ