What is RPO and RTO with examples?

Evidian SafeKit

What is RPO and RTO with examples of high availability and backup solutions?

Overview

This article explores RPO (Recovery Point Objective) and RTO (Recovery Time Objective) with examples of high availability and backup solutions.

What is RPO and RTO with examples?

High availability and backup solutions are complementary. The first is for automatic failover in the event of a failure and the second is for data recovery in the event of a disaster such as ransomware encrypting all data.

The article explains in detail the RTO and RPO of SafeKit, a high availability software product.

What is RPO?

RPO (Recovery Point Objective) reflects the data loss in the event of a failure.

If you are looking for a high availability cluster with automatic failover, then the RPO should be 0. The application is thus restarted without data loss. Either you can choose a hardware high availability cluster with shared disk. Or you can choose a software high availability cluster with synchronous real-time replication to have 0 data loss.

If you are implementing backup solutions, then the RPO is greater than 0 and the recovery is not automatic. Administrators decide how often to replicate and how many backups to keep.

What is RTO?

RTO (Recovery Time Objective) is the time during which an application is unavailable in the event of a failure.

For a critical application, RTO should be minimal. For this, a high availability solution is necessary with automatic restart of the application in the event of hardware or software failures. RTO is then approximatively one minute: the detection time plus the automatic restart time of the application.

With a backup solution, RTO is generally greater than several hours. Administrators will first attempt to repair the hardware and restart the application on up-to-date data. Restarting from a backup is the last decision when previous actions don't work, because it leads to data loss.

RTO with the example of a SafeKit mirror cluster

The SafeKit mirror cluster is a software high availability cluster with synchronous real-time data replication and automatic application failover.

RTO of the SafeKit mirror cluster is in the order of 1 mn and can be decreased if you configure the heartbeat timeout.

For a hardware failure, RTO = heartbeat timeout (default 30 s) + time to restart the application.

For a software failure or an administrator restart, RTO = time to stop the application + time to restart it.

With solutions that reboot a full virtual machine in case of failure, the RTO includes the reboot time of the virtual machine.

RTO with the example of a SafeKit farm cluster

The SafeKit farm cluster is a software high availability cluster with network load balancing and automatic failover.

RTO of a SafeKit farm cluster is in the order of a few seconds.

For a hardware failure, RTO = failure detection timeout through monitoring channels (default a few seconds). After the timeout the load balancing filters are reconfigured.

For a software failure or an administrator restart, RTO = time to stop the application + time to restart it.

RPO with the example of a SafeKit mirror cluster

RPO of the SafeKit mirror cluster is 0 as the replication is synchronous and real-time.

Be careful, with asynchronous replication, RPO is not 0 and there is data loss in case of failure when the application restarts on the secondary server.

RPO with the example of a SafeKit farm cluster

N/R. A farm cluster does replicate any data.

SafeKit: an ideal solution for a partner application

This platform agnostic solution is ideal for a partner with a critical application and who wants to provide a redundancy and high availability option easy to deploy to many customers.

This clustering solution is also recognized as the simplest to implement by our partners.

What are the advantages of a mirror cluster?

  • Low Complexity
  • Plug&Play deployment with no specific skills
  • Suitable for large deployments in many sites (very simple to deploy)
  • 2 physical or virtual nodes
  • No shared storage requirement
  • No Domain Controller requirement
  • Same solution on Windows and Linux
  • Support Windows Server and Client OS editions
  • Well documented API and support
  • Synchronous data replication (no data loss in case of failure)
  • Replicated directories can be in the system disk
  • Supports multiple heartbeats and vitual IP addresses
  • Offers configurable software, hardware and network checkers
  • For the split brain problem and the quorum, does not require a special disk or a third machine or a dedicated link between both servers
  • Automatic failover of application with a recovery time in the order of one minute
  • Automatic failback when a server comes back after a failure (no manual operation)
  • A very simple console to deploy the solution and to maintain it afterwards for end-customer
  • Supports hardware and environment failures (20% of causes of unavailability), including the complete failure of a computer room with 2 nodes in two remote sites
  • Supports software failures (40% of causes of unavailability): software bug, regression on software update (N and N+1 versions can coexist)
  • Supports human errors (40% of causes of unavailability) : the simplicity of use avoids the administration error of the critical application

What are the advantages of a farm cluster

  • Low Complexity
  • Plug&Play deployment with no specific skills
  • Suitable for large deployments in many sites (very simple to deploy)
  • 2 physical or virtual nodes or more
  • No network load balancers requirement
  • No proxy server requirement (above the farm cluster)
  • No Domain Controller requirement
  • No restriction in VMware due to multicast or unicast address
  • Same solution on Windows and Linux
  • Support Windows Server and Client OS editions
  • Well documented API and support
  • Supports multiple monitoring channels on multiple networks for server failure detection
  • Supports multiple vitual IP addresses
  • Offers configurable software, hardware and network checkers
  • Offers the mirror cluster with synchronous real-time replication and failover to implement a farm+mirror 3-tiers architecture
  • Automatic failover with a recovery time in the order of a few seconds
  • Automatic failback when a server comes back after a failure (no manual operation)
  • A very simple console to deploy the solution and to maintain it afterwards for end-customer
  • Supports hardware and environment failures (20% of causes of unavailability), including the complete failure of a computer room with 2 nodes in two remote sites
  • Supports software failures (40% of causes of unavailability): software bug, regression on software update (N and N+1 versions can coexist)
  • Supports human errors (40% of causes of unavailability): the simplicity of use avoids the administration error of the critical application

SafeKit Modules for Plug&Play Redundancy and High Availability Solutions

SafeKit Modules for Plug&Play High Availability Solutions

Network load balancing and failover

Windows farm

Linux farm

Generic Windows farm   > Generic Linux farm   >
Microsoft IIS   > -
NGINX   >
Apache   >
Amazon AWS farm   >
Microsoft Azure farm   >
Google GCP farm   >
Other cloud   >

Advanced clustering architectures

Several modules can be deployed on the same cluster. Thus, advanced clustering architectures can be implemented:

SafeKit High Availability Differentiators against Competition

Demonstrations of Redundancy and High Availability Solutions

SafeKit Webinar

This webinar presents in 2 minutes Evidian SafeKit.

In this webinar, you will understand SafeKit mirror and farm clusters.

Microsoft SQL Server Cluster

This video shows a mirror module configuration with synchronous real-time replication and failover.

The file replication and the failover are configured for Microsoft SQL Server but it works in the same manner for other databases.

Free trial here

Apache Cluster

This video shows a farm module configuration with load balancing and failover.

The load balancing and the failover are configured for Apache but it works in the same manner for other web services.

Free trial here

Hyper-V Cluster

This video shows a Hyper-V cluster with full replications of virtual machines.

Virtual machines can run on both Hyper-V servers and they are restarted in case of failure.

Free trial here

SafeKit Training

Introduction

  1. Overview / pptx

    • Features
    • Architectures
    • Distinctive advantages
  2. Competition / pptx

    • Hardware vs software cluster
    • Synchronous vs asynchronous replication
    • File vs disk replication
    • High availability vs fault tolerance
    • Hardware vs software load balancing
    • Virtual machine vs application HA

Installation, Console, CLI

  1. Install and setup / pptx

    • Package installation
    • Nodes setup
    • Cluster configuration
    • Upgrade
  2. Web console / pptx

    • Cluster configuration
    • Configuration tab
    • Control tab
    • Monitor tab
    • Advanced Configuration tab
  3. Command line / pptx

    • Silent installation
    • Cluster administration
    • Module administration
    • Command line interface

Advanced configuration

  1. Mirror module / pptx

    • userconfig.xml + restart scripts
    • Heartbeat (<hearbeat>)
    • Virtual IP address (<vip>)
    • Real-time file replication (<rfs>)
  2. Farm  module / pptx

    • userconfig.xml + restart scripts
    • Farm configuration (<farm>)
    • Virtual IP address (<vip>)
  3. Checkers / pptx

    • Failover machine (<failover>)
    • Process monitoring (<errd>)
    • Network and duplicate IP checkers
    • Custom checker (<custom>)
    • Split brain checker (<splitbrain>)
    • TCP, ping, module checkers

Support

  1. Support tools / pptx

    • Analyze snapshots
  2. Evidian support / pptx

    • Get permanent license key
    • Register on support.evidian.com
    • Call desk

Documentation

  1. Technical documentation

  2. Presales documentation