eviden-logo

Evidian > Products > High Availability Software - Zero Extra Hardware > SafeKit User's Guide 8.2

SafeKit User's Guide

SafeKit 8.2

Evidian SafeKit User's Guide 8.2

Overview

Subject

This document covers all the phases of the SafeKit implementation: architecture, installation, tests, administration & troubleshooting, support, and command line interface.

Intended
Readers

Architectures

Technical overview

Installation

Installation

Console

The SafeKit web console

Securing the SafeKit web service

Advanced configuration

Cluster.xml for the SafeKit cluster configuration

Userconfig.xml for a module configuration

Scripts for a module configuration

Examples of module configurations

Administration

Mirror module administration

Farm module administration

Command line interface

Advanced administration

Support

Tests

Troubleshooting

Access to Evidian support

Log Messages Index

Other

Table of Contents

Third-Party Software

Release

SafeKit 8.2

Supported OS

Windows and Linux; for a detailed list of supported OS, see here

Web Sites

Evidian marketing site:   http://www.evidian.com/safekit

Evidian support site:      https://support.evidian.com/safekit

Ref

39 A2 38MC 04

If you have any comments or questions related to this documentation, please contact us at https://www.evidian.com/company/contact-evidian/

Copyright © Evidian, 2024

The trademarks mentioned in this document are the propriety of their respective owners.

No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical or otherwise without the prior written permission of the publisher.

Evidian disclaims the implied warranties of merchantability and fitness for a particular purpose and makes no express warranties except as may be stated in its written agreement with and for its customer. In no event is Evidian liable to anyone for any indirect, special, or consequential damages.

The information and specifications in this document are subject to change without notice. Consult your Evidian Marketing Representative for product or service availability.



 

 

 

 

 

Table of Contents

Overview.. 3

Table of Contents. 5

1.... Technical overview.. 15

1.1..... Generalities, solutions, architectures 15

1.1.1          Introduction to SafeKit 15

1.1.2          SafeKit solutions. 15

1.1.3          SafeKit architectures. 16

1.1.4          SafeKit cluster definition. 16

1.1.5          SafeKit module definition. 17

1.1.6          SafeKit limitations. 17

1.2..... The SafeKit mirror cluster 18

1.2.1          Real time file replication and application failover 18

1.2.2          Step 1. Normal operation. 19

1.2.3          Step 2. Failover 19

1.2.4          Step 3. Failback and automatic resynchronization. 19

1.2.5          Step 4. Return to normal operation. 20

1.2.6          Synchronous replication versus asynchronous replication. 20

1.2.7          Behavior in case of network isolation. 20

1.2.8          3-node replication. 21

1.2.9          SafeKit on a single node to protect against software failures 21

1.3..... The SafeKit farm cluster 22

1.3.1          Network load balancing and application failover 22

1.3.2          Principle of a virtual IP address with network load balancing. 22

1.3.3          Load balancing for stateful or stateless web services 22

1.3.4          Chain high availability solution in a farm.. 23

1.4..... Clusters running several modules 23

1.4.1          The SafeKit farm+mirror cluster 23

1.4.2          The SafeKit active/active cluster with replication. 23

1.4.3          The SafeKit N-1 cluster 24

1.5..... The SafeKit Hyper-V or KVM cluster 25

1.5.1          Load balancing, replication, failover of entire virtual machines. 25

1.6..... SafeKit clusters in the cloud. 25

1.6.1          Mirror cluster in Azure, AWS and GCP. 25

1.6.2          Farm cluster in Azure, AWS and GCP. 26

2.... Installation. 29

2.1..... SafeKit install 29

2.1.1          Download the package. 29

2.1.2          Installation directories and disk space provisioning. 29

2.1.3          SafeKit install procedure. 30

2.1.4          Use the SafeKit web console or command line interface. 32

2.1.5          SafeKit license keys. 33

2.1.6          System specific procedures and characteristics 34

2.2..... Mirror installation recommendation. 35

2.2.1          Hardware prerequisites. 35

2.2.2          Network prerequisites. 35

2.2.3          Application prerequisites. 35

2.2.4          File replication prerequisites. 35

2.3..... Farm installation recommendation. 35

2.3.1          Hardware prerequisites. 36

2.3.2          Network prerequisites. 36

2.3.3          Application prerequisites. 36

2.4..... SafeKit upgrade. 36

2.4.1          Prepare the upgrade. 36

2.4.2          Uninstall procedure. 36

2.4.3          Reinstall and postinstall procedure. 37

2.5..... SafeKit full uninstall 39

2.5.1          Uninstall on Windows as administrator 39

2.5.2          Uninstall on Linux as root 39

2.6..... SafeKit documentation. 39

3.... The SafeKit web console. 41

3.1..... Start the web console. 41

3.1.1          Start a web browser 41

3.1.2          Connect to a SafeKit server 42

3.2..... Configure the cluster 43

3.2.1          Cluster configuration wizard. 43

3.2.2          Cluster configuration home page. 46

3.3..... Configure a module. 47

3.3.1          Select the new module to configure. 48

3.3.2          Module configuration wizard. 49

3.3.3          Modules configuration home page. 54

3.3.4          Edit the module configuration locally and then apply it 56

3.4..... Monitor a module. 57

3.4.1          Monitoring home page. 57

3.4.2          Module state. 58

3.4.3          Module control menus 60

3.4.4          Module details. 63

3.4.5          Module states timeline. 68

3.5..... Snapshots or logs of module for debug and support 69

3.6..... Secure access to the web console. 70

4.... Tests. 73

4.1..... Installation and tests after boot 73

4.1.1          Test package installation. 73

4.1.2          Test license and version. 74

4.1.3          Test SafeKit services and modules after boot 74

4.1.4          Test start of SafeKit web console. 76

4.2..... Tests of a mirror module. 77

4.2.1          Test first start of a mirror module on 2 servers STOP (NotReady). 77

4.2.2          Test start of a mirror module on 2 servers STOP (NotReady). 77

4.2.3          Test stop of a mirror module on the server PRIM (Ready). 77

4.2.4          Test start of a mirror module on the server STOP (NotReady). 78

4.2.5          Test restart of a mirror module on the server PRIM (Ready). 78

4.2.6          Test virtual IP address of a mirror module. 78

4.2.7          Test file replication of a mirror module. 79

4.2.8          Test shutdown of the server PRIM (Ready). 80

4.2.9          Test power-off of the server PRIM (Ready). 81

4.2.10       Test split-brain with a mirror module. 81

4.2.11       Continue your mirror module tests with checkers 82

4.3..... Tests of a farm module. 83

4.3.1          Test start of a farm module on all servers STOP (NotReady). 83

4.3.2          Test stop of a farm module on one server UP (Ready). 83

4.3.3          Test restart of a farm module on one server UP(Ready). 83

4.3.4          Test virtual IP address of a farm module. 83

4.3.5          Test TCP load balancing on a virtual IP address. 85

4.3.6          Test split-brain with a farm module. 86

4.3.7          Test compatibility of the network with invisible MAC address (vmac_invisible) 87

4.3.8          Test shutdown of a server UP (Ready). 88

4.3.9          Test power-off of a server UP (Ready). 88

4.3.10       Continue your farm module tests with checkers. 89

4.4..... Tests of checkers common to mirror and farm.. 89

4.4.1          Test <errd> checker with action restart or stopstart 89

4.4.2          Test <tcp> checker with action restart or stopstart 90

4.4.3          Test <tcp> checker with action wait 91

4.4.4          Test <interface check="on"> with action wait 91

4.4.5          Test <ping> checker with action wait 92

4.4.6          Test <module> checker with action wait 93

4.4.7          Test <custom> checker with action wait 94

4.4.8          Test <custom> checker with action restart or stopstart 95

5.... Mirror module administration. 99

5.1..... Operating mode of a mirror module. 99

5.2..... State automaton of a mirror module (STOP, WAIT, ALONE, PRIM, SECOND - NotReady, Transient, Ready) 101

5.3..... First start-up of a mirror module (safekit prim command) 102

5.4..... Different reintegration cases (use of bitmaps) 103

5.5..... Start-up of a mirror module with the up-to-date data  STOP (NotReady) - WAIT (NotReady). 104

5.6..... Degraded replication mode (ALONE (Ready) degraded) 105

5.7..... Automatic or manual failover 106

5.8..... Default primary server (automatic swap after reintegration) 108

5.9..... Prim command fails: why? (safekit primforce command) 109

6.... Farm module administration. 111

6.1..... Operating mode of a farm module. 111

6.2..... State automaton of a farm module (STOP, WAIT, UP - NotReady, Transient, Ready) 112

6.3..... Start-up of a farm module. 113

7.... Troubleshooting. 115

7.1..... Connection issues with the web console. 115

7.1.1          Browser check. 115

7.1.2          Browser state clear 116

7.1.3          Server check. 116

7.2..... Connection issues with the HTTPS web console. 116

7.2.1          Check server certificates. 117

7.2.2          Check certificates installed in SafeKit 118

7.2.3          Revert to HTTP configuration. 118

7.3..... How to read logs and resources of the module? 119

7.4..... How to read the commands log of the server? 119

7.5..... Stable module  (Ready) and (Ready). 120

7.6..... Degraded module (Ready)and /(NotReady). 120

7.7..... Out of service module /(NotReady) and /(NotReady). 120

7.8..... Module  STOP (NotReady): start the module. 120

7.9..... Module WAIT (NotReady): repair the resource="down" 121

7.10... Module oscillating from  (Ready) to  (Transient). 121

7.11... Message on stop after maxloop. 122

7.12... Module  (Ready) but non-operational application. 123

7.13... Mirror module ALONE (Ready) - WAIT/STOP (NotReady) 124

7.14... Farm module UP(Ready)but problem of load balancing in a farm.. 125

7.14.1       Reported network load share are not coherent 125

7.14.2       virtual IP address does not respond properly. 125

7.15... Problem after Boot 125

7.16... Analysis from snapshots of the module. 126

7.16.1       Module configuration files 126

7.16.2       Module dump files. 127

7.17... Problem with the size of SafeKit databases 129

7.18... Problem for retrieving the certification authority certificate from an external PKI 130

7.18.1       Export CA certificate(s) from public certificates 130

7.19... Still in Trouble. 133

8.... Access to Evidian support 135

8.1..... Home page of support site. 135

8.2..... Permanent license keys 136

8.3..... Create an account 137

8.4..... Access to your account 137

8.5..... Call desk to open a trouble ticket 138

8.5.1          Call desk operations 138

8.5.2          Create a call 138

8.5.3          Attach the snapshots. 139

8.5.4          Answers to a call and exchange with support 140

8.6..... Download and upload area. 141

8.6.1          Two areas of download and upload. 141

8.6.2          Product download area. 141

8.6.3          Private upload area. 142

8.7..... Knowledge base. 142

9.... Command line interface. 143

9.1..... Commands to control SafeKit services 143

9.1.1          safeadmin service. 143

9.1.2          safewebserver service. 144

9.1.3          SNMP service. 144

9.2..... Command lines to configure and monitor the cluster 145

9.3..... Command lines to control modules 147

9.4..... Command lines to monitor modules 149

9.5..... Command lines to configure modules 150

9.6..... Command lines for support 152

9.7..... Command lines distributed across multiple SafeKit servers 153

9.8..... Examples 155

9.8.1          Local and distributed command. 155

9.8.2          Cluster configuration with command line. 155

9.8.3          Module configuration with command line. 156

9.8.4          Module snapshot with command line. 156

10.. Advanced administration. 157

10.1... SafeKit environment variables and directories 157

10.1.1       Global 157

10.1.2       Module. 157

10.2... SafeKit services and daemons 159

10.2.1       SafeKit services 159

10.2.2       SafeKit daemons per module. 160

10.3... Firewall settings 160

10.3.1       Firewall settings in Linux. 161

10.3.2       Firewall settings in Windows. 161

10.3.3       Other firewalls. 162

10.4... Boot and shutdown setup in Windows 165

10.4.1       Automatic procedure. 165

10.4.2       Manual procedure. 165

10.5... Antivirus settings 166

10.6... Securing module internal communications 166

10.6.1       Configuration with the SafeKit Web console. 167

10.6.2       Configuration with the Command Line Interface. 167

10.6.3       Advanced configuration. 168

10.7... SafeKit web service configuration. 169

10.7.1       Configuration files 169

10.7.2       Connection ports configuration. 171

10.7.3       HTTP/HTTPS and user authentication configuration. 171

10.7.4       SafeKit API 171

10.8... Mail notification. 171

10.9... SNMP monitoring. 172

10.9.1       SNMP monitoring in Windows 172

10.9.2       SNMP monitoring in Linux. 173

10.9.3       The SafeKit MIB. 173

10.10. Commands log of the SafeKit server 174

10.11. SafeKit log messages in system journal 174

11.. Securing the SafeKit web service. 177

11.1... Overview. 177

11.1.1       Default setup. 178

11.1.2       Predefined setups 178

11.2... HTTP setup. 179

11.2.1       Default setup. 179

11.2.2       Unsecure setup based on identical role for all 181

11.3... HTTPS setup. 182

11.3.1       HTTPS setup using the SafeKit PKI 183

11.3.2       HTTPS setup using an external PKI 190

11.4... User authentication setup. 194

11.4.1       File-based authentication setup. 194

11.4.2       LDAP/AD authentication setup. 197

11.4.3       OpenID authentication setup. 199

12.. Cluster.xml for the SafeKit cluster configuration. 203

12.1... Cluster.xml file. 203

12.1.1       Cluster.xml example. 203

12.1.2       Cluster.xml syntax. 204

12.1.3       <lans>, <lan>, <node> attributes. 204

12.2... SafeKit cluster Configuration. 206

12.2.1       Configuration with the SafeKit web console. 206

12.2.2       Configuration with command line. 207

12.2.3       Configuration changes. 207

13.. Userconfig.xml for a module configuration. 209

13.1... Macro definition - <macro>. 210

13.1.1       <macro> example. 210

13.1.2       <macro> syntax. 210

13.1.3       <macro> attributes 210

13.2... Farm or mirror module - <service>. 211

13.2.1       <service> example. 211

13.2.2       <service> syntax. 211

13.2.3       <service> attributes 211

13.3... Heartbeats  - <heart>, <heartbeat >. 214

13.3.1       <heart> example. 214

13.3.2       <heart> syntax. 214

13.3.3       <heart>, <heartbeat > attributes. 215

13.4... Farm topology - <farm>, <lan>. 216

13.4.1       <farm> example. 216

13.4.2       <farm> syntax. 216

13.4.3       <farm>, <lan> attributes 217

13.5... Virtual IP address - <vip>. 217

13.5.1       <vip> example in a mirror module. 217

13.5.2       <vip> example in a farm module. 218

13.5.3       Alternative to <vip> for servers in different networks. 218

13.5.4       <vip> syntax. 219

13.5.5       <vip><interface_list>, <interface>, <virtual_interface>, <real_interface>, <virtual_addr> attributes. 220

13.5.6       <loadbalancing_list>, <group>, <cluster>, <host> attributes. 223

13.5.7       <vip> Load balancing description. 225

13.6... File replication - <rfs>, <replicated>. 226

13.6.1       <rfs> example. 226

13.6.2       <rfs> syntax. 227

13.6.3       <rfs>, <replicated> attributes 227

13.6.4       <rfs> description. 235

13.7... Enable module scripts - <user>, <var>. 244

13.7.1       <user> example. 244

13.7.2       <user> syntax. 244

13.7.3       <user>, <var> attributes 244

13.8... Virtual hostname - <vhost>, <virtualhostname>. 245

13.8.1       <vhost> example. 245

13.8.2       <vhost> syntax. 245

13.8.3       <vhost>, <virtualhostname> attributes 245

13.8.4       <vhost> description. 246

13.9... Process or service monitoring - <errd>, <proc>. 246

13.9.1       <errd> example. 246

13.9.2       <errd> syntax. 247

13.9.3       <errd>, <proc> attributes 247

13.9.4       <errd> commands 251

13.10. Checkers - <check>. 253

13.10.1     <check> example. 253

13.10.2     <check> syntax. 253

13.10.3     <checker> description. 254

13.11. TCP checker - <tcp>. 257

13.11.1     <tcp> example. 257

13.11.2     <tcp> syntax. 257

13.11.3     <tcp> attributes. 258

13.12. Ping checker - <ping>. 259

13.12.1     <ping> example. 259

13.12.2     <ping> syntax. 260

13.12.3     <ping> attributes. 260

13.13. Interface checker - <intf>. 262

13.13.1     <intf> example. 262

13.13.2     <intf> syntax. 262

13.13.3     <intf> attributes. 262

13.14. IP checker - <ip>. 263

13.14.1     <ip> example. 263

13.14.2     <ip> syntax. 263

13.14.3     <ip> attributes. 264

13.15. Custom checker - <custom>. 264

13.15.1     <custom> example. 264

13.15.2     <custom> syntax. 265

13.15.3     <custom> attributes. 265

13.16. Module checker - <module>. 267

13.16.1     <module> example. 267

13.16.2     <module> syntax. 268

13.16.3     <module> attributes 268

13.17. Splitbrain checker - <splitbrain>. 269

13.17.1     <splitbrain> example. 270

13.17.2     <splitbrain> syntax. 270

13.17.3     <splitbrain> attributes 270

13.18. Failover machine - <failover>. 271

13.18.1     <failover> example. 271

13.18.2     <failover> syntax. 272

13.18.3     <failover> attributes. 272

13.18.4     <failover> description. 273

14.. Scripts for a module configuration. 277

14.1... List of scripts 277

14.1.1       Start/stop scripts. 277

14.1.2       Other scripts. 278

14.2... Variables and arguments passed to scripts 278

14.3... Scripts output 279

14.3.1       Output into script log. 279

14.3.2       Output into module log. 279

14.4... Scripts execution automaton. 280

14.5... SafeKit special commands for scripts 281

14.5.1       Commands for Windows. 282

14.5.2       Commands for Linux. 282

14.5.3       Commands for Windows and Linux. 283

15.. Examples of module configurations. 285

15.1... Mirror module example with mirror.safe. 285

15.1.1       Cluster configuration with two networks. 286

15.1.2       Mirror module configurations. 287

15.1.3       Mirror Module scripts. 288

15.2... Farm module example with farm.safe. 290

15.2.1       Cluster configuration with three nodes 290

15.2.2       Farm module configurations 291

15.2.3       Farm module scripts 295

15.3... Macro and script variables example with hyperv.safe. 296

15.3.1       Module configuration with macros and var 296

15.3.2       Module scripts with var 297

15.4... Process monitoring example with softerrd.safe. 298

15.4.1       Module configuration with process monitoring. 298

15.4.2       Advanced configuration of module scripts. 299

15.5... TCP checker example. 301

15.6... Ping checker example. 302

15.7... Custom checker example with customchecker.safe. 304

15.7.1       Module configuration with custom checker 304

15.7.2       Advanced configuration of module checker script 306

15.8... Split-brain checker example. 307

15.9... Module checker examples 308

15.9.1       Example of a farm module depending on a mirror module. 308

15.9.2       Example with leader.safe and follower.safe. 310

15.10. Interface checker example. 310

15.11. IP checker example. 311

15.12. Mail notification example with notification.safe. 312

15.12.1     Notification on the start of the module. 312

15.12.2     Notification on the stop of the module. 313

15.12.3     Notification on module state changes. 314

15.13. Virtual hostname example with vhost.safe. 316

15.13.1     Module configuration with a virtual hostname. 316

15.13.2     Module scripts with a virtual hostname. 317

16.. SafeKit cluster in the cloud. 321

16.1... SafeKit cluster in Amazon AWS. 321

16.1.1       Mirror cluster in AWS. 322

16.1.2       Farm cluster in AWS. 323

16.2... SafeKit cluster in Microsoft Azure. 324

16.2.1       Mirror cluster in Azure. 325

16.2.2       Farm cluster in Azure. 327

16.3... SafeKit cluster in Google GCP. 328

16.3.1       Mirror cluster in GCP. 329

16.3.2       Farm cluster in GCP. 330

17.. Third-Party Software. 333

Log Messages Index. 337

Index. 341


1. Technical overview

*       Section 1.1Generalities, solutions, architectures

*       Section 1.2The SafeKit mirror cluster

*       Section 1.3The SafeKit farm cluster

*       Section 1.4Clusters running several modules

*       Section 1.5The SafeKit Hyper-V or KVM cluster

*       Section 1.6SafeKit clusters in the cloud

1.1             Generalities, solutions, architectures

1.1.1         Introduction to SafeKit

SafeKit, developed by Evidian, is a high availability software solution designed to ensure 24/7 uptime for business-critical applications. It supports both Windows and Linux platforms and eliminates the need for shared disks, enterprise editions of databases, or advanced technical skills, making it a cost-effective alternative to traditional clustering solutions.

Key Features:

·         Real-Time Synchronous Replication: Continuous data replication across nodes to prevent data loss.

·         Automatic Failover and Failback: Seamless switch to a secondary system during failures and reversion once the original system is operational.

·         Load Balancing: Optimizes resource use by distributing workloads across multiple servers.

·         Platform Agnostic: Compatible with physical machines, virtual machines, and public cloud infrastructures.

Key Advantages:

·         Zero Specific Skills: No specialized IT skills required for deployment.

·         Zero Hardware Overhead: No need for specific hardware like shared disks or load balancers.

·         Zero Software Overhead: Works with standard editions of Windows and Linux.

Key Solutions:

·         Application Level: High availability with restart scripts per application.

·         Hypervisor Level: High availability without restart scripts per application.

·         Container or Pod Level: High availability without restart scripts per application.

SafeKit is ideal for software publishers, resellers, and distributors looking to enhance their products with high availability features. It also offers an OEM opportunity for partners to integrate SafeKit into their own applications.

1.1.2         SafeKit solutions

 See here for a list of SafeKit solutions.

Application-level HA

In this type of solution, only application data is replicated. And only the application is restarted in case of a failure.

Integration tasks must be implemented: write restart scripts for the application, define folders for replication, configure software checkers, define a virtual IP address.

This solution is platform-independent and works with applications inside physical machines, virtual machines, in the cloud. Any hypervisor is supported (e.g., VMware, Hyper-V, etc.).

Virtual machine-level HA

In this type of solution, the entire virtual machine (VM) is replicated, including the application and OS. The complete virtual machine is restarted in case of a failure.

The advantage is that there are no restart scripts to write per application, and no virtual IP address to set. If you don’t know how an application works, this is the simplest solution.

This solution works with Windows/Hyper-V and Linux/KVM but not with VMware. This is an active/active solution with multiple virtual machines replicated and restarted between the two nodes.

Note: Applications running in containers or pods also do not require dedicated restart scripts. SafeKit provides generic restarts and real-time replication of persistent data for these environments (see the list of SafeKit solutions).

1.1.3         SafeKit architectures

SafeKit offers two basic high availability clusters for Windows and Linux:

·         the mirror cluster, with real-time file replication and failover, built by deploying a mirror module on 2 servers,

·         the farm cluster, with network load balancing and failover, built by deploying a farm module on 2 servers or more.

Several modules can be deployed on the same cluster. Thus, advanced clustering architectures can be implemented:

·         the farm+mirror cluster built by deploying a farm module and a mirror module on the same cluster,

·         the active/active cluster built by deploying several mirror modules on 2 servers,

·         the N-1 cluster built by deploying N mirror module on N+1 servers.

Specific clusters are also interesting to consider with SafeKit:

·         the Hyper-V or KVM cluster with real-time replication and failover of entire virtual machines between 2 active hypervisors,

·         mirror or farm clusters in the Cloud.

1.1.4         SafeKit cluster definition

A SafeKit cluster is a set of servers where SafeKit is installed and running.

All servers within a given SafeKit cluster share the same cluster configuration, which includes the list of servers and networks used. These servers communicate with each other to maintain a global view of the configurations of the SafeKit modules. A server cannot belong to multiple SafeKit clusters simultaneously.

Configuring the cluster is a prerequisite before the installation and configuration of SafeKit modules. This can be done using the SafeKit web console or through online commands.

1.1.5         SafeKit module definition

A module is a customization of SafeKit for a specific application or hypervisor. See here for a list of modules and their quick installation guides.

Types of Modules

·         Generic farm and mirror modules for new applications,

·         Preconfigured application modules for databases, web servers…,

·         Hypervisors modules (hyperv.safe, kvm.safe) for real-time replication and restart of entire virtual machines.

Module Contents

In practice, a module is a “.safe” file (zip type) that includes:

·         The configuration file userconfig.xml, which contains:

o    The virtual IP address (not necessary for a hypervisor module),

o    File directories to replicate in real time (for a mirror module),

o    Network load balancing criteria (for a farm module),

o    Configuration of software and hardware failures detectors,

·         The scripts to start and stop an application or a virtual machine.

Deployment Steps

Once a module is configured and tested, deployment requires no specific IT skills:

·         Install the application or the hypervisor on 2 standard servers,

·         Install the SafeKit software on both servers,

·         Install the module on both servers.

Configuring, deploying, and monitoring modules can be done using the SafeKit web console or through online commands.

1.1.6         SafeKit limitations

Typical usage with SafeKit

Replication of a few Tera-bytes

Replication < 1 million files

Replication <= 32 virtual machines

1 or 10 G/s LAN or extended LAN

Limitation

Resynchronization after a failure takes too long.

On a 1 Gb/s network, 3 Hours for 1 Tera-bytes.

On a 10 Gb/s network, 1 hour or less for 1 Tera-bytes (depends on write disk IO performances).

Resynchronization after a failure takes too long.

Time to check each file between both nodes.

In full virtual machine replication mode, and with one virtual machine in a mirror module, the limit is 32 modules per cluster.

Failover of the virtual IP address is built-in when in the same subnet.

A LAN provides adequate bandwidth for resynchronization.

A LAN provides adequate latency (typically a round-trip of less than 2ms) for synchronous replication.

Alternative

Use shared storage.

Put files in a virtual hard disk replicated by SafeKit.

Use another HA solution with shared storage.

Use backup solutions with asynchronous replication.

1.2             The SafeKit mirror cluster

1.2.1         Real time file replication and application failover

The mirror cluster is an active-passive high-availability solution, built by deploying a mirror module within a two-node cluster. The application runs on a primary server and is restarted automatically on a secondary server if the primary server fails.

With its real-time file replication function, this architecture is particularly suited to providing high availability for back-end applications with critical data to protect against failure.

Microsoft SQL Server, PostgreSQL, MariaDB, Oracle, Milestone, Nedap, Docker, Podman, Hyper-V, and KVM solutions are examples of mirror modules. You can create your own mirror module for your application based on the generic mirror.safe module. See here for a list of modules.

Note that Hyper-V and KVM mirror modules replicate entire virtual machines, including applications and operating systems. They do not require a virtual IP, as the VM restart handles the failover of the VM physical IP address.

The mirror cluster works as follows.

1.2.2         Step 1. Normal operation

Server 1 (PRIM) runs the application.

SafeKit replicates files opened by the application. Only changes made by the application in the files are replicated in real time across the network, thus limiting traffic.

For replication, only names of file directories to replicate are configured in SafeKit. There are no pre-requisites on disk organization for the two servers. Directories to replicate may be located in the system disk.

1.2.3         Step 2. Failover

When Server 1 fails, Server 2 takes over. SafeKit switches the virtual IP address and restarts the application automatically on Server 2. The application finds the files replicated by SafeKit up-to-date on Server 2, thanks to the synchronous replication between Server 1 and Server 2. The application continues to run on Server 2 by locally modifying its files that are no longer replicated to Server 1.

The switch-over time is equal to the fault-detection time (set to 30 seconds by default) plus the application start-up time. Unlike disk replication solutions, there is no delay for remounting file systems and running recovery procedures.

1.2.4         Step 3. Failback and automatic resynchronization

Failback involves restarting Server 1 after fixing the problem that caused it to fail. SafeKit automatically resynchronizes the files, updating only the files modified on Server 2 while Server 1 was halted.

This automatic reintegration takes place without stopping the application, which can continue running on Server 2. This is a major feature that differentiates SafeKit from other solutions, which require manual operations to reintegrate Server 1 in the cluster.

1.2.5         Step 4. Return to normal operation

After reintegration, the files are once again in mirror mode, as in step 1. The system is back in high-availability mode, with the application running on Server 2 and SafeKit replicating file updates to Server 1.

If administrators want the application to run on Server 1, they can execute a ‘Stop/Start’ command on the PRIM server either through the console at the appropriate time or automatically by configuring a default primary server.

1.2.6         Synchronous replication versus asynchronous replication

There is a significant difference between synchronous replication, as offered by the SafeKit mirror solution, and asynchronous replication traditionally offered by other file replication solutions.

With synchronous replication, when a disk IO is performed by the application on the primary server inside a replicated file, SafeKit waits for the IO acknowledgement from the local disk and from the secondary server, before sending the IO acknowledgement to the application. This mechanism is essential for recovery of transactional applications.

The latency of a LAN (typically a round-trip of less than 2ms) between the servers is required to implement synchronous data replication, possibly with an extended LAN in two geographically remote computer rooms.

With asynchronous replication implemented by other solutions, the IOs are placed in a log on the primary server but the primary server does not wait for the IO acknowledgments of the secondary server. Thus, all data that has not been copied over the network to the second server is lost in the event of a failure of the first server.

In particular, a transactional application may lose committed data in the event of a failure. Asynchronous replication can be used for data replication over a low-speed WAN to back up data remotely, but it is not suitable for high availability with automatic failover.

SafeKit provides a semi-synchronous solution, implementing the asynchrony not on the primary server but on the secondary one. In this solution, SafeKit always waits for the acknowledgement of the two servers before sending the acknowledgement to the application. But on the secondary, there are 2 options asynchronous or synchronous. In the asynchronous case, the secondary sends the acknowledgement to the primary upon receipt of the IO and writes to disk after. In the synchronous case, the secondary writes the IO to disk and then sends the acknowledgement to the primary. The synchronous mode is required if we consider a simultaneous double power outage of two servers, with inability to restart the former primary server and requirement to re-start on the secondary.

1.2.7         Behavior in case of network isolation

A heartbeat is a mechanism for synchronizing two servers and detecting failures by exchanging data over a shared network. If one server loses all heartbeats, it assumes the other is down and runs the application ALONE.

SafeKit supports multiple heartbeats across shared networks. A dedicated network with a second heartbeat can prevent network isolation and also be used as the replication network.

Network Isolation:

·         Upon losing all heartbeats, both servers transition to the ALONE state, running the application independently.

·         After the isolation, one server stops and resynchronizes data from the other server.

·         The cluster returns to PRIM-SECOND state.

Splitbrain Checker:

·         Uses a witness IP (usually a router) to avoid double execution during isolation.

·         Only the server with witness access goes ALONE, the other waits.

·         After isolation, the WAIT server resynchronizes and becomes SECOND.

1.2.8         3-node replication

SafeKit only supports replication between two nodes. However, it is possible to implement 3-node replication by combining SafeKit with a backup solution.

An application is made highly available between 2 nodes thanks to SafeKit with its synchronous real-time replication (no data loss) and automatic failover. Additionally, a backup solution is implemented for asynchronous replication to a third node in a disaster recovery site. Since there is data loss with an asynchronous backup solution, the failover to the third node is manual and decided by an administrator.

Note that the real-time replication of SafeKit does not eliminate the need for a backup solution. For example, a ransomware attack encrypting replicated data on the primary server will also encrypt data on the secondary server in real-time with SafeKit. Only a backup solution with a retention policy can resolve a ransomware attack. The administrator must restore the backup from before the ransomware attack.

1.2.9         SafeKit on a single node to protect against software failures

You can configure a module in "light" mode, which corresponds to a module running on a single node without synchronizing with other nodes (unlike mirror or farm modules). A light module includes the start and stop of an application, as well as SafeKit checkers that detect software errors and perform automatic restarts on a single node.

The light module interfaces with the SafeKit console, allowing an administrator to view the status of the application module and manually trigger application restarts using a button-click interface.

There is no need to define a virtual IP address or replicated directories in a light module. Note that this can also serve as a first step before transitioning to a mirror module or a farm module.

1.3             The SafeKit farm cluster

1.3.1         Network load balancing and application failover

The farm cluster is an active-active high-availability solution, built by deploying a farm module within a cluster of two or more nodes. The farm cluster provides both network load balancing, through transparent distribution of network traffic, and software and hardware failover. This architecture offers a simple solution to support the increase in system load.

The same application runs on each server, and the load is balanced by the distribution of network activity on the different servers of the farm.

Farm clusters are suited to front-end applications like web services.

Apache, Microsoft IIS, NGINX solutions are examples of farm modules. You can write your own farm module for your application, based on the generic farm.safe module. See here for a list of modules.

1.3.2         Principle of a virtual IP address with network load balancing

The virtual IP address is configured locally on each server in the farm. Input traffic for this address is distributed among all servers by a filter within each server’s kernel.

The load balancing algorithm inside the filter is based on the identity of the client packets (client IP address, client TCP port). Depending on the identity of the client packet, only one filter on a server accepts the packet. Once a packet is accepted by the filter on a server, only the CPU and memory of that server are used by the application responding to the client’s request. The output messages are sent directly from the application server to the client.

If a server fails, the SafeKit heartbeat protocol in a farm reconfigures the filters to re-balance the traffic among the remaining available servers.

1.3.3         Load balancing for stateful or stateless web services

With a stateful server, session affinity is required. The same client must connect to the same server across multiple TCP sessions to retrieve its context. In this scenario, the SafeKit load balancing rule is configured on the client IP address. This ensures that the same client always connects to the same server for multiple TCP sessions, while different clients are distributed across various servers in the farm. This configuration is used when session affinity is required.

With a stateless server, there is no session affinity. The same client can connect to different servers in the farm across multiple TCP sessions, as no context is stored locally on a server from one session to another. In this case, the SafeKit load balancing rule is configured on the TCP client session identity. This configuration is optimal for distributing sessions between servers but requires a TCP service without session affinity.

1.3.4         Chain high availability solution in a farm

What is a chain HA solution (also known as a cascading HA solution)?

·         Multiple servers are linked in a sequence: If one server fails, the next one in the chain takes over.

·         Priority-based management: A single server, the one with the highest priority in the chain and which is available, manages all requests from clients.

·         Failover process: If the server with the highest priority fails, the next available server with the highest priority takes over.

·         Reintegration: When a server comes back online and has the highest priority, it resumes handling all client requests.

·         Quick recovery time: This solution has a quick recovery time, as the application is pre-started on all servers. The recovery time is essentially the time needed to reconfigure the priorities among the servers in the farm (a few seconds).

·         Replication limitations: This solution does not support real-time replication, which is limited to mirror architecture. However, a combined farm+mirror architecture is available.

To implement a chain high availability solution, SafeKit offers a "power" variable in the load balancing rules, which is set at the level of each server in the cluster. The power variable allows you to allocate more or less traffic to a server. When the power variable is set as a multiple of 64 between servers (e.g., 1, 64, 64*64, 64*64*64, ...), the chain high availability solution is implemented.

1.4             Clusters running several modules

1.4.1         The SafeKit farm+mirror cluster

Network load balancing, file replication and application failover

You can mix farm and mirror modules on the same cluster.

This option allows you to implement a multi-tier application architecture, such as apache_farm.safe (farm architecture with load balancing and failover) and postgresql.safe (mirror architecture with file replication and failover) on the same servers.

As a result, load balancing, file replication and failover are managed coherently on the same servers.

1.4.2         The SafeKit active/active cluster with replication

Crossed replication and mutual failover

In an active / active cluster with replication, there are two servers and two mirror modules in mutual failover (appli1.safe and appli2.safe). Each application server is backup of the other server.

If one application server fails, both applications will run on the same physical server. Once the failed server is restarted, its application will return to its default primary server.

A mutual failover cluster is more cost-effective than two separate mirror clusters, as it eliminates the need for backup servers that remain idle most of the time, waiting for a primary server to fail. However, in the event of a server failure, the remaining server must be capable of handling the combined workload of both applications.

Note that:

·         Both applications, Appli1 and Appli2, must be installed on each server to enable application failover.

·         This architecture is not limited to just two applications; N application modules can be deployed across two servers.

·         Each mirror module will have its own virtual IP address, its own replicated file directories, and its own restart scripts.

1.4.3         The SafeKit N-1 cluster

Replication and application failover from N servers to 1

In an N-1 cluster, N mirror application modules are deployed across N primary servers and a single backup server.

In the event of a failure, unlike in an active/active cluster, the backup server does not need to manage a double workload when a primary server fails. This assumes only one failure occurs at a time. While the solution can support multiple primary server failures simultaneously, in such cases, the single backup server will need to handle the combined workload of all the failed servers. In a N-1 cluster, there are N mirror application modules installed between N primary servers and one backup server.

Note that:

·         All applications (Appli1, Appli2, Appli3) must be installed on the single backup server to enable application failover.

·         Each mirror module will have its own virtual IP address, its own replicated file directories, and its own restart scripts.

1.5             The SafeKit Hyper-V or KVM cluster

1.5.1         Load balancing, replication, failover of entire virtual machines

The Hyper-V or KVM cluster is an example of an active-active cluster. Multiple applications can be hosted in various virtual machines, which are replicated and restarted by SafeKit. Each virtual machine is managed by SafeKit within its own mirror module.

The solution has the following features:

·         Real-time synchronous replication of entire virtual machines with failover capabilities.

·         A centralized, user-friendly SafeKit console for managing all VMs, including the ability to migrate VMs between servers to optimize load distribution.

·         A checker for each VM to detect if it has locked up, crashed, or ceased to function, and to restart the VM if necessary.

·         An attractive solution that requires no application integration.

·         A robust architecture suitable for high-availability solutions that cannot be integrated at the application level.

A free trial of the Hyper-V cluster with SafeKit is available here.

A free trial of the KVM cluster with SafeKit is available here.

1.6             SafeKit clusters in the cloud

For a full description, refer to section 16.

1.6.1         Mirror cluster in Azure, AWS and GCP

SafeKit delivers high-availability clusters with real-time replication and failover in Azure, AWS, and GCP through the deployment of a mirror module.

The mirror solution in the cloud is similar to the on-premise one, except that the virtual IP address must be configured at the load balancer level:

·         Virtual machines are placed in different availability zones, which are in different subnets.

·         The critical application runs on the primary server.

·         Users connect to a primary/secondary virtual IP address managed by the cloud load balancer.

·         SafeKit provides a health check configured in the load balancer. On the primary server, the health check returns OK to the load balancer, while it returns nothing on the secondary server. Thus, all requests to the virtual IP address are routed to the primary server.

·         If the primary server fails or is stopped, the secondary server automatically becomes the primary one and returns OK to the health check. Thus, all requests to the virtual IP address are rerouted to the new primary server.

·         SafeKit monitors the critical application on the primary server using SafeKit checkers.

·         SafeKit automatically restarts the critical application in the event of software or hardware failure, thanks to restart scripts.

·         SafeKit performs synchronous real-time replication of files containing critical data.

For more information, refer to mirror cluster in Azure, mirror cluster in AWS or mirror cluster in GCP.

1.6.2         Farm cluster in Azure, AWS and GCP

SafeKit delivers high-availability clusters with network load balancing and failover in Azure, AWS, and GCP through the deployment of a farm module.

The farm solution in the cloud is similar to the on-premise one, except that the virtual IP address must be configured at the load balancer level:

·         Virtual machines are placed in different availability zones, which are in different subnets.

·         The critical application runs on all servers.

·         Users are connected to a virtual IP address managed by the cloud load balancer.

·         SafeKit provides a health check configured in the load balancer. The health check returns OK on all servers running the application.

·         If a server fails or is stopped, the checker returns nothing to the load balancer, which then stops routing requests to that server.

·         SafeKit monitors the critical application on all servers using SafeKit checkers.

·         SafeKit automatically restarts the critical application on a server when there is a software failure, thanks to restart scripts.

For more information, refer to farm cluster in Azure, farm cluster in AWS or farm cluster in GCP.


 

 

 


2. Installation

*       Section 2.1SafeKit install

*       Section 2.2Mirror installation recommendation

*       Section 2.3Farm installation recommendation

*       Section 2.4SafeKit upgrade

*       Section 2.5SafeKit full uninstall

*       Section 2.6SafeKit documentation

2.1             SafeKit install

2.1.1         Download the package

1.    Connect to https://support.evidian.com/safekit

2.    Go to <Version 8.2>/Platforms/<Your platform>/Current versions

3.    Download the package
In Windows, two packages are available:

·         A Windows Installer package (safekit_windows_x86_64_8_2_x_y.msi). It depends on the VS2022 C runtime which must be previously installed

·         A standalone executable bundle (safekit_windows_x86_64_8_2_x_y.exe), which includes the SafeKit installation and the VS2022 C runtime

Choose one or the other package depending on whether the VS2022 C runtime is installed or not.

2.1.2         Installation directories and disk space provisioning

SafeKit is installed in:

SAFE

·         in Windows

SAFE=C:\safekit
if
%SYSTEMDRIVE%=C:

·         in Linux

SAFE=/opt/safekit

Minimum free disk space: 97MB

 

SAFEVAR

·         in Windows

SAFEVAR= C:\safekit\var

if %SYSTEMDRIVE%=C:

·         in Linux

SAFEVAR=/var/safekit

Minimum free disk space: 20MB + at least 20MB (up to 3 GB) per module for dumps

 

 

2.1.3         SafeKit install procedure

2.1.3.1      Install on Windows as administrator

2.1.3.1.1                SafeKit package install

1.    Log-in as administrator on Windows server

2.    Locate the downloaded file safekit_windows_x86_64_8_2_x_y.msi (or safekit_windows_x86_64_8_2_x_y.exe)

3.    Install in interactive mode by double-clicking it and go through the installer wizard

Before SafeKit 8.2.3, after installation, you need to run the firewall configuration scripts (see section 10.3) and initialize the SafeKit web service (see section 11.2.1.2).

Since SafeKit 8.2.3, at the end of the SafeKit Setup, you will be asked to check or uncheck " Set console credentials and firewall rules now ".

If the box is checked, when clicking the “Finish” button:

o    it configures Microsoft Windows Firewall for SafeKit. For details or other firewalls, see section 10.3.

o    it opens a window to enter the password for the admin user of the SafeKit web console.

This step is mandatory to initialize the default configuration of the web service that requires authentication. It is initialized with the admin user and the given password pwd, for instance. It then allows to access to all the web console's features, by logging in with admin/pwd, and run distributed commands. For details, see section 11.2.1.

 

Commentaire important contour

The password must be identical on all nodes that belong to the same SafeKit cluster. Otherwise, web console and distributed commands will fail with authentication errors.

or

3.    Install in non-interactive mode, by executing:

msiexec /qn /i safekitwindows_8_2_x_y.msi

Then, the firewall setup and web service initialization must be done.

2.1.3.1.2                Firewall setup

This step is mandatory to enable communication between the nodes of the SafeKit cluster and with the web console.

No action required when firewall automatic configuration has been performed during the package install. Otherwise see section 10.3.

2.1.3.1.3                Web service initialization

This step is mandatory to initialize the default configuration of the web service, which is accessed by the web console and the global safekit command. The web service requires authentication to access the service. No action required when the web service initialization has been performed during the package install. Otherwise, see section 11.2.1.2.

2.1.3.1.4                Antivirus setup

This step is only necessary if the server's antivirus interferes with the operation of SafeKit. See section 10.5 for the list of legitimate SafeKit directories and processes that should not be affected by the antivirus.

2.1.3.2      Install on Linux as root

2.1.3.2.1                SafeKit package install

1.    Open a Shell console as root on Linux server

2.    Go to the directory that contains the downloaded file safekitlinux_x86_64_8_2_x_y.bin

auto extractible zip file

3.    Run chmod +x safekitlinux_x86_64_8_2_x_y.bin

4.    Run./safekitlinux_8_2_x86_64_x_y.bin

it extracts the package and the safekitinstall script

5.    Install in interactive mode by executing ./safekitinstall

During the installation:

·         reply to “Do you accept that SafeKit automatically configure the local firewall to open these ports (yes|no)?

If you answer yes, it configures firewalld or iptable Linux firewall for SafeKit. For details or other firewalls, see section 10.3.

·         reply to “Please enter a password or "no" if you want to set it later

This step is mandatory to initialize the default configuration of the web service. The web service requires authentication to access the service.

It initializes it with the admin user and the given password pwd, for instance. It then allows to access to all the web console's features, by logging in with admin/pwd, and run distributed commands. For details, see section 11.2.1.

Commentaire important contour

The password must be identical on all nodes that belong to the same SafeKit cluster. Otherwise, web console and distributed commands will fail with authentication errors.

or

5.    Install in non-interactive mode, by executing:

./safekitinstall -q

Use the option -nofirewall for disabling the firewall automatic setup.

Use the option -passwd pwd for initializing the web service authentication (where pwd is the password set for the admin user).

The install log is /tmp/safekitinstall_log.

2.1.3.2.2                Firewall setup

This step is mandatory to enable communication between the nodes of the SafeKit cluster and with the web console.

No action required when firewall automatic configuration has been performed during the package install. Otherwise see section 10.3.

2.1.3.2.3                Web service initialization

This step is mandatory to initialize the default configuration of the web service, which is accessed by the web console and the global safekit command. The web service requires authentication to access the service. No action required when the web service initialization has been performed during the package install. Otherwise, see section 11.2.1.2.

2.1.3.2.4                Antivirus setup

This step is only necessary if the server's antivirus interferes with the operation of SafeKit. See section 10.5 for the list of legitimate SafeKit directories and processes that should not be affected by the antivirus.

2.1.4         Use the SafeKit web console or command line interface

Once installed, the SafeKit cluster must be defined. Then modules can be installed, configured, and administered. All these actions can be done with the SafeKit console or the command line interface.

2.1.4.1      The SafeKit web console

1.    Start a web browser (Microsoft Edge, Firefox, or Chrome)

2.    Connect it to the URL http://host:9010 (where host is the name or IP address of one of the SafeKit nodes)

3.    In the login page, enter admin as user’s name and the password you gave on initialization (e.g., pwd)

4.    Once the console is loaded, the admin user can access to Monitoring and
Configuration in the navigation sidebar, as he has the default
Admin role

For details see section 3.

2.1.4.2      The SafeKit command line interface

It is based on the single safekit command located at the root of the SafeKit installation directory. Almost all safekit commands can be applied locally or on a list of nodes in the SafeKit cluster. This is called global or distributed command.

For details on the safekit command, see section 9.

 

To use the safekit command:

In Windows

1.    Open a PowerShell console as administrator

2.    Go to the root of the SafeKit installation directory SAFE (by default SAFE=C:\safekit if %SYSTEMDRIVE%=C:)

cd c:\safekit

3.    Run .\safekit.exe <arguments> for the local command

4.    Run .\safekit.exe -H "<hosts>" <arguments> for the command distributed across multiple nodes

In Linux

1.    Open a Shell console as root

2.    Go to the root of the SafeKit installation directory SAFE (by default SAFE=/opt/safekit)

cd /opt/safekit

3.    Run ./safekit <arguments> for the local command

4.    Run ./safekit -H "<hosts>" <arguments> for the command distributed across multiple nodes

 

For instance, to display the levels (SafeKit, OS…):

·         for the local host

safekit level

·         for all hosts configured in the SafeKit cluster

safekit -H "*" level

2.1.5         SafeKit license keys

License keys are determined and verified based on the Operating System (Windows or Linux) and the hostnames of machines (not the FQDN), as returned by the hostname command in a Windows command prompt or a Linux shell. They are delivered in a text file. Once the license key file is installed, there is no need for a connection to a license server.

·         If you do not install any license key file, the product will stop functioning every 3 days. However, it can be restarted for another 3 days.

·         You can download a one-month trial key file from the following address: https://www.evidian.com/products/high-availability-software-for-application-clustering/high-availability-and-load-balancing-cluster-key/

·         When a license key expires or is incorrect (e.g., wrong OS or hostname), the system falls into the 3-day behavior.

·         After placing a purchase order, you obtain a permanent key file (see section 8.2). The permanent key file can be installed without reinstalling or stopping the product.

·         The key file can contain keys for multiple hostnames. SafeKit will detect the appropriate license for the correct OS/hostname on each server.

·         Save the key file into the SAFE/conf/license.txt file (or any other file in SAFE/conf) on each server.

·         If files in SAFE/conf contain more than one key file, the most favorable key will be chosen.

·         Check the key conformance on each server with the command SAFE/safekit level or with the SafeKit web console.

2.1.6         System specific procedures and characteristics

2.1.6.1      Windows

·         Apply a special procedure to properly stop SafeKit modules at machine shutdown and to start safeadmin service at boot: see section 10.4.

·         For network interfaces with teaming and with SafeKit load balancing, it is necessary to uncheck "Vip" on physical network interfaces of teaming and keep it checked only on teaming virtual interface.

2.1.6.2      Linux

·         In Linux, the SafeKit package depends on other system packages. Most of them are installed automatically, except those specific to the implementation of load balancing  in a farm and file replication in a mirror.

·         For an updated list of required packages, see the SafeKit Release Notes.

·         The user safekit and a group safekit are created: all users belonging to the safekit group, and the user root can execute SafeKit commands

·         In a farm module with load balancing on a virtual IP address, the vip kernel module is compiled when the module is configured. To compile successfully, Linux packages must be installed. See the SafeKit Release Notes for an up-to-date list of the packages.

·         For a farm with SafeKit load balancing on a bonding interface, no ARP should be set in the bonding configuration. Otherwise, the association <virtual IP address, invisible virtual MAC address> is broken in client ARP caches with physical MAC address of the bonding interface.

2.2             Mirror installation recommendation

  ip 1.1     ip 1.2

     

virtual ip =      ip 1.10

                         

mirror(app1)=  app1

                                    

                        dir1                dir1

2.2.1         Hardware prerequisites

·         2 servers with the same Operating System

·         Supported OS: https://support.evidian.com/supported_versions/#safekit

·         Disk drive with write-back cache recommended for the performance of the IOs

2.2.2         Network prerequisites

·         1 physical IP address per server (ip 1.1 and ip 1.2)

·         If you need to set a virtual IP address (ip 1.10), both servers must be in the same IP network with the standard SafeKit configuration (LAN or extended LAN between two remote computer rooms). For setting a virtual IP address with servers in different IP networks, see section 13.5.3.

2.2.3         Application prerequisites

·         The application is installed and starts on both servers

·         Application can be started and stopped using command lines

·         On Linux, command lines like service "service" start|stop or su -user "appli-cmd"

·         On Windows, command lines like net start|stop "service"

·         If necessary, application with a procedure to recover after crash

·         Remove automatic application start at boot and configure the boot start of the module instead

2.2.4         File replication prerequisites

·         File directories that will be replicated are created on both servers

·         They are located at the same place on both servers in the file tree

·         It is better to synchronize clocks of both server for file replication (NTP protocol)

·         On Linux, align uids/gids on both servers for owners of replicated directories/files

·         See also system specific procedures and characteristics in section 2.1.6

2.3             Farm installation recommendation

ip 1.1    ip 1.2   ip 1.3

     

virtual IP =             ip 1.20     ip 1.20      ip 1.20

                                                              

farm (app2) =          app2        app2         app2

2.3.1         Hardware prerequisites

·         At least 2 servers with the same Operating System

·         Supported OS: https://support.evidian.com/supported_versions/#safekit

·         Linux: kernel compilation tools installed for vip kernel module

2.3.2         Network prerequisites

·         1 physical IP address per server (ip 1.1, ip 1.2, ip 1.3)

·         If you need to set a virtual IP address (ip 1.20), servers must be in the same IP network with the standard SafeKit configuration (same LAN or extended LAN between remote computer rooms). For setting a virtual IP address with servers in different IP networks, see section 13.5.3.

·         See also system specific procedures and characteristics in section 2.1.6

2.3.3         Application prerequisites

The same prerequisites as for a mirror module described in section 2.2.3

2.4             SafeKit upgrade

If you encounter a problem with SafeKit, see the Software Release Bulletin containing the list of fixes on the product.

If you want to take advantage of some new features, see the SafeKit Release Notes. This document also tells you if you are in the case of a major upgrade (ex. 7.5 to 8.2) which requires a different procedure from the one presented here.

The upgrade procedure consists in uninstalling the old package and then installing the new package. All nodes in the same cluster must be upgraded.

2.4.1         Prepare the upgrade

1.    Note the state "on" or "off" of SafeKit services and modules started automatically at boot safekit boot webstatus; safekit boot status -m AM (where AM is the name of the module) and in Windows: safekit boot snmpstatus;

Commentaire important contour

The start at boot of the module can be defined in its configuration file. If so, the use of the safekit boot command becomes unnecessary.

 

2.    for a mirror module

note the server in the ALONE or PRIM status to know which server holds the up-to-date replicated files

3.    optionally, take snapshots of modules

Uninstalling/reinstalling will reset logs and dumps of each module. If you want to keep this information (logs and last 3 dumps and configurations), run the command safekit snapshot -m AM /path/snapshot_xx.zip (replace AM by the module name)

2.4.2         Uninstall procedure

On Windows as administrator and on Linux as root:

1.    stop all modules using the command safekit shutdown

For a mirror in the PRIM-SECOND status, stop first the SECOND server to avoid an unnecessary failover

2.    close all editors, file explorers, shells, or terminal under SAFE and SAFEVAR (to avoid package uninstallation error)

3.    uninstall SafeKit package

In Windows

Use the Control Panel-Add/Remove Programs applet

In Linux

Use the command safekit uninstall

 

4.    undo all configurations that you have done manually for the firewall setup (see section 10.3)

Uninstalling SafeKit includes creating a backup of the installed modules in SAFE/Application_Modules/backup, then unconfiguring them.

2.4.3         Reinstall and postinstall procedure

1.    Install the new package as described in section 2.1

2.    Check with the command safekit level the installed SafeKit version and the validity of the license (which has not been uninstalled)

If you have a problem with the new package and the old key, take a temporary license: see section 2.1.5

3.    If you use the web console, clear the browser cache and refresh pages in the web browser

4.    Since SafeKit 8.2.1, previously configured modules are automatically reconfigured on upgrade.

However, you may still need to reconfigure module to apply any configuration changes coming with the new version (see the SafeKit Release Notes). Reconfigure the module either with:

o    the web console by navigating to “Configuration/Modules configuration/
Configure the module/”

o    the web console by directly entering the URL http://host:9010/console/en/configuration/modules/AM/config/

o    the command safekit config -m AM

where AM is the module name

5.    If necessary, reconfigure the automatic start of modules at boot

The start at boot of the module can be defined in its configuration file. If so, skip this step. Otherwise, run the command safekit boot -m AM on (replace AM by the module name)

6.    Restart the modules

Mirror module

The module must be started as primary on the node with the updated replicated files (former PRIM or ALONE) either with:

·         the web console by navigating to Monitoring/of the node/Force start/As primary

·         the command safekit prim -m AM (replace AM by the module name)

 

Check that the application is working properly once the module is in ALONE state, before starting the other node.

On the other node (former SECOND), the module must be started in secondary mode either with:

·         the web console by navigating to Monitoring/of the node/Force start/As secondary

·         the command safekit second -m AM (replace AM by the module name)

Once this initial start has been performed by selecting the primary and secondary nodes, subsequent starts can be performed with:

·         the web console by navigating to Monitoring/of the node/Start/

·         the command safekit start -m AM (replace AM by the module name)

Farm module

Start the module either with:

·         the web console by navigating to Monitoring/of the module/Start/

·         the command safekit start -m AM (replace AM by the module name)

 

Furthermore, in exceptional cases where you have modified the default setup of the SafeKit web service or SNMP monitoring :

1.    the SafeKit web service safewebserver

·         If its automatic start at boot had been disabled, disable it again with the command safekit boot weboff

·         If you had modified configuration files and these have evolved in the new version, your modifications are saved into SAFE/web/conf before being overwritten by the new version. Carrying over your old configuration to the new version may require some adaptations. For details on the default setup and all predefined setups, see section 11.

·         For HTTPS and login/password configurations, certificates, and user.conf / group.conf generated for the previous release should be compatible.

2.    The SafeKit SNMP monitoring

·         In Windows, if its automatic start at boot had been enabled, enable it again with the command safekit boot snmpon

·         If you had modified configuration files and these have evolved in the new version, your modifications are saved into SAFE/snmp/conf before being overwritten by the new version. Carrying over your old configuration to the new version may require some adaptations. For details, see section 10.9.

2.5             SafeKit full uninstall

For completely removing the SafeKit package, follow the procedure described below.

2.5.1         Uninstall on Windows as administrator

1.    Log-in as administrator on Windows server

2.    stop all modules using the command safekit shutdown

3.    close all editors, file explorers, shells, or cmd under SAFE and SAFEVAR (to avoid package uninstallation error)

(SAFE=C:\safekit if %SYSTEMDRIVE%=C: ; SAFEVAR=C:\safekit\var if %SYSTEMDRIVE%=C:)

4.    uninstall SafeKit using the Control Panel-Add/Remove Programs applet

5.    reboot the server

6.    delete the folder SAFE that is the installation directory of the previous install of SafeKit

7.    undo all configurations that you have done for SafeKit boot/shutdown (see section 10.4)

8.    undo all configurations that you have done for firewalls rules setting (see section 10.3)

2.5.2         Uninstall on Linux as root

1.    Open a Shell console as root on Linux server

2.    stop all modules using the command safekit shutdown

3.    close all editors, file explorers, shells, or terminal under SAFE and SAFEVAR (SAFE=/opt/safekit ; SAFEVAR=/var/safekit)

4.    uninstall SafeKit using the safekit uninstall -all command and answer yes when prompted to delete all SafeKit folders

5.    reboot the server

6.    undo all configurations that you have done for firewalls rules setting

See section 10.3

7.    delete the user/group created by the previous install (default is safekit/safekit) with the commands: 

userdel safekit

groupdel safekit

2.6             SafeKit documentation

SafeKit Solution

The SafeKit solution is fully described.

SafeKit Training

Refer to this online training for a quick start in using SafeKit.

SafeKit Release Notes

It presents:

·         latest install instructions

·         major changes

·         restrictions and known problems

·         migration instructions

Software Release Bulletin

Bulletin listing SafeKit 8.2 packages, with descriptions of changes and fixed issues.

SafeKit Knowledge Base

List of known SafeKit issues and restrictions.

Other KBs are available on the Evidian support site, but are only accessible to registered users. For more details on the support site, see section 8.

SafeKit user's guide

This is the guide. Please refer to the guide corresponding to your SafeKit version number. It is delivered with the SafeKit package and can be accessed via the web console under /User’s guide.

The link opposite takes you to the latest version of this guide.

 

 

 


3. The SafeKit web console

*       Section 3.1Start the web console

*       Section 3.2Configure the cluster”  

*       Section 3.3Configure a module

*       Section 3.4Monitor a module

*       Section 3.5Snapshots or logs of module for debug and support

*       Section 3.6Secure access to the web console

 

The SafeKit 8 web console and API have evolved from earlier versions. As a result, the console delivered with SafeKit 8 can only administer SafeKit 8 servers, which cannot be administered with an older console.

3.1             Start the web console

The web console permits to administer one SafeKit cluster. A SafeKit cluster is a set of servers where SafeKit is installed and running. All servers belonging to a given SafeKit cluster share the same cluster configuration (list of servers and networks used) and communicate with each other’s to have a global view of SafeKit modules configurations. The same server can not belong to many SafeKit clusters.

3.1.1         Start a web browser

·         The web browser runs on any allowed SafeKit nodes or workstation that can reach the SafeKit servers over the network.

·         Network, firewall and proxy configuration must allow access to all the servers that are administered with the web console

·         JavaScript must be available and enabled in the web browser

·         Tested browsers are Microsoft Edge, Firefox, and Google Chrome

·         To avoid security popups in Microsoft Edge, you may add the SafeKit servers addresses into the Intranet or Trusted zone

·         The messages in the web console are displayed in French or English languages, according to the selected language into the console

·         After SafeKit upgrade, you must clear the browser’s cache to get the new web console pages. A quick way to do this is a keyboard shortcut:

1.    Open the browser to any web page and hold CTRL and SHIFT while tapping the DELETE key

2.    A dialog box will open to clear the browser. Set it to clear everything and click Clear Now or Delete at the bottom

3.    Close the browser, stop all background processes that may be still running and re-open it fresh to reload the web console

 

3.1.2         Connect to a SafeKit server

By default, access to the web console requires the user to authenticate himself with a name and password. On SafeKit install, you had to initialize it with the user admin and assign a password. This admin name and password are sufficient to access all the console's features. For more details on this configuration, see section 11.2.1.

1.    Start a web browser (Microsoft Edge, Firefox, or Chrome)

2.    Connect it to the URL http://host:9010 (where host is the name or IP address of one of the SafeKit servers). If HTTPS is configured, there is an automatic redirection to https://host:9453.

3.    The SafeKit server to which the console is connected (host in the URL) is called the connection node. This node acts as a proxy to communicate on behalf of the console with all other SafeKit servers.

Commentaire, ajouter contour

You can connect to any node of the cluster since the console offer global view and actions. On connection error with one node, connect to another node.

4.    In the login page, enter admin as user’s name and the password you gave on initialization (e.g., pwd).

5.    The SafeKit web console is loaded

·         When the console is connected to a SafeKit server on which the cluster is configured, the name of the node corresponding to the server (as defined in the cluster configuration) is displayed in the header. This is the connection node (node1 in the example). If the cluster is not yet configured, no name is displayed.

·         Click on  to open the menu to read the SafeKit User’s Guide, select the language, enable/disable the dark mode and logout.

·         Click on  to collapse or expand the navigation sidebar.

·         Click on  “Configuration to configure the cluster and the modules. Configuration is only authorized to users that have Admin role. By default, the admin user has the Admin role.

·         (4) Click on  “Monitoring to monitor and control the configured modules. Monitoring is authorized to users that have Admin, Control and Monitor roles. With Monitor role, actions on modules (start, stop…)  are prohibited.

Commentaire, ajouter contour

The web console offers contextual help by clicking on the icon.

3.2             Configure the cluster

The SafeKit cluster must be defined before installing, configuring, or starting a SafeKit module. A Safekit cluster is defined by a set of networks and the addresses, on these networks, of a group of SafeKit servers, named nodes. These nodes implement one or more modules. Each server is not necessarily connected to all the networks, but at least one.

The cluster configuration is saved on the servers’ side into the cluster.xml file (see section 12). For a correct behavior, it is required to apply the same cluster configuration on all the nodes.

Commentaire important contour

You must fully define the cluster configuration before installing and configuring modules since the modification of the cluster can affect the configuration or the execution of installed modules.

 

The cluster configuration home page is available :

·         Directly via the URL http://host:9010/console/en/configuration/cluster

Or

·         By navigating the console via  “Configuration/Cluster configuration

 

If the cluster is not yet configured, the cluster configuration wizard is automatically opened.

3.2.1         Cluster configuration wizard

Open the configuration wizard:

·         Directly via the URL http://host:9010/console/en/configuration/cluster/config

Or

·         Navigate in the console via Configuration/Cluster configuration/
Configure the cluster/

 

The cluster configuration wizard is a step-by-step guided form:

SafeKit web console image
The cluster configuration wizard

1.    Edit cluster configuration described in section 3.2.1.1

2.    Check result described in section 3.2.1.2

3.     to “Exit cluster configuration wizard

3.2.1.1      Edit cluster configuration

SafeKit web console image
Edit the cluster configuration using the cluster configuration wizard.
Define 2 LANs and IP addresses of 2 nodes on these networks

·         (1) Fill in the form to first assign a user-friendly name for the network. This name is used for configuring heartbeat networks used by a module.

Click on  to add another node/lan or on  to remove the node/lan from the cluster.

Commentaire important contour

When a node/lan is removed from the cluster, all modules using it in its configuration may become unusable.

·         (2) Fill in the IP address of the node and then press the Tab key to check the server connectivity and automatically insert the server hostname.

The icon next to the address reflects the reachability of the node.

SafeKit web console image
Tooltip with cluster node information in the cluster configuration wizard

means that the SafeKit server is available. The tooltip gives information on the server.

SafeKit web console image
Tooltip for connection issues in the cluster configuration wizard

means that there was no reply from the server within the timeout delay. Fix the problem to be able to administer this node. It may be a bad address, a network or host failure, a bad configuration of the web browser or the firewall, the stop of the SafeKit web service on the node. For solving the problem, refer to the section 7.1.

 

·         Change the node name if necessary. This name is the one that will be used by the SafeKit administration service for uniquely identifying a SafeKit node. It is also the one displayed into the SafeKit web console.

·         (3) If you prefer, click on “Advanced configuration to switch to XML cluster editing.

Click on  to open the SafeKit User’s Guide on the configuration description in the cluster.xml file.

·         Click on “Reload to discard your current modifications and reload the original configuration.

·         (4) Once the edition is completed, click on “Save and Apply to save and apply the edited configuration to all nodes in the cluster.

 

Commentaire important contour

If required, you can reapply the configuration to all nodes without modifying it.

 

Sous-titres contour

For examples of cluster configurations with two networks  refer to section 15.1.1; with three nodes refer to section 15.2.1.

 

3.2.1.2      Check result

SafeKit web console image
Check the cluster configuration results on all nodes using the cluster configuration  wizard

·         (1) Read the result of the operation on each node:

o    Success means the configuration was successful.

o    Failure means the configuration has failed. Click to read the output of commands executed on the node and search for the error.  You may need to modify the parameters entered or connect to the node to correct the problem. Once the error has been corrected, “Save and apply again.

·         (2) Click on “Configure modules to exit the cluster configuration wizard and navigate to modules configuration.

Or

·         (3) Click on  to “Exit the cluster configuration wizard” and navigate to the cluster configuration home page

3.2.2         Cluster configuration home page

When the cluster is configured, the cluster configuration home page is available.

Open it:

·         Directly via the URL http://host:9010/console/en/configuration/cluster

Or

·         By navigating the console via Configuration/Cluster configuration

 

In this example, the console is loaded from 10.0.0.107, which corresponds to node1 in the existing cluster. This is the connection node.

SafeKit web console image
Cluster configuration home page

·         (1) Click on Configuration in the navigation sidebar

·         (2) Click on “Cluster configuration tab

Nodes configured in the cluster are listed with their configuration date.

·         (3) Click on  to display details about the node: networks name and addresses defined in the cluster configuration, SafeKit version, license key, hostname, OS.

·         (4) Click on one of the buttons:

o     to modify the cluster configuration and/or re-apply it. This opens the cluster configuration wizard and loads the cluster configuration from the connection node.

o     to download the cluster configuration in XML format from the connection node.

o     to unconfigure the cluster on one or more nodes

3.3             Configure a module

Once the cluster has been set up, you can configure a new module on the cluster. The module configuration home page is accessible :

·         Directly via the URL http://host:9010/console/en/configuration/modules

Or

·         By navigating the console via Configuration/Modules configuration

If no module has been configured, the console automatically presents the page for configuring a “New module”.

Sous-titres contour

For module configuration examples refer to section 15.

3.3.1         Select the new module to configure

In this example, the console is loaded from 10.0.0.107, which corresponds to node1 in the existing cluster. This is the connection node.

SafeKit web console image
Select a new module to configure from the configuration home page

·         (1) Click on Configuration in the navigation sidebar

·         (2) Click on “Modules configuration tab

·         (3) Click on “New Module

The page proposes to select a new module among several proposals visible by clicking on :

o    the “Main modules”, including the generic mirror.safe (refer to section 15.1.2) and farm.safe (refer to section 15.2.2) modules for integrating a new application into a mirror or farm architecture.

Here are the modules stored on the connection node, node1, under SAFE/Application_Modules/generic, SAFE/Application_Modules/demo and SAFE/Application_Modules/published.

o    Backup modules archived on the connection node, which are saved when a module is uninstalled on this node.

They are loaded from node1 under SAFE/Application_Modules/backup.

o    Other modules which are examples of SafeKit features used in modules supplied for testing purposes only. Refer to section 15 for the description some of them.

They are loaded from node1 under SAFE/Application_Modules/other.

o    A locally stored module accessible from “Upload module”.

This feature can be used to configure a module for a given application (e.g., Microsoft SQL Server, PostgreSQL…) downloaded from one of the SafeKit quick installation guides.

·         (4) Select a module to configure from those listed above. In the example, mirror.safe.

·         (5) Click on the button Configure the new module.

·         A dialog opens to give the new module name

SafeKit web console image
Enter the name of the new module

·         (6) Enter the name of the new module.

·         (7) Click on “Confirm

The module configuration wizard is opened. This is described below.

3.3.2         Module configuration wizard

The module configuration wizard is a step-by-step guided form.:

SafeKit web console image
The module configuration wizard

1.    Edit module configuration described in section 3.3.2.1

2.    Edit module scripts (Optional) described in section 3.3.2.2

3.    Enable communication encryption (Optional) described in section 3.3.2.3

4.    Save and apply described in section 3.3.2.4

5.    Check result described in section 3.3.2.5

6.     to “Exit module configuration wizard

Note that module reconfiguration can only be applied to nodes on which the module in question is not started. Therefore, stop the module before starting the configuration wizard.

Commentaire important contour

If needed, you can reapply the module configuration on all nodes without modifying it.

3.3.2.1      Edit module configuration

Below is an example of editing the mirror.safe module configuration.

SafeKit web console image
Edit the module configuration using the module configuration wizard

·         (1) Fill in the form to assign values to the various components, add or remove them. Click on  to open the detailed panel for each component.

This form is used to enter only the main module configuration parameters.

Commentaire, ajouter contour

The names of the “Heartbeat networks proposed are the names of the lans entered during cluster configuration.

·         (2) For advanced module configuration, exhaustive compared to the form, click on “Advanced configuration”. This switches to editing the module configuration file in XML format, userconfig.xml.

Click on  to open the SafeKit User’s Guide describing the configuration of the various components in the userconfig.xml file.

·         If necessary, click on “Reload to discard your modifications and reload the complete original configuration (including scripts if these were modified in the next step).

·         (3) Once you have finished editing the module configuration, click on “Next step”.

 

Sous-titres contour

For examples of mirror module configuration, refer to section 15.1.2 ; of farm module configuration, refer to section 15.2.2.

3.3.2.2      Edit module scripts

Below is an example of editing the mirror.safe module scripts.

SafeKit web console image
Edit the module scripts using the module configuration wizard

·         (1) Click on “start_prim or “stop_prim to edit it and insert your application start/stop.

Click on  to copy the content and edit it with your favorite syntax editor. Once done, paste the modified content into the input field with .

·         (2) If necessary, click on “Advanced configuration” to list the other module’s scripts and edit them (prestart, poststop, scripts for checkers…).

·         Click on  to open the SafeKit User’s Guide describing the module scripts.

·         If necessary, click on “Reload to discard your modifications and reload the complete original configuration (including the module configuration if it was modified in the previous step).

·         (3) Once you have finished editing the module scripts, click on “Next step”.

 

Sous-titres contour

For examples of mirror module scripts, refer to section 15.1.3 ; of farm module scripts, refer to section 15.2.3.

3.3.2.3      Enable communication encryption

Encryption of internal module communications between cluster nodes is enabled by default. For details, see section 10.6.

SafeKit web console image
Enable/disable encryption of module communications using the module configuration wizard

·         (1) Click “Enable to enable or disable encryption of module communications.

Commentaire important contour

When the module's encryption key is not identical on all nodes, internal communication is impossible. The configuration must be reapplied to all nodes to propagate the same key.

 

To generate new encryption keys, you need to:

1.    disable encryption, then “Save and apply configuration to all nodes

2.    enable encryption, then “Save and apply configuration to all nodes

·         If necessary, click on “Reload to discard your modifications and reload the complete original configuration (including the module configuration and scripts if these were modified in the previous steps).

·         (2) Once this step is complete, click on “Next step”.

3.3.2.4      Save and apply

Step to select the nodes affected by the configuration.

SafeKit web console image
Select the nodes on which to save and apply the module configuration using the module configuration wizard

·         (1) Check/uncheck to select/unselect nodes. Please note that the connection node (node1 in the example) is mandatory.

There are 2 cases where “Save and Apply is disabled:

SafeKit web console image
Situation where the node cannot be selected, in the module configuration wizard, because the module is running

The module on the selected node is started and, in a state, other than STOP (NotReady).

SafeKit web console image
Situation where the node cannot be selected, iin the module configuration wizard, because the node is not responding

There was no reply from the node within the timeout delay. It may be a bad address, a network or host failure, a bad configuration of the web browser or the firewall, the stop of the SafeKit web service on the node. For solving the problem, refer to the section 7.1.

In both cases, uncheck the node or click on “Save and check to apply it later, after stopping the module or solving the communication problem.

·         (2) Click on “Save and check to save the edited configuration on the connection node and check its consistency. It then proceeds to the next step to display the result of this operation.

Once this operation has been completed, any changes are saved on the connection node. The configuration wizard can be exited and relaunched later to apply the saved configuration. Until the saved configuration is applied, the last applied configuration of the module remains active.

·         (3) Click on “Save and apply to save and apply the edited configuration on selected nodes. It then proceeds to the next step to display the result of this operation.

If this operation is successful, the applied configuration becomes the active one for the module.

Commentaire important contour

On the server side, the module configuration is saved under SAFE/modules/AM(where AM is the module name). When reconfiguring a module, this directory is deleted and overwritten with the changes made in the console. Thus, on the servers’ side, you must close all editors, file explorers, shells or cmd under SAFE/modules/AM before applying the configuration (otherwise there is a risk that the apply fails).

3.3.2.5      Check result

The example below shows the result of the “Save and Apply operation. The layout for “Save and Verify is similar.

SafeKit web console image
Check the module configuration results on the nodes using the module configuration wizard

·         (1) Read the result of the operation on each node:

o    Success means the operation was successful.

o    Failure means the operation has failed.

Click to read the output of commands executed on the node and search for the error.  You may need to modify the parameters entered or connect to the node to correct the problem. Once the error has been corrected, repeat the operation from the previous step.

·         (2) Click on “Monitor modules to exit the module configuration wizard and navigate to modules monitoring.

Or

·         (3) Click on  to “Exit the module configuration wizard” and navigate to the modules configuration home page.

3.3.3         Modules configuration home page

Once the first module has been configured, the module configuration home page is available. It allows you to view the modules installed on the cluster and to access the configuration of a new module.

Open it:

·         Directly via the URL http://host:9010/console/en/configuration/modules

Or

·         By navigating the console via Configuration/Modules configuration

 

Commentaire important contour

Before each reconfiguration, deconfiguration and uninstallation, on each node, close all editors, file explorer, shells or cmd under SAFE/modules/AM (or risk the operation failing).

 

In the following example, the console is loaded from 10.0.0.107, which corresponds to node1 in the existing cluster. This is the connection node.

SafeKit web console image
Installed modules on the configuration home page

·         (1) Click on Configuration in the navigation sidebar.

·         (2) Click on “Modules configuration tab.

·         Modules installed on the cluster are listed with the date the configuration was applied and, if applicable, the date the configuration was saved but not yet applied.

·         (3) Click on one of the buttons associated with the module:

o     to modify its configuration or reapply its current configuration. This opens the module configuration wizard and loads its current configuration from the connection node.

o     to download the .safe, consisting of all module files (userconfig.xml, scripts) from the connection node.

o     to reconfigure the module from the contents of a locally stored .safe.

o     to restore a previous module configuration.

o    SafeKit keeps a copy of the last three successful configurations (stored under SAFE/modules/lastconfig on the server side). All module configuration files are packaged in a .safe file, whose name is of the type of AM_<date>_<time> (where AM is the module name).

o     to remove internal files for the module on one or more nodes, without uninstalling it. The user configuration files are kept for later re-application.

o     to completely uninstall the module on one or more nodes.

All module configuration files are packaged in a .safe file, which is archived on the server side under SAFE/Application_Modules/backup.

·         To configure a new module, click on “New module

3.3.4         Edit the module configuration locally and then apply it

You may prefer to use your favorite editor to modify the module’s configuration file and scripts or may need to add module scripts, such as custom checkers.

Follow the procedure below to modifye the module's configuration on your workstation and then apply it.

SafeKit web console image
Download the module configuration to update it locally, then upload and apply it from the modules configuration home page

·         (1) Click on Configuration in the navigation sidebar.

·         (2) Click on “Modules configuration tab.

·         (3) Click on  to download the mirror.safe on your workstation.

·         (4) Extract the content of mirror.safe, that is a zip file, to edit userconfig.xml, add/delete/edit module scripts into the bin directory (add a custom checker for instance).

·         (5) Compress the modified directory into xx.safe (or xx.zip) then upload it with  (.safe and .zip extension are accepted).

SafeKit web console image
Select a local file, with extension .safe or .zip, to upload a module configuration using the module configuration home page

·         (6) Click on  to select the file to be uploaded then “Confirm”.

 

The module configuration wizard is launched with the contents of this file. The new contents are visible into the wizard. Got to step 4 to “Save and apply” this new configuration.

3.4             Monitor a module

Once a module is configured, you can monitor its state and run actions on it (start, stop…).

The modules monitoring home page is accessible :

·         Directly via http://host:9010/console/en/monitoring

Or

·         By navigating the console via Monitoring

3.4.1         Monitoring home page

In this example, the console is loaded from 10.0.0.107, which corresponds to node1 in the existing cluster. This is the connection node. Two modules are configured: farm and mirror.

SafeKit web console image
Modules monitoring home page

·         (1) Click on Monitoring in the navigation sidebar

For each installed module, it displays:

o    the module name and nodes name on which it is installed

o    the module state on the node

o    a notification on state change if the user has allowed them, and the URL is https or http://localhost

For a description, see section 3.4.2.

·         (2) Click on to open the menu of global actions (start, stop…) on the module that apply on all nodes (node1, node2 in the example).

For a description, see section 3.4.3.1.

·         (3) Click on to open menu of actions (start, stop…) on the module that applies only to the node (node1 in the example).

For a description, see section 3.4.3.2.

·         (4) Click on the node panel (mirror>node1 in the example) to open details for the module on this node (logs, resources…). Since SafeKit 8.2.2, Click instead on Une image contenant noir, obscurité

Description générée automatiquement to open/close the details.

For a description, see section 3.4.4.

·         (5) Click on Une image contenant noir, obscurité

Description générée automatiquement to open/close the module states timeline on all nodes where it is installed. Available since SafeKit 8.2.2.

For a description, see section 3.4.5.

3.4.2         Module state

The module is represented real-time display of  its synthetic and detailed states on the left and right panels.

3.4.2.1      Synthetic state

The console displays one of the following synthetic states for the module on the node:

STOP (NotReady)(red)                    Module stopped (ready for starting)

WAIT (Transient)(orange)             Transient state of the module

ALONE (Transient)(orange)       Transient state of a mirror module, primary without secondary

ALONE (Ready)(green)                   Stable state of a mirror module, primary without secondary

PRIM (Transient)(orange)             Transient state of a mirror module, primary with secondary

PRIM (Ready)(green)                     Stable state of a mirror module, primary with secondary

SECOND (Transient)(orange)          Transient state of a mirror module, secondary         
with  primary
, during the synchronization of
replicated directories

SECOND (Ready)(green)                 Stable state of a mirror module, secondary with primary

UP (Transient)(orange)                 Transient state of a farm module

UP (Ready)(green)                        Stable state of a farm module

WAIT (NotReady) (red)                   Blocked state of the module, waiting for one or more resources

NOT CONFIGURED (grey)                    Installed module but not configured

ERROR (red)                                                  The node did not respond within the given time limit.

This may be due to an incorrect address, a network or server failure, a misconfigured web browser or firewall, or the SafeKit web service being stopped on the node (see section 7.1). It may also be due to the temporary unavailability of the connection node. In this case, reload the console from another SafeKit node.

 

For details on state changes of a mirror module, see section 5.2.

For details on state changes of a farm module, see section 6.2.

 

3.4.2.2      Detailed state

It is the state of the main resources or failover rules.

uptodate             Replicated directories of the mirror module are uptodate

             Replicated directories of the mirror module are not uptodate

not uptodate

             The mirror module is in degraded mode described in section 7.6

 degraded          

50%, 100%             The network load share of the farm module (e.g. 50% or 100% with 2 nodes)

              No load share taken by the farm module

   0%

              The module applied the failover rule (e.g., the rule named

c_checkfile         c_checkfile) which triggers the actions restart, stop, stopstart, or wait on the module due to a resource going down. View section 13.18.4.2 for details on failover rules. To analyze the issue, read the logs and resource statuses as described below.

              The module is in state ERROR (red)

connection       The node did not respond within the given time limit                                                     

 error

3.4.3         Module control menus

3.4.3.1      Global menu

The actions of global menu apply to all nodes where the module is configured.

In the example below, actions apply to the module mirror on node1 and node2.

SafeKit web console image
Start or stop the module on all nodes, download all module snapshots or logs, using the global menu of the module

·         (1) Click on to open the module's global actions menu.

·         Click on “Start” to start the module on all nodes.

For mirror module, the node with the up-to-date replicated data is started as primary.

·         Click on “Stop” to stop the module on all nodes.

For mirror module, the node that is secondary is stopped first to avoid unnecessary failover.

·         Click on “Debug” for debug and support as described in section 3.5.

3.4.3.2      Local menu

The actions of local menu apply only to the selected node.

3.4.3.2.1                Control a mirror module

In the example below, actions apply to the module mirror on node1.

SafeKit web console image
Actions using the local menu of the mirror module on the selected  node:
start, stop or restart  the module
force start as primary, as secondary, as secondary with full data synchronization

·         (1) Click on to open module's local actions menu on the desired node (e.g. node1).

·         Click on “Start” to start the module on the node.

For mirror module, the node is started as primary when replicated data are up-to-date. Otherwise, it is started as secondary. For details, see section 5.5.

·         Click on “Stop” to stop the module on the node.

·         Click on “Restart” to restart the module on the node.

It only executes only stop then start scripts to locally restart the application without leading to a failover.

·         Use “Force start submenu when you need to decide if the node should start primary or secondary:

o    Select “Force start As Primary to force the module to start as primary on this node.

For instance, on the 1st start of a mirror module as described in section 5.3, you must “Force start As primary the node which has the up-to-date replicated folders.

o    Select “Force Start As secondary to force the module to start as secondary on this node.

Data synchronization can be optimized based on the module's last internal state.

o    Select “Force Start As secondary with full data synchronization to start the module on this node as a secondary and to force a complete copy of the replicated data.

·         Click on “Disable/enable” to control error detection as described in section 3.4.3.2.3.

·         Click on “Debug to download module logs or snapshots from this node rather than from all nodes as described in section 3.5.

 

To understand and check the correct behavior of a mirror module, see section 5. To test it, see section 4.

3.4.3.2.2                Control farm module

In the example below, actions apply to the module farm on node2.

SafeKit web console image
Actions using the local menu of the farm module on the selected  node:
start, stop or restart  the module

·         (1) Click on to open module's local actions menu on the desired node (e.g. node2).

·         Click on “Start” to start the module on the node.

·         Click on “Stop” to stop the module on the node.

·         Click on “Restart” to restart the module on the node.

It only executes only stop then start scripts to restart the application without leading to a failover.

·         Click on “Disable/enable” to control error detection as described in section 3.4.3.2.3.