Overview.. 3

1... Technical overview.. 17

1.1.... Generalities, solutions, architectures. 17

1.1.1        Introduction to SafeKit 17

1.1.2        SafeKit solutions 17

1.1.3        SafeKit architectures 18

1.1.4        SafeKit cluster definition. 18

1.1.5        SafeKit module definition. 19

1.1.6        SafeKit limitations 19

1.2.... The SafeKit mirror cluster 20

1.2.1        Real time file replication and application failover 20

1.2.2        Step 1. Normal operation. 21

1.2.3        Step 2. Failover 21

1.2.4        Step 3. Failback and automatic resynchronization. 21

1.2.5        Step 4. Return to normal operation. 22

1.2.6        Synchronous replication versus asynchronous replication. 22

1.2.7        Behavior in case of network isolation. 22

1.2.8        3-node replication. 23

1.2.9        SafeKit on a single node to protect against software failures 23

1.3.... The SafeKit farm cluster 24

1.3.1        Network load balancing and application failover 24

1.3.2        Principle of a virtual IP address with network load balancing. 24

1.3.3        Load balancing for stateful or stateless web services 24

1.3.4        Chain high availability solution in a farm.. 25

1.4.... Clusters running several modules. 25

1.4.1        The SafeKit farm+mirror cluster 25

1.4.2        The SafeKit active/active cluster with replication. 25

1.4.3        The SafeKit N-1 cluster 26

1.5.... The SafeKit Hyper-V or KVM cluster 27

1.5.1        Load balancing, replication, failover of entire virtual machines 27

1.6.... SafeKit clusters in the cloud. 27

1.6.1        Mirror cluster in Azure, AWS and GCP. 27

1.6.2        Farm cluster in Azure, AWS and GCP. 28

2... Installation. 31

2.1.... SafeKit install 31

2.1.1        Download the package. 31

2.1.2        Installation directories and disk space provisioning. 31

2.1.3        SafeKit install procedure. 32

2.1.4        Use the SafeKit web console or command line interface. 34

2.1.5        SafeKit license keys 35

2.1.6        System specific procedures and characteristics 36

2.2.... Mirror installation recommendation. 36

2.2.1        Hardware prerequisites 37

2.2.2        Network prerequisites 37

2.2.3        Application prerequisites 37

2.2.4        File replication prerequisites 37

2.3.... Farm installation recommendation. 37

2.3.1        Hardware prerequisites 37

2.3.2        Network prerequisites 37

2.3.3        Application prerequisites 38

2.4.... SafeKit upgrade. 38

2.4.1        Prepare the upgrade. 38

2.4.2        Uninstall procedure. 38

2.4.3        Reinstall and postinstall procedure. 39

2.5.... SafeKit full uninstall 41

2.5.1        Uninstall on Windows as administrator 41

2.5.2        Uninstall on Linux as root 41

2.6.... SafeKit documentation. 41

3... The SafeKit web console. 43

3.1.... Start the web console. 43

3.1.1        Start a web browser 43

3.1.2        Connect to a SafeKit node. 44

3.1.3        List of connection nodes 45

3.2.... Configure the cluster 46

3.2.1        Cluster configuration wizard. 46

3.2.2        Cluster configuration home page. 49

3.3.... Configure a module. 50

3.3.1        Select the new module to configure. 51

3.3.2        Module configuration wizard. 52

3.3.3        Modules configuration home page. 57

3.3.4        Edit the module configuration locally and then apply it 59

3.4.... Monitor a module. 60

3.4.1        Monitoring home page. 60

3.4.2        Module state. 61

3.4.3        Module control menus 63

3.4.4        Module details 66

3.4.5        Module states timeline. 71

3.5.... Snapshots or logs of module for debug and support 72

3.6.... Secure access to the web console. 73

4... Tests. 75

4.1.... Installation and tests after boot 75

4.1.1        Test package installation. 75

4.1.2        Test license and version. 76

4.1.3        Test SafeKit services and modules after boot 76

4.1.4        Test start of SafeKit web console. 78

4.2.... Tests of a mirror module. 79

4.2.1        Test first start of a mirror module on 2 servers STOP (NotReady). 79

4.2.2        Test start of a mirror module on 2 servers STOP (NotReady). 79

4.2.3        Test stop of a mirror module on the server PRIM (Ready). 79

4.2.4        Test start of a mirror module on the server STOP (NotReady). 80

4.2.5        Test restart of a mirror module on the server PRIM (Ready). 80

4.2.6        Test virtual IP address of a mirror module. 80

4.2.7        Test file replication of a mirror module. 81

4.2.8        Test shutdown of the server PRIM (Ready). 82

4.2.9        Test power-off of the server PRIM (Ready). 83

4.2.10      Test split-brain with a mirror module. 83

4.2.11      Continue your mirror module tests with checkers 84

4.3.... Tests of a farm module. 85

4.3.1        Test start of a farm module on all servers STOP (NotReady). 85

4.3.2        Test stop of a farm module on one server UP (Ready). 85

4.3.3        Test restart of a farm module on one server UP(Ready). 85

4.3.4        Test virtual IP address of a farm module. 85

4.3.5        Test TCP load balancing on a virtual IP address 87

4.3.6        Test split-brain with a farm module. 88

4.3.7        Test compatibility of the network with invisible MAC address (vmac_invisible) 89

4.3.8        Test shutdown of a server UP (Ready). 90

4.3.9        Test power-off of a server UP (Ready). 91

4.3.10      Continue your farm module tests with checkers 91

4.4.... Tests of checkers common to mirror and farm.. 91

4.4.1        Test <errd> checker with action restart or stopstart 91

4.4.2        Test <tcp> checker with action restart or stopstart 92

4.4.3        Test <tcp> checker with action wait 93

4.4.4        Test <interface check="on"> with action wait 94

4.4.5        Test <ping> checker with action wait 94

4.4.6        Test <module> checker with action wait 95

4.4.7        Test <custom> checker with action wait 96

4.4.8        Test <custom> checker with action restart or stopstart 97

5... Mirror module administration. 101

5.1.... Operating mode of a mirror module. 101

5.2.... State automaton of a mirror module (STOP, WAIT, ALONE, PRIM, SECOND - NotReady, Transient, Ready) 103

5.3.... First start-up of a mirror module (safekit prim command) 104

5.4.... Different reintegration cases (use of bitmaps) 105

5.5.... Start-up of a mirror module with the up-to-date data  STOP (NotReady) - WAIT (NotReady). 106

5.6.... Degraded replication mode (ALONE (Ready) degraded) 107

5.7.... Automatic or manual failover 108

5.8.... Default primary server (automatic swap after reintegration) 110

5.9.... Prim command fails: why? (safekit primforce command) 111

6... Farm module administration. 113

6.1.... Operating mode of a farm module. 113

6.2.... State automaton of a farm module (STOP, WAIT, UP - NotReady, Transient, Ready) 114

6.3.... Start-up of a farm module. 115

7... Troubleshooting. 117

7.1.... Connection issues with the web console. 117

7.1.1        Browser check. 117

7.1.2        Browser state clear 118

7.1.3        Server check. 118

7.2.... Connection issues with the HTTPS web console. 118

7.2.1        Check server certificates 119

7.2.2        Check certificates installed in SafeKit 120

7.2.3        Revert to HTTP configuration. 121

7.3.... How to read logs and resources of the module?. 121

7.4.... How to read the commands log of the server?. 122

7.5.... Stable module  (Ready) and (Ready). 122

7.6.... Degraded module (Ready)and /(NotReady). 122

7.7.... Out of service module /(NotReady) and /(NotReady). 122

7.8.... Module  STOP (NotReady): start the module. 123

7.9.... Module WAIT (NotReady): repair the resource="down" 123

7.10.. Module oscillating from  (Ready) to  (Transient). 124

7.11.. Message on stop after maxloop. 125

7.12.. Module  (Ready) but non-operational application. 125

7.13.. Mirror module ALONE (Ready) - WAIT/STOP (NotReady) 126

7.14.. Farm module UP(Ready)but problem of load balancing in a farm.. 127

7.14.1      Reported network load share are not coherent 127

7.14.2      virtual IP address does not respond properly. 127

7.15.. Problem with the virtual IP after failover 128

7.16.. Problem after Boot 129

7.17.. Analysis from snapshots of the module. 129

7.17.1      Module configuration files 130

7.17.2      Module dump files 130

7.18.. Problem with the size of SafeKit databases. 133

7.19.. Problem for retrieving the certification authority certificate from an external PKI 134

7.19.1      Export CA certificate(s) from public certificates 134

7.20.. Issue with email sending by the SafeKit notification agent 137

7.20.1      Failed to read or parse the configuration file. 137

7.20.2      Curl errors 138

7.21.. Still in Trouble. 139

8... Access to Evidian support 141

8.1.... Home page of support site. 141

8.2.... Permanent license keys. 142

8.3.... Create an account 143

8.4.... Access to your account 143

8.5.... Call desk to open a trouble ticket 144

8.5.1        Call desk operations 144

8.5.2        Create a call 144

8.5.3        Attach the snapshots 145

8.5.4        Answers to a call and exchange with support 146

8.6.... Download and upload area. 147

8.6.1        Two areas of download and upload. 147

8.6.2        Product download area. 147

8.6.3        Private upload area. 148

8.7.... Knowledge base. 148

9... Command line interface. 149

9.1.... Commands to control and setup SafeKit 149

9.1.1        safeadmin service. 149

9.1.2        safewebserver service. 150

9.1.3        Email notification agent 151

9.1.4        SNMP service. 151

9.2.... Command lines to configure and monitor the cluster 152

9.3.... Command lines to control modules. 154

9.4.... Command lines to monitor modules. 155

9.5.... Command lines to configure modules. 156

9.6.... Command lines for support 158

9.7.... Command lines during the maintenance of the module application. 159

9.7.1        Module control for maintenance. 159

9.7.2        Running the application without the module. 161

9.8.... Command lines distributed across multiple SafeKit servers. 161

9.9.... Examples. 163

9.9.1        Local and distributed command. 163

9.9.2        Cluster configuration with command line. 163

9.9.3        Module configuration with command line. 163

9.9.4        Module snapshot with command line. 164

10. Advanced administration and setup. 165

10.1.. SafeKit environment variables and directories. 165

10.1.1      Global 165

10.1.2      Module. 165

10.2.. SafeKit services and daemons. 167

10.2.1      SafeKit services 167

10.2.2      SafeKit daemons per module. 167

10.3.. Firewall settings. 168

10.3.1      Firewall settings in Linux. 168

10.3.2      Firewall settings in Windows 169

10.3.3      Other firewalls 170

10.4.. Boot and shutdown setup in Windows. 173

10.4.1      Automatic procedure. 173

10.4.2      Manual procedure. 173

10.5.. Linux Secure boot settings for SafeKit kernel modules. 174

10.6.. Antivirus settings. 175

10.7.. Encryption of module communications 175

10.7.1      Configuration with the SafeKit Web console. 176

10.7.2      Configuration with the Command Line Interface. 176

10.7.3      Advanced configuration. 177

10.8.. SafeKit web service settings. 178

10.8.1      Configuration files 178

10.8.2      Connection ports configuration. 180

10.8.3      HTTP/HTTPS and user authentication configuration. 180

10.8.4      SafeKit API 181

10.9.. SafeKit email notification agent 181

10.9.1      SafeKit notification agent configuration. 182

10.9.2      SMTP client credentials setup for authentication. 183

10.9.3      Email sending test 183

10.9.4      SafeKit notification agent activation. 184

10.10 SNMP monitoring. 184

10.10.1     SNMP monitoring in Windows 184

10.10.2     SNMP monitoring in Linux. 185

10.10.3     The SafeKit MIB. 185

10.11 Commands log of the SafeKit server 186

10.12 SafeKit log messages in system log. 187

11. Securing the SafeKit web service. 189

11.1.. Overview. 189

11.1.1      Default setup. 190

11.1.2      Predefined setups 190

11.2.. HTTP setup. 191

11.2.1      Default setup. 191

11.2.2      Unsecure setup based on identical role for all 193

11.3.. HTTPS setup. 194

11.3.1      HTTPS setup using the SafeKit PKI 195

11.3.2      HTTPS setup using an external PKI 202

11.4.. User authentication setup. 206

11.4.1      File-based authentication setup. 206

11.4.2      LDAP/AD authentication setup. 209

11.4.3      OpenID authentication setup. 211

12. Cluster.xml for the SafeKit cluster configuration. 215

12.1.. Cluster.xml file. 215

12.1.1      Cluster.xml example. 215

12.1.2      Cluster.xml syntax. 216

12.1.3      <lans>, <lan>, <node> attributes 216

12.2.. SafeKit cluster Configuration. 218

12.2.1      Configuration with the SafeKit web console. 218

12.2.2      Configuration with command line. 219

12.2.3      Configuration changes 219

13. Userconfig.xml for a module configuration. 221

13.1.. Macro definition - <macro>. 222

13.1.1      <macro> example. 222

13.1.2      <macro> syntax. 222

13.1.3      <macro> attributes 222

13.2.. Farm or mirror module - <service>. 223

13.2.1      <service> example. 223

13.2.2      <service> syntax. 223

13.2.3      <service> attributes 223

13.3.. Heartbeats  - <heart>, <heartbeat >. 226

13.3.1      <heart> example. 226

13.3.2      <heart> syntax. 226

13.3.3      <heart>, <heartbeat > attributes 227

13.4.. Farm topology - <farm>, <lan>. 228

13.4.1      <farm> example. 228

13.4.2      <farm> syntax. 228

13.4.3      <farm>, <lan> attributes 229

13.5.. Virtual IP address - <vip>. 229

13.5.1      <vip> example in a mirror module. 230

13.5.2      <vip> example in a farm module. 230

13.5.3      Alternative to <vip> for servers in different networks 230

13.5.4      <vip> syntax. 231

13.5.5      <vip><interface_list>, <interface>, <virtual_interface>, <real_interface>, <virtual_addr> attributes 232

13.5.6      <loadbalancing_list>, <group>, <cluster>, <host> attributes 236

13.5.7      <vip> Load balancing description. 237

13.6.. File replication - <rfs>, <replicated>. 238

13.6.1      <rfs> example. 238

13.6.2      <rfs> syntax. 239

13.6.3      <rfs>, <replicated> attributes 239

13.6.4      <rfs> description. 247

13.7.. Enable module scripts - <user>, <var>. 256

13.7.1      <user> example. 256

13.7.2      <user> syntax. 256

13.7.3      <user>, <var> attributes 256

13.8.. Virtual hostname - <vhost>, <virtualhostname>. 257

13.8.1      <vhost> example. 257

13.8.2      <vhost> syntax. 257

13.8.3      <vhost>, <virtualhostname> attributes 258

13.8.4      <vhost> description. 258

13.9.. Process or service monitoring - <errd>, <proc>. 259

13.9.1      <errd> example. 259

13.9.2      <errd> syntax. 259

13.9.3      <errd>, <proc> attributes 260

13.9.4      <errd> commands 263

13.10 Checkers - <check>. 265

13.10.1     <check> example. 265

13.10.2     <check> syntax. 265

13.10.3     <checker> description. 266

13.11 TCP checker - <tcp>. 269

13.11.1     <tcp> example. 269

13.11.2     <tcp> syntax. 270

13.11.3     <tcp> attributes 270

13.12 Ping checker - <ping>. 272

13.12.1     <ping> example. 272

13.12.2     <ping> syntax. 272

13.12.3     <ping> attributes 272

13.13 Interface checker - <intf>. 274

13.13.1     <intf> example. 274

13.13.2     <intf> syntax. 274

13.13.3     <intf> attributes 275

13.14 IP checker - <ip>. 275

13.14.1     <ip> example. 275

13.14.2     <ip> syntax. 276

13.14.3     <ip> attributes 276

13.15 Custom checker - <custom>. 277

13.15.1     <custom> example. 277

13.15.2     <custom> syntax. 278

13.15.3     <custom> attributes 278

13.16 Module checker - <module>. 279

13.16.1     <module> example. 280

13.16.2     <module> syntax. 280

13.16.3     <module> attributes 281

13.17 Splitbrain checker - <splitbrain>. 281

13.17.1     <splitbrain> example. 282

13.17.2     <splitbrain> syntax. 282

13.17.3     <splitbrain> attributes 283

13.18 Failover machine - <failover>. 283

13.18.1     <failover> example. 284

13.18.2     <failover> syntax. 285

13.18.3     <failover> attributes 285

13.18.4     <failover> description. 285

14. Scripts for a module configuration. 289

14.1.. List of scripts. 289

14.1.1      Start/stop scripts 289

14.1.2      Other scripts 290

14.2.. Variables and arguments passed to scripts. 290

14.3.. Scripts output 291

14.3.1      Output into script log. 291

14.3.2      Output into module log. 291

14.4.. Scripts execution automaton. 292

14.5.. SafeKit special commands for scripts. 293

14.5.1      Commands for Windows 294

14.5.2      Commands for Linux. 294

14.5.3      Commands for Windows and Linux. 295

15. Examples of module configurations. 297

15.1.. Mirror module example with mirror.safe. 297

15.1.1      Cluster configuration with two networks 298

15.1.2      Mirror module configurations 298

15.1.3      Mirror Module scripts 301

15.2.. Farm module example with farm.safe. 303

15.2.1      Cluster configuration with three nodes 303

15.2.2      Farm module configurations 304

15.2.3      Farm module scripts 309

15.3.. Macro and script variables example with hyperv.safe. 311

15.3.1      Module configuration with macros and var 311

15.3.2      Module scripts with var 312

15.4.. Process monitoring example with softerrd.safe. 312

15.4.1      Module configuration with process monitoring. 312

15.4.2      Advanced configuration of module scripts 314

15.5.. TCP checker example. 316

15.6.. Ping checker example. 317

15.7.. Custom checker example with customchecker.safe. 319

15.7.1      Module configuration with custom checker 319

15.7.2      Advanced configuration of module checker script 321

15.8.. Split-brain checker example. 322

15.9.. Module checker examples. 323

15.9.1      Example of a farm module depending on a mirror module. 323

15.9.2      Example with leader.safe and follower.safe. 325

15.10 Interface checker example. 325

15.11 IP checker example. 326

15.12 Virtual hostname example with vhost.safe. 327

15.12.1     Module configuration with a virtual hostname. 327

15.12.2     Module scripts with a virtual hostname. 328

16. SafeKit cluster in the cloud. 331

16.1.. SafeKit cluster in Amazon AWS. 331

16.1.1      Mirror cluster in AWS. 332

16.1.2      Farm cluster in AWS. 333

16.2.. SafeKit cluster in Microsoft Azure. 334

16.2.1      Mirror cluster in Azure. 335

16.2.2      Farm cluster in Azure. 337

16.3.. SafeKit cluster in Google GCP. 338

16.3.1      Mirror cluster in GCP. 339

16.3.2      Farm cluster in GCP. 340

17. Third-Party Software. 343

Log Messages Index. 347

Index. 351

 

pdf version