SKYBOX APPLIANCE HIGH AVAILABILITY INSTALLATION GUIDE Version 7 Contents Change Log ..................................................................................................................................................................... 3 Overview of the Skybox HA Module .......................................................................................................................... 4 What is the Skybox High Availability Module? ..................................................................................................... 4 How Does the HA Module Work? ............................................................................................................................ 4 NAT/Port Forwarding ................................................................................................................................................. 5 Load Balancing with Skybox HA ............................................................................................................................. 6 Failover Scenario Explained ..................................................................................................................................... 7 Before Installation ......................................................................................................................................................... 8 Required Dependencies/Setup: .............................................................................................................................. 8 Using Yum to Verify Installed Package ................................................................................................................. 8 Using Yum to Install Missing Packages ................................................................................................................. 9 Configuring sudo for ‘skyboxview’ ....................................................................................................................... 10 Setting up SSH Trust Between Servers .............................................................................................................. 11 Installing the Skybox HA Module ............................................................................................................................. 13 Extracting the Module Files .................................................................................................................................... 13 Configuring INCRON ................................................................................................................................................ 14 Server Configuration Files & Scripts .................................................................................................................... 15 Setting Up HA ............................................................................................................................................................... 17 Setting Up Secondary Server ................................................................................................................................ 17 Setting Up Primary Server ..................................................................................................................................... 20 Failover Recovery..................................................................................................................................................... 23 Updating to a New Version of HA ......................................................................................................................... 24 Troubleshooting ........................................................................................................................................................... 25 Primary Not Syncing to Secondary ...................................................................................................................... 25 INCRON Not Starting at Boot ................................................................................................................................ 26 Model Copies to Secondary, but doesn’t load .................................................................................................... 26 REST Ping fails due to curl error ........................................................................................................................... 27 Firewall configuration reset back to defaults ..................................................................................................... 27 FAQ – Frequently Asked Questions ......................................................................................................................... 28 Appendix ........................................................................................................................................................................ 30 ISO 10.0.203-7.1.8 Packages ............................................................................................................................... 30 skyboxsecurity.com 2 CHANGE LOG Author Nilesh Bharakhada Version Description 1.5 Latest version in PDF format 7.01 Updated Document to same version as HA release Updated documentation to reflect script changes Re-written sections to make guide more readable Re-ordered section to read more orderly Removed section ‘Disabling Firewalld on CentOS 7’ Nilesh Bharakhada 7.03 Collection of scripts renamed from HA to DR (disaster recover). Document has been updated to reflect this. Added firewalld & iptables output to confirm firewall configured correctly Nilesh Bharakhada 7.05 Naming convention reverted to HA Added TCP443 as a dependency skyboxsecurity.com 3 OVERVIEW OF THE SKYBOX HA MODULE What is the Skybox High Availability Module? The Skybox High Availability (HA) Module is a feature that allows the Skybox application to minimize downtime while reliably keeping any data and tasks intact and operational. How Does the HA Module Work? The HA Module is based on an Active/Passive model. The Secondary server monitors the connection with the Primary server via SOAP requests. If the Primary server is up and operational, data will be synchronized from the Primary to the Secondary server. See Figure 1.1 Figure 1.1 - Active/Passive Model 3 skyboxsecurity.com 4 NAT/Port Forwarding The Skybox HA setup includes port forwarding for ease of use and setup for load balancing. The Skybox HA Module consists of the following: • Primary (active) server enables a firewall that will NAT all traffic incoming to port 443 and forwards it to its internal port 8443 (the listening port of Skybox) • Secondary (passive) server disables its firewall and performs no NAT translations. Requests on port 443 will be dropped until the Secondary detects the Primary server as unreachable. Figure 1.2 NAT/Port translation setup on Primary and Secondary servers skyboxsecurity.com 5 Load Balancing with Skybox HA The customer can take advantage of using the NAT/Port forwarding to enable a load-balancing setup. In order to achieve a load-balancing setup, the customer will need an external load-balancer. For example, on F5 load balancers, the customer can create a common VIP and create a pool with the Skybox servers. Since the HA Module uses a NAT rule to port forward all traffic received on port 443 to its internal port 8443, but ONLY on the Active server, a health monitor can be created to check if a server is responding to HTTPS requests on port 443. The recommended health check is to use the Skybox native REST API health check call: https://<server_IP>:<port>/skybox/webservice/jaxrsinternal/internal/healthcheck/ping The load balancer will only forward to traffic to the server that is responding to the health check requests on port 443. If the Primary server is unreachable, the switch over will be activated and the F5 will detect the Secondary server as active and forward traffic to its port 443. See Figure 1.3 below for an example setup. Figure 1.3 – example of a load-balancer setup skyboxsecurity.com 6 Failover Scenario Explained When the Primary server stops responding according to the timeout thresholds configured, the Secondary server will initiate the failover and become Active (Primary). All tasks will then be operational on the Secondary server. When the Primary server comes back online, the Primary server will detect that the Secondary is active and remain as Passive. This is by design to allow customers/support personnel the time to troubleshoot and diagnose the reason for the Primary server’s connection issue. Once the customer/support personnel have completed their troubleshooting, the Secondary server can then be manually set back to Passive and the Primary server back to Active. Figure 1.4 – Failover scenario diagram skyboxsecurity.com 7 BEFORE INSTALLATION Before the installation of the Skybox Appliance HA package, certain modules and dependencies are required to be installed. *Note: All the below settings should be applied to both primary and secondary server Required Dependencies/Setup: a) Minimum supported for HA is Skybox version 10.0.203 (ISO) b) Ports 22/tcp, 8443/tcp and 443/tcp are opened between appliances in both directions (firewalls) c) Package: ‘sudo’ d) Package: ‘rsync’ e) Package: ‘python-suds’ f) Package: ‘incron’ The packages above have been checked and are available in the 10.0.203 ISO – See Appendix Using Yum to Verify Installed Package In order to check if the required packages (as listed above) are installed, use the ‘yum info <package name>’ command. The output should show an “installed” status for return Repo value. If the return value is “base”, then package isn’t installed. For example, for the python-suds package, would enter the following command in the command shell as the ‘root’ user: [root@skybox-pri /home/skyboxview]# yum info python-suds Loaded plugins: fastestmirror Determining fastest mirrors * base: centos.interhost.net.il * extras: centos.interhost.net.il * updates: mirror.muvhost.com Installed Packages Name : python-suds Arch : noarch Version : 0.4.1 Release : 5.el7 Size : 939 k Repo : installed From repo : anaconda Summary : A python SOAP client URL : https://fedorahosted.org/suds License : LGPLv3+ Description : The suds project is a python soap web services client lib. Suds leverages : python meta programming to provide an intuitive API for consuming web : services. Objectification of types defined in the WSDL is provided : without class generation. Programmers rarely need to read the WSDL since : services and WSDL based objects can be easily inspected. Alternatively, the command ‘rpm -qa <package name> can be used to check if the package has been installed. The output from the command will show the package version currently installed. [root@skybox-pri /opt/INIT]$ rpm -qa python-suds python-suds-0.4.1-5.el7.noarch Repeat the steps on the Secondary server. skyboxsecurity.com 8 Using Yum to Install Missing Packages To install a missing package, use the “yum install -y <package-name>” command root@skybox-pri /home/skyboxview]# yum install -y python-suds Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * base: centos.interhost.net.il * extras: centos.interhost.net.il * updates: mirror.muvhost.com base | 3.6 kB 00:00:00 extras | 2.9 kB 00:00:00 updates | 2.9 kB 00:00:00 (1/2): extras/7/x86_64/primary_db | 194 kB 00:00:00 (2/2): updates/7/x86_64/primary_db | 2.1 MB 00:00:04 Resolving Dependencies --> Running transaction check ---> Package python-suds.noarch 0:0.4.1-5.el7 will be installed --> Finished Dependency Resolution Dependencies Resolved ==================================================================================== Package Arch Version Repository Size ==================================================================================== Installing: python-suds noarch 0.4.1-5.el7 base 204 k Transaction Summary ==================================================================================== Install 1 Package Total download size: 204 k Installed size: 939 k Downloading packages: python-suds-0.4.1-5.el7.noarch.rpm | 204 kB 00:00:00 Running transaction check Running transaction test Transaction test succeeded Running transaction Installing: python-suds-0.4.1-5.el7.noarch Verifying: python-suds-0.4.1-5.el7.noarch 1/1 1/1 Installed: python-suds.noarch 0:0.4.1-5.el7 Complete! Repeat the steps on the Secondary server. skyboxsecurity.com 9 Configuring sudo for ‘skyboxview’ The account used to log into the HA will be ‘skyboxview’. In order to have the correct permissions, the account needs to run the HA module, the ‘skyboxview’ user will be defined as a sudoer. As the ‘root’ user, enter the following commands on the command line: a) Enter the command ‘visudo’ to edit the sudoer file. Use ONLY the ‘visudo’ command as it protects the correct format for the file: [root@skybox-pri /home/skyboxview]# visudo b) Uncomment the following line by removing the ‘#’ in front of it. Move the cursor in front of the ‘#’ mark and press the ‘x’ key to delete the sing character: Before: ## Same thing without a password # %wheel ALL=(ALL) NOPASSWD: ALL After: After ## Same thing without a password %wheel ALL=(ALL) NOPASSWD: ALL c) Save and quit by entering the following command. (*Note – this is lowercase, don’t use upper!): ESC : x (ESC refers to the Escape key) d) Add the “skyboxview” user to the “wheel” with the following command on the command line: [root@skybox-pri /home/skyboxview]# usermod -aG wheel skyboxview e) Verify that the “skyboxview” user is part of the wheel group with the following command on the command line. The ‘wheel’ group should be listed. See the highlight example below: [root@skybox-pri /home/skyboxview]# groups skyboxview skyboxview : skyboxview wheel Repeat the steps on the Secondary server. skyboxsecurity.com 10 Setting up SSH Trust Between Servers The Primary and Secondary servers require the ability to login to each other without the prompting of a password. To facilitate this need, the servers will use Trusted SSH keys between them in order to connect without prompting for credentials. a) Login as the skyboxview user. Use the following command on the command line: [root@skybox-pri /home/skyboxview]# su skyboxview b) First, create the ssh-keys by entering the “ssh-keygen -t dsa” command, leave the passphrase empty and use the default settings. This will create the public and private SSH keys used for the connection. [skyboxview@skybox-pri ~]$ ssh-keygen -t dsa Generating public/private dsa key pair. Enter file in which to save the key (/home/skyboxview/.ssh/id_dsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/skyboxview/.ssh/id_dsa. Your public key has been saved in /home/skyboxview/.ssh/id_dsa.pub. The key fingerprint is: SHA256:yo1Rfk8bZEg3Wqc0kfdUlSRSXFRdzdag6/maRE8BzGk skyboxview@skybox-pri The key's randomart image is: +---[DSA 1024]----+ | .+@***#| | . *EOo.B| | . o.=..+ | | o o . ..| | . S . = . | | . = . = * | | + . * . | | . o | | o.. | +----[SHA256]-----+ c) Next, copy the public key to the destination server that the host will connect to by entering the ‘ssh-copy-id -i ~/.ssh/id_dsa.pub skyboxview@<other IP or hostname>’ command. Enter ‘yes’ for the RSA key fingerprint, if prompted. Enter the ‘skyboxview’password when prompted. For example, the output below displays the output from issuing the command to connect to skybox-sec [skyboxview@skybox-pri ~]$ ssh-copy-id -i ~/.ssh/id_dsa.pub skyboxview@skybox-sec /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/skyboxview/.ssh/id_dsa.pub" /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys Authorized uses only. All activity may be monitored and reported. skyboxview@skybox-sec's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'skyboxview@skybox-sec'" and check to make sure that only the key(s) you wanted were added. skyboxsecurity.com 11 d) Verify that the ssh-keys were installed correctly by connecting via ssh to the destination server. There should NOT be prompt for a password if the ‘ssh-copy-id’ command was successfully installed: [skyboxview@skybox-pri ~]$ ssh skyboxview@skybox-sec Authorized uses only. All activity may be monitored and reported. Last failed login: Tue Jun 23 22:03:50 IDT 2020 from skybox-pri.il.skyboxsecurity.com on ssh:notty Last login: Tue Jun 23 17:49:11 2020 ======================================================= | | | ##### # # # # ###### ####### # # | | # # # # # # # # # # # # | | # # # # # # # # # # # | | ##### ### # ###### # # # | | # # # # # # # # # # | | # # # # # # # # # # # | | ##### # # # ###### ####### # # | | | ======================================================= APPLIANCE VERSION CORES RAM REDHAT RELEASE : : : : 10.1.205-7.1.135 4 32012 MB CentOS Linux release 7.7.1908 (Core) Access Skybox Web Admin at https://192.168.100.209:444 This option will place you in an un-restricted shell. It should be used only to set up the network connection according to the instructions in the Appliance Quick Start Guide. All other interactions with the Appliance should be done via the Web Administration. [skyboxview@skybox-sec ~] Repeat the steps on the Secondary server. skyboxsecurity.com 12 INSTALLING THE SKYBOX HA MODULE The Skybox HA module will be installed on both the Primary and Secondary servers in the ‘skyboxview’ home directory: /home/skyboxview/ The HA Scripts can be found on the PS EMEA Git Hub [https://dev.azure.com/SkyboxPS] Extracting the Module Files a) Copy over the ha_v*.zip file to the /home/skyboxview directory as the ‘skyboxview’ user (otherwise the file permissions will use ‘root’ instead of the ‘skyboxview’ user. Use whatever method you are comfortable with. For example, using scp: C:\SkyBox\Projects\HA>scp ha.zip skyboxview@skybox-pri:/home/skyboxview skyboxview@skybox-pri's password: ha.zip | 579 kB | 579.1 kB/s | ETA: 00:00:00 | 100% C:\SkyBox\Projects\HA> b) Extract the zip file into the ha directory: skyboxview@skybox-pri ~]$ unzip ha.zip Archive: ha.zip extracting: ha/0 inflating: ha/README.md inflating: ha/backupTicket.sh inflating: ha/conf_files_sync.sh inflating: ha/cron inflating: ha/etc/allow_rsync_policy.pp inflating: ha/etc/allow_rsync_policy.te < - - - OUTPUT OMMITED - - - > inflating: ha/loadticket.sh extracting: ha/log/0 inflating: ha/req.xml inflating: ha/setprimary.sh inflating: ha/setsecondary.sh inflating: ha/skybox_ha.conf inflating: ha/skybox_watchdog.sh extracting: ha/ticket_backups/0 inflating: ha/unsetha.sh c) Verify the file was unzipped by issuing the command ‘ll’ on the command line: [skyboxview@skybox-pri ~]$ ll total 584 drwxr-x---. 6 skyboxview skyboxview -rw-r--r--. 1 skyboxview skyboxview 4096 Jun 23 15:21 ha 2317409 Jun 23 15:14 ha.zip d) Perform the same setup on the remote server skyboxsecurity.com 13 Configuring INCRON The INCRON service monitors the directories that will be rsync’d from the Primary server to the Secondary server a) As the ‘root’ user, configure INCRON to start when the server boots: [root@skybox-pri ~]# systemctl enable incrond Created symlink from /etc/systemd/system/multi-user.target.wants/incrond.service to /usr/lib/systemd/system/incrond.service. b) Verify that INCRON was configured correctly [root@skybox-pri ~]# systemctl status incrond â— incrond.service - Inotify System Scheduler Loaded: loaded (/usr/lib/systemd/system/incrond.service; enabled; vendor preset: disabled) Active: inactive (dead) since Tue 2020-06-23 15:25:34 BST; 1s ago Main PID: 13768 (code=exited, status=0/SUCCESS) c) Start the INCRON service [root@skybox-pri skyboxview]# systemctl start incrond Repeat the steps on the Secondary server. skyboxsecurity.com 14 Server Configuration Files & Scripts The Skybox HA module uses a configuration file to pull the necessary information from. The file is the skybox_ha.conf file located in ‘/home/skyboxview/ha/’: (full path is ‘/home/skyboxview/ha/skybox_ha.conf’) and is used to configure the different options for the HA setup. The shell scripts located in the HA directory will need to have the execute bit set by the ‘skyboxview’ user to ensure that the scripts will run properly. a) As the ‘skyboxview’ user, enable the execute bit for all shell scripts by running the following command: [skyboxview@skybox-pri ~/ha]$ sudo chmod +x *.sh b) Verify that all the shell scripts have the execute bit set by using the following command: [skyboxview@skybox-pri ~/ha]$ ll total 108 -rw-r-----. 1 skyboxview skyboxview 0 Jun 23 14:14 0 -rwxr-x---. 1 skyboxview skyboxview 1282 Jun 23 14:14 backupTicket.sh -rwxr-x---. 1 skyboxview skyboxview 2275 Jun 23 14:14 conf_files_sync.sh -rw-r-----. 1 skyboxview skyboxview 265 Jun 23 14:14 cron drwxr-x---. 2 skyboxview skyboxview 4096 Jun 23 15:21 etc drwxr-x---. 2 skyboxview skyboxview 4096 Jun 23 15:21 firewall -rw-r-----. 1 skyboxview skyboxview 922 Jun 23 14:14 incron.primary -rwxr-x---. 1 skyboxview skyboxview 1316 Jun 23 14:14 loadmodel.sh -rwxr-x---. 1 skyboxview skyboxview 1153 Jun 23 14:14 loadticket.sh drwxr-x---. 2 skyboxview skyboxview 4096 Jun 23 15:21 log -rw-r-----. 1 skyboxview skyboxview 304 Jun 23 14:14 README.md -rw-r-----. 1 skyboxview skyboxview 3739 Jun 23 14:14 releasenotes.txt -rwxr-x---. 1 skyboxview skyboxview 827 Jun 23 14:14 replicate_fw_config.sh -rw-r-----. 1 skyboxview skyboxview 146 Jun 23 14:14 req.xml -rwxr-x---. 1 skyboxview skyboxview 12768 Jun 23 14:14 setprimary.sh -rwxr-x---. 1 skyboxview skyboxview 12826 Jun 23 14:14 setsecondary.sh -rw-r-----. 1 skyboxview skyboxview 1366 Jun 23 14:14 skybox_ha.conf -rwxr-x---. 1 skyboxview skyboxview 5862 Jun 23 14:14 skybox_watchdog.sh drwxr-x---. 2 skyboxview skyboxview 4096 Jun 23 15:21 ticket_backups -rwxr-x---. 1 skyboxview skyboxview 5069 Jun 23 14:14 unsetha.sh skyboxsecurity.com 15 c) Edit the skybox_ha.conf file located in the /home/skyboxview/ha directory. *NOTE: The DIRECTORIES variable should not be changed! This is used for directories that are to be synced. The OTHER_SERVER variable MUST have single quotes. # Set the server role: 1 = Primary; 2 = Secondary SERVER_ROLE=1 # Set options for REST ping OTHER_SERVER='otherserver' TIMEOUT=10 RETRIES=5 WAIT_BETWEEN_RETRIES=3 //IP or //value //enter //value # Configure REST ping intervals REST_PING_INTERVAL=10 //interval between each ping check hostname of counterpart server in seconds amount of retries before failure consideration in seconds time between each ping attempt # Set the base installation location of Skybox (case sensitive) SKYBOX_HOME=/opt/skyboxview # Set the Skybox HA script locations (case sensitive) SKYUSERHOME=/home/skyboxview/ha # Configure the following log locations WATCHDOGLOG=$SKYUSERHOME/log/watchdog.log SKYBOXLOG=$SKYUSERHOME/log/skybox_ha.log TICKETSTATUS=$SKYUSERHOME/log/ticket_statusfile.log LIVEREPORTS=$SKYUSERHOME/log/live_reports.log DATACOLLECTOR=$SKYUSERHOME/log/data_collector.log MODELSTATUS=$SKYUSERHOME/log/model-statusfile.log FWCONFIGSYNC=$SKYUSERHOME/log/fw_config_sync.log SBSERVCONFSYNC=$SKYUSERHOME/log/sb_serv_config_sync.log # Directory locations - Do not change! DIRECTORIES=($SKYBOX_HOME/data/xml_models $SKYBOX_HOME/data/sqlx_models) # Configure the following below to enable the mail settings MAIL_SENDER= MAIL_RECEIVER= # Enable this flag to disable the GUI port on the secondary # Only required configure on the secondary configuration - Values: true/false DISABLE_SECONDARY_GUI=false # Configure the following if you are using a proxy between the Primary and Secondary # Values: true/false PROXYVAR=false PROXYUSER= PROXYPASSWORD= PROXYHOST= PROXYPORT= d) Before continuing, make sure BOTH Primary and Secondary servers have the skybox_ha.conf file configure and the scripts have the execute bit set. skyboxsecurity.com 16 SETTING UP HA Now it’s time to complete the set up for the Primary and Secondary server. *IMPORTANT: These scripts must be run as the ‘skyboxview’ user!!! Setting Up Secondary Server The Secondary server must be set up first, as the Primary server will perform a ‘role’ check when its script is run. The script to set up the Secondary server is the ‘setsecondary.sh’ script located in the ‘/home/skyboxview/ha/’ directory. The script will do the following: • Create the cronjobs for the ‘skyboxview’ user, which includes the skybox_watchdog.sh and the conf_files_sync.sh scripts • Create the incron jobs for the ‘skyboxview’ user for INCROND to monitor and automatically load any model files received • Disable any port forwarding or firewalls • Set the sender/receiver email addresses for the alerts • Set the task scheduling in /opt/skyboxview/server/conf/sb_server.properties to ‘false’ a) Log in as the ‘skyboxview’ user, cd to the /home/skyboxview/ha directory and run the set primary script. *NOTE: ‘auto’ MUST be specified, otherwise the script will not configure the server properly! [skyboxview@skybox-sec ~/ha]$ ./setsecondary.sh auto ***************************************************************** ***************************************************************** ***** ***** ***** Configuring HA - This server role = SECONDARY ***** ***** ***** ***************************************************************** ***************************************************************** 2020-06-22 16:54:59 - Configuring HA - This server role = SECONDARY 2020-06-22 16:54:59 - The user is skyboxview. Proceeding to set as secondary... 2020-06-22 16:54:59 - Verifying ticket_attachments directory exists 2020-06-22 16:54:59 - Directory /opt/skyboxview/data/ticket_attachments doesn't exist. Creating ticket_attachments folder 2020-06-22 16:54:59 - Directory /opt/skyboxview/data/ticket_attachments was created successfully 2020-06-22 16:54:59 - Removing file /home/skyboxview/ha/incron.primary 2020-06-22 16:54:59 - File /home/skyboxview/ha/incron.primary was removed successfully 2020-06-22 16:54:59 - Removing contents of INCRONTAB for user skyboxview removing table for user 'skyboxview' table for user 'skyboxview' successfully removed skyboxsecurity.com 17 2020-06-22 16:54:59 - Removed contents of INCRONTAB for user skyboxview 2020-06-22 16:54:59 - Recreating & reloading INCRONTAB copying table from file '/home/skyboxview/ha/incron.secondary' requesting table reload for user 'skyboxview'... request done 2020-06-22 16:54:59 - Reloaded INCRONTAB 2020-06-22 2020-06-22 2020-06-22 2020-06-22 2020-06-22 2020-06-22 2020-06-22 16:54:59 16:54:59 16:54:59 16:54:59 16:54:59 16:54:59 16:54:59 - Auto failover configured - Resetting skybox crontab and adding entries - skybox_watchdog.sh entry added to cron - conf_files_sync.sh entry added to cron - backupTicket.sh entry doesn't exist in cron - replicate_fw_config.sh entry doesn't exist in cron - Skybox HA crontab entries added 2020-06-22 16:54:59 - Disable Secondary Gui is set to false 2020-06-22 16:54:59 - Detected firewalld, loading configuration success 2020-06-22 16:55:00 - Firewalld configuration was uploaded: firewall with secondary gui enabled successfully reloaded 2020-06-22 16:55:00 - Disabling Task Scheduling 2020-06-22 16:55:02 - Disabled Task Scheduling 2020-06-22 16:55:02 - Updating flag 'enable_collectors_auto_update' in sb.server.properties 2020-06-22 16:55:02 - 'enable_collectors_auto_update' flag is set to true, changing flag to false 2020-06-22 16:55:02 - 'enable_collectors_auto_update' flag has been successfully set to false 2020-06-22 16:55:02 - Host has been successfully set as SECONDARY *** HIGH AVAILABILITY CONFIGURED *** *** This server: SECONDARY *** b) Verify that the cron jobs were created successfully [skyboxview@skybox-sec ~/ha]$ crontab -l */10 * * * * /home/skyboxview/ha/skybox_watchdog.sh >/dev/null 2>&1 */60 * * * * /home/skyboxview/ha/conf_files_sync.sh >/dev/null 2>&1 c) Verify that the incron jobs were created successfully by using the following. skyboxview@skybox-sec ~/ha]$ incrontab -l /opt/skyboxview/data/xml_models IN_MOVED_TO /home/skyboxview/ha/loadmodel.sh $@/$# /opt/skyboxview/data/sqlx_models IN_MOVED_TO /home/skyboxview/ha/loadmodel.sh $@/$# /home/skyboxview/ha/ticket_backups/ IN_MOVED_TO /home/skyboxview/ha/loadticket.sh $@/$# skyboxsecurity.com 18 d) Verify that the firewall is disabled, and port forwarding isn’t enabled. FirewallD [skyboxview@skybox-sec ~/ha]$ sudo firewall-cmd --zone=public --list-all public (active) target: default icmp-block-inversion: no interfaces: ens160 sources: services: dhcpv6-client ssh ports: 9443/tcp 123/udp 8443/tcp 514/udp 514/tcp 161/udp 444/tcp protocols: masquerade: no forward-ports: source-ports: icmp-blocks: rich rules: iptables – there should be nothing under the ‘PREROUTING’ chain [skyboxview@skybox-sec ~/ha]$ sudo iptables --table nat --list Chain PREROUTING (policy ACCEPT) target prot opt source destination Chain INPUT (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination Chain POSTROUTING (policy ACCEPT) target prot opt source destination skyboxsecurity.com 19 Setting Up Primary Server The script to set up the Primary server is the ‘setprimary.sh’ script located in the ‘/home/skyboxview/ha/’ directory. The script will do the following: • Create the cronjobs for the ‘skyboxview’ user, which includes the skybox_watchdog.sh, backupTicket.sh, and • replicate_fw_config.sh scripts • Create the incron jobs for the ‘skyboxview’ user to rsync the file over the Secondary server • Create the port forwarding of port 443/tcp to forward to port 8443/tcp • Set the sender/receiver email addresses for the alerts • Set the task scheduling in /opt/skyboxview/server/conf/sb_server.properties to ‘true’ a) Log in as the ‘skyboxview’ user, cd to the /home/skyboxview/ha directory and run the set primary script. *NOTE: ‘auto’ MUST be specified, otherwise the script will not configure the server properly! skyboxview@skybox-pri ~/ha]$ ./setprimary.sh auto *************************************************************** *************************************************************** ***** ***** ***** Configuring HA - This server role = PRIMARY ***** ***** ***** *************************************************************** *************************************************************** 2020-06-22 17:23:31 - Configuring HA - This server role = PRIMARY 2020-06-22 17:23:31 - The user is skyboxview. Proceeding to set as primary... 2020-06-22 17:23:57 - Other server is not primary... continuing 2020-06-22 17:23:57 - Removing file /home/skyboxview/ha/incron.primary 2020-06-22 17:23:57 - File /home/skyboxview/ha/incron.primary was removed successfully 2020-06-22 17:23:57 - Removing contents of INCRONTAB for user skyboxview removing table for user 'skyboxview' table for user 'skyboxview' successfully removed 2020-06-22 17:23:57 - Removed contents of INCRONTAB for user skyboxview 2020-06-22 17:23:57 - Recreating & reloading INCRONTAB copying table from file '/home/skyboxview/ha/incron.primary' requesting table reload for user 'skyboxview'... request done 2020-06-22 17:23:57 - Reloaded INCRONTAB 2020-06-22 2020-06-22 2020-06-22 2020-06-22 2020-06-22 2020-06-22 2020-06-22 17:23:57 17:23:57 17:23:57 17:23:57 17:23:57 17:23:57 17:23:57 - Auto failover configured - Resetting skybox crontab and adding entries - skybox_watchdog.sh entry added to cron - backupTicket.sh entry added to cron - replicate_fw_config.sh entry added to cron - conf_files_sync.sh entry doesn't exist in cron - Skybox HA crontab entries added 2020-06-22 17:23:57 - Detected Firewalld service, loading configuration skyboxsecurity.com 20 success 2020-06-22 17:23:58 - Firewalld configuration was uploaded, firewall was successfully reloaded 2020-06-22 17:23:58 - No mail parameters supplied 2020-06-22 17:23:58 - Enabling Task Scheduling 2020-06-22 17:24:00 - Task Scheduling enabled 2020-06-22 17:24:00 - Updating flag 'enable_collectors_auto_update' in sb.server.properties 2020-06-22 17:24:00 - 'enable_collectors_auto_update' flag is set to true 2020-06-22 17:24:00 - Host has been successfully set as PRIMARY *** HIGH AVAILABILITY CONFIGURED *** *** This server: PRIMARY *** b) Verify that the cron jobs were created successfully [skyboxview@skybox-pri ~/ha]$ crontab -l */10 * * * * /home/skyboxview/ha/skybox_watchdog.sh >/dev/null 2>&1 */60 * * * * /home/skyboxview/ha/backupTicket.sh >/dev/null 2>&1 */60 * * * * /home/skyboxview/ha/replicate_fw_config.sh >/dev/null 2>&1 c) Verify that the incron jobs were created successfully by using the following. skyboxview@skybox-pri ~/ha]$ incrontab -l /opt/skyboxview/data/xml_models/ IN_CLOSE_WRITE /usr/bin/rsync --logfile=/home/skyboxview/ha/log/model-statusfile.log -av $@/$# skyboxsec:/opt/skyboxview/data/xml_models/ /opt/skyboxview/data/sqlx_models/ IN_CLOSE_WRITE /usr/bin/rsync --logfile=/home/skyboxview/ha/log/model-statusfile.log -av $@/$# skyboxsec:/opt/skyboxview/data/sqlx_models/ /opt/skyboxview/data/ticket_attachments/ IN_CLOSE_WRITE /usr/bin/rsync --logfile=/home/skyboxview/ha/log/ticket_statusfile.log -rav $@/$# skyboxsec:/opt/skyboxview/data/ticket_attachments/ /opt/skyboxview/data/Live/reports/ IN_CLOSE_WRITE /usr/bin/rsync --logfile=/home/skyboxview/ha/log/live_reports.log -rav $@/$# skyboxsec:/opt/skyboxview/data/Live/reports/ /opt/skyboxview/data/collector/data_collector/ IN_CLOSE_WRITE /usr/bin/rsync --logfile=/home/skyboxview/ha/log/data_collector.log -rav $@/$# skyboxsec:/opt/skyboxview/data/collector/data_collector skyboxsecurity.com 21 d) Verify that the firewall is disabled, and port forwarding isn’t enabled. FirewallD [skyboxview@skybox-pri ~/ha]$ sudo firewall-cmd --zone=public --list-all public (active) target: default icmp-block-inversion: no interfaces: ens160 sources: services: rsyncd ssh ports: 9443/tcp 8443/tcp 3306/tcp 123/udp 161/udp 8099/tcp 514/tcp 514/udp 444/tcp protocols: masquerade: yes forward-ports: port=443:proto=tcp:toport=8443:toaddr= source-ports: icmp-blocks: rich rules: iptables [skyboxview@skybox-sec ~/ha]$ sudo iptables --table nat --list Chain PREROUTING (policy ACCEPT) target prot opt source destination REDIRECT tcp -- anywhere anywhere tcp dpt:https redir ports 8443 Chain INPUT (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination Chain POSTROUTING (policy ACCEPT) target prot opt source destination e) Finally, setup the ‘Model Backup’ task on the Primary server using the required scheduling skyboxsecurity.com 22 Failover Recovery In a scenario where there is a failover event and the Primary Server has diagnosed and recovered, a failover recovery will need to be initiated to bring the primary back online as Active. Follow the steps below to recover the Primary as the Active server. Ensure the Secondary server is setup first! a) Run the ./unset.sh command on both the primary and secondary server: skyboxview@skybox-pri ~/ha]$ ./unsetha.sh ****************************************** ****************************************** ***** ***** ***** Disabling HA ***** ***** ***** ****************************************** ****************************************** 2020-06-23 2020-06-23 2020-06-23 2020-06-23 2020-06-23 2020-06-23 2020-06-23 13:40:06 13:40:06 13:40:06 13:40:06 13:40:06 13:40:06 13:40:06 - Disabling HA - Removing HA Cron entries - skybox_watchdog.sh entry doesn't exist in cron - backupTicket.sh entry doesn't exist in cron - replicate_fw_config.sh entry doesn't exist in cron - conf_files_sync.sh entry doesn't exist in cron - HA Cron entries removed 2020-06-23 13:40:06 - Clearing incron.primary copying table from file '/home/skyboxview/ha/incron.primary' 2020-06-23 13:40:06 - incron.primary cleared successfully 2020-06-23 13:40:06 - Detected Firewalld service, resetting configuration success 2020-06-23 13:40:09 - Firewalld configuration was uploaded: firewall successfully reloaded 2020-06-23 13:40:09 - HA has been unset ************************************************* ************************************************* ***** ***** ***** HA has been disabled ***** ***** ***** ************************************************* ************************************************* b) Verify the incrond service is up and running ( systemctl status incrond). If the incron service is down, restart the service (systemctl start incrond) c) Run the setsecondary.sh script on the Secondary server with the following command: ./setsecondary auto d) Run the setprimary.sh script on the Primary serve with the following command: ./setprimary auto skyboxsecurity.com 23 e) Verify the contents of crontab (crontab -l) and incrontab (incrontab -l) of both the primary and secondary servers match the contents in the ‘Setting Up Secondary Server’ and ‘Setting Up Primary Server’ sections. f) Run the model backup task. The model file will be copied to the secondary and uploaded. Verify that the model from the primary server is loaded into the secondary machine. Updating to a New Version of HA The process to update the HA installation is a manual process. In order to update HA: a) Record values from the skybox_ha.conf file for both the primary and secondary servers b) Delete the ha directory rm -rf ha/ c) Copy the zip file for the new version of HA to the directory where HA was installed d) Unzip the file using the unzip command. This will unzip the ha directory: unzip ha.v.xx.zip e) Update the skybox_ha.conf files with values from the previous configurations f) Setup the secondary server ./setsecondary auto g) Setup the primary server ./setprimary auto h) Finally check that the crontab and incrontab entries of both the primary and secondary server math the contents from the ‘Setting Up Secondary Server’ and ‘Setting Up Primary Server’ sections. skyboxsecurity.com 24 TROUBLESHOOTING Primary Not Syncing to Secondary a) Check connectivity and SSH settings (on both machines). The other server should respond to ping and you should be able to SSH to the other server without having to enter a password. ping -c 4 <other_server> ssh skyboxview@<other_server> b) Run the REST PING API request to verify the health check is operational. The command should return a “HTTP/1.1 200 OK” (highlighted below). For the primary server check ports 8443 and 443 curl --insecure --connect-timeout 3 https://<serverIP>:<port>/skybox/webservice/jaxrsinternal/internal/healthcheck/ping -v [skyboxview@skyboxapp /opt/ha]$ curl --insecure --connect-timeout 3 https://127.0.0.1:8443/skybox/webservice/jaxrsinternal/internal/healthcheck/ping -v * About to connect() to 127.0.0.1 port 8443 (#0) * Trying 127.0.0.1... * Connected to 127.0.0.1 (127.0.0.1) port 8443 (#0) * Initializing NSS with certpath: sql:/etc/pki/nssdb * skipping SSL peer certificate verification * NSS: client certificate not found (nickname not specified) * SSL connection using TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 * Server certificate: * subject: CN=www.skyboxsecurity.com,OU=Skybox Security,O=Skybox Security,L=San Jose,ST=CA,C=US * start date: Apr 28 11:13:30 2015 GMT * expire date: Apr 23 11:13:30 2035 GMT * common name: www.skyboxsecurity.com * issuer: CN=www.skyboxsecurity.com,OU=Skybox Security,O=Skybox Security,L=San Jose,ST=CA,C=US > GET /skybox/webservice/jaxrsinternal/internal/healthcheck/ping HTTP/1.1 > User-Agent: curl/7.29.0 > Host: 127.0.0.1:8443 > Accept: */* > < HTTP/1.1 200 OK < Strict-Transport-Security: max-age=31622400; includeSubDomains < Date: Fri, 01 Feb 2019 01:50:36 GMT < Content-Length: 0 < Server: < * Connection #0 to host 127.0.0.1 left intact c) Ensure that the incrond service is running by issuing the command ‘ sudo systemctl status incrond’ as the skyboxview user. This will also ensure that sudo is configured correctly skyboxsecurity.com 25 INCRON Not Starting at Boot Root Cause: In the newer versions of the Skybox ISO, like the 9.0.208 ISO, the incrond service was not enabled to run on boot. Troubleshooting steps: Instead of using chkconfig, the services uses systemctl. Follow the steps listed below: a) Run the following command as the root user. Verify that the incrond service has been enabled for boot. Look for the ‘disable’ keyword (highlighted below) [root@skybox-pri /home/skyboxview]# systemctl status incrond â— incrond.service - Inotify System Scheduler Loaded: loaded (/usr/lib/systemd/system/incrond.service; disabled; vendor preset: disabled) Active: inactive (dead) b) Enable the incron service to run on server boot. The following command will create a symbolic link so that the service will start on next boot [root@skybox-pri /home/skyboxview]# systemctl enable incrond Created symlink from /etc/systemd/system/multi-user.target.wants/incrond.service to /usr/lib/systemd/system/incrond.service. c) Start the incron service using the command: systemctl start incrond d) Verify that the service has been started and enabled to run on boot root@skybox-pri /home/skyboxview]# systemctl status incrond â— incrond.service - Inotify System Scheduler Loaded: loaded (/usr/lib/systemd/system/incrond.service; enabled; vendor preset: disabled) Active: active (running) since Tue 2020-06-23 14:31:41 BST; 4s ago Process: 13767 ExecStart=/usr/sbin/incrond (code=exited, status=0/SUCCESS) Main PID: 13768 (incrond) CGroup: /system.slice/incrond.service └─13768 /usr/sbin/incrond Model Copies to Secondary, but doesn’t load a) Check that the incron service is running and enable to start on boot b) Check that the sbvserver service is running on the secondary skyboxsecurity.com 26 REST Ping fails due to curl error The REST health check API call may return a curl error. If the curl command returns a code 56, check with the customer to verify if there is a proxy enabled between the Primary and Secondary servers. Firewall configuration reset back to defaults When the unsetha.sh script is run, the Iptables and Firewalld configurations are reset. When the setprimary.sh and setsecondary.sh are ran, the configured firewall configurations replace any changes made on the server. The work around is to make the changes permanent in relevant configuration files found in the <ha_home>/firewall directory. skyboxsecurity.com 27 FAQ – FREQUENTLY ASKED QUESTIONS These are some of the frequently asked questions customers may ask about the Skybox HA Module. Which type of cluster are you providing? (hot-stand-by, cold-stand-by)? We are doing HA by using an Active/Passive in a hot-standby configuration. Is there automatic or manual failover or no failover at all? The Skybox HA package will detect a failover on the Primary and automatically switch over to the Secondary server. The HA will not fail back over to the Primary automatically and must be switched over by a manual method. This is by design to allow administrators the time required to troubleshoot/diagnose any issues with the Primary server. If there is failover, how is it triggered? There is an automatic failover from the Primary server to the Secondary server. This is triggered by a watchdog service on the Secondary server. This service monitors the Primary server by issuing SOAP requests to verify the server is operational and the responding to requests. If the Primary service fails the status check, then the watchdog service will automatically trigger the failover and set the Secondary server as the active server. Is the cluster active/passive or active/active? The HA module is set up as an active/passive model, where the Secondary server will become active if the Primary server becomes unreachable on port 443 after the configured timeouts have been reached. How are the IP addressing supposed to be setup? Is there a common IP? Are you using VRRP? If the customer would like to use load-balancing with the HA module, they will need to use an external source, like a load balancer. For example, on F5 load balancers, the customer can create a common VIP and put the servers into a pool. The HA Module uses a NAT rule to port forward all traffic received on port 443 to its internal port 8443. On the F5, a health monitor can be created to check if a server is responding to HTTPS requests on port 443. If the Primary server is unreachable, the switch over will be activated and the F5 will detect the Secondary server as active and forward traffic to its port 443. skyboxsecurity.com 28 Which data is synced between cluster members? First, the servers are not clustered. The following are synced from the Primary to the Secondary server. • Models • Sql models • Reports • Data collector • Ticket attachments • OS configurations What performance impact does syncing have on the servers? There’s no performance impact on the Primary server as it’s generating the backup models only. Only the Secondary server (which isn’t active), will be impacted as it’s uploading the model. The time is based on the size of the model. Is it recommended to use an extra interface for syncing? Yes, a second NIC is recommended. This will allow normal operations on the main NIC and allow the secondary NIC to be dedicated to HA. skyboxsecurity.com 29 APPENDIX ISO 10.0.203-7.1.8 Packages This section evidences the packages listed in the ‘Required Dependencies/Setup’ section are available in the Skybox ISO 10.0.203-7.1.8 [skyboxview@skyboxapp ~]$ get_appliance_details APPLIANCE_VERSION: 10.0.203-7.1.8 CORES: 2 FIPS_MODE: Disabled MODE: SERVER MODEL: RAM: 7803 MB REDHAT_RELEASE: CentOS Linux release 7.6.1810 (Core) SERIAL_NUMBER: SKYBOX: 10.0.203 [skyboxview@skyboxapp ~]$ yum info sudo Loaded plugins: fastestmirror Determining fastest mirrors * base: mirror.cov.ukservers.com * extras: mirror.cov.ukservers.com * updates: www.mirrorservice.org Installed Packages Name : sudo Arch : x86_64 Version : 1.8.23 Release : 3.el7 Size : 3.0 M Repo : installed From repo : anaconda Summary : Allows restricted root access for specified users URL : http://www.courtesan.com/sudo/ License : ISC Description : Sudo (superuser do) allows a system administrator to give certain : users (or groups of users) the ability to run some (or all) commands : as root while logging all commands and arguments. Sudo operates on a : per-command basis. It is not a replacement for the shell. Features : include: the ability to restrict what commands a user may run on a : per-host basis, copious logging of each command (providing a clear : audit trail of who did what), a configurable timeout of the sudo : command, and the ability to use the same configuration file (sudoers) : on many different machines. [skyboxview@skyboxapp ~]$ rpm -qa sudo sudo-1.8.23-3.el7.x86_64 skyboxsecurity.com 30 [skyboxview@skyboxapp ~]$ yum info rsync Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * base: mirror.cov.ukservers.com * extras: mirror.cov.ukservers.com * updates: www.mirrorservice.org Installed Packages Name : rsync Arch : x86_64 Version : 3.1.2 Release : 6.el7_6.1 Size : 815 k Repo : installed From repo : anaconda Summary : A program for synchronizing files over a network URL : http://rsync.samba.org/ License : GPLv3+ Description : Rsync uses a reliable algorithm to bring remote and host files into : sync very quickly. Rsync is fast because it just sends the differences : in the files over the network instead of sending the complete : files. Rsync is often used as a very powerful mirroring process or : just as a more capable replacement for the rcp command. A technical : report which describes the rsync algorithm is included in this : package. [skyboxview@skyboxapp ~]$ rpm -qa rsync rsync-3.1.2-6.el7_6.1.x86_64 [akyboxview@skyboxapp ~]$ yum info python-suds Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * base: mirror.cov.ukservers.com * extras: mirror.cov.ukservers.com * updates: www.mirrorservice.org Installed Packages Name : python-suds Arch : noarch Version : 0.4.1 Release : 5.el7 Size : 939 k Repo : installed From repo : anaconda Summary : A python SOAP client URL : https://fedorahosted.org/suds License : LGPLv3+ Description : The suds project is a python soap web services client lib. Suds leverages : python meta programming to provide an intuitive API for consuming web : services. Objectification of types defined in the WSDL is provided : without class generation. Programmers rarely need to read the WSDL since : services and WSDL based objects can be easily inspected. [skyboxview@skyboxapp ~]$ rpm -qa python-suds python-suds-0.4.1-5.el7.noarch skyboxsecurity.com 31 [skyboxview@skyboxapp ~]$ yum info incron Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * base: mirror.cov.ukservers.com * extras: mirror.cov.ukservers.com * updates: www.mirrorservice.org Installed Packages Name : incron Arch : x86_64 Version : 0.5.10 Release : 8.el7 Size : 249 k Repo : installed From repo : anaconda Summary : Inotify cron system URL : http://inotify.aiken.cz License : GPLv2 Description : This program is an "inotify cron" system. : It consists of a daemon and a table manipulator. : You can use it a similar way as the regular cron. : The difference is that the inotify cron handles : filesystem events rather than time periods. [skyboxview@skyboxapp ~]$ rpm -qa incron incron-0.5.10-8.el7.x86_64 skyboxsecurity.com 32