NetQoS High Availability Products Installation and Operations netqos.com www.netqos.com Contents NETQOS HIGH-AVAILABILITY PRODUCTS ............................................................................................................... 2 BEFORE YOU BEGIN .........................................................................................................................................................3 INSTALLING AND OPERATING DOUBLE-TAKE ......................................................................................................... 4 INSTALLATION ................................................................................................................................................................4 Monitoring and notifications ................................................................................................................................14 FAILOVER OPERATIONS (SOURCE TO TARGET).....................................................................................................................16 Step 1 - Reconfigure the NetVoyant Master Console (NetVoyant Only) ...............................................................16 FAILBACK (TARGET TO SOURCE) AND RESTORATION (TARGET TO SOURCE) OPERATIONS .............................................................17 To Manually Restore Service to the Source Server ...............................................................................................17 Step 1 - Restoration ..............................................................................................................................................17 Step 2 - Failback (Target back to Source) .............................................................................................................19 Step 3 - Re-mirroring ............................................................................................................................................20 Step 4 - Reconfiguring the Master Console (NetVoyant Systems Only) ................................................................21 APPENDIX A – STARTING THE MIRROR MANUALLY ............................................................................................. 22 APPENDIX B – CONFIGURING THE MONITOR MANUALLY .................................................................................... 24 APPENDIX C – MONITOR TROUBLESHOOTING ..................................................................................................... 25 APPENDIX D – MANUALLY INVOKING/TESTING FAILOVER ................................................................................... 26 Pg. 1 Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com NetQoS High-Availability Products Double-Take is a real-time data replication and failover software product that monitors updates received at the Source (primary) server and sends them to the Target (back-up) server. The Target monitors the Source, using a heart-beat (on port 1100 UDP Broadcast), and initiates failover when it detects the Source is unavailable (if the heartbeat is not heard within 60 seconds then the Target takes over. This time is adjustable.). When the outage is repaired, Operations can initiate failback during a designated maintenance window. Updates are restored to the Source from the Target with minimal disruption or data loss upon completing the restoration. This document describes the steps necessary to install and operate Double-Take to provide high availability for NetQoS Products. The currently supported High-Availability products are shown in the table below. Note that NPC is not supported when it is installed with another product on the same server (ie when NPC and NV are installed on a standalone server. NV can be supported for HA but not NPC currently). Component High Availability Supported SA Standalone No SA Management Console Yes SA Collector No SA Aggregator No NetQOS Performance Center (NPC) Standalone (ie installed and running on its own server and not with any other NetQoS products) Yes NPC w/RA or SA distributed Yes NV Standalone Yes NV Management Console Yes NV Poller Yes RA Standalone No RA Console Yes RA DSA Yes RA Flow Manager Yes RA Harvester Yes UCM Standalone No UCM Management Console No UCM collector No Anomaly Detector Pg. 2 No Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com If the Source is unavailable, the Target will assume the IP address of the Source (except in a NetVoyant environment. performance information. Therefore, it is imperative that other components can communicate with the Target using the Source’s IP address. This does not apply to NetVoyant systems (ie the Target will not assume the IP address of the Source if the Source fails). Before You Begin Pre-install checklist: Source Server Name Source IP Address Target Server Name Target IP Address Double-Take Activation Code (one for each server) SMTP Server Name (for emailing alerts) Login User Name/Password (if required) Double-take uses ports 1100 and 1105 for communication. These must be enabled through firewalls or alternative ports configured in the Double-take console. If the network is not configured to propagate UDP broadcasts, Double-take servers must be added manually using Insert->Server. The Double-take service depends on Windows Management Instrumentation (WMI) and Remote Procedure Call (RPC) services being enabled and running. Pg. 3 Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com Installing and Operating Double-Take Installation Pg. 4 1. Copy the Double Take (DT) executable (eg DoubleTake_i386_5.1.1.867.0.exe) to both the Source and Target servers (Look on the Manufacturing portal for a copy of DT software: https://oz.netqos.com/Departments/Manufacturing/OM/default.aspx. Double click on the “Double Take” folder. Then click on the EXE. If you have problems down loading the file then go to Double Takes web site and click on http://download.doubletake.com/?f=DoubleTake_i386_5.1.1.867.0.exe . 2. Then on the Source and Target server click on the DT executable and click on the “Unzip” button to unzip the files to the Temp directory. Then click “OK”. Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com Pg. 5 3. You should then see the Double-Take Software Launcher Appear (on both the Source and Target). 4. Click on the “Double-Take for Windows” option and then click on “Install Double-Take for Windows” (on both the Source and Target servers). 5. Install DT on the Source and Target servers using the installation defaults. Use 512mbytes for maximum memory on Harvesters. Ignore any warning messages about missing registry keys. See the various options screens below. Make sure to have the activation code available. 6. Activation codes are in an excel spreadsheet on our Manufacturing portal: https://oz.netqos.com/Departments/Manufacturing/OM/default.aspx . Double click on the “Double Take” folder and highlight the “Double Take Licenses” EXCEL spreadsheet. Select “Edit in MS Excel”. Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com Fill in the spreadsheet, select two codes and save. You should be asked to “Check in” the document. Go ahead and check the document back in). This will be required during the installation steps below. Pg. 6 Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com 7. Pg. 7 Use 512Mbytes for maximum memory on Harvesters. The default of 50MB is sufficient for other products. Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com 8. Click “Yes” to reboot the server. Double Take High Level Design 10.8.0.54 Double Take Ports Used (and required to be let through Firewalls) Primary Double Take Source Port Used By Port 1100 UDP Broadcast - DT server Heart Beat. Sends and receives on this port. Used to tell the MC that the DT server is alive. - FCC Heart Beat. Sends and receives on this port and is used to populate the FCC console with discovered DT servers. - MC Heart Beat. Sends and receives on this port and is used to populate MC console tree with discovered DT servers. Port 1105 UDP Directed - MC. Used by MC to retrieve the status and statistics from the DT servers. - Double Take Server. Updates the MC with the status and statistics of the server. Port 1100 TCP - FCC. Sends and receives communication to DT servers. Used by FCC to connect to the DT servers. - DT Server. Sends and receives communication between DT servers. Used to form a connection between DT servers. tlab54 10.8.0.7 MC = DT Management Console FCC = Fault Management Centre DT Server = Double Take Server Back-up Double Take Target tlab58 9. Unzip the file install_ha.zip on the Source and Target systems. 10. Login to the Source server and run install_ha.hta. 11. Enter the Source (eg tlab54) and Target (eg tlab58) Microsoft machine names (not Fully Qualified Domain name), and the Source server IP address (eg 10.8.0.54). Select your product and click the Source button. 12. Login to the Target. Run install_ha.hta. Repeat step 11 on the Target using the IP address the Source will have during the restore process (not applicable to NetVoyant applications). Select your product and click the Target button. Pg. 8 Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com If the NetQoS systems are not registered in the DNS, the Target and Source server names and IP addresses must added to the local Hosts file (c:\windows\system32\drivers\etc\). In addition, the Target computer may need to be added to the Failover Control Center each time it is launched. See Appendix A for instructions to create the failover monitor manually. 13. Log in to the Target and run “C:\Program Files\DoubleTake\stop_services.bat” to stop MySQL and all NetQos services. Pg. 9 Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com This ensures that all data files are closed on the Target and the Double-Take Source can replicate changes to it. These services automatically restart on the Target when failover is initiated and stop prior to completing the failback to the Source so replication can resume. 14. By default, services are set to automatic startup and need to be changed to manual startup on the Target. These services automatically restart on the Target when failover is initiated and stop prior to completing the failback to the Source so replication can resume. a. Login to the Target. Select Start > Administration Tools > Services, then right-click each NetQoS service and select Properties. Select Manual in the start-up type drop-down box. b. Repeat this for the MySQL service. 15. Verify the Double-Take management console can communicate with the servers (on Source). Select Start->All Programs->Double-Take->Management Console. Double-click on each server icon (ie select TLAB54 and double right click) to login to the server. If the NetQoS systems are not registered in the DNS, the Target and Source server names and IP addresses must added to the local Hosts file. Pg. 10 Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com 16. Look at the bottom left of the MC pane (where “For Help, press F1” is displayed above) and when you select TLAB54 and double left click you should see it say Login accepted and then ready. If Login fails then check that the Double-take service is running on that system. The Double-take service depends on Windows Management Instrumentation (WMI) and Remote Procedure Call (RPC) services being enabled and running. 17. For SuperAgent products, run regedit on the Target system and modify HKLM\SOFTWARE\NetQoS\SuperAgent\Parameters\MasterDB to be the IP address of the Source. 18. On the Source server run “C:\Program Files\DoubleTake\install.bat”. 19. Open the Double-take Management Console and select Monitor-> New Message Window and select your Source server (eg TLAB54). You should then see a similar display as to the one below. Now double-click the Source server in the Double-take management console to see that the backup replication set is running (look for a green tick). For a RA installation there will also be a second archive replication set running for the Harvester (again look for a green tick). If a mirror does not start automatically, see Appendix A for steps to manually start it. Pg. 11 Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com 20. To verify that the Target is monitoring the Source, from the Double-Take Management Console, select Tools->Failover Control Center. 21. Select the Target machine in the drop-down box and click Login. The Target computer will need to be added to the Failover Control Center each time it is launched. See Appendix B for instructions to create the failover monitor manually. Pg. 12 Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com 15. Select the Source IP address (eg 10.8.0.54), then click “Edit Monitor” to verify that the “current IP address(es)” match the IP of the Target (ie 10.8.0.7 in this example) system. See Appendix C for troubleshooting suggestions. Pg. 13 Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com Monitoring and notifications Configure email notification from the Double-Take Management Console. 1. Start the Double-Take Console by selecting Start->Programs->Double-Take->Management Console. 2. Double-click the Source server to login. 3. Right-click the server and select properties. 4. Select the E-mail notification tab. 5. Click the enable notification check-box to enable the email configuration and click the Add button after entering a Send to address. 6. Set the events to “information” and exclude all but 5100 – event failover automatic and 5102 – manual intervention required (if enabled). Pg. 14 Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com 7. Pg. 15 An SNMP trap service is also available. The Double-Take MIB is located at “C:\Program Files\DoubleTake\dt.mib”. Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com Failover Operations (Source to Target) Failover occurs after the Target does not receive a response from the Source for 60 seconds (5 tensecond polls). When a failover is detected, the Target sends an email notification (if configured). For NetQoS products (except NetVoyant) the Target system will assume the IP address of the Source system for transparent operations. NetQoS services will start automatically on the Target. However if you have a distributed NetVoyant system then one manual step is required (see below). Step 1 - Reconfigure the NetVoyant Master Console (NetVoyant Only) The section only applies to distributed NetVoyant systems. After a failover, configure the Target poller on the NetVoyant Console. 1. Exit the NetVoyant Console on the Master Console, if it is running. 2. Select run reconfig_poller.hta. 3. Select the failed poller in the drop-down box, and then enter the name of the Target server and its IP address. Click Okay. NetVoyant Services and the database will be stopped and restarted. The NetVoyant Console now reflects the Target poller rather instead of the Source. Pg. 16 Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com Failback (Target to Source) and Restoration (Target to Source) Operations After a failure is resolved and the Source is accessible again, restoration of data from the Target to the Source must be initiated manually. The IP address of the Source NIC must be changed before connecting to the network to avoid an IP address conflict during the restoration phase (this step is not required for a NetVoyant installation). To Manually Restore Service to the Source Server To manually restore service to the Source server (once it is back up and running) then carry out the following steps: 1. Restoration 2. Failback 3. Re-Mirror 4. Reconfigure NV Master Console (this step is only required for distributed NetVoyant systems) Step 1 - Restoration 1. On the Source server stop Files\doubletake\stop_services.bat”. all NetQoS services by running “C:\Program 2. Change the IP address of the Source NIC before connecting the Source system to the network. NOTE: This is not required for NetVoyant systems. 3. Pg. 17 On the Target server run “C:\Program Files\DoubleTake\nqrestore.bat” to initiate restoration. Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com 4. Pg. 18 The rstrbackup replication process will start synchronizing files on the Target with the Source. Updates on the Target (such as new polling responses) will be replicated to the Source in real-time. Note that Harvesters (for a RA installation) have a second replication set rstrarchive running. See Appendix A to initiate manual mirroring if the mirroring does not start automatically – Target data state indicates “mirror required.” Restoration is complete when the mirror status is Idle and Sent Bytes stops incrementing. Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com Step 2 - Failback (Target back to Source) When restoration of the Source is complete, initiate the failback process from the Target server: 1. From the Double-Take Management Console, select Tools->Failover Control Center. 2. Select the Target machine in the drop-down box and click Login. NOTE: The Target must be added manually (Add Target) if the servers are not registered in a DNS. 3. Select the Source server in the pane and click failback. 4. A warning appears: “The Source has not been restored. Do you want to fallback?”. Click Okay and this will stop NetQoS services on the Target. New updates will not be processed from this time until failback completes (so data will not be captured by the NetQoS products during this period). 5. The next prompt will be “Click Continue to restart monitoring”. This will cause the Target to release the Source IP address. Leave this box open, temporarily. The process will complete in step 7, below. 6. Now restore the Source NIC IP address to its original IP address before proceeding. This step is not required for NetVoyant systems. 7. Return to the Target server FCC and complete the failback process by clicking “Continue” from step 5 above. 8. Exit the Double-take management console on the Target. 9. NetQoS services will start on the Source machine and the Target will resume monitoring the Source for failure. Pg. 19 Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com Step 3 - Re-mirroring The mirroring and replication processes must be re-started on the Source server. 1. Run “C:\Program Files\DoubleTake\failback.bat” to initiate mirroring and replication. 2. The Source will begin synchronizing its files and updates from the Target. 3. See Appendix A if mirroring does not start automatically. Pg. 20 Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com Step 4 - Reconfiguring the Master Console (NetVoyant Systems Only) After failback completes and NetVoyant is again running on the Source, reconfigure the Source poller in the NetVoyant Console. 1. Exit the NetVoyant Console on the Master Console, if it is running. 2. Select run reconfig_poller.hta. 3. Select the (Target) backup poller in the drop-down box and enter the name of the Source server and its IP address. Click Okay. NetVoyant Services and the database will be stopped and restarted. The NetVoyant Console now reflects the Source poller rather than the Target. Pg. 21 Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com Appendix A – Starting the Mirror Manually Follow these steps if a mirror does not start automatically: 1. Right-click the mirror that has not started. 2. Select Mirroring then Start. Pg. 22 Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com 3. Check Send data only if Source is newer than Target for archive replication sets on Harvester systems only. 4. Check Use block checksum for the backup replication sets on all products. 5. Pg. 23 Click OK. Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com Appendix B – Configuring the Monitor Manually 1. Start the Failover Control Center: Start > Programs > Double-Take > Failover Control Center. 2. Click Add Target for the backup server. 3. Click Login. 4. Click Add Monitor. A prompt will appear for the Source (monitored) server name. 5. Check the box next to the Source server to enable the other options; ensure the IP Addresses box under Items to Failover is checked for all products other than NetVoyant. 6. Click Scripts… to add the failover and failback scripts. 7. Click … to browse to the files for the following scripts: 8. Pg. 24 Post-Failover - C:\Program Files\doubletake\start_services.bat Pre-Failback - C:\Program Files\doubletake\stop _services.bat Source Post-FailBack - C:\Program Files\doubletake\start_services.bat Click OK and OK to complete the process. Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com Appendix C – Monitor Troubleshooting If the monitor status in the Failover Control Center is not green, it may because the incorrect Source NIC is being monitored. 1. Start the Failover Control Center: Start > Programs > Double-Take > Failover Control Center. 2. Click Add Target for the backup server. 3. Click Login. 4. Select the Source under Names to Monitor. 5. Click Edit Monitor. 6. Select another NIC in the Target Adapter drop-down box. Verify the “Current IP Address(es)” matches the IP address of the Target. 7. If the Target adapter needs to change, edit c:\program files\doubletake\install.txt. 8. Find the line “monitor move 10.8.0.54 to nic 65539 use network interval 5 timeout 60;” Replace 65539 with the number in brackets from the correct Target Adapter. Save the file. 9. Click “OK” on the Monitor Settings. Pg. 25 Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com Appendix D – Manually Invoking/Testing Failover Follow these steps to manually invoke or test failover from the Source to the Target: 1. On the Target run the Failover Control Centre (FCC) or reboot the Source Server or disconnect its network cable. 2. Select the Target machine in the drop-down box and click Login. 3. Now click on the Source server (eg TLAB54) and the Failover button will no longer be grayed out. 4. Now click on the Failover button. The Target server will now take automatic control (don’t forget to carry out the one manual step if this is a NetVoyant distributed system to complete the Failover successfully). 5. After testing follow the steps to Failback and Restore the Source server. Pg. 26 Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com About NetQoS NetQoS is the fastest growing network performance management products and services provider. NetQoS has enabled hundreds of the world’s largest organizations to take a Performance First approach to network management—the new vanguard in ensuring optimal application delivery across the WAN. By focusing on the performance of key applications running over the network and identifying where there is opportunity for improvement, IT organizations can make more informed infrastructure investments and resolve problems that impact the business. Today, NetQoS is the only vendor that can provide global visibility for the world’s largest enterprises into all key metrics necessary to take a Performance First management approach. More information is available at www.netqos.com. NetQoS Global Headquarters 5001 Plaza On The Lake Austin, TX, 78746 United States Phone: 512.407.9443 Toll-Free: 877.835.9575 Fax: 512.407.8629 NetQoS EMEA 1650 Arlington Business Park Theale Reading, RG7 4SA United Kingdom Phone: + 44 (0) 118 929 8032 Fax: + 44 (0) 118 929 8033 NetQoS APAC NetQoS Singapore Representative Office Level 21, Centennial Tower 3 Temasek Ave., Singapore 039190 Phone: + 65 6549 7476 Fax: + 65 6549 7001 Website: www.netqos.com E-mail: sales@netqos.com © 2001-2008 NetQoS, Inc. All rights reserved. NetQoS, the NetQoS logo, SuperAgent, and NetVoyant are registered trademarks of NetQoS, Inc. ReporterAnalyzer and Allocate are trademarks of NetQoS, Inc. Other brands, product names and trademarks are property of their respective owners. Pg. 27 Copyright 2009 NetQoS, Inc. 877.835.9575 www. netqos.com netqos.com