What Every MCT Needs to Know about Clustering and High Availability Rodney R. Fournier Microsoft MVP - Windows Server Clustering Net Working America, Inc. Agenda Terms you need to know Four Types of Clustering What is Clustering? Overview of Exchange Clustering Overview of SQL Server Failover Clustering MSDTC Resources Agenda Terms you need to know Four Types of Clustering What is Clustering? Overview of Exchange Clustering Overview of SQL Server Failover Clustering MSDTC Resources Terms you need to know Active/Passive vs. Active/Active vs. Instance Failover & Failback Heartbeat Quorum vs. Majority Node Set Shared Storage Resources vs. Resource Groups High-availability vs. Fault Tolerance Scalability vs. Availability Mean Time To Failure Mean Time To Recover Node, Virtual Server, IP, Name, etc. Cluster aware Agenda Terms you need to know Four Types of Clustering What is Clustering? Overview of Exchange Clustering Overview of SQL Server Failover Clustering MSDTC Resources Four Types of Clustering High Performance Computing Component Load Balancing Network Load Balancing Server Clustering High Performance Computing (HPC) Super Computing Also called HPC Clusters or Supercluster As many as 256 nodes Strong competition for UNIX/Linux http://www.microsoft.com/windowsserv er2003/hpc/default.mspx Special applications Component Load Balancing (CLB) Component Object Model (COM+) components load balancing Calls to activate COM+ components are load balanced to different servers within the COM+ cluster http://www.microsoft.com/applicationce nter/techinfo/deployment/2000/AppCent erCLBTechOver.doc Application Center 2000 Network Load Balancing (NLB) Up to 32 nodes Layers 2 and 3 of the OSI model Can provide Scalability Provides Availability Supported on version of Windows Server 2003 http://www.microsoft.com/techne t/prodtechnol/windowsserver200 3/technologies/clustering/nlbbp. mspx IIS, SharePoint Portal Server, VPN Remote Access, ISA, Terminal Server Server Clustering WINS DHCP Exchange Server SQL Server File Shares Printers Message Queuing Distributed Transaction Coordinator Generic Service or Script Volume Shadow Copy Service Task Microsoft Search Service Agenda Terms you need to know Types of Clustering What is Clustering? Overview of Exchange Clustering Overview of SQL Server Failover Clustering MSDTC Resources Shared Nothing Model External Storage Array Node A Node B Public Network Heartbeat “Shared Nothing” For more information, see 293289 Public Basics Quorum = Clustering Stores most current configuration data in quorum recovery logs and registry checkpoints Maintains resource checkpoints Provides persistent physical storage Recovery Logs used to Enable any node to form a cluster Enable nodes to maintain a cluster Guarantee that a single cluster is formed Cluster.Log file Logs cluster activity; great for troubleshooting Server Cluster Components (Windows-based) Virtual server From client/application perspective, the server names or IP addresses used for access Hardware components of server clusters: Cluster nodes Internal heartbeat External networking Shared cluster disk array: Quorum disk Data disks Public Network Server Cluster Heartbeat Node B Node A Shared Disk Array Hardware Considerations Buy systems from the Windows Server Catalog: Cluster Solution – Hardware Compatibility List (HCL) http://www.microsoft.com/windows/catalog/serve r/default.aspx?xslt=categoryproduct&subid=22&p gn=8b712458-b91c-4a7d-8695-23e9cd3ae95b Entire systems, not individual components Ask your preferred vendor for help Get guarantees! Buy a support agreement that matches your level of availability Remember a PSS contract, too! Availability requirements, budget, 8th & 9th layer Shared Disk Configuration Instance-to-disk ratio: Two resources cannot share a physical disk Basic disks only; mount points and dynamic disks are not supported File compression and encryption are not supported Use Fibre Channel if you can; use SCSI if cost is a factor or iSCSI Use hardware-based RAID only; Softwarebased RAID is not supported Each RAID controller is different Turn writeback caching off if controller in server nodes …continued Shared Disk Configuration Be sure all disks are dependencies of the SQL Server/Exchange resource Disk is single point of failure. Store spare drives and have a secondary form of high availability Data Recommended: RAID 10 array of mirrored sets that are then striped RAID 5 okay Logs RAID 1 or possibly mirrored sets that are then striped; not RAID 5 Shared Disk Configuration Network-attached storage (NAS) Not supported for clusters Storage area networks (SANs) Only those on the HCL Cluster list or the Cluster/Multi-Cluster Device list can be used Get verification that it is set up properly Setup is usually done by the vendor Do not accept the default configuration—it will probably be for a file system iSCSI is now supported with 2003 SP1 Software Considerations Exchange/SQL Server 2000 Enterprise Edition Operating systems: Windows Server 2003 Enterprise Edition Windows Server 2003 Datacenter Edition Network Configurations Cluster nodes with Windows domains, DNS, and WINS You may still need WINS for NetBIOS resolution Nodes and virtual server must be able to access the domain All nodes have to be in the same domain Network Card Settings Do not set NICs to Autodetect You need at least 4 static IP addresses: 1 for each node, 1 for the server cluster, 1 for Clustered Service/Application Recommend 6 (additional dedicated heartbeat NICs) Multiple IP Addresses Use separate subnets for IP addresses Bandwidth Network Configuration Public Network Server Cluster Heartbeat Node B Node A Shared Disk Array Processor/Memory Configuration Configure each cluster node with processing power sufficient to handle the load for any process that may run on it Set Processor Affinity to N–1 if necessary Test your application before putting it into production Monitor processor usage. Use System Monitor Memory Single-instance: No issues unless other services or applications are running Multiple-instance: Be sure that one instance will not diminish the resources of other processes or instances in the event of a failover Failure External Storage Array Node A X Public Node B Network Heartbeat “Shared Nothing” For more information, see 293289 Public So Why Cluster? Provide High-Availability Failover mitigates outage when hardware failure occurs Strengthened by fault tolerant design Measured in 9s Term Nirvana 5 Nines 4 Nines 3 Nines 2 Nines or Fired Nines 100.00 99.999 99.99 99.9 99 Downtime per Year 0 seconds 5 minutes 52 minutes 8.7 hours 3.7 days Managed maintenance/upgrades Rolling Upgrades What Don’t You Get? Does not protect against: Loss of or damage to shared storage Network failures Application failures or database corruption Disasters Human errors Does not load balance mailboxes Cannot move running applications, and shared state is lost! Agenda Terms you need to know Four Types of Clustering What is Clustering? Overview of Exchange Clustering Overview of SQL Server Failover Clustering MSDTC Resources Overview Of Exchange Clustering Exchange Virtual Server (EVS) Physical Disk resource: SCSI, Fibre Channel (FC), or Internet SCSI (iSCSI) IP Address resource Network Name resource System Attendant resource and resources created by System Attendant Resources created by an administrator (for example, protocol virtual servers) Clustering Exchange Client PCs EVS fails over and is available to clients Failure Occurs! Node A Node B EVS Heartbeat EVS Passive Node Disk cabinet A Disk cabinet B SCSI Reserve Broken New Reservation Established Overview Of Exchange Clustering 1+1 Active/Passive 7+1 Active/Passive 2+0 Active/Active – Not Recommended Requirements For Clustering Exchange 2003 Windows Server 2003 Enterprise Edition and Datacenter Edition 2-node Active/Active Up to 8-node Active/Passive Requirements For Clustering Exchange 2003 Exchange Cluster Models Active/Passive is the strongly preferred model Fewer EVS’ than nodes Must use if more than two nodes Active/Active is the strongly discouraged model Maximum of two nodes and maximum of two EVS’ Maximum one RSG per cluster (824126) Limits number of concurrent MAPI users per node to 1,900 Limits average CPU utilization on each node to 40% Two instances of store running in one Store.exe process; not enough contiguous virtual memory to bring resource online Exchange Virtual Server Limits With two nodes, you can have up to two EVS’ With three or more nodes you can have n-1 where n = number of nodes in cluster Support For Clustering Exchange 2003 Active/Active System Attendant Information Store POP3, IMAP4, SMTP, HTTP Microsoft Search (full-text indexing) SMTP and routing group connectors Active/Passive Message Transfer Agent Requirements For Clustering Exchange 2003 NOT Supported Active Directory Connector (ADC) Exchange Event Service Foreign Mail System Connectors Network News Transport Protocol (NNTP) Site Replication Service (SRS) Requirements For Clustering Exchange 2003 Cluster certified hardware only Windows Server Catalog – Cluster or Geographic Cluster http://www.microsoft.com/windows/catalog/server SCSI, FC or iSCSI external storage Identical hardware for all nodes Microsoft support for Exchange failover clusters (810987) OS – 32-bit only Windows Server 2003 Enterprise Edition Windows Server 2003 Datacenter Edition Microsoft Distributed Transaction Coordinator (MSDTC) installed. Exchange Server 2003 Enterprise Edition Building An Exchange Cluster Design storage Four storage group maximum on node Shared disks must be NTFS/BASIC (237853) Use Diskpart to align sectors at storage level Use separate disk resources for logs/databases in EVS Use separate resource group for quorum Volume mount points supported on Windows 2003 (318458) Some iSCSI (839686) and NAS (839687) devices are now supported for use with Exchange and Exchange clusters You cannot use NAS for quorum resource (cluster FAQ) Additional disk resources need to be added as dependency Building An Exchange Cluster Design network Use multiple networks with dedicated private networks (258750) Do not use teaming or DHCP (254101) Need an IP address and Network Name resource for Each physical node The cluster resource group Each Exchange Virtual Server Use consistent naming standards Building An Exchange Cluster Step 1 - Prepare Hardware Apply latest system BIOS Apply latest device firmware Gather latest software drivers Disable unnecessary hardware Follow your hardware manufacturer recommendations to ensure you are using only drivers or firmware that have been tested for clusters Building An Exchange Cluster Step 2 – Install operating system and other prerequisites Install operating system (Windows Server 2003 preferred) SMTP, W3SVC and NNTP services Add nodes to domain as member servers Domain controllers are not supported for Exchange cluster nodes (810986) Windows Support Tools Windows Update / Security hotfixes If 1 GB or more of memory, tune with /3GB and /USERVA=3030 in Boot.ini Building An Exchange Cluster Step 3 – Prepare Nodes for Cluster Service Disable unnecessary services Configure Networks Rename connections: Private Network and Public Network Disable NetBIOS and DNS on private (heartbeat) interface Disable Media Sense on NICs – Hard-code (258750) Use 10MBs/Half-Duplex if not sure what speed to use Give private network highest binding order Unbind MS Client and File and Print on private network and bind IP and Network Monitor only Create/Select cluster service account Domain account w/local Administrator rights on each node Does NOT need Exchange Full Admin role Create Quorum partition on shared disk 50MB min; 500MB-1GB recommended Create and format additional disks/arrays Building An Exchange Cluster Step 4 – Install Cluster Service on each node. Move TEMP/TMP folder off %Systemroot% Run Cluster Diagnostics and Verification Tool Step 5 – Install Network DTC on each node (MSKB 817064, 301600) Step 6 – Install Exchange 2003 Unattended setup not supported Binaries installed locally in same location on each node Install one node at a time and reboot each node when finished Building An Exchange Cluster Step 7 – Install Exchange 2003 Service Packs and Updates Always update one node at a time, then the EVS via Cluster Administrator (for SP1) 867624 Step 8 – Create Exchange Virtual Server Create Resource Group Disk Resource IP Address Resource Network Name Resource Exchange System Attendant Resource Building An Exchange Cluster Step 9 – (Optional) Repeat Step 8 if creating additional EVS’ Step 10 – Configure EVS resources Increase pending time-out on Active/Active clusters Configure Restart and Affect the Group settings Configure Information Store and System Attendant resources for 1 restart Step 11 – Bring resources online Step 12 – Configure failover and failback (197047) Building An Exchange Cluster Prior to Putting into Production Test failover policies Test hardware (simulate failures) Exchange Server Load Simulator 2003 (LoadSim) Test under heavy network, disk I/O, and services loads Test under large number of simultaneous logon attempts Clean up after LoadSim Manually remove everything or flatten cluster and rebuild Exchange Server 2003 Jetstress 2004 Tool Microsoft Exchange Server Best Practices Analyzer Tool http://www.microsoft.com/exchange/exbpa Building An Exchange Cluster Additional Best Practices Do not install applications into the default Cluster Group Do not delete or rename the default Cluster Group or remove any resources from that resource group Do not use APM/ACPI power-saving features Do not set the Cluster service account to be a member of the domain administrator group Turn off cluster event log replication if auditing is enabled and security logging is heavy, or if you do not want event log entries to be replicated (224969) Agenda Terms you need to know Four Types of Clustering What is Clustering? Overview of Exchange Clustering Overview of SQL Server Failover Clustering MSDTC Resources SQL Server Virtual Servers Virtual servers: Instances of clustered SQL Server servers From client/application perspective, the server names or IP addresses used for access Cluster resources configured during install of a virtual server: SQL Server IP Address SQL Server Network Name SQL Server (clustered instance of the SQL Server 2000 service) SQL Server Agent SQL Server Fulltext SQL Server virtual server administrator account SQL Server Cluster Types Single-Instance Cluster Only one SQL Server virtual server running; Can be a default or named instance Replaces term active/passive Multiple-Instance Cluster Up to 16 SQL Server virtual servers are supported per server cluster: 1 default instance + up to 15 named instances OR Up to 16 named instances only Replaces term active/active The Failover Process Operating-system checks Heartbeat checks availability of nodes and virtual server SQL Server checks LooksAlive check runs every five seconds IsAlive check runs SELECT @@SERVERNAME query Failover to another node Windows Clustering attempts restart on same node or fails over to another node SQL Server service starts Brings master online Database recovery proceeds End users and applications must reconnect Illustration Of Failover Client PCs Node A Node B SQL Server SQL Server Heartbeat Shared Disk Array Failover From A Client/Application Perspective Application can keep running; It doesn’t have to be aware of a new IP address or server name; Only virtual server fails over Failover is nearly transparent, except… SQL goes through a stop/restart and connections are dropped Completed transactions in log are rolled forward; Incomplete transactions will be rolled back Plan for and manage failover: Handle a failover gracefully in code, or have retry logic Consider using middleware (MTS/MSMQ/BizTalk) for transactions Use the Clustering API to code cluster-aware applications Non-cluster-aware applications/services may have to be Generic Application or Service resources Consider the network timeout value Enhancements To Failover Clustering In SQL Server SQL Server Setup installs and uninstalls a cluster SQL Server failover clustering is a permanent option; No unclustering is possible; To remove, you must uninstall Service packs are applied directly to virtual servers SQL Server supports multiple instances and multiple network addresses Extensive support for recovering from a failure of a server node in the cluster, including a one-node cluster Number of nodes …continued Enhancements To Failover Clustering (Continued) All nodes have local copies of SQL Server tools and executables SQL Server failover clustering supports Microsoft Search service Rerunning the Setup program updates failover clustering configurations SQL Server Service Manager or SQL Server Enterprise Manager now start and stop SQL Server services No longer have to use Cluster Administrator to perform this task Building A SQL 2000 Cluster Step 1 - Prepare Hardware Apply latest system BIOS Apply latest device firmware Gather latest software drivers Disable unnecessary hardware Building A SQL 2000 Cluster Step 2 – Install OS and Pre-Reqs Install Windows Server 2003 Add Nodes to Domain as member servers DCs are not recommended on clustered nodes Windows Update / Security Hotfixes Administration Tools – ADMINPAK.MSI Windows Support Tools Resource Kit Tools Building A SQL 2000 Cluster Step 3 – Prepare Nodes for Cluster Service Disable unnecessary services Configure Networks Rename connections: Private Network and Public Network Disable NetBIOS and DNS on private (heartbeat) interface Disable Media Sense on NICs – Hard-code (MSKB 258750) Use 10MBs/Half-Duplex if not sure what speed to use Give private network highest binding order Create/Select cluster service account Domain account w/local Administrator rights on each node Create Quorum partition on shared disk 50MB min; 500MB-1GB recommended Create and format additional disks/arrays Building A SQL 2000 Cluster Step 4 – Install Cluster Service on each node. Step 5 – Install Network DTC on each node (MSKB 817064, 301600) Step 6 – Install SQL 2000 Virtual Instance Binaries installed locally in same location on each node Installs all nodes at the same time! Building A SQL 2000 Cluster Step 7 – Install SQL 2000 Service Pack 4 and Updates Always update all nodes Step 8 – (Optional) Repeat Step 6 if using Multiple Instance model Step 9 – Bring Resources Online Building A SQL 2000 Cluster Best Practices Do not install applications into the default Cluster Group Do not delete or rename the default Cluster Group or remove any resources from that resource group Do not use APM/ACPI power-saving features Give the Cluster service account full rights to administer computer objects if Kerberos authentication is enabled for virtual servers Do not set the Cluster service account to be a member of the domain administrator group Failover Cluster Failover Clustering SQL Server 2005 Further refined in SQL Server 2005 More nodes Match operating system limits Unattended setup Support for mounted volumes (Mount Points) All SQL Server services participate Database Engine, SQL Server Agent, Analysis Services, Full-Text Search, etc. Database Mirroring Database Mirroring New for SQL Server 2005 Instant Standby Conceptually a fault-tolerant server Building block for complex topologies Database Failover Very Fast … less than three seconds Zero data loss Automatic or manual failover Automatic re-sync after failover Automatic, transparent client redirect SQL 2005 Failover Solutions At A Glance Both Provide Automatic detection and failover Manual failover Transparent client connect Zero work loss Database Views mitigate DBA or application error Failover Clustering System scope Certified hardware Fast failover No reporting on standby Single copy of database Database Mirroring Database scope Standard servers Fastest failover Limited reporting on standby Duplicate copy of database Agenda Terms you need to know Four Types of Clustering What is Clustering? Overview of Exchange Clustering Overview of SQL Server Failover Clustering MSDTC Resources MSDTC Best Practices Install Network DTC with Windows http://support.microsoft.com/kb/817064 Install Clustering Create MSDTC Resource within the cluster http://support.microsoft.com/default.aspx?sc id=kb;en-us;301600 Exchange – requires MSDTC for installation and service packs – put into Cluster Group SQL – only required if an application uses it – Dedicated IP, Network Name, Group Agenda Terms you need to know Four Types of Clustering What is Clustering? Overview of Exchange Clustering Overview of SQL Server Failover Clustering MSDTC Resources Microsoft Windows Server Clustering MVP www.nw-america.com – Clustering msmvps.com/clustering - Blog https://mvp.support.microsoft.com/prof ile=EDD23402-0C81-4968-916C09D62BBD77F5 – MVP Profile Resources Clustering newsgroup support – msnews.microsoft.com Microsoft.public.exchange.clustering Microsoft.public.sqlserver.clustering Microsoft.public.windows.server.clustering Welcome to the Clustering Technologies Community http://www.microsoft.com/windowsserver2003/com munity/centers/clustering/default.mspx Server Clusters: Network Configuration Best Practices for Windows 2000 and Windows Server 2003 http://www.microsoft.com/technet/prodtechnol/wind owsserver2003/technologies/clustering/clstntbp.ms px Resources SQL Server High Availability resources http://www.microsoft.com/sql/techinf o/administration/2000/availability.asp Visit the SQL Server Web site: www.microsoft.com/sql SQL Server 2000 Failover Clustering http://www.microsoft.com/technet/pr odtechnol/sql/2000/maintain/failclus. mspx Resources Exchange Server 2003 planning guide: http://www.microsoft.com/technet/prodtechnol/ex change/Exchange2003/proddocs/library/MessSys t.asp Exchange Server 2003 Deployment Guide: http://www.microsoft.com/technet/prodtechnol/ex change/Exchange2003/proddocs/library/DepGuid e.asp Exchange Server 2003 Technical Documentation Library: http://www.microsoft.com/exchange/librar y/ Resources Learn more about Clustering at TechEd Hands On Labs MGT12 Microsoft System Center Data Protection Manager SVR15 Clustering with Virtual Server 2005 Cabana Talks Find me and buy me a drink Community Resources Attend a free chat or web cast http://www.microsoft.com/communities/chats/default.mspx http://www.microsoft.com/usa/webcasts/default.asp List of newsgroups http://communities2.microsoft.com/ communities/newsgroups/en-us/default.aspx MS Community Sites http://www.microsoft.com/communities/default.mspx Locate Local User Groups http://www.microsoft.com/communities/usergroups/default.m spx Community sites http://www.microsoft.com/communities/related/default.mspx Where To Learn More Other Tech Ed Sessions: BAP200 Microsoft Business Solutions-Great Plains: Maximizing Your Hardware and Network Infrastructure CSI448 Optimizing Scalability, Performance and Availability with Systems Built on the .NET Framework DBA308 Ensuring Business Continuance with SQL Server 2005 Data Availability Solutions MGT315 Update Management and Desktop Deployment at Microsoft MSG300 Exchange 2003 Architecture Best Practices Where To Learn More Other Tech Ed Sessions: MSG360 Microsoft IT: Exchange Best Practices from Microsoft IT MSG383 Exchange Server 2003 Cluster Best Practices PRT375 SharePoint Products and Technologies: Performance and Capacity Planning Best Practices and Lessons Learned SVR308 Introducing Windows Server 2003, Compute Cluster Edition Your Feedback is Important! Please Fill Out a Survey for This Session on CommNet © 2005 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.