BCO2874 vSphere High Availability 5.0 and SMP Fault Tolerance – Technical Overview and Roadmap Name, Title, Company Disclaimer This session may contain product features that are currently under development. This session/overview of the new technology represents no commitment from VMware to deliver these features in any generally available product. Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind. Technical feasibility and market demand will affect final delivery. Pricing and packaging for any new technologies or features discussed or presented have not been determined. 2 vSphere HA and FT Today Minimize downtime without the cost/complexity of traditional solutions vSphere HA provides rapid recovery from outages vSphere Fault Tolerance provides continuous availability Coverage Application App Monitoring APIs Partner solutions Guest OS Guest Monitoring VM Fault Tolerance Infrastructure HA Hardware none 3 minutes Downtime This Talk 1. Technical overview of vSphere HA 5.0 • Presented by Keith Farkas 2. Technical preview of vSphere Fault Tolerance SMP • Presented by Jim Chow Coverage App Monitoring APIs Partner solutions Application Guest Monitoring Guest OS VM HA 5.0 Multiple vCPU Fault Tolerance FT Infrastructure HA Hardware none 4 minutes Downtime vSphere HA 5.0 Objectives Learn about the enhancements in vSphere HA 5.0 Understand the new architecture Identify questions for the breakout / expert sessions 5 vSphere HA 5.0 vSphere HA was completely rewritten in 5.0 to • Simplify setting up HA clusters and managing them • Enable more flexible and larger HA deployments • Make HA more robust and easier to troubleshoot • Support network partitions 5.0 architecture is fundamentally different • This talk • Describes the three key concepts • Summarizes host failure responses • To learn more, see other VMworld HA venues 6 5.0 Architecture New vSphere HA agent • Called the Fault Domain Manager (FDM) • Provides all the HA on-host functionality As in previous releases • vCenter Server (VC) manages the cluster • Failover operations are independent of VC • FDMs communicate over management network vCenter Server (VC) 7 Key Concepts – Part 1 • FDM roles and responsibilities • Inter-FDM communication 8 FDM Master One FDM is chosen to be the master • Normally, one master per cluster • All others assume the role of FDM slaves Any FDM can be chosen as master • No longer a primary / secondary role concept • Selection done using an election Master-specific responsibilities • Monitors availability of hosts / VMs in cluster • Manages VM restarts after VM/host failures • Reports cluster state / failover actions to VC • Manages persisted state vCenter Server (VC) 9 FDM Slave and Shared Responsibilities Slave-specific responsibilities Forwards critical state changes to the master Restarts VMs when directed by the master If the master should fail, participates in master election Each FDM (master or slave) • Monitors the state of local VMs and the host • Implements the VM/App Monitoring feature 10 The Master Election An election is held when: vSphere HA is enabled • Master’s host becomes inactive • HA is reconfigured on master’s host • A management network partition occurs ESX 1 ESX 3 If multiple masters can communicate, all but one will abdicate Master-election algorithm • Takes15 to 25s (depends on reason for election) • Elects participating host with the greatest number of mounted datastores 11 ESX 2 ESX 4 Agent Communication FDMs communicate over the • Management networks • Datastores Datastores used when network is unavailable • Used when hosts are isolated or partitioned Network communication • All communication is point to point • Election is conducted using UDP • All master-slave communication is via SSL encrypted TCP 12 Questions Answered Using Datastore Communication Master Is a slave partitioned or isolated? Are its VMs running? 13 Slave Is a master responsible for my VM? Questions Answered Using Datastore Communication Master Slave Is a slave partitioned or isolated? Is a master responsible for my VM? Are its VMs running? Datastores Used Datastores selected by VC, called the Heartbeat Datastores 14 Datastores containing VM config files Heartbeat Datastores VC chooses (by default) two datastores for each host You can override the selection or provide preferences • Use the cluster “edit settings” dialog for this purpose 15 Responses to a Network or Host Failures 16 Host Is Declared Dead Master declares a host dead when: • Master can’t communicate with it over the network • Host is not connected to master • Host does not respond to ICMP pings • Master observes no storage heartbeats ESX 1 ESX 3 ESX 2 ESX 4 Results in: • Master attempts to restart all VMs from host • Restarts on network-reachable hosts and its own host 17 Host Is Network Partitioned Master declares a host partitioned when: • Master can’t communicate with it over the network • Master can see its storage heartbeats Results in: • One master exists in each partition • VC reports one master’s view of the cluster ESX 1 ESX 3 ESX 2 ESX 4 • Only one master “owns” any one VM • A VM running in the “other” partition will be • monitored via the heartbeat datastores • restarted if it fails (in master’s partition) • When partition is resolved, all but one master abdicates 18 Host Is Network Isolated A host is isolated when: It sees no vSphere HA network traffic It cannot ping the isolation addresses Results in: ESX 1 ESX 3 ESX 2 ESX 4 Host invokes (improved) Isolation response • Checks first if a master “owns” a VM • Applied if VM is owned or datastore is inaccessible • Default is now Leave Powered On Master • Restarts those VMs powered off or that fail later • Reports host isolated if both can access its heartbeat datastores, otherwise dead Isolation Addresses 19 Key Concepts – Part 2 HA Protection and failure-response guarantees 20 vSphere HA Response to Failures Guest OS hangs, crashes Reset VM With tools installed Application heartbeats stop Host fails (e.g., reboots) Host isolation (VM powered off) VM fails (e.g., VM crashes) 21 Attempt The responding master VM restart knows are HA Protected HA Protected Workflow User issues power on for a VM Host powers on the VM VC learns that the VM powered on time VC tells master to protect the VM Master receives directive from VC Master writes fact to a file Write is done 22 HA Restart Guarantee User issues power on for a VM Host powers on the VM VC learns that the VM powered on time VC tells master to protect the VM Master receives directive from VC An attempt may be made if a failure occurs now Master writes fact to a file Write is done 23 An attempt will be made for failures now and in future vSphere HA Protection Property Is a new per-VM property Reports on whether a restart attempt is guaranteed Is shown on the VM summary panel and optionally in VM lists 24 Values of the HA Protection Property Value reported by VC User issues power on for a VM N/A Host powers on the VM VC learns that the VM powered on time VC tells master to protect the VM Master receives directive from VC Unprotected Master writes fact to a file Write is done. Master tells VC VC learns VM has been protected 25 Protected Wrap Up 26 vSphere HA Summary vSphere HA feature provides organizations the ability to run their critical business applications with confidence 5.0 Enhancements provide • A solid, scalable foundation upon which to build to the cloud • Simpler management and troubleshooting • Additional and more robust responses to failures Resource Pool 27 VMware ESXi VMware ESXi VMware ESXi Operating Server Failed Server Operating Server To Learn More About HA and HA 5.0 At VMworld • See demo in VMware booth in solutions exchange • Try it out in lab HOL04 – Reducing Unplanned Downtime • Attend group discussions GD15 and GD35 – vSphere HA and FT • Review panel session VSP1682 – vSphere Clustering Q&A • Talk with knowledge expert (EXPERTS-09) Offline • Availability Guide • Best Practices Guide • Troubleshooting Guide • Release notes 28 vSphere Fault Tolerance SMP Technical Preview Objectives Why Fault Tolerance? What’s new: SMP 29 vSphere Availability Portfolio Coverage App Monitoring APIs Application Guest Monitoring Guest OS VM Fault Tolerance Infrastructure HA Hardware none 30 minutes Downtime Why Fault Tolerance? Continuous Availability • Zero downtime • Zero data loss • No loss of TCP connections • Completely transparent to guest software • Simple UI: Turn On Fault Tolerance • Delegate all management to the virtual infrastructure Users Apps OS 31 Background 2009: vSphere Fault Tolerance in vSphere 4.0 2010: Updates to vSphere Fault Tolerance in vSphere 4.1 2011: Updates to vSphere Fault Tolerance in vSphere 5.0 Details: http://www.vmware.com/products/fault-tolerance/ Problem: • FT only for uni-processor VMs • Is FT for multi-processor VMs possible? • An impressively hard problem • Concerted effort to find an approach Reached milestone • We’d like to share it 32 A Starting Point: vSphere FT vLockstep Application Operating System Virtualization Layer Application FT LOGGING Operating System Virtualization Layer Shared Disk 33 A Clean Slate vLockstep Application Operating System Virtualization Layer Application FT LOGGING Operating System Virtualization Layer Shared Disk 34 A Clean Slate Application Operating System Virtualization Layer Spare you the details See it in action 35 Application FT LOGGING Operating System Virtualization Layer Live Demo Client Operating System Application Operating System Virtualization Layer Experimental setup, caveats 36 Application FT LOGGING Operating System Virtualization Layer Live Demo Summary SMP FT in action • Presented a good solution • Client oblivious to FT operation • SwingBench client • SSH client • Transparent failover • Zero downtime, zero data loss • Taste for performance / bandwidth But that’s not all 37 Performance Numbers % Throughput (FT/non FT) (higher is better) 100 80 60 40 20 0 Microsoft SQL Server 2-vCPU Microsoft SQL Server 4-vCPU Oracle Swingbench 2vCPU Oracle Swingbench 4vCPU Similar configuration to vSphere 4 FT Performance Whitepaper • Models real-world workloads: 60% CPU utilization 38 vSphere FT Summary Why Fault Tolerance • Continuous availability Fault Tolerance for multi-processor VMs • Good solution to impressively hard problem • A new design • Demonstrated similar experience to existing vSphere FT • But more vCPUs 39 vSphere HA and FT Future Directions 40 vSphere HA and FT – Technical Directions Technical directions include More comprehensive coverage of failures for more applications Coverage Multi-tier application App Monitoring APIs Application Guest OS Hardware/VM 41 VM/Guest Monitoring Fault Tolerance Infrastructure HA Multiple vCPUs MetroHA Protection against host component failures Downtime vSphere HA and FT – Technical Directions Technical directions include More comprehensive coverage of failures for more applications Broader set of enablers for improving availability of applications Coverage Multi-tier application Building blocks for creating available apps App Monitoring APIs Application API extensions Guest OS Hardware/VM 42 VM/Guest Monitoring Fault Tolerance Infrastructure HA Multiple vCPUs MetroHA Protection against host component failures Downtime vSphere HA and FT – Technical Directions Technical directions include More comprehensive coverage of failures for more applications Broader set of enablers for improving availability of applications Coverage Multi-tier application Building blocks for creating available Partner apps solutions APIs App Monitoring Application API extensions Guest OS Hardware/VM VM/Guest Monitoring Fault Tolerance Infrastructure HA Multiple vCPUs MetroHA Protection against host component failures none 43 minutes Downtime vSphere HA and FT – Technical Directions Technical directions include More comprehensive coverage of failures for more applications Broader set of enablers for improving availability of applications Coverage Multi-tier application Application Building blocks for creating available Partner apps solutions APIs App Monitoring Solidifying vSphere API extensions as the platform for running all mission-critical applications Guest OS Hardware/VM VM/Guest Monitoring Fault Tolerance Infrastructure HA Multiple vCPUs MetroHA Protection against host component failures none 44 minutes Downtime Thank you! Questions? 45 BCO2874 vSphere High Availability 5.0 and SMP Fault Tolerance – Technical Overview and Roadmap Additional vSphere HA 5.0 Details 48 Troubleshooting 49 Troubleshooting vSphere HA 5.0 HA issues proactive warning about possible future conditions • VMs not protected after powering on • Management network discontinuities • Isolation addresses stop working HA host states provide granularity into error conditions All HA conditions reported via events; config issues/alarms for some • Event descriptions describe problem and actions to take • All event messages contain “vSphere HA” so searching for HA issues easier • HA alarms are more fine grain and auto clearing (where appropriate) 5.0 Troubleshooting guide which discusses likely top issues. E.g., • Implications of each of the HA host states • Topics on HB datastores, failovers, admission control • Will be updated periodically 50 HA Agent Logging HA 5.0 writes operational information to a single log file called fdm.log • A configurable number of historical copies are kept to assist with debugging File contains a record of, for example, • Inventory updates relating to VMs, the host, and datastores received from the host management agent (hostd) • Processing of configuration updates sent to a master by vCenter Server • Significant actions taken by the HA agent, such as protecting a VM or restarting a VM • Messages sent by a slave to a master and by a master to a slave Default location • ESXi 5.0: /var/log/fdm.log (historical copies in var/run/log) • Earlier ESX versions: /var/log/vmware/fdm (all files in the same directory) Notes • See vSphere HA best practices guide for recommended log capacities • HA log files are designed to assist VMware support in diagnosing problems and the format may change at any time. Thus, for reporting, we recommend you rely on the vCenter Server HA-related events, alarms, config issues, and VM/host properties 51 Log File Format Log file contains time stamped rows Many rows report the HA agent (FDM) module that logged the info E.g., 2011-06-01T05:48:00.945Z [FFFE2B90 info 'Invt' opID=SWI-a111addb] [InventoryManagerImpl::ProcessClusterChange] Cluster state changed to Startup Noteworthy modules are • Cluster – module responsible for cluster functions • Invt – module responsible for caching key inventory details • Policy – module responsible for deciding what to do on a failure • Placement – module responsible for placing failed VMs • Execution – module responsible for restarting VMs • Monitor – modules responsible for periodic health checks • FDM – module responsible for communication with vCenter Server 52 Additional Datastore Details for HA 5.0 • Heartbeating and heartbeat files • Protected VM files • File locations 53 Heartbeat Datastores(HB): Purpose and Mechanisms Used by master for slaves not connected to it over network Determine if a slave is alive • Rely on heartbeats issued to slave’s HB datastores • Each FDM opens a file on each of its HB datastores for heartbeating purposes • Files contain no information. On VMFS datastores, file will have the minimum-allowed file size • Files are named X-hb, where X is the (SDK API) moID of the host • Master periodically reads heartbeats of all partitioned / isolated slaves Determine the set of VMs running on a slave • A FDM writes a list of powered on VMs into a file on each of its HB datastores • Master periodically reads the files of all partitioned/isolated slaves • Each poweron file contains at most 140 KB of info. On VMFS datastores, actual disk usage is determined by the file-sizes supported by the VMFS version • They are named X-powereon, where X is the (SDK API) moID of the host 54 VM Protected Files Protected-vm files are used • When recovering from a master failure • To determine whether a master is responsible for a given VM • To divvy the VMs up between masters during a partition One protetedlist file per datastore per cluster using the datastore • It stores the local paths of the protected VMs • A VM is listed only in the file on the datastore containing its config file Each file is a fixed 2 MB in size 55 File Locations FDMs create a directory (.vSphere-HA) in root of each relevant datastore Within it, they create a subdirectory for each cluster using the datastore Each subdirectory is given a unique name called the Fault Domain ID <VC uuid>-<cluster entity ID>-<8 random hex characters>-<VC hostname> • Entity ID is the number portion of the (SDK API) moID of the cluster E.g., in /vmfs/volumes/clusterDS/.vSphere-HA/ 56 FDM-C8496A0D-12D2-4933-AE02-601BCDDB9C61-9-d6bfc023-vc23/ Cluster 9 FDM-C8496A0D-12D2-4933-AE02-601BCDDB9C61-17-ad9fd307-vc23/ Cluster 17 UI Changes 57 Summary of UI Changes Cluster Summary Screen • Advanced Runtime Info (improved) • Cluster Status (new) • Configuration Issues (improved) Cluster and datacenter • Hosts list view (improved) Cluster Configuration • Datastore Heartbeating (new) • Admission Control (improved) Host, cluster, datacenter • VM list view (improved) Host Summary Screen • HA host state (improved) VM Summary Screen • HA Protection (improved) 58 Cluster Summary of UI Changes Cluster Summary Screen • Advanced Runtime Info (improved) • Cluster Status (new) 59 Summary of UI Changes Cluster Summary Screen • Advanced Runtime Info (improved) • Cluster Status (new) • Configuration Issues (improved) 60 Summary of UI Changes Cluster Summary Screen • Advanced Runtime Info (improved) • Cluster Status (new) • Configuration Issues (improved) Cluster and datacenter • Hosts list view (improved) 61 Summary of UI Changes Cluster and datacenter • Hosts list view (improved) Cluster Configuration • Datastore Heartbeating (new) 62 Summary of UI Changes Cluster Summary Screen • Advanced Runtime Info (improved) • Cluster Status (new) • Configuration Issues (improved) Cluster and datacenter • Hosts list view (improved) Cluster Configuration • Datastore Heartbeating (new) • Admission Control (improved) Host, cluster, datacenter • VM list view (improved) Host Summary Screen • HA host state (improved) VM Summary Screen • HA Protection (improved) 63 Summary of UI Changes Host, cluster, datacenter • VM list view (improved) showing protected VMs 64 UI Changes Cluster Summary Screen • Advanced Runtime Info (improved) • Cluster Status (new) • Configuration Issues (improved) Cluster and datacenter • Hosts list view (improved) Cluster Configuration • Datastore Heartbeating (new) • Admission Control (improved) Host, cluster, datacenter • VM list view (improved) Host Summary Screen • HA host state (improved) VM Summary Screen • HA Protection (improved) 65 UI Changes Cluster Summary Screen • Advanced Runtime Info (improved) • Cluster Status (new) • Configuration Issues (improved) Cluster and datacenter • Hosts list view (improved) Cluster Configuration • Datastore Heartbeating (new) • Admission Control (improved) Host, cluster, datacenter • VM list view (improved) Host Summary Screen • HA host state (improved) VM Summary Screen • HA Protection (improved) 66