High Availability Deep Dive What’s New in vSphere 5 David Lane, Virtualization Engineer High Point Solutions Agenda • • • • • • • What is High Availability What’s New in vSphere 5 Core Components of High Availability vSphere 5 How High Availability Works in vSphere 5 Scenarios for High Availability in vSphere 5 Exploiting High Availability with vSphere 5 Q&A What is High Availability? • The Answer to Hardware Density Concerns • Resilient Architecture • Automated Recovery • Simple Setup / Familiar Interface High Availability Prerequisites • • • • • • • Minimum of 2 Hosts Minimum of 3GB of Host Memory VMware vCenter Server Shared Storage Pingable Constant Address (Gateway) HA Communication Firewall Ports (TCP/UDP 8182) Essentials Plus and Up Configuring High Availability • 10 Steps - 10 Minutes • Create a Cluster • Drag and Drop Hosts What’s New for vSphere 5 • FDM (Fault Domain Manager) – New HA Agent • Master / Slave Nodes • Datastore Heartbeating • Enhanced Isolation Validation • No DNS Dependency • Supports Management Network Partitions • Enhanced Admission Control Policies Core HA Components of vSphere 5 • FDM (Fault Domain Manager) • VMware vCenter • hostd FDM • Replaces Legato AAM (Automated Availability Manager) • Single Process Agent with Watchdog Failsafe • No DNS Dependency No DNS Limitations • Consolidated Logging with Syslog Compatibility • Talks Directly to hostd and vCenter Not Dependent on VPXA VMware vCenter • Deploys FDM Agents – Parallel (AAM Serial) • Communicates Configuration Changes in Cluster to Master Node • Retrieves Virtual Machine Status • Displays Protection Status of VMs hostd • Required for FDM • Runs on Host • Relays information about VMs on host • Responsible to Power On VMs How Does High Availability vSphere 5 work? The Tools • Master / Slave Nodes • Heartbeating • Isolated vs. Network Partitioned • Virtual Machine Protection Master / Slave Nodes • One Master Node Per Cluster (exception Network Partitioned) • Master Node Monitors VM Health Directs Slaves • Master Node Takes Ownership of Datastores where VMs Configuration Files are Located • Master Node Reports VM Status to vCenter Server • Master Node Assigned by Election • Slaves Monitor Their running VMs and send Status to Master and perform restarts on Master Node Requests • Slaves Also Monitor Master Node Health Master Node Election • Election held When HA is Enabled or Reconfigured and When Master Node - Fails, Becomes Isolated or Partitioned, Disconnects from vCenter, In Maintenance Mode, In Standby • Utilizes UDP • Takes 15 Seconds • Host with Most Connected Datastores Wins • If Multiple Hosts Share Highest Number Of Datastores the Host with the highest Managed Object ID (MOID) Wins • New Master Node will Attempt to Acquire Ownership of All Datastores by Locking “protectedlist” File (Protected VM List Inventory File, on Datastores in Cluster) • In The Case of Master Node Isolation File Locks will be Released Heartbeating • Network Heatbeating • Datastore Heartbeating Network Heartbeating • Heartbeats sent from Slaves to Master and From Master to Slaves • Heartbeats Sent Every Second • Determines the State Of the Hosts Datastore Heartbeating • Prevents Unnecessary Restarts • Extra Heartbeat Added to Determine State if Management Network is Lost • Validates Failure or Just Isolation • Uses PowerOn File to Determine Isolation Isolated vs. Network Partitioned • Isolated (Host Separated from Master VMs May be Restarted) – Not Receiving Heartbeat From Master – Not Receiving Election Traffic – Cannot Ping Isolation Address • Partitioned (Multiple Host Isolated but Can Communicate to Each Other Over Management Network) – Not Receiving Heartbeats from Master – Does Receive Election Traffic Virtual Machine Protection • vCenter Server Performs Protection on State Change • Protection guaranteed when the master has committed the change of state to disk • Protectedlist File Contains VM State and Protection Scenarios For High Availability vSphere 5 Using The Tools • Failed Host • Isolated Host • Application Monitoring - Failed VM OS Failed Host • Failed Master Host – Master Election Initiated – New Master Elected – New Master Restarts all VMs on the Protectedlist with Not Running State • Failed Slave Host – Master Check Network heartbeat – Master Checks Datastore Heartbeat – Master Restarts VMs Affected Isolated Host • Isolation Responses – Power Off – Leave Powered On – Shut Down • Isolation Detection – Slaves will Hold Single Server Election and Check Ping Address – Master will Check Ping Address – Master Restarts VMs Affected Application Monitoring - Failed VM OS • Restarts Individual VM When Needed • Configurable VM Tools Heartbeat • Monitors Network and Storage I/O Activity as Fail-Safe Exploiting HA with vSphere 5 • Stretched Clusters – Storage DRS • Blade Chassis Failure • Larger Clusters Tenant Based Cloud Q&A THANK YOU