Performance Tuning, Management and Optimization in a Virtual Infrastructure Presented by David Davis Director of Infrastructure www.TrainSignal.com Who Is David Davis? • Director of Infrastructure Train Signal, Inc -the leader in Professional IT video training • Over 15 years in enterprise infrastructure management and years of real-world virtualization experience • Have obtained the following certifications: CCIE#9369, VCP, MCSE and CISSP Who Is David Davis? • Author of six video training courses and hundreds of articles for well-known websites such as: SearchVMware.com and VirtualizationAdmin.com • Best known for my Train Signal VMware • ESX Server video training course • Best of VMworld 2008 Awards Judge • Company website: www.TrainSignal.com • Personal website: www.VMwareVideos.com Abstract • Virtualized Infrastructures can perform as well or better than physical infrastructures, if performance tuning and management are done correctly. • Today’s Applications are complex: • • • • Virtualized Distributed Intensive Tied to SLAs Abstract • We will cover… • Performance Tuning and Management in the virtual infrastructure • • Best practices for virtualization performance How to troubleshoot bottlenecks on existing systems • Tools that allow IT Pros to manage virtual infrastructure properly Abstract • What are common mistakes that can hurt performance and how can you prevent them? • What tools are available for managing the performance of your virtual infrastructure? • What are the best practices for configuring your virtualization infrastructure to ensure ideal performance? What I Assume You Already Know Assumptions… • Good understanding of server virtualization concepts • May or may not already be using virtualization • Have, or will have, performance concerns (that’s everyone, right?) By The End Of The Session, You’ll Know The Following: • How to manage performances • How to optimize performance • How to troubleshoot performance issues • How to design your VI so that you prevent performance issues Virtualization Basics: Virtualization Guest Encapsulation Virtualization Performance Basics Management Agents and Interfaces VMkernel Service Console Hosted Other Peripheral I/O UserWorlds POSIX API Storage Stack Resource Management Hardware VM VM VMM VMM Network Stack Device Drivers Managing Performance Managing Performance Managing Performance ESX 3.5 And Update 1 • Up to 32 logical processors per host (64 LP experimental) • Large memory support – 250 GB physical and 64Gb per VM • Large page size – 2 MB VMKernel pages can be allocated to guest OS • 192 vCPUs per Host -- Update 1 ESX “4.0” • CPU and RAM “hot add” • Historical performance tracking and performance alerts • Clustered VirtualCenter Servers • ESX hosts profile management • Cross-hosts virtual networking • 8-way virtual SMP • Virtual machines fault tolerance across multiple hosts dubbed “vlockstep” Virtualization Overhead • CPU -- special handling of instructions • Memory -- additional management tasks • Devices and resource management -- not direct access to hardware • Typically, difficult to notice -- in my experience Memory Overhead • Service console 272 MB (not in ESXi) • VMKernel 100 MB+ • Per-VM memory overhead increases with: • • • Number of VCPUs Size of guest memory 64-bit guest OS Virtual CPU Recommendations • Single threaded app = uni proc VM • Multi threaded app = SMP VM • But only as many as required • Unused VCPU in SMP VM = scheduling overhead -- see KB 1077 and 1730 Virtual CPU Recommendations • Make sure OS HAL matches number of CPUs -- MP vs. UP HAL • Use 64-bit guests, if possible -- more registers, larger kernel tables • Still, remember 64-bit OS vs. app compatibilities Performance Issues Due To Interrupts • Any controller but usually USB • Disable USB • See KB 1290 Overall Device Recommendations • Disable / remove all unused devices • USB, CDROM, Floppy • Can consume CPU when idle Large Guest Pages Backed By Host • New in ESX 3.5 • Significant performance improvement for memory intensive apps • Best to allocate large pages immediately after VM boot • Page sharing not supported for large pages Network Performance • Check NICs for proper speed and duplex, hardcoded • NIC teaming distributes load and offers passive failover • Separate NICs avoid contention -- console, VMKernel, and VM • Tune VM to VM networking and rx/tx buffers (KB 1428) Network Performance • Use 32-bit vmxnet driver instead of vlance • To use vmxnet, install tools • E1000 is for 64-bit guests • Enhanced vmxnet is offered for several guests Network Performance • Use a network adapter that supports the following: • • • Checksum offload • • Capability to handle high memory DMA • 10 G TCP segmentation offload (TSO) Jumbo frames (JF), available in enhanced vmxnet vNIC Capability to handle multiple scatter/gather elements per Tx frame Install VMware Tools • Vmxnet -- high speed net driver • Memory balloon driver • Improved graphics • Timer sponge for correct accounting of time • Timesync -- syncs time with host every minute Storage Performance • Hardware configuration affects storage performance • ESX Server HBA1 HBA2 Consult SAN Configuration Guides • • Ensure caching is enabled • Spread I/O requests across available paths HBA3 HBA4 FC Switch SP1 Consider tuning layout of LUNs across RAID sets 1 SP2 2 3 Storage array 4 Storage Performance • Fibre Channel SAN storage best practices • Set LUN queue depth appropriately (KB 1267) • Networked storage best practices (NFS, iSCSI) • Ensure sufficient CPU for software-initiated iSCSI and NFS • • Avoid link oversubscription • Use multiple mount points with multiple VMs Ensure consistent configuration across the full network path Benchmarking With VMark • VMmark: A scalable benchmark for virtualized enterprise systems • • • • Provides meaningful measurement of virtualization performance Generates metric that scales with underlying system capacity Used to compare the performance of different hardware and virtualization platforms Employs realistic, diverse workloads running on multiple operating systems Storage Performance • VM Configuration • • Choose placement of data disks and swap files on LUNs appropriately • RAID type, spindles available, concurrent access of LUNs etc. Increase VM’s max outstanding disk requests if needed (KB 1268) • Esxtop enhancements • • Per device and path stats Per VM device stats DRS Performance • Ensure hosts in a cluster are VMotion compatible • Minimize reservations if possible • Use VM affinity and anti-affinity rules only when needed • Migration threshold should be set less aggressively when • • • Hosts in the cluster are inhomogeneous VM resource utilization is highly variable in time More affinity and anti-affinity rules • Use DRS to achieve max performance Troubleshooting Performance • Know your applications • Have a baseline • Esxtop • Decent tools in VI Client • Find Bottleneck, CPU, Disk, RAM or Net • Host or Guest? • Some components may be out of your domain -- both SAN and Net are critical Designing VI To Prevent Performance Issues • Capacity Planning is key • Know your apps • Understand the SAN • Use DRS / Resource Pools • Don’t skimp on hardware Prevent Common Performance Mistakes • P2V • Poor Improper Sizing • Poor Hardware Selection • Alerting not configured • Not using DRS Performance Tools • Esxtop • vKernel Capacity Bottleneck Analyzer • vKernel Modeler • Solarwinds Orion VMware Edition and free VM Monitor • Veeam Monitor • Nagios • Vizioncore vCharter / vFoglight • eG VM Monitor Performance Tool Demo INSERT GRAPHICS – estimated to be about 5 slides David’s Five Performance Tips 1. Know how to use esxtop -- quick and simple 2. Know your applications and environment 3. Have a baseline 4. Don’t skimp on hardware 5. Use third-party performance tools -- historical performance monitoring is required Conclusion • Virtualization environments continue to grow in complexity • Managing performance doesn’t have to be difficult • Follow best practices, know your environment and use third-party performance tools • With that, performance can be improved and troubleshooting can be simplified References • Related papers for best practices and benchmarking: • ESX Server 3 performance tuning best practices • www.vmware.com/pdf/vi_performance_tuning.pdf • VMmark • www.vmware.com/pdf/vmmark_intro.pdf • SAN Configuration Guide: • www.vmware.com/pdf/esx_san_cfg_technote.pdf • www.vmware.com/pdf/vi3_esx_san_cfg.pdf Questions?