HP Discover 2012 TB#3258: The benefits and right practices of 10GbE networking with VMware vSphere 5 June 6, 2012 2:45PM Agenda The Transition… What’s new with VMware vSphere 5 The design considerations Deployment scenarios Summary © 2012 Emulex Corporation 2 The Transition A website focused on technical content A consortium of technical whitepapers of actual work done in Emulex Labs or Partner Labs A resource for how-to content, blogs, application notes, technical videos and technical webcast recordings Recently selected as one of the Best Technical websites by Network Products Guide – May 2012 © 2012 Emulex Corporation 4 Ethernet Physical Media Mid 1980’s Mid 1990’s Early 2000’s Late 2000’s 10Mb 100Mb 1Gb 10Gb UTP Cat 3 UTP Cat 5 UTP Cat 5 SFP Fiber SFP+ Cu, Fiber X2, Cat 6/7 Technology Cable Distance Power (each side) SFP+ Direct Attach Twinax < 10m ~0.1W ~0.1ms SFP+ SR short reach MM 62.5mm MM 50mm 82m 300m 1W ~0 © 2012 Emulex Corporation Transceiver Latency (link) 5 10GbE PHY Standards 10GBASE- Media Reach Module Cost LR SMF 10km High SR MMF 26m* Medium LX4 MMF/SMF 300m* High CX4 Twinax 15m Low LRM MMF 220-300m* Medium T UTP 30-100m Not applicable *FDDI-grade MMF L – 1310nm Laser S – 850nm Laser X – Uses 8B/10B encoding R – Uses 64B/66B encoding T – Unshielded Twisted Pair C – Copper (Twinax), transmits over 4-lanes in each direction 4 (as in LX4) – represents information is transmitted over 4 wavelengths using a coarse Wavelength Division Multiplexing © 2012 Emulex Corporation 6 Upgrade Current Network or Greenfield Build If upgrading current network – Consider bandwidth requirement of servers – Implement 10GbE uplink for 1Gb devices – High priority devices attach directly to 10GbE Greenfield Data Center 10Gb – 10GbE to Top of Rack – Edge / Core Switching at 10Gb – 10GbE Convergence NIC/iSCSI/FCoE Throughout © 2012 Emulex Corporation 7 10GbE Enterprise Class iSCSI/LAN – Optimized Server Virtualization LAN 1 LAN 2 iSCSI 1 iSCSI 2 Administration Administration vMotion vMotion LAN 3 iSCSI / LAN LAN 4 iSCSI / LAN 1GbE LOM and Multiple 1Gb Adapters © 2012 Emulex Corporation 10GbE LOM and 10Gb Dual-Port Adapter 8 What’s New in vSphere Networking with 10GbE VMware Networking Improvements vMotion • Throughput improved significantly for single vMotion – – – – ESX 3.5 – ~1.0Gbps ESX 4.0 – ~2.6Gbps ESX 4.1 – max 8 Gbps ESX 5.0 – Multiple 8+ Gbps connections • Elapsed reduced by 50%+ on 10GigE tests. Tx Worldlet • VM – VM throughput improved by 2X, to up to 19 Gbps • VM – Native throughput improved by 10% LRO (Large Receive Offload) • Receive tests indicate 5-30% improvement in throughput • 40 - 60% decrease in CPU cost © 2012 Emulex Corporation 10 Jumbo MTU with 10GbE Configure jumbo MTU VM virtual adapter – change jumbo packet size value to 9k ESX host – change vswitch MTU to 9k NICs at both ends and all the intermediate hops/routers/switches must also support jumbo frames Example: Configure 9k packet size on W2k8 VM virtual adapter © 2012 Emulex Corporation 11 Jumbo MTU with 10GbE Configure jumbo MTU – On ESX host, change vSwitch1 MTU to support jumbo frames up to 9000 Example: Configure 9k MTU to vSwitch1 © 2012 Emulex Corporation 12 Jumbo Frames even more important with 10GbE Performance Improvement in iSCSI and NFS read/write throughput using Jumbo Frames on 10GbE Protocol Read Throughput Write Throughput SWiSCSI +11% +39% NFS +9% +32% 64KB Blocksize © 2012 Emulex Corporation 13 NetQueues What are NetQueues? Provides 8 queues on each 10G port for incoming network traffic Offloads incoming packet sorting/routing functions from hypervisor to the adapter Frees up CPU resources Improves platform efficiency and overall system performance Traffic is balanced across multiple CPU cores for processing using MSI-x Message Signaled Interrupts (MSI-x) support o Doubles the # of device interrupts generated o Balances interrupts across all CPUs in a Symmetric Multiprocessing (SMP) platform Optimized throughput between multiple virtual machines and a physical network Best benefit with jumbo MTU on vSwitch, VNIC, physical switch © 2012 Emulex Corporation 14 NetQueues Without NetQueues Enabled – Only a single TX and RX queue for processing all network traffic generated by multiple virtual machines. Enable NetQueues in ESX Host or use vSphere Client Configuring NetQueues on ESX Host: © 2012 Emulex Corporation 15 Performance Benefits Enabling NetQueues and Jumbo MTU • • • Throughput is increased by 2.50 Gbps – ~ NIC Line Rate Reduces average CPU utilization/used 35 - 50% reduction in Virtual Machine CPU utilization 1500 MTU, 8VMs no NetQueue Receive Thru = 7.36 Gbps 9000 MTU, 8VMs NetQueue enabled Receive Thru = 9. Gbps Throughput Across VMs © 2012 Emulex Corporation 16 Network Traffic Management with 10GbE iSCSI FT VMotion iSCSI NFS TCP/IP FT VMotion NFS TCP/IP vSwitch vSwitch 1GigE 10 GigE Traffic Types compete. Who gets what share of the vmnic? Dedicated NICs for various traffic types e.g. VMotion, IP Storage Traffic typically converged over two 10GbE NICs Bandwidth assured by dedicated physical NICs Some traffic types & flows could dominate others through oversubscription © 2012 Emulex Corporation 17 NetIOC: Guaranteed Network Resources NetIOC disabled NFS, VM, and FT traffic take a dip during vMotion Shares & Limits as Configured NetIOC enabled NFS, VM, and FT traffic not affected by concurrent vMotion © 2012 Emulex Corporation 18 802.1p Tagging (QoS) 802.1p is an IEEE standard for enabling Quality of Service at MAC level. 802.1p tagging helps provide end-to-end Quality of Service when: • All network switches in the environment treat traffic according to the tags • Tags are added based on the priority of the application/workload You will now be able to tag any traffic flowing out of the vSphere infrastructure. © 2012 Emulex Corporation 19 VMware vMotion Benefits of vMotion Challenges – Eliminates planned downtime – Larger VMs require more time to vMotion – Enables dynamic load balancing – Reduces power consumption – Larger environments require more vMotions to remain balanced – Essential to managing the virtualized datacenter © 2012 Emulex Corporation 20 vSphere 5 vMotion – Multi-NIC support 70 vSphere 4.1 vSphere 5.0 60 25 to 30% Better Time in seconds 50 50% Better 40 vSphere 5.0, out of box configuration provides 25% to 30% reduction in vMotion Time 30 Using 2 NICs can provide linear scaling and a 50% reduction in vMotion time 20 10 Elapsed reduced by 50%+ on 10GbE tests 0 1 VM 2 VMs © 2012 Emulex Corporation 21 VMware Fault Tolerance Single identical VMs running in lockstep on separate hosts Zero downtime, zero data loss failover from hardware failures No complex clustering or specialized hardware Single common mechanism for all applications and OS-es 10GbE Benefits • 20% Better VM Performance • Easier to Manage Less NICs required • 10GbE is Highly Recommended for FT © 2012 Emulex Corporation 22 vSphere 5 – Networking Improvements (SplitRXmode) Greatly reduce packet loss for Multicast processing vSphere 5 introduces a new way of doing network packet receive processing • Splitting the cost of receive packet processing to multiple contexts. • Each VM we can specify whether we want to have receive packet processing in the network queue context or a separate context. Results when we had more than 24 VMs powered on Without splitRxMode up to 20% packet loss. Enabling splitRxMode less than 0.01% packet loss. Enable SplitRxMode on a per VNIC basis Editing vmx file of VNIC with ethernetX.emuRxMode = "1" for the ethernet device. Note: This is only available with vmxmet3. © 2012 Emulex Corporation 23 Network Latency VM round trip latency overhead 15-20 microseconds DirectPath reduces round trip latency by 10 µs How to reduce Network Latency – Consider disabling power management – Consider disabling physical & virtual NIC interrupt coalescing – Reduce Contention on the physical NIC – Utilize 10Gb NICS – Utilize VMXNET3 Para virtualized device driver – Utilize Directpath (if really needed) http://www.vmware.com/files/pdf/techpaper/network-io-latency-perf-vsphere5.pdf © 2012 Emulex Corporation 24 Networking & 10GbE Best Practices Turn on Hyper-threading in BIOs Confirm that the BIOS is set to enable all populated sockets for all cores Enable “Turbo Mode” for processors that support it Confirm that hardware-assisted virtualization features are enabled in the BIOS Disable any other power-saving mode in the BIOS Disable any unneeded devices from the BIOS, such as serial and USB ports For Windows VMs use VMXNET3 Enhanced Virtual Adapters for best performance Adjust Network Heap Size for excessive network traffic o By default ESX server network stack allocates 64MB of buffers to handle network data o Increase buffer allocation from 64MB to 128MB memory to handle more network data o To change Heap Size ESX Host: # excfg-advcfg –k 128 netPktHeapMaxSize © 2012 Emulex Corporation 25 Networking & 10GbE Best Practices Cont’d Be Mindful of Converged networks, storage load can effect network – Use NIOC to balance and control network workloads Use Distributed switches for cross ESX host Network Convenience, no significant performance impact for DvSwitch vs. vSwitch – DvSwitch Needed for NIOC and advance network shaping Utilize the latest NIC Features – Jumbo Frames for iSCSI & NFS workloads – Utilize 10Gb Ethernet hardware Utilize NIC Teams for fail over (HA) and NIC load balancing – Use Multiple NICs for improved vMotion Speeds Keep an eye out for Packet Loss and Network Latency © 2012 Emulex Corporation 26 Design Considerations Design Considerations Separate Infrastructure Traffic from VM Traffic – VMs should not see infrastructure traffic (security violation) Method of Traffic Separation – 1. Separate logical networks (VLANs) • Create one VDS, connect all pNICs • Create portgroups with different VLANs, place vmknic and VM’s vNics on different portgroups • 2 pNics are sufficient – 2. Separate physical networks • Create one VDS for each physical network • Create portgroups, put vmknic and VM’s vNics on portgroups from different vswitches • Requires at least 4 pNICs (2 per network for redundancy) © 2012 Emulex Corporation 28 Design Considerations – Cont’d Avoid Single point of Failure – Connect two or more physical NICs to an Uplink Port group – Preferably connect those physical NICs to separate physical switches Understand your Virtual Infrastructure traffic flows – Make use of NetFlow feature to monitor the traffic flows over time – Use the data to come up with appropriate traffic shares to help NIOS configuration Prioritize traffic for important workloads and infrastructure traffic – Use QoS tagging © 2012 Emulex Corporation 29 Design Considerations – Virtual Infrastructure Traffic Types © 2012 Emulex Corporation 30 Design Considerations – Traffic Characteristics / NFS © 2012 Emulex Corporation 31 Deployment Scenarios Deploying 10GbE Customer Infrastructure – Needs – Blade Server Deployment • Two – 10Gb Interfaces – Rack Server Deployment • Two – 10Gb Interfaces Physical Port Configurations – Multiple 1Gb NICs – Two 10Gb NICs Physical Switch Capabilities – Switch clustering – Link State tracking © 2012 Emulex Corporation 33 Rack Server Deployment – 10GbE Interface © 2012 Emulex Corporation 34 Static- Port Group to NIC mapping © 2012 Emulex Corporation 35 Dynamic – Use NIOC, Shares and Limits Need Bandwidth information for different traffic types – Consider using NetFlow Bandwidth Assumption – – – – – Management – less than 1Gb vMotion – 2Gb iSCSI – 2Gb FT – 1Gb VM’s – 2Gb Share Calculation – Equal shares to vMotion, iSCSI and Virtual Machine – Lower shares for Management and FT © 2012 Emulex Corporation 36 Dynamic – Use NIOC, Shares and Limits © 2012 Emulex Corporation 37 Blade Server Deployment – 10GbE Interface © 2012 Emulex Corporation 38 Static – Port group to NIC mapping © 2012 Emulex Corporation 39 Dynamic – Use NIOC, Shares and Limits © 2012 Emulex Corporation 40 Pros and Cons Static Pros • Deterministic traffic allocation • Administrators have control over which traffic flows through which uplinks • Physical separation of traffic through separate physical interfaces Cons • Underutilized I/O resources • Higher CAPEX and OPEX • Resiliency through Active-Standby Paths © 2012 Emulex Corporation 41 Pros and Cons Dynamic Pros • Better utilized I/O resource through traffic management • Logical separation of traffic through VLAN • Traffic SLA maintained through NIOC shares • Resiliency through Active-Active Paths Cons • Dynamic traffic movement across physical infrastructure need all paths to be available to handle any traffic characteristics. • VLAN expertise © 2012 Emulex Corporation 42 Emulex is HP’s Leading I/O Supplier Partnership Across HP’s ESSN Business Units Broadest Offering Across Operating Systems and Form Factors Enterprise Server, Storage, and Networking Business Critical Systems Industry Standard Systems Storage Division Networking ProLiant DL Superdome2 10GbE uCNA 10GbE uCNA 10GbE NIC 4/8Gb FC 8Gb/10GbE Combo ProLiant ML P4000 10GbE uCNA 5820 P10000 ProLiant SL Integrity Blade 10GbE uCNA Tiger LOM 10GbE uCNA 10GbE NIC 8Gb FC ProLiant Bladesystems rx2800 10GbE uCNA 10GbE NIC LOM-Down (G7) BLOM (Gen8) 10GbE uCNA 10GbE NIC, 8Gb FC © 2012 Emulex Corporation x9000 D2D 10GbE uCNA 16Gb FC 43 Resources Emulex Implementers Lab: www.implementerslab.com – Website containing technical whitepapers, application notes, videos and technical blogs. HP VMware Solutions www.hp.com/go/vmware VMware Technical Resource Center: http://www.vmware.com/resources/techresources/ www.hp.com/go/ProLiantGen8 © 2012 Emulex Corporation 44 Summary Migrations from 1GbE to 10GbE deployments do require new hardware but mostly all servers and top of rack switches being deployed are now 10GbE VMware Distributed Virtual Switch enables data center like infrastructure VMware ESXi 5.0 complemented by 10GbE adapters provided by Emulex enable the tools and technologies required to successfully deploy 10GbE networks HP adapters provided by Emulex deliver all the necessary technology to support VMware vSphere 5.0 10GbE network deployments © 2012 Emulex Corporation 45 Emulex Breakout Sessions Wednesday, June 6th – 2:45pm – The benefits and best practices of using 10Gb Ethernet with VMware vSphere 5.0 (session #TB3258) Join Emulex and VMware for a technical discussion to help understand the effect of transitioning from 1Gb to 10Gb Ethernet. It’s no longer about speed and feeds as it requires knowledge of new network architectures and management tools in order to setup and monitor bandwidth allocation and traffic prioritization. Storage and Network architects need to have an understanding of new storage technologies, and networking features VMware brings with VMware ESXi 5.0. – Speaker: Alex Amaya (Emulex), and VMware Thursday, June 7th – 11:15am – Converged I/O for HP ProLiant Gen8 (session #BB3259 – Thursday, June 7th 11:15am) Join this interactive – session with IT Brand Pulse to understand the benefits of HP converged networking, by exploring a customer’s real-world deployment. HP has just unveiled its next generation ProLiant Gen8 platform that accelerates server market transformation – in addition to being self-sufficient, this new platform is enabled with industry-leading I/O making them incredibly fast and flexible. The HP ProLiant Gen8 platform delivers a converged I/O that includes FlexFabric, Flex-10, iSCSI and Fibre Channel over Ethernet.....at the price of 10GbE. Speaker, Frank Berry, CEO of IT Brand Pulse Get a Starbucks Card! Enter to win a $100 Visa Gift Card! Enter to win the 2012 Ducati Monster Motorcycle! © 2012 Emulex Corporation 46 Emulex Booth – Demos, & Grand Prize Management OneCommand® Manager Emulex OCM vCenter plug-in 8GFC HBAs – stand up and mezzanine HP Converged Infrastructure – Path to the Cloud •FCoE •iSCSI •10GbE Ethernet GRAND PRIZE 2012 Ducati Monster Requirements for bike entry: Listen to a booth demo Participate in an Emulex breakout session Announce winner on 6/7 at 1:30 PM!!!!! © 2012 Emulex Corporation 47 © 2012 Emulex Corporation 48