Microsoft-Citrix Virtual Desktop Infrastructure Technical Whitepaper Primary Contributors Mike Truitt, Microsoft Corp. Loay Shbeilat, Microsoft Corp. Jesus Alvarez, Microsoft Corp. Christopher Fife, Citrix Systems, Inc. Robert Briggs, NetApp Corp. Mike Brennan, Cisco Table of Contents Table of Figures.................................................................................................................. vi Introduction.......................................................................................................................... 1 Audience ......................................................................................................................... 1 Purpose ........................................................................................................................... 1 Scope .............................................................................................................................. 1 Summary of Findings ....................................................................................................... 2 Microsoft-Citrix VDI Program Description ............................................................................ 3 Business Benefits ............................................................................................................ 3 Microsoft Corporation ...................................................................................................... 4 Virtual Machine Manager............................................................................................ 4 Fabric Management .............................................................................................. 4 Windows Server 2008 R2 SP1 ................................................................................... 5 Active Directory Domain Services ......................................................................... 5 File Services ......................................................................................................... 5 Failover Clustering ................................................................................................ 6 Hyper-V ................................................................................................................ 6 Hyper-V Server .......................................................................................................... 6 Deeper Dive into Hyper-V Features ............................................................................ 6 Dynamic Memory (SP1) ........................................................................................ 6 Live Migration ....................................................................................................... 6 Hardware Support for Hyper-V Virtual Machines................................................... 7 Cluster Shared Volumes ....................................................................................... 7 Performance and Power Consumption ................................................................. 7 Networking Support .............................................................................................. 7 SQL Server 2008 R2 .................................................................................................. 7 Windows 7.................................................................................................................. 8 FlexPod with Microsoft Private Cloud............................................................................... 8 Cisco UCS 5108 Blade Chassis ................................................................................. 8 ii Microsoft-Citrix VDI Technical Whitepaper Cisco UCS B200 M2 Blade Servers ........................................................................... 8 Cisco UCS B230 M2 Blade Servers ........................................................................... 8 Cisco Nexus 5010 Switch ........................................................................................... 9 Cisco UCS 6248 Fabric Interconnect.......................................................................... 9 NetApp FAS3240 Controller ....................................................................................... 9 NetApp DS2246 Disk Shelves and Storage Media ................................................... 10 Citrix .............................................................................................................................. 10 Citrix FlexCast Technology....................................................................................... 10 Citrix XenDesktop 5.6 ............................................................................................... 11 Microsoft-Citrix VDI Solution Description ........................................................................... 13 Conceptual Software Architecture.................................................................................. 13 Conceptual Hardware Architecture ................................................................................ 14 Scale Units ............................................................................................................... 14 Compute .................................................................................................................. 14 Storage .................................................................................................................... 14 Networking ............................................................................................................... 15 Microsoft-Citrix 2000 VDI Architecture ............................................................................... 16 Capacity Planning .......................................................................................................... 17 Server Reference Architecture ....................................................................................... 17 Networking Reference Architecture ............................................................................... 17 Server-Side Networking............................................................................................ 17 SAN and Fabric Reference Architecture ........................................................................ 18 Software Reference Architecture ................................................................................... 21 Windows Server 2008 R2 Enterprise SP1 ................................................................ 23 Hyper-V Role ........................................................................................................ 23 Failover Cluster Feature ....................................................................................... 23 Active Directory Domain Services ......................................................................... 24 File Server Role .................................................................................................... 24 PXE Boot .............................................................................................................. 25 Virtual Machine Manager.......................................................................................... 25 Microsoft-Citrix VDI Technical Whitepaper iii SQL Server 2008 R2 ................................................................................................ 26 Windows 7................................................................................................................ 26 Citrix XenDesktop 5.6 ............................................................................................... 27 XenDesktop Delivery Controller ............................................................................ 27 Web Interface ....................................................................................................... 27 Licensing .............................................................................................................. 27 Virtual Desktop Provisioning Model ...................................................................... 27 PVS Pooled Desktops .......................................................................................... 29 Network Booting a PVS Target Device ................................................................. 29 PVS Master Image Creation ................................................................................. 30 PVS Load Balancing ............................................................................................. 31 High Availability ............................................................................................................. 31 Performance Characteristics ............................................................................................. 33 End-User Experience..................................................................................................... 33 Capacity Planning .......................................................................................................... 35 Performance Data.......................................................................................................... 36 Power Management ................................................................................................. 36 2,000 Virtual Desktop Boot Storm ......................................................................... 36 Hyper-V Host Performance Analysis ............................................................. 37 PVS Server Performance Analysis................................................................ 40 VMM Server Performance Analysis .............................................................. 42 Boot Storm Analysis Summary ..................................................................... 43 2,000 Virtual Desktops Shutdown (Power Off) ...................................................... 43 Runtime Analysis (Login VSI) ................................................................................... 43 Single-Server Scalability Analysis ......................................................................... 44 Full 2,000 Desktop environment VSI Runtime....................................................... 47 Single-Server Scalability with Half the Host RAM (128 GB RAM) ......................... 54 Appendix A ........................................................................................................................ 58 Power Management....................................................................................................... 58 Startup ..................................................................................................................... 58 iv Microsoft-Citrix VDI Technical Whitepaper Shutdown ................................................................................................................. 59 Master Image Update .................................................................................................... 60 Appendix B ........................................................................................................................ 61 Infrastructure Deployment.............................................................................................. 61 Microsoft-Citrix VDI Technical Whitepaper v Table of Figures Figure 1. Conceptual Microsoft-Citrix VDI .......................................................................................................... 13 Figure 2. Conceptual hardware architecture...................................................................................................... 16 Figure 3. Storage architecture ............................................................................................................................ 19 Figure 4. Conceptual software architecture ....................................................................................................... 22 Figure 5. End-user connectivity metrics ............................................................................................................. 34 Figure 6. Boot storm startup with 15-75-10 throttle ......................................................................................... 37 Figure 7. Boot storm Microsoft Hyper-V CPU utilization .................................................................................... 38 Figure 8. Boot storm hypervisor disk queue length ........................................................................................... 38 Figure 9. Boot storm hypervisor memory available ........................................................................................... 39 Figure 10. Boot storm hypervisor network-bandwidth utilization ..................................................................... 40 Figure 11. Boot storm PVS CPU utilization ......................................................................................................... 41 Figure 12. Boot storm PVS network-bandwidth utilization ................................................................................ 41 Figure 13. Boot storm Microsoft System Center 2012 Virtual Machine Manager component CPU utilization . 42 Figure 14. Boot storm Microsoft System Center 2012 Virtual Machine Manager component memory available ............................................................................................................................................................................ 43 Figure 15. 150 desktop VMs with 256 GB RAM .................................................................................................. 44 Figure 16. 150 desktop VM hypervisor CPU runtime utilization ........................................................................ 45 Figure 17. 150 desktop hypervisor runtime available memory ......................................................................... 46 Figure 18. 150 desktop VM physical memory allocated .................................................................................... 46 Figure19. 2,000 desktop VMs—Run167 ............................................................................................................. 48 Figure 20. 2,000 desktop hypervisor CPU runtime utilization ............................................................................ 48 Figure 22. 2,000 desktop VMs—hypervisor I/O ................................................................................................. 50 Figure 23. 2,000 desktop VMs—hypervisor disk queue length .......................................................................... 50 Figure 24. 2,000 desktop VMs—hypervisor I/O latency ..................................................................................... 51 Figure 25. 2,000 desktop VM storage I/O time .................................................................................................. 52 Figure 26. 2,000 desktop VM storage CPU utilization ........................................................................................ 53 Figure 27. 2,000 desktop storage I/O latency .................................................................................................... 53 Figure 28. 125 desktop VMs Run183 .................................................................................................................. 54 Figure 29. 125 desktop VM hypervisor CPU runtime utilization ........................................................................ 55 vi Microsoft-Citrix VDI Technical Whitepaper Figure 30. 125 desktop VM hypervisor runtime available memory ................................................................... 56 Figure 31. 125 desktop VM physical memory allocated .................................................................................... 57 Figure 33. Power management—shutdown ...................................................................................................... 59 Figure 34. Master-image update ........................................................................................................................ 60 Figure 35. Two-node failover cluster .................................................................................................................. 61 Microsoft-Citrix VDI Technical Whitepaper vii Introduction This Microsoft-Citrix virtual desktop infrastructure (VDI) technical whitepaper describes the results of a study conducted at the Microsoft Enterprise Engineering Center (EEC) to evaluate the scalability of the Citrix XenDesktop 5.6 and Provisioning Services 6.1 (PVS) environment on Cisco UCS B-Series Blade Servers running Windows Server 2008 R2 service pack 1 (SP1) with the Microsoft Hyper-V role connected to a NetApp storage array. This solution is managed by Microsoft System Center 2012 and constitutes the hardware environment that was validated in the Microsoft Private Cloud Fast Track (Fast Track) program in 2011, and revalidated in 2012 against the Fast Track requirements using System Center 2012. This solution is designed to stream 2,000 virtual desktops as an enterprise-class reference implementation that has been architected through a joint-development relationship between Microsoft, Citrix, Cisco, and NetApp. Audience To take full advantage of this document, the reader must have a general understanding of private cloud VDI concepts, and a technical understanding of enterprise-level server, storage, and networking technology. Purpose The purpose of this document is to articulate the design considerations and validation efforts required to deploy the Microsoft-Citrix software stack on Cisco UCS B-Series Blade Servers with NetApp storage. We specifically set out to accomplish the following: Assist enterprise-level IT professionals with capacity planning and performance analysis for the deployment of a Microsoft-Citrix VDI solution Describe the core functionality of the Microsoft-Citrix VDI and what administrators can expect with provisioning, power operations, image patching, and virtual machine (VM) runtime Validate the interaction between the Virtual Machine Manager (VMM) component of System Center 2012 and XenDesktop 5.6 from a software perspective Scope This document provides a high-level description of the hardware and software layers involved in the Microsoft-Citrix VDI solution, in addition to a detailed description of the 2,000 virtual-desktop deployment that was used to test the solution at the Microsoft EEC. For testing purposes, a scale of 2,000 virtual desktops was chosen, and no attempt was made to exceed this number. The solution was successful in validating the entire VDI stack using the latest versions of each vendor’s products: Cisco, Citrix, Microsoft, and NetApp. To reach scales greater than 2,000 virtual desktops, customers can deploy multiple “pods” of Hyper-V clusters, each managed by an instance of Virtual Machine Manager. XenDesktop VDI can aggregate multiple instances of VMM deployments and their hosts to achieve even greater scale. Microsoft-Citrix VDI Technical Whitepaper 1 This document does not cover all integration points between XenDesktop 5.6 and the Microsoft virtualization and management platform. For example, it does not cover application virtualization (App-V), the Operations Manager component of System Center 2012, System Center 2012 Configuration Manager, or Microsoft RemoteFX desktop sessions. Summary of Findings Microsoft, Citrix, Cisco, and NetApp successfully validated XenDesktop streaming 2,000 concurrent Windows 7 desktops to virtual machines hosted with Hyper-V. The deployment was built on a 16-node FlexPod with Microsoft Private Cloud infrastructure that comprises the hardware requirements in the Microsoft Private Cloud Fast Track program. The hardware, in conjunction with Windows Server 2008 R2 Enterprise SP1, Hyper-V, System Center 2012, and Citrix XenDesktop 5.6: Validated that a single instance of Virtual Machine Manager and a single instance of the XenDesktop delivery controller successfully hosted 2,000 VDI workloads. Verified single-server scalability: 143 Windows 7 desktops with 512 MB of dynamic memory (expandable to 2 GB) running knowledge worker load on a single Cisco UCS B230 M2 Blade Server with 20 cores and 256 GB of random access memory (RAM). Validated a fully virtualized environment where all VDI components (such as Active Directory Domain Services, Microsoft SQL Server, VMM, XenDesktop, and Provisioning Server) were run in virtual machines. Observed an 80 percent storage savings in comparison to a similar VDI environment without NetApp storage efficiencies. The VDI-based solution in this whitepaper fully outlines the hardware and software environments the capacity planning used to achieve the results and the performance characteristics achieved. 2 Microsoft-Citrix VDI Technical Whitepaper Microsoft-Citrix VDI Program Description The Microsoft-Citrix VDI solution combines Microsoft software, consolidated guidance, and validated configurations with partner hardware technology—including computing power, network and storage architectures, and value-added software components. This solution constitutes the hardware environment that was validated in the Microsoft Private Cloud Fast Track program in 2011 and again in 2012 with a revalidation against the Fast Track requirements using System Center 2012. The Microsoft-Citrix VDI solution provides a turnkey approach for delivering validated implementations of the private cloud for virtual desktops. System Center provides local control over data and operations, helping IT professionals to dynamically pool, allocate, and manage resources for agile infrastructures-as-a-service (IaaS). When deployed on the Microsoft private cloud, XenDesktop transforms Windows desktops and applications into on-demand services available to any user, any device, anywhere. XenDesktop quickly and securely delivers any type of virtual desktop—from Windows, web, and software-as-aservice (SaaS) applications, to PCs, Apple Macs, tablets, smartphones, laptops, and thin clients—all with a high-definition user experience. The following sections provide general information about each software and hardware component employed in the Microsoft-Citrix VDI architecture. For specific information about the Microsoft-Citrix VDI deployment, see Microsoft-Citrix 2,000 VDI Reference Architecture on page 16. Business Benefits The Microsoft-Citrix VDI solution represents an opportunity for enterprise-class businesses to realize dramatic cost savings and improvements in performance over the traditional client-server IT model. The Microsoft-Citrix VDI software and hardware architecture provides a state-of-the-art virtualization technology for building private clouds with streamed operating systems to virtual desktops on demand or at predetermined times during the day. This solution helps organizations implement private clouds with increased ease and confidence. The following are some to the key benefits of the Microsoft-Citrix VDI program: Enable virtual work styles to increase workforce productivity from anywhere Take advantage of the latest mobile devices to drive innovation throughout the business Rapidly adapt to change with fast, flexible desktop and application delivery for offshoring, mergers and acquisitions (M&A), branch expansion, and other business initiatives Transform desktop computing with centralized delivery, management, and security Microsoft and Citrix deliver end-to-end business value through an integrated desktop virtualization solution built on Microsoft Hyper-V Server 2008 R2 SP1, and managed by System Center. The solution has been enhanced in Service Pack 1 for Windows Server 2008 R2 and Hyper-V Server 2008 R2 for even greater impact. Microsoft-Citrix VDI Technical Whitepaper 3 Dynamic memory helps you pool available physical memory on the host and dynamically allocate it to virtual machines based on workload needs, making more efficient use of physical memory resources to reduce server requirements. By enabling more efficient usage of physical memory resources, dynamic memory allows more virtual machines (VMs) to run simultaneously on a virtualization host without noticeable performance impact, reducing the number of hypervisor-capable servers needed to host large numbers of virtual desktops. Microsoft® RemoteFX™, a feature of the Remote Desktop Services (RDS) server role, provides users with a rich user experience when accessing virtual desktops from a broad range of endpoint devices. For example, using RemoteFX in a VDI-based desktop environment, users can work remotely in a Windows® Aero® desktop environment, watch full-motion video, enjoy Microsoft Silverlight® animations, and run three-dimensional (3-D) applications from virtually any endpoint device. By using Citrix and Microsoft for desktop virtualization, you can take advantage of the enhanced VM density capabilities of Hyper-V® Server 2008 R2 SP1 to provide users with the rich experience of Windows 7 using RemoteFX while enabling IT to securely manage both physical and virtual infrastructures using System Center. Integrated with Windows Server® 2008 R2 and System Center management capabilities, and complemented with partner technology from Citrix, Hyper-V Server 2008 R2 SP1 delivers unique end-to-end business value for desktop virtualization. Microsoft Corporation For a growing number of businesses, the journey to cloud computing starts with a private cloud implementation. A Microsoft private cloud dramatically changes the way your business produces and consumes IT services by creating a layer of abstraction over your pooled IT resources. This helps your data center to offer true infrastructure-service capability, in addition to optimally managed application services. The VDI software solution is built on top of Windows Server 2008 R2, Hyper-V Server 2008 R2, System Center 2012, and SQL Server 2008 R2. The solution deployed Windows 7 Enterprise as the virtual desktop of choice. Virtual Machine Manager The Virtual Machine Manager component of System Center 2012 helps to enable centralized management of physical and virtual IT infrastructure, increased server utilization, and dynamic resource optimization across multiple virtualization platforms. Virtual Machine Manager includes end-to-end capabilities such as planning, deploying, managing, and optimizing the virtual infrastructure. It can centrally create and manage VMs across data centers, easily consolidate multiple physical servers onto virtual hosts, rapidly provision and optimize VMs, and dynamically manage virtual resources through management packs. Fabric Management In System Center 2012, the Virtual Machine Manager component offers a robust set of new features to enable better management of the fabric supporting the virtual infrastructure: 4 Storage management: Storage arrays using the Storage Management Initiative Specification (SMI-S) protocol can be added to Virtual Machine Manager for management. VMM can then discover, classify, and provision storage to Hyper-V Microsoft-Citrix VDI Technical Whitepaper hosts or clusters. The FlexPod solution using NetApp FAS storage includes SMI-S capabilities. Bare-metal provisioning: Virtual Machine Manager integrates with new or existing deployments of Windows Deployment Services (WDS) to deploy Windows Server 2008 R2 to bare-metal machines using boot from virtual hard disk (VHD). VMM communicates with the bare-metal machines via baseboard management controller (BMC). In order for VMM to communicate with the host, the host BMC must support one of the following protocols: o Data Center Management Interface (DCMI) o Systems Management Architecture for Server Hardware (SMASH) o Intelligent Platform Management Interface (IPMI) Hyper-V cluster provisioning: By taking advantage of the storage-management features of Virtual Machine Manager, shared storage can be provisioned to Hyper-V hosts and used to create host clusters from within VMM. With this feature and baremetal provisioning, users can go from having bare-metal machines to having HyperV clusters, all done from VMM. Fabric updates: Virtual Machine Manager integrates with Windows Server Update Services (WSUS) to manage updates to the VMM fabric servers. VMM fabric servers include Hyper-V hosts, Hyper-V clusters, library servers, pre-boot execution environment (PXE) servers, WSUS servers, and the VMM management server. In the case of host Hyper-V clusters, VMM can orchestrate the update process. If the cluster supports live migrations, running VMs will be automatically evacuated from the host being patched. If live migration is not supported, running VMs are put on saved state and brought back online once the host has updated. Windows Server 2008 R2 SP1 Windows Server 2008 R2 SP1 is a multi-purpose server designed to increase the reliability and flexibility of your server or private cloud infrastructure, helping you save time and reduce costs. It provides you with powerful tools to react to business needs faster than ever before with greater control and confidence. Multiple components of Windows Server have been implemented in the VDI solution. Active Directory Domain Services Active Directory® Domain Services can help you manage corporate identities, credentials, information protection, system, and application settings. Active Directory Domain Services (AD DS) is the central location for configuration information, authentication requests, and information about all of the objects that are stored within your forest. Using Active Directory Domain Services, you can efficiently manage users, computers, groups, printers, applications, and other directory-enabled objects from one secure, centralized location. File Services Windows Server 2008 R2 offers a cost-effective, enterprise-ready file serving platform for Windows and mixed environments. It offers an abundance of capabilities that have been requested over the years by IT organizations. Your organization can take advantage of the extensive file-serving, data-management, and data-protection capabilities found in Windows Server 2008 R2. Microsoft-Citrix VDI Technical Whitepaper 5 Failover Clustering Server availability is a higher priority than ever. The demands of a 24 hours a day, seven days a week (24/7) global marketplace mean downtime can equate to lost customers, revenue, and productivity. Windows Server 2008 brought many new or enhanced configuration, management, and diagnostic features to failover clustering that made setting up and managing the clusters easier for IT staff. Windows Server 2008 R2 builds on that work with improvements aimed at enhancing the validation process for new or existing clusters, simplifying the management of clustered virtual machines (which run with Hyper-V), providing a Windows PowerShell interface, and providing more options for migrating settings from one cluster to another. These enhancements combine to provide you with a near turn-key solution for making applications and services highly available. Hyper-V Hyper-V is an integral part of Windows Server and provides a foundational virtualization platform that helps you to transition to the cloud. With Windows Server 2008 R2 you get a compelling solution for core virtualization scenarios—production server consolidation, dynamic data center, business continuity, VDI, and test and development. Hyper-V provides you better flexibility with features like live migration and cluster shared volumes for storage flexibility. More details are listed in the following section. Hyper-V Server Microsoft Hyper-V Server 2008 R2 is the hypervisor-based server-virtualization product that allows you to consolidate workloads onto a single physical server. It is a stand-alone product that provides a reliable and optimized virtualization solution, enabling organizations to improve server utilization and reduce costs. Because Hyper-V Server is a dedicated, stand-alone product that contains only the Windows hypervisor, Windows Server driver model, and virtualization components, it provides a small footprint and minimal overhead. Deeper Dive into Hyper-V Features Windows Server 2008 R2 provides new virtualization technology in Hyper-V, helping you deliver more advanced capabilities to your business for increased IT efficiency and agility. Dynamic Memory (SP1) Dynamic Memory, new in Windows Server 2008 R2 SP1, enables customers to better utilize the memory resources of Hyper-V hosts by balancing how memory is distributed between running virtual machines. Memory can be dynamically reallocated between different virtual machines in response to the changing workloads of these machines. Dynamic Memory enables more efficient use of memory while maintaining consistent workload performance and scalability. Implementing Dynamic Memory means that higher levels of server consolidation can be achieved with minimal impact to performance. It also enables larger numbers of virtual desktops per Hyper-V host for VDI scenarios. The net result for both scenarios is more efficient use of expensive server-hardware resources, which can translate into easier management and lower costs. Live Migration Windows Server 2008 R2 with Hyper-V includes the live migration feature. With live migration, data centers with multiple Hyper-V physical hosts can move running virtual machines to the best physical computer for performance, scaling, or optimal consolidation without affecting users, thereby reducing costs and increasing productivity. Service and 6 Microsoft-Citrix VDI Technical Whitepaper maintenance can be done in a controlled fashion during business hours, increasing productivity for users and server administrators. Data centers can also reduce power consumption by dynamically increasing consolidation ratios and powering off unused physical hosts during lower-demand times. Hardware Support for Hyper-V Virtual Machines Windows Server 2008 R2 supports up to 64 logical processors in the host processor pool. This is a significant upgrade from previous versions, and allows not only greater VM density per host, but also gives IT administrators more flexibility in assigning CPU resources to VMs. The new Hyper-V processor compatibility mode for live migration allows live migration across different CPU versions within the same processor family (for example, from Intel Core 2 to Intel Pentium 4 or from AMD Opteron to AMD Athlon), enabling migration across a broader range of server-host hardware. Cluster Shared Volumes With Windows Server 2008 R2, Hyper-V uses cluster shared volumes (CSVs) storage to simplify and enhance shared storage usage. CSVs enable multiple servers with Windows Server to access SAN storage using a single consistent namespace for all volumes on all hosts. Multiple hosts can access the same logical unit number (LUN) on SAN storage. CSVs enable faster live migration and easier storage management for Hyper-V when used in a cluster configuration. CSVs are available as part of the Windows Failover Clustering feature of Windows Server 2008 R2. Performance and Power Consumption Hyper-V in Windows Server 2008 R2 adds enhancements that reduce virtual machine power consumption. Hyper-V supports Second Level Address Translation (SLAT), which uses new features on today's CPUs to improve VM performance while reducing processing load on the Windows hypervisor. Hyper-V VMs can consume less power by virtue of the new Core Parking feature implemented in Windows Server 2008 R2. Networking Support In Windows Server 2008 R2 there are three new networking features that improve the performance of virtual networks. Support for jumbo frames, previously available in nonvirtual environments, has been extended to work with VMs. This feature enables VMs to use jumbo frames up to 9,014 bytes, if the underlying physical network supports it. Supporting jumbo frames reduces the network-stack overhead incurred per byte and increases throughput. In addition, there is a significant reduction of CPU utilization due to the fewer number of calls from the network stack to the network driver. The Virtual Machine Queue (VMQ) feature allows physical computer network interface cards (NICs) to use direct memory access (DMA) to place the contents of packets directly into VM memory, increasing I/O performance. SQL Server 2008 R2 SQL Server® 2008 R2 includes a number of new services, including PowerPivot for Microsoft Excel and Microsoft SharePoint®, Master Data Services, StreamInsight, Report Builder 3.0, Reporting Services Add-in for SharePoint, a data-tier function in Microsoft Visual Studio® that enables packaging of tiered databases as part of an application, and Utility Control Point, which is part of application and multiserver management (AMSM), used to manage multiple SQL Server databases. Microsoft-Citrix VDI Technical Whitepaper 7 Windows 7 Windows 7 is designed to meet the evolving needs of end users and IT professionals—both in and out of the office. Windows 7 delivers an industry-leading virtualized desktop due to a strong integration with Hyper-V Server 2008 R2 SP1 and advances in the Windows 7 CPU and I/O over previous Windows operating systems to deliver higher virtual-desktop densities. FlexPod with Microsoft Private Cloud NetApp and Cisco solutions for the Microsoft Private Cloud Fast Track program are tightly integrated with the software management stack based on System Center and related Windows components. The goal is to provide an integrated management experience to allow customers to easily implement and manage private clouds based on Hyper-V. Cisco UCS 5108 Blade Chassis The Cisco UCS 5100 Series Blade Server Chassis is a crucial building block of the Cisco Unified Computing System, delivering a scalable and flexible architecture for current and future data center needs, while helping reduce total cost of ownership. The first Cisco blade-server chassis offering, the Cisco UCS 5108 Blade Server Chassis, is six rack units (6RU) high, can mount in an industry-standard 19 inch rack, and uses standard front-to-back cooling. A chassis can accommodate up to eight half-width, or four full-width, Cisco UCS B-Series Blade Servers form factors. The Cisco UCS 5108 Blade Server Chassis revolutionizes the use and deployment of blade-based systems. By incorporating unified fabric and fabric-extender technology, the Cisco Unified Computing System enables the chassis to: Have fewer physical components. Require no independent management. Be more energy-efficient than traditional blade-server chassis. This simplicity eliminates the need for dedicated chassis management and blade switches, reduces cabling, and allowing scalability to 40 chassis without adding complexity. The Cisco UCS 5108 Blade Server Chassis is a critical component in delivering the simplicity and IT responsiveness for the data center as part of the Cisco Unified Computing System. Cisco UCS B200 M2 Blade Servers The Cisco UCS B200 M2 Blade Server is a half-width, two-socket blade server. The system uses two Intel Xeon 5600 series processors, up to 96 GB of DDR3 memory, two optional hot-swappable small form factor (SFF) serial attached SCSI (SAS) disk drives, and a single mezzanine connector for up to 20 Gbps of I/O throughput. The server balances simplicity, performance, and density for production-level virtualization and other mainstream data center workloads. Cisco UCS B230 M2 Blade Servers The Cisco UCS B230 M2 Blade Server extends the capabilities of the Cisco Unified Computing System and delivers new levels of performance, efficiency, and reliability. With the Intel Xeon processor E7-2800 family, the two-socket Cisco UCS B230 M2 Blade Server platform delivers high performance and density in a compact, half-width form factor. 8 Microsoft-Citrix VDI Technical Whitepaper Cisco Nexus 5010 Switch Your next-generation data center has specific server networking needs. The Cisco Nexus 5010 1 RU (rack unit) switch provides an Ethernet-based unified fabric designed to meet those needs. The Cisco Nexus 5010 Switch foundation is built upon: High-performance 10 Gigabit Ethernet IEEE Data Center Bridging (DCB) for lossless Ethernet Fibre Channel over Ethernet (FCoE) Virtual machine–optimized networking The switch delivers more than 500 Gbps of switching capacity with 20 fixed wire-speed 10 Gigabit Ethernet ports that support data center bridging and FCoE. In addition, one expansion port supports any of the following modules: Eight-port 1/2/4 Gigabit Fibre Channel Six-port 1/2/4/8 Gigabit Fibre Channel Four-port 10 Gigabit Ethernet (DCB and FCoE) and four-port 1/2/4 Gigabit Fibre Channel Six-port 10 Gigabit Ethernet (DCB and FCoE) Cisco UCS 6248 Fabric Interconnect The Cisco UCS 6248UP 48-Port Fabric Interconnect is a core part of the Cisco Unified Computing System. Typically deployed in redundant pairs, the Cisco UCS 6248UP Fabric Interconnect provides uniform access to both networks and storage. Benefit from a low total cost of ownership (TCO) with enhanced features and capabilities, including the following: Increased bandwidth up to 960 Gbps Higher port density: up to 48 ports in one rack unit (1RU), including one expansion module with 16 unified ports High-performance, flexible, unified ports capable of line-rate, low-latency, lossless 1/10 Gigabit Ethernet, FCoE, and 4/2/1 and 8/4/2 Fibre Channel Reduced port-to-port latency from 3.2 microseconds to 2 microseconds Centralized unified management with Cisco UCS Manager Efficient cooling and serviceability: front-to-back cooling, redundant front-plug fans and power supplies, and rear cabling NetApp FAS3240 Controller Storage capacity of up to 1200 TB means the FAS3240 can handle data consolidation in heterogeneous IT environments at all application levels. The NetApp FAS3240 is configured with Data ONTAP software as standard. Data ONTAP is a highly advanced and scalable operating system delivering unmatched value through greater flexibility, availability, and automation. The FAS 3240 includes the following features: Microsoft-Citrix VDI Technical Whitepaper 9 1,200 TB maximum raw capacity 600 maximum total disk drives Single-enclosure high availability (HA); two controllers in single 3U chassis Up to 1 TB FlashCache memory Expandable with PCIe expansion slots Up to 500 FlexVol volumes per controller FCP, IP SAN (iSCSI), NFS, CIFS, FCoE, HTTP, and FTP storage protocols 4,096 maximum LUNs NetApp DS2246 Disk Shelves and Storage Media NetApp leads among vendors of enterprise network storage in the adoption of SAS and SFF hard-disk drives. The NetApp DS2246 disk shelf holds 24 SAS 2.5” drives in only 2U of rack space, achieving both storage and performance density concurrently. Compared to a 4U-high DS4243 disk shelf with SAS drives of the same capacity, the DS2246 doubles the storage density, can increase performance density by 60 percent, and can reduce power consumption by 30 to 50 percent. Citrix Citrix XenDesktop is a desktop virtualization solution that transforms Windows desktops and applications into an on-demand service available to any user, any device, anywhere. XenDesktop quickly and securely delivers any type of virtual desktop, Windows, web, and SaaS application to PCs, Macs, tablets, smartphones, laptops, and thin clients—all with a high-definition user experience. Citrix FlexCast delivery technology enables IT to go beyond VDI and deliver virtual desktops to any type of user, including task workers, mobile workers, power users, and contractors. XenDesktop also helps IT rapidly adapt to business initiatives, such as offshoring, M&A, and branch expansion by simplifying desktop delivery, and enabling user self-service. The open, scalable, and proven architecture dramatically simplifies virtual desktop management, support, and systems integration, optimizing performance, improving security, and lowering costs. Citrix FlexCast Technology Different types of workers across the enterprise have varying performance and personalization requirements. Some require offline mobility of laptops, others need simplicity and standardization, while still others need high performance and fully personalized desktops. XenDesktop can meet all these requirements in a single solution with the unique Citrix FlexCast delivery technology. With FlexCast, IT can deliver every type of virtual desktop and application, hosted or local, optimized to meet the performance, security, and mobility requirements of each individual user. The FlexCast delivery technologies can be broken down into the following categories: 10 Hosted shared desktops provide a locked-down, streamlined, and standardized environment with a core set of applications that are ideally suited for task workers where personalization is not required or appropriate. Hosted VDI desktops offer a personalized Windows desktop experience for office workers, which can be securely delivered over any network to any device. Microsoft-Citrix VDI Technical Whitepaper Streamed VHD desktops take advantage of the local processing power of rich clients, while providing centralized single-image management of the desktop. These types of desktops are often used in computer labs and training facilities, and when users require local processing for certain applications or peripherals. Local VM desktops extend the benefits of centralized, single-instance management to mobile workers who need to use their laptops offline. When they are able to connect to a suitable network, changes to the operating system, applications, and user data are automatically synchronized with the data center. On-demand applications allow any Windows application to be centralized and managed in the data center, hosted either on multi-user terminal servers or on VMs, and instantly delivered as a service to physical and virtual desktops. Optimized for each user device, network, and location, applications are delivered through a highspeed protocol for use while connected or streamed through Citrix application virtualization or Microsoft App-V directly to the endpoint for use when offline. A complete overview of FlexCast technology can be found at Citrix.com. For the VDI testing and validation represented in this whitepaper, Windows 7 was streamed to shared desktops hosted on the Cisco UCS hardware and NetApp storage solutions. User connections to the virtual desktops were made using the Citrix High-Definition User Experience (Citrix HDX) technology. Citrix XenDesktop 5.6 Citrix XenDesktop transforms Windows desktops to an on-demand service for any user, any device, anywhere. XenDesktop quickly and securely delivers any type of virtual desktop or Windows, web, and SaaS application to all the latest PCs, Macs, tablets, smartphones, laptops, and thin clients—all with a high-definition user experience. The following describes the strategic features of Citrix XenDesktop 5.6: Any device, anywhere with Citrix Receiver. XenDesktop 5.6 includes new highperformance Citrix Receiver clients, which can increase interactive performance by as much as 300 percent. Also new are Receiver clients to cover the Google Chrome operating system, HP WebOS, and Blackberry PlayBook, in addition to Windows, Mac, Linux, Google Android, and Apple iPhone and iPad. All combined, over a billion consumer devices across the globe can access secure desktops and Windows, web, or SaaS applications through Citrix Receiver. HDX. The industry-leading user experience of XenDesktop gets a turbo-boost while reducing deployment costs and improving quality of service at the same time. New breakthrough WAN performance, tight integration with Citrix Branch Repeater 6, and protocol updates that support five different priority streams on the network allow administrators to as much as double the number of users on a given network and extend an HDX experience to users on the most difficult networks while average bandwidth on any network is reduced by a third. HDX real-time audio quality is also improved with a dedicated real-time voice stream that can be prioritized over all other traffic. Graphics and video also get a huge boost with intelligent multimedia and graphics–command redirection technologies that take advantage of client-side GPUs to increase server density ten-fold while providing a local-like experience for users half-way around the world. Beyond VDI with Flexcast. Different types of workers across the enterprise have varying performance and personalization requirements. Some require offline mobility of laptops, others need simplicity and standardization, while still others need a high-performance, fully Microsoft-Citrix VDI Technical Whitepaper 11 personalized desktop. XenDesktop meets all these requirements in a single solution with its unique FlexCast delivery technology. With FlexCast, IT can deliver every type of virtual desktop, hosted or local, optimized to meet the performance, security, and mobility requirements of each individual user. Any Windows, web, or SaaS application. XenDesktop includes all the on-demand application delivery capabilities of Citrix XenApp, which is used every day by 100 million users in over 230,000 organizations worldwide. New features include Instant App Access for both hosted and streamed virtual applications delivered into a virtual desktop, providing a quick and seamless user experience as virtual applications are accessed within a virtual desktop environment. Citrix Receiver makes it easier than ever for end users to access all their corporate, web-based, and SaaS apps from a single interface. New features include integration with the upcoming NetScaler Cloud Gateway, a combined remote access and cloud-application provisioning solution that includes automated application-provisioning workflows, identity federation, Windows, web, and SaaS application single sign-on, monitoring, and metering functionality, and will be supported on over a billion devices. Open, scalable, proven. With numerous awards, industry-validated scalability, and over 10,000 Citrix Ready products from third-party vendors that enhance Citrix environments and solve business needs, XenDesktop 5.6 provides a powerful desktop computing infrastructure that’s easier than ever to manage. XenDesktop 5.6 builds on the success of the Desktop Studio admin console with integrated support for advanced provisioning services and Hyper-V Server 2008 R2 SP1 Dynamic Memory for up to 40 percent greater VM density. Desktop Director, a tool created exclusively for the helpdesk, has been expanded to include advanced troubleshooting for both desktops and applications, and now includes HDX Monitor for real-time views of virtual desktop performance. Single-instance management. XenDesktop enables IT to separate the device, operating system, applications, and user personalization and maintain single master images of each. Instead of juggling thousands of static desktop images, IT can manage and update the operating system and applications once, from one location. Now we are able to centrally upgrade the entire enterprise to Windows 7 in a weekend, instead of months. Singleinstance management can dramatically reduce on-going patch and upgrade maintenance efforts, and cut data center storage costs by up to 90 percent by eliminating redundant copies. Data security and access control. With XenDesktop, users can access desktops and applications from any location or device, while IT uses policies that control where data is kept. XenDesktop can prevent data from residing on endpoints, centrally controlling information in the data center. In addition, XenDesktop can ensure that any application data that must reside on the endpoint is protected with Citrix XenVault technology. Extensive access control and security policies ensure that intellectual property is protected, and regulatory compliance requirements are met. Enterprise-class infrastructure. XenDesktop includes application, desktop, and server virtualization infrastructure that scales to meet the demanding requirements of global enterprises. Proactive monitoring and reporting enables rapid problem resolution, while intelligent load and capacity management helps ensure that problems rarely arise in the first place. Virtualization management features such as live migration, high availability, and bare-metal server provisioning make the XenDesktop infrastructure robust and resilient. 12 Microsoft-Citrix VDI Technical Whitepaper Microsoft-Citrix VDI Solution Description This section provides a high-level overview of the entire solution, but leaves more in-depth information regarding hardware, software, and other configuration details for later in this paper. Hardware and software layers are broken into their components with the hardware layer providing an overview of the compute, storage, and networking infrastructure design based on the Cisco-NetApp Fast Track Bill of Materials (BoM). The software layer will touch upon the hypervisor, management, and brokering and streaming layers. Conceptual Software Architecture Figure 1 provides a high-level representation of the Microsoft-Citrix VDI: Figure 1. Conceptual Microsoft-Citrix Virtual Desktop Infrastructure Callout Description 1 Citrix layer Access: This is the layer that consists of the Citrix Receiver client software and webbased portal, and is the user interface to access the virtual desktops. VDI: This is the layer that brokers connections between users and desktops, and streams desktop master images to VMs running on the Microsoft Hyper-V servers. Microsoft layer Management: VDI with the Virtual Machine Manager component of Microsoft System Center 2012. Virtual Desktops: Windows 7 Enterprise. OS and Additional Components: Windows Server, and Microsoft Hyper-V Server. Hardware layer Network: This is the layer where thin clients establish their connections to the virtual desktops. Compute: This is the layer where the virtual desktops run. Storage: This is the layer where virtual hard disks are stored. 2 3 Microsoft-Citrix VDI Technical Whitepaper 13 Conceptual Hardware Architecture One of the key drivers of the layered approach to infrastructure architecture that is presented here is to enable complex workflow and automation to be developed over time by creating a collection of simple automation tasks, assembling them into procedures that are managed by the management layer, and then creating workflows and process automation controlled by the orchestration layer. Scale Units In a modular architecture, the concept of a scale unit refers to the point a module in the architecture can scale to before another module is required. For example, an individual server is a scale unit, it can be expanded to a certain point in terms of CPU and RAM, but beyond its maximums, an additional server is required to continue scaling. Each scale unit also has an associated amount of physical installation labor, configuration labor, and so on. With large scale units such as a preconfigured full rack of servers, the labor overhead can be minimized. It is critical to know the scale limits of all components, both hardware and software, to determine the optimum scale units as input to the overall architecture. Scale units enable the documentation of all the requirements needed (such as space, power, HVAC, and connectivity) for implementation. Compute The compute layer of the VDI architecture represents the physical servers that host the infrastructure VMs and desktop VMs. All of the software-based VDI components are run in virtual machines. The hardware-architecture choices that are available to data center architects are constantly evolving. Choices range from rack-mounted servers to tightly integrated, highly redundant blade systems to container models. The same spectrum exists for storage and networking equipment. Server scale limits are well published and include number and speed of CPU cores, maximum amount and speed of RAM, and number and type of expansion slots. Particularly important are the number and type of onboard I/O ports and the number and type of supported I/O cards. Both Ethernet and Fibre Channel expansion cards often provide multiport options where a single card can have four ports. Additionally, in blade-server architectures, there are often limitations on the amount of I/O cards and/or supported combinations. It is important to be aware of these limitations and the oversubscription ratio between blade I/O ports and any blade chassis switch modules. Storage Storage is one of the most important components of any private cloud solution, and VDI is no exception. The number of I/O per second (IOPS) has a direct impact on the VDI performance as the storage layer of the VDI architecture handles the virtual disk for each VM. Storage architecture is a critical design consideration for private cloud solutions. The topic is challenging because it is rapidly evolving in terms of new standards, protocols, and implementations. Storage and supporting storage networking are critical to the overall performance of the environment, but storage tends to be one of the more costly items. 14 Microsoft-Citrix VDI Technical Whitepaper Storage architectures today have several layers including the storage arrays, the storage network, the storage protocol, and for virtualization, the file system utilizing the physical storage. One of the primary objectives of the private cloud solution is to enable rapid provisioning and deprovisioning of virtual machines. Doing so at large scale requires tight integration with the storage architecture and robust automation. Provisioning a new virtual machine on an already-existing LUN is a simple operation. However, provisioning a new LUN, adding it to a host cluster, and similar tasks are relatively complicated, and can greatly benefit from automation. Networking Many network architectures include a tiered design with three or more tiers such as core, distribution, and access. Designs are driven by the port bandwidth and quantity required at the edge, in addition to the ability of the distribution and core tiers to provide higher speed uplinks to aggregate traffic. Additional considerations include Ethernet broadcast boundaries and limitations, spanning tree, and other loop avoidance technologies. A dedicated management network is a frequent feature of advanced data center virtualization solutions. Most virtualization vendors recommend that hosts be managed via a dedicated network so that there is not competition with guest traffic needs, and to provide a degree of separation for security and ease of management purposes. This typically implies dedicating a NIC per host and port per network device to the management network. With advanced data center virtualization, a frequent use case is to provide isolated networks where different owners, such as particular departments or applications, are provided their own dedicated networks. Multi-tenant networking refers to using technologies such as VLANs or IP security (IPsec) isolation techniques to provide dedicated networks that utilize a single network infrastructure or wire. Managing the network environment in an advanced data center–virtualization solution can present challenges that must be addressed. Ideally, network settings and policies are defined centrally and applied universally by the management solution. In the case of IPsecbased isolation, this can be accomplished using Active Directory Domain Services and group policy to control firewall settings across the hosts and guests, in addition to the IPsec policies controlling network communication. For VLAN-based network segmentation, several components including the host servers, host clusters, VMM, and the network switches must be configured correctly to enable both rapid provisioning and network segmentation. With Hyper-V and host clusters, identical virtual networks must be defined on all nodes in order for a virtual machine to be able to failover to any node and maintain its connection to the network. At large scale, this can be accomplished with PowerShell scripting. Microsoft-Citrix VDI Technical Whitepaper 15 Microsoft-Citrix 2,000 VDI Architecture The purpose of this chapter is to describe the software and hardware layers of the Microsoft-Citrix VDI as deployed and tested at the Microsoft Enterprise Engineering Center (EEC) in Redmond, Washington, from November 2011 through March 2012. In addition to describing the VDI’s software and hardware layers and how they interact, this chapter also provides general guidance for capacity planning, notes throughout the chapter to assist you when considering a Microsoft-Citrix VDI for your organization, and performance graphs from data collected during testing at the EEC. Figure 2 shows the relationship between each hardware component deployed with the Microsoft-Citrix VDI: Figure 2. Conceptual hardware architecture 16 Microsoft-Citrix VDI Technical Whitepaper Capacity Planning In a VDI environment, capacity planning is driven by one fundamental question: what is the maximum number of virtual desktops required at peak utilization? The answer to this question provides the information required to determine compute, storage, and networking requirements. The smallest unit of capacity for any solution is Single Server Scalability (SSS), or determining how many virtual machines of a given workload can be hosted on a single/representative server without impacting the users’ experience. There is no simple answer to determine SSS in an environment because each hardware and software environment is unique. However, there are tools available, including System Monitor, Microsoft Assessment and Planning Toolkit, and Login VSI. After deploying an initial hardware layer, you can dramatically increase the maximum number of virtual desktops by adding server, storage, and network hardware as needed and configuring the Microsoft-Citrix software layer to support it. For capacity planning details and analysis, see Performance Characteristics on page 33. Server Reference Architecture The server architecture includes two Cisco USC 5108 blade chassis, two Cisco UCS B200 M2 blade servers for infrastructure VMs, and 14 UCS B230 M2 blade servers for desktop VMs. The two Cisco B200 servers were configured as a Windows failover cluster and hosted the VDI servers. The 14 Cisco B230 servers were not clustered, and hosted the VDI workloads. The implication of non-clustered hosts is that the user desktops cannot be live migrated between hosts. This imposes additional administrative burden when patching each Hyper-V host because user desktop sessions must be manually drained before the server can be taken down for maintenance. However, this burden is counter-balanced by reducing the complexity of the initial installation and reducing the overhead failover clustering places on the hosts. In the pooled desktop scenario, user desktops are still highly-available on nonclustered hosts because each pooled desktop is identical, and in the event of a host failure, affected users are simply redirected to another desktop on a different host. Networking Reference Architecture The networking architecture includes two Cisco 6248 Fabric Interconnect switches and two Cisco Nexus 5010 with an eight-port fibre channel module. For this deployment, we used a layered architecture that includes server-side networking and fabric/switch-side networking. Server-Side Networking The Cisco UCS blades each have a dual-mezzanine 10 GB Converged Network Adapter (CNA). To optimize the management, operations, and user traffic in the VDI, the network was segmented at the hardware layer into six logical adapters (virtual interface cards) that allowed the solution to isolate traffic and drive performance. Those network adapters map to the following networks: Live-migration VLAN for server patching and maintenance. In order to use this network efficiently, we recommend that you use a dedicated network. Microsoft-Citrix VDI Technical Whitepaper 17 CSV VLAN is used for CSV traffic and for providing high availability in the event that a server loses fibre connectivity to the CSV. Client-access (public) VLAN is used for the bulk of communication for the entire VDI solution. This VLAN is used by infrastructure servers and cluster servers to communicate with each other and with the VDI hosts (Hyper-V servers), virtual desktop VMs, and incoming thin-client traffic. Provisioning services (PVS) VLAN is used by the Citrix Provisioning Server to stream the operating system to the virtual desktops. iSCSI network is used to facilitate failover clustering feature within VMs, and to expose the shared storage to the clustered VMs. Management VLAN is the recommended network for connecting to any of the clustered physical hosts for management purposes. We sliced the mezzanine card on the B230s into three physical network adapters. Those network adapters map to the following networks: Public VLAN PVS VLAN Management VLAN SAN and Fabric Reference Architecture The NetApp unified storage architecture provides customers with an agile and scalable storage platform. Innovative NetApp storage solutions provide customers new alternatives and expanded possibilities over traditional storage vendors. All NetApp storage systems utilize the Data ONTAP operating system to provide SAN (FCoE, Fibre Channel, and iSCSI), NAS (CIFS, NFS), primary storage, and secondary storage within a single unified platform so that all virtual desktop data components can be hosted on the same storage array. A single process for activities such as installation, provisioning, mirroring, backup, and upgrading is used throughout the entire product line from the entry level to enterpriseclass controllers. Having a single set of software and processes brings great simplicity to even the most complex enterprise data-management challenges. Unifying storage and data management software and processes reduces the complexity of data ownership, enables companies to adapt to their changing business needs without interruption, and results in a dramatic reduction in total cost of ownership. For large, scalable VDI environments, the NetApp solution provides the following unique benefits: Drive cost and efficiency savings in storage, power, and cooling requirements Most agile and operationally efficient storage solutions Best-in-class data protection and business continuance solutions to address any level of data availability demands The storage layer of the VDI architecture, often one of the most important components because of how the number of IOPS affects the VDI performance, manages the virtual disk for each VM desktop. 18 Microsoft-Citrix VDI Technical Whitepaper Figure 3 shows the storage architecture in context of the Microsoft-Citrix 2,000 VDI: Figure 3. Storage architecture Planning your storage implementation, you should take into account that VDI environments are extremely I/O intensive. IOPS range from majority reads to majority writes depending on the system state. When in a boot storm, the storage backend will see a steady increase in read IOPS. When in production, heavy-write IOPS might be noticed, especially during high end-user workloads. NetApp recommends sizing the storage for high IOPS with small I/O sizing. NetApp provides a scalable, unified storage and data management solution for VDI. The unique benefits of the NetApp solution are: Storage efficiency: Significant cost savings with multiple levels of storage efficiency for all the virtual machine data components. These storage efficiencies include: NetApp Thin Provisioning, a way of logically presenting more storage to hosts than physically available. NetApp Deduplication, which saves space on primary storage by removing redundant copies of blocks within a volume. NetApp FlexClones, which provides hardware-assisted rapid creation of spaceefficient, writeable, point-in-time images of individual files, LUNs, or flexible volumes. Microsoft-Citrix VDI Technical Whitepaper 19 Performance: Enhanced user experience with transparent read and write I/O optimization that strongly complements NetApp's storage efficiency capabilities. NetApp provides performance enhancements with: NetApp Transparent Storage Cache Sharing, which allows customers to benefit from storage efficiency and at the same time significantly increase I/O performance. NetApp Flash Cache, which increases the amount of available cache to help reduce virtual desktop storm activities and drastically improves read I/O. NetApp Write Optimization to optimize write operations in RAID-DP. NetApp Flexible Volumes and Aggregates to allow the performance and capacity to be shared by all desktops in the volume or aggregate. Data protection: Enhanced protection of both the virtual desktop operating system data and the user data, with very low overhead for both cost and operations. Superior NetApp data protection is achieved with RAID-DP. NetApp RAID-DP is an advanced RAID technology that provides the default RAID level on all storage systems. RAID-DP protects against the simultaneous loss of two drives in a single RAID group. It is very economical to deploy, the overhead with default RAID groups is a mere 12.5 percent. This level of resiliency and storage efficiency makes data residing on RAID-DP safer than data residing on RAID 5, and more cost-effective than RAID 10. Controllers: The FAS3240 HA-Pair array has two controllers running in active-active mode, which provides redundancy and high availability. Deployed solutions in this environment run Data ONTAP 8.0.1. In our environment, the controllers are set to the active-active state. If one controller fails, the other will carry the load. Each controller was fitted with a 256 GB FlashCache, module to boost read performance of the array, and has two redundant FC connections to FCoE switches. Aggregates: An aggregate is the physical storage made up of one or more RAID groups of physical disks, and is a container of volumes, LUNs and qtrees. In our environment, we created one aggregate for the controllers (WorkAgrr). NetApp controllers come stock with an aggregate preinstalled (AGRR0). That aggregate contains the Data ONTAP operating system on which the controller runs. In order to use the controller and disk shelves, you need to create at least one additional aggregate because the operating system uses the default aggregate. We use two RAID-DP groups. Each RAID-DP group includes 16 physical drives. Two of these drives are dedicated to parity, and the remaining 14 are dedicated for data. Each aggregate essentially has access to 32 physical drives. Volumes: To ensure that each controller maintained peak performance throughout all of the lifecycles of a VDI environment, volumes were created symmetrically across both controllers. The volumes were also created as Thin Provisioned FlexVolumes with storage efficiencies enabled. The following is a brief list of these volumes and their purposes: 20 UCS Boot Volume: Contains all boot from SAN–operating-system LUNs for each physical UCS Server. VDI Host Write Cache Volume: Contains write-cache LUNs for each VDI Hyper-V Server host. Infrastructure VM Clustered Volume: Contains a CSV for clustered infrastructure VMs (PVS, VMM, XenDesktop Desktop Controller, File Services). Quorum Volume: Contains quorum disks for infrastructure VMs. Microsoft-Citrix VDI Technical Whitepaper LUNs: Within the UCS boot LUN volume, each controller contains eight LUNs. These LUNs were set up as SAN-bootable LUNs for each of the physical UCS blades and their operating systems. Within the VDI host write-cache volume, each controller has seven LUNs created for each VDI host. Within the infrastructure CSV volume, one LUN was created for the CSV. Software Reference Architecture This solution is based on an enterprise-class software stack, including Windows Server 2008 R2 Enterprise SP1, the Virtual Machine Manager component of System Center 2012, and XenDesktop 5.6 with Provisioning Services 6.1. It has been designed from the ground up to provide scalable VDI environments. The software layer is specifically designed to take advantage of the latest improvements in VMM, and the virtual desktop brokering and streaming features in XenDesktop and Provisioning Services. Microsoft and Citrix have worked closely together over past 12 months to integrate VMM with XenDesktop 5.6 and Provisioning Services 6.1 to create a solution that offers the best-in-class VDI software architecture. Microsoft-Citrix VDI Technical Whitepaper 21 Figure 4 shows the relationship between each software component in the Microsoft-Citrix VDI solution: Figure 4. Conceptual software architecture Callout 1 22 Description Client Thin/Thick Client: This is the device running the Citrix Receiver client, from which Microsoft-Citrix VDI Technical Whitepaper Callout Description users access their hosted virtual desktops. Web Interface SSL VPN: This is the web-based portal that authenticates users and secures the network connection to the virtual desktop. Virtual Desktop Infrastructure Three Provisioning Services 6.1 VMs: These load-balanced servers stream the desktop master image to each virtual machine. Two Citrix XenDesktop 5.6 VMs: These load-balanced servers broker user connections to the virtual desktops, manage VM power state, and provision new VMs. One VM running the Virtual Machine Manager component of Microsoft System Center 2012: this HA server provides the Microsoft virtualization management interfaces. Two SQL Server VMs: These HA servers hosted the XenDesktop, Provisioning Services, and Virtual Machine Services databases. Microsoft Hyper-V and Hardware 14 Windows Server 2008 R2 SP1 Hosts: These servers hosted the 2,000 Windows 7 Virtual Desktops. Two Windows Server 2008 R2 SP1 Hosts: These servers hosted the VDI components (AD, SQL, VMM, XD, and PVS). SAN: This system provided storage for the entire VDI solution including the boot volumes for each hypervisor. 2 3 Windows Server 2008 R2 Enterprise SP1 Windows Server 2008 R2 Enterprise SP1 was used to create the infrastructure VMs. The validation was performed on the graphical user interface (GUI) version because the native management consoles of Microsoft virtualization were needed. The partnership focused on the management of the solution rather than the density. Scaling out can be done through capacity planning, and from an operating system perspective, more RAM and CPUs are supported to drive density higher as needed. The configuration utilized an installation with the Hyper-V role with the failover cluster feature enabled on the two-node infrastructure servers (B200) and on all infrastructure VMs. The 14 VDI hosts on the servers running Microsoft Hyper-V Server 2008 R2 SP1 offer lower disk footprint disk, less RAM, and processor for higher densities. It has lower “attack” surfaces. It has no local UI but can be managed remotely using remote UI tools. This is a free server offering from Microsoft. Hyper-V Role We installed Hyper-V role on Windows Server 2008 R2 Enterprise SP1, which is installed on the infrastructure servers. Hyper-V role comes pre-installed on Microsoft Hyper-V Server 2008 R2 SP1. Failover Cluster Feature Windows Server Failover Clustering is an important feature of the Windows Server platform that can help improve your server availability. When one server fails, another server begins to provide service in its place. This process is called failover. Failover clusters in Windows Server 2008 R2 provide you with seamless, easy-to-deploy high availability for important databases, messaging servers, file and print services, and virtualized workloads. Note For this deployment, only the servers that hosted the VDI Infrastructure VMs were in a failover cluster. The servers hosting the user desktops were not clustered. Microsoft-Citrix VDI Technical Whitepaper 23 With failover clustering you can help build redundancy into your network and eliminate single points of failure. The improvements to failover clustering in Windows Server 2008 R2 are aimed at simplifying clusters, making them more secure, and enhancing cluster stability. For the cluster disk configuration, we used the quorum-type node and disk majority for this cluster. The quorum disk type was a 50 GB LUN. Two additional 1.5 TB LUNs were provisioned to host the VHDs for infrastructure VMs. Those two LUNs were set up as CSVs. Networks: Live-migration VLAN for server patching and maintenance. In order to use this network efficiently, we recommend that you use a dedicated network. Client-access (public) VLAN is used for a bulk of communication for the entire VDI solution. This VLAN is used by infrastructure servers and cluster servers to communicate with each other and with the VDI hosts (Hyper-V servers), virtual desktop VMs, and incoming user thin-client traffic. Provisioning-services (PVS) VLAN is used by the Citrix Provisioning Server to stream the operating system to the virtual desktops. iSCSI network is used to facilitate the failover clustering feature within VMs and to expose the shared storage to the clustered VMs. Management VLAN is the recommended network for connecting to any of the clustered physical hosts for management purposes. CSV VLAN is used for CSV traffic in addition to providing availability if one server loses FC connectivity to the CSV volume. Active Directory Domain Services We used Active Directory Domain Services (Windows 2008 R2 SP1 in a stand-alone forest) as our network directory system to manage computers and user accounts and to authenticate credentials when users log on. Based on experience we estimated 2,000 maximum users per domain controller, so we configured two Active Directory Domains Services domain controllers in our environment to perform logon and management operations for 2,000 virtual desktops. Test results confirmed that this was more than sufficient to support 2,000 virtual desktop users. File Server Role The purpose of the file server role is to host roaming user profiles. XenDesktop allows users to log on to virtual desktops from different devices using roaming profiles, which are stored on a common file server. Based on our testing, we found that the performance of the server hosting user-roaming profiles was critical to meeting our requirement that all users be able to log on in less than 60 seconds, even during peak load times. The following factors affected the performance of the user profile file server, thereby impacting the length of time it took for users to access their virtual desktops: 24 Server memory: sufficient memory was required to support server operations. Microsoft-Citrix VDI Technical Whitepaper Server hard drive performance: the speed of access to user profiles affected how quickly users can log on to virtual desktops. Sufficient disk performance was needed to process concurrent user read and write requests for user profile data. Network speed: network bandwidth requirements for user profile file servers depend on the size of the profiles. More profiles or larger profile sizes require more bandwidth and might require additional servers. Because user profile size can vary greatly depending on user function, role, and environment settings, you should perform sizing tests based on your specific environment and goals. During testing we found that a user roaming profile server with 8 GB RAM, SATA local storage, and 1 Gbps network (HP BL280C) did not perform well enough to meet the success criteria of logon time under 60 seconds for 2,000 users. We found that the profile server required 26 GB of RAM and 8 Gbps of network bandwidth to support 2,000 user profiles. PXE Boot Microsoft Hyper-V has two types of network adapters. The first is referred to as the “Legacy Network Adapter” in the Hyper-V management console, and as the “Emulated Network Adapter” in the VMM Administrative console. The other adapter is referred to as the “Network Adapter.” The legacy network adapter is tied directly to the BIOS of the virtual machine. Using the legacy adapter increases processor overhead because device access requires context switching for communication. The legacy network adapter is required for supporting any PXE, such as that used with Citrix Provisioning Services. Contrary to popular belief, the legacy network is not limited in speed to 100 MB, but can run at higher speeds if supported by the host’s physical network interface. The synthetic network adapter is loaded by the Host Integration Services after the operating system loads inside the virtual machine. As such, the network adapter is not available for any PXE operations. Since the network adapter is integrated directly with the virtual machine, it can take advantage of the high-speed VMBus for communication and reduce processor overhead by avoiding the context switches that the legacy network adapter requires. Virtual Machine Manager In this solution, XenDesktop issues create, start, stop, and delete commands to Virtual Machine Manager for virtual desktop provisioning and lifecycle management through PowerShell commandlets. When VMM receives a PowerShell commandlet call from XenDesktop, the VMM task engine translates the requested task and sends it to the server with Hyper-V, which hosts the virtual desktops. VMM then reports the state of the virtual desktop after the command execution is complete. For VMM and XenDesktop to communicate using PowerShell, the VMM console must be installed on the same server with the XenDesktop console. When using Provisioning Services to create VMs, the VMM console must be installed on the same server as the Provisioning Services console. Note The create, start, stop, and delete commands can also be initiated directly in VMM. Microsoft-Citrix VDI Technical Whitepaper 25 In a VDI environment, the following VMM features are heavily used: Three refreshers: VM Lite, VM Full, and Host are used to keep the status of VMs updated to reflect changes made through VMM or out of band. Each refresher runs on a preset interval. VM Lite poles the host every two minutes to check for any VMs that might have been created out of band and adds them to VMM database. VM Full runs every 30 minutes on each VM and is used to check for changes to desktop VMs that might have been executed out of band. Host also runs every 30 minutes and retrieves the most recent information for each VMM host. These refreshers enable VMM to accurately report the state and performance of the desktop VMs and host VMs, in addition to hardware utilization and other system information. PowerShell snapin: XenDesktop operations to VMM using the VMM PowerShell snapin. XenDesktop invokes the commandlets and VMM executes the requested task. During task execution VMM keeps track of task execution progress, orchestrates individual actions needed depending on the task, and then reports back the result of the task. The majority of task actions performed on Hyper-V hosts and desktops are issued by means of WMI commands; the VMM task engine issues the required WMI calls during commandlet execution. VMM also includes enhancements that are specifically targeted for VDI deployments: The VMM jobs engine has been improved with a parallel jobs enhancement to handle a higher number of jobs. VMM task status reporting has been optimized to better communicate the results of the requested operations. VMM also has improved job queuing and an improved interface geared toward cloud management tasks and VDI scalability. Virtual Machine Manager can be installed as a highly available cluster service. The VMM agent is installed automatically when you add a Hyper-V host for management. For better performance, we recommend installing VMM with a remote database. SQL Server 2008 R2 For a production environment, we recommend database mirroring or clustering to make the XenDesktop database highly available. At least two SQL Server instances should be deployed in a production environment to provide database services to the XenDesktop site. In this architecture, one SQL Server instance was installed as a highly available VM on a Windows cluster. Citrix recommends configuring two SQL Server instances for high availability. Windows 7 In this architecture, we used Windows 7 Enterprise SP1 (32-bit edition) as the operating system for the virtual desktops. With lower I/O and RAM requirements, Windows 7 has been optimized for streaming in a VDI solution. In this architecture, we chose to disable certain features to make the image lighter on I/O and memory consumption. The image also had additional applications installed: Microsoft Office 2010 and Adobe Reader 10. These applications were used to run the knowledge worker workload using the Login VSI tool. 26 Microsoft-Citrix VDI Technical Whitepaper Citrix XenDesktop 5.6 XenDesktop consists of several services: the controller or broker, Web Interface, Provisioning Services, and license enforcement. The following sections describe how each of these XenDesktop services were configured to host 2,000 Windows 7 desktops. XenDesktop Delivery Controller The XenDesktop Delivery Controller (DDC) provides services that manage users’ access to resources, controls the assembly of users’ virtual desktop environments, and brokers connections between users and their virtual desktops. It controls the state of the desktops, starting and stopping them based on demand and administrative configuration. During the validation of this reference architecture at the Microsoft EEC, all testing was conducted on a single virtualized DDC with 4 vCPUs and 8 GB of RAM. This VM successfully serviced 2,000 user desktops with capacity to spare. For a highly available XenDesktop site, we recommend two DDCs. Web Interface Web Interface is the HTML portal that authenticates users and connects them to the virtual desktops they are authorized to use. Citrix Web Interface is installed by default on all servers on which you install the DDC. The default configuration of Web Interface was used for the reference architecture, so the Web Interface service was hosted on the same VM as the DDC. For secure remote access through Access Gateway, you need to create a new Web Interface site. For information about creating sites, and details of how to modify the site's user interface to refer to desktops rather than applications, see the Web Interface documentation. Licensing Before configuring a XenDesktop site you will need a Citrix License Server and a SQL Server database to host the XenDesktop site’s database. If a license server does not exist, it can be installed along with the DDC. For the purposes of this reference architecture the Citrix License service was hosted on the same VM as the DDC. See Licensing your product in Citrix eDocs for more information about Citrix Licensing Server and configuring License Servers. Virtual Desktop Provisioning Model XenDesktop 5.6 Platinum has two primary provisioning models: Machine Creation Services (MCS) and Provisioning Services (PVS). Machine Creation Services is part of the XenDesktop Studio management console, but is limited in that it is meant for hosted VDI desktops only, pooled or dedicated. Organizations looking to utilize a streamed VHD model or a hosted VDI model with blade PCs require the use of PVS. However, PVS requires a separate server, and potentially multiple servers within the infrastructure. An additional aspect to consider is the requirement for dedicated private desktops. Private desktops allow users to have complete control of their virtual desktop. With private desktops, the initial delivery of the desktop is identical. Once deployed, each desktop becomes unique as changes persist across reboots. Within the hosted VDI desktop FlexCast model, this level of personalization can be achieved with installed images, MCS images, and PVS images. Microsoft-Citrix VDI Technical Whitepaper 27 Machine Creation Services uses built-in technology to provide each desktop with a unique identity and also thin provisions each desktop from a master image. Only changes made to the desktop consume additional disk space. Provisioning Services uses built-in technology to provide each desktop with a unique identity, and it also utilizes a complete copy of the base desktop image in read-write mode. Each copy consumes disk space, which also increases as additional items are added to the desktop image by the user. Based on these examples, most organizations utilize MCS images when required to deliver a dedicated desktop due to the additional requirements of PVS. If MCS and PVS use dynamic disks for the differencing/write cache disks, the first boot of a virtual desktop requires expansion of the disk to accommodate the changes required during startup. During a boot storm, the expansion on the SAN requires significant resources. However, subsequent reboots do not impact PVS because the same disk, which was already expanded, is still used. MCS, however, disconnects the old disk, creates a new disk, expands the disk, boots the virtual desktop, and eventually deletes the old disk. This impacts the storage every time a desktop is rebooted. Due to the impact of disk expansion on a SAN infrastructure, most recommendations are to utilize fixed disks instead of dynamic disks. Although a fixed disk allocates all defined space, which can be wasteful, it does significantly reduce the impact of a boot storm. MCS can only use dynamic disks, which incurs the expansion penalty. PVS is flexible, and allows the use of dynamic or fixed disks. It is a NetApp best practice to use thin-provisioned, fixed-size VHDs throughout an enterprise virtual environment. Engineers have discovered that there are substantial performance hits when using dynamic VHDs in a SAN environment, due to block-level misalignment. In light of this, it is a NetApp preference that PVS be used when deploying Citrix XenDesktop and provisioning virtual desktops. Due to its use of dynamic VHDs, MCS is not recommended as a means of provisioning virtual desktops at this time. For the purposes of this reference architecture at the EEC, we used PVS to stream 2,000 pooled desktops to hosted virtual machines with 3 GB fixed disks for the write cache. For more information about choosing the appropriate image delivery option please see “XenDesktop Planning Guide: Image Delivery.” Provisioning Services 6.1 Provisioning Services streaming technology allows computers and virtual machines to be provisioned and reprovisioned in real-time from a single shared-disk image. In doing so, administrators can completely eliminate the need to manage and patch individual systems. Instead, all image management is done on the master image. After installing and configuring PVS components, a vDisk is created from a device’s hard drive by taking a snapshot of the operating system and application image, and then storing that image as a vDisk file on the network. A device that is used during this process is referred to as a “master target device.” The devices that use those vDisks are called “target devices.” When a target device is turned on, it is set to boot from the network and to communicate with a provisioning server. The target device downloads the boot file from a provisioning server, and then boots. Based on the device boot-configuration settings, the appropriate 28 Microsoft-Citrix VDI Technical Whitepaper vDisk is located, then mounted on the provisioning server. The software on that vDisk is streamed to the target device as needed. To the target device, it appears like a regular hard drive on the system. PVS Pooled Desktops Provisioning Services desktops can be deployed as pooled or private: Private Desktop: A private desktop is a single private desktop assigned to one distinct user. Pooled Desktop: A pooled virtual desktop uses PVS to stream a standard desktop image to multiple desktop instances upon boot-up. When considering a PVS deployment, there are some design decisions that need to be made regarding the write-cache for the virtual desktop device. The write-cache is a cache of all data that the target device has written. If data is written to the provisioning server vDisk in a caching mode, the data is not written back to the base vDisk. Instead it is written to a write-cache file in one of the following locations: Cache on local HD: Cache on local HD is stored in a file on a secondary local hard drive of the device. It gets created as an invisible file in the root folder of the local HD. The cache-file size grows as needed, but never gets larger than the original vDisk, and rarely larger than the free space on the original vDisk. RAM cache: Cache is stored in client RAM. The cache maximum size is fixed by a setting in vDisk properties. All written data can be read from local RAM instead of going back to the server. RAM cache is faster than server cache and works in a high-availability environment. Server cache: Server cache is stored in a file on the server, a share, SAN, or other network storage device. The file size grows as needed, but never gets larger than the original vDisk, and rarely larger than the free space on the original vDisk. It is slower than RAM cache because all reads/writes have to go to the server and be read from a file. Cache gets deleted when the device reboots; in other words, on every boot the device reverts to the base image. Changes remain only during a single boot session. Difference disk mode: Difference cache is in a file on the server, a share, SAN, or other network storage device. The cache file size grows as needed, but never gets larger than the original vDisk, and rarely larger than the free space on the original vDisk. It is slower than both RAM cache and server cache. For optimal performance and scalability, the cache on local HD option was used. A 3 GB fixed virtual disk was assigned to the virtual machine template used in the virtual desktop creation process. The PVS target-device agent was installed in the Windows 7 image and automatically places the Windows swap file on the write cache drive when this mode was enabled. Network Booting a PVS Target Device Provisioning Services uses a special bootstrap program which initializes the streaming session between the target device and the provisioning server. After this session starts, the operating system is streamed and loaded from the vDisk that was initiated. There are three ways that a target device can load the bootstrap program: Over the network, via PXE Microsoft-Citrix VDI Technical Whitepaper 29 From a boot device stored on attached media From a BIOS-embedded bootstrap (OEM versions only) The most common method of delivering the bootstrap program is over the network. When network booting a target device there are three methods of delivering the bootstrap program to consider: DHCP, PXE service, or bootstrap protocol (BOOTP). The DHCP service delivers IP configurations to a target device. It can also deliver the bootstrap file location using options 67, and 60 or 66. Consider delivering the bootstrap file location with a DHCP service to reduce the number of services and increase reliability. The PXE service can deliver the bootstrap file location to a target device according to the PXE specification version 2.1. Use this service if a DHCP service exists and cannot be changed, and another PXE service is not used. Note The BOOTP service can deliver IP configuration to a target device according to the BOOTP tab. It can also deliver the boot program location using optional fields. Use of this service is no longer typical. Use this service only if DHCP does not meet your requirements. Because there was a subnet dedicated to PVS streaming traffic, DHCP services were installed on one of the PVS servers. DHCP was then configured to deliver IP addresses to the target devices and DHCP options 66 and 67 were used to specify the location and name of the bootstrap program. PVS Master Image Creation When creating a PVS master image, the Windows desktop operating system is converted into a vDisk (.vhd) image. This is then locked in a shared (read-only) mode and hosted on the PVS server's local disk or on a shared file location. Virtual desktops are then configured to PXE boot on the hypervisor server. PVS streams the vDisk image on start to the hypervisor, and is loaded into RAM. PVS injects a security identifier (SID) and host name as each desktop boots to make them unique in Active Directory Domain Services. These object mappings are maintained and managed within the PVS server and are visible in the PVS console under "collections" view. They are initially created and mapped by the XenDesktop Setup tool. Note Provisioning Services vDisks can be stored on SAN, SMB 2.1 NAS/Windows File Shares, or local storage on the PVS servers themselves. As the PVS servers are assigned with 8 GB RAM, the image will remain persistent and be serviced by RAM after it is initially served for the first time by each server. In the EEC virtual desktop environment, we used the Login VSI 3.0 benchmarking and performance testing tool from Login Consultants to automate and simulate user activity and interaction with virtual desktops. We created a master VM with the following features: 30 One virtual CPU Dynamic RAM—startup 512 MB, expandable to 2,048 MB 40 GB virtual hard drive 2 virtual NICs—one “legacy” NIC for PXE boot and one “synthetic” NIC for user data once the VM has booted Microsoft-Citrix VDI Technical Whitepaper Windows 7 Enterprise (x86) Office 2010 Login VSI “medium” workload (equivalent to a knowledge worker) We optimized the virtual desktops’ performance according to best practices. The following list outlines some of the Windows settings that were used: Uninstall Windows Search Configure the virtual memory paging file to a static size of twice the system RAM, in this case 3 GB Disable System Restore Disable some Windows scheduled tasks (such as backup) PVS Load Balancing PVS servers can be configured in a farm to provide high availability and resilience; connections are automatically failed over to working servers within the farm in the event of a failure, without interruption to the desktop. At the EEC we installed three instances of PVS and configured them into a load-balanced farm. We found that two PVS servers were capable of streaming all 2,000 desktops, however they were a bottleneck during boot storms. A third PVS server alleviated the problem. Each server was configured to use its local disk to store the vDisk rather than using shared storage. High Availability High availability describes a VDI that has no single point of failure. Each component of the Microsoft-Citrix 2,000 VDI architecture is highly available. This architecture includes three levels of high availability: The first level of high availability includes two Cisco chassis, two fabrics interconnected to two Cisco Nexus switches, and two NetApp controllers. The second level of high availability includes two infrastructure physical servers (Cisco USC B200s). The third level of high availability includes the infrastructure VMs: o All infrastructure VMs exist on a two-node failover cluster that provides high availability against a single physical server failure. o We deployed three PVS servers that are highly available and load-balanced with two XenDesktop controllers that are pointed at each other to create high availability and load balancing. o Virtual Machine Manager and SQL Server database VMs include failover clustering on the VM level. o Virtual desktops are overprovisioned by one physical server (10 percent) to accommodate failure, and pooled so that high-availability is in re-logging on. Availability management defines processes necessary to achieve the perception of continuous availability. Continuity management defines how risk will be managed in a Microsoft-Citrix VDI Technical Whitepaper 31 disaster scenario to make sure minimum service-levels are maintained. The principles of resiliency and automation are fundamental here. 32 Microsoft-Citrix VDI Technical Whitepaper Performance Characteristics The VDI solutions in this whitepaper is enabled by Microsoft, Cisco, and NetApp products with Citrix XenDesktop 5.6 layered as a workload on top of the infrastructure. At its core, XenDesktop is a desktop virtualization solution that transforms Windows desktops and applications into an on-demand service available to any user, anywhere, on any device. For more information on how components will interact based on a scenario, workload, or task, see Appendix A on page 58. End-User Experience Citrix XenDesktop helps you deliver on-demand virtual desktops and applications anywhere your users work, anywhere your business takes you, to any type of device, bringing unprecedented flexibility and mobility to your workforce. XenDesktop unlocks the full productivity and creativity of every worker while helping the entire organization adapt rapidly to new challenges and opportunities. To put the best talent to work for your organization, you need to be able to deliver their desktops wherever they are in the world, wherever they go, any time they need them. With XenDesktop, your users can take advantage of virtual work styles to integrate computing more seamlessly into their lives, never losing productivity just because they are away from the office. Today's workers are more savvy than ever when it comes to the latest mobile devices. XenDesktop empowers them to use smartphones, tablets, personal laptops, and nearly any device they choose as a seamless part of their corporate desktop experience. That makes your organization the kind of place the best workers of the digital generation want to work. By providing maximum options for their work preferences, you help to drive further innovation in everything they do. Competitive advantage depends on your ability to get work done the right way, in the right place, at the right time. Fast, flexible virtual desktop delivery with XenDesktop helps you adapt quickly and cost-effectively to business changes, from mergers to growth initiatives to strategies like work shifting and offshoring. As global markets demand more fluid and responsive virtual organizations, XenDesktop helps you deploy full desktop computing resources in seconds wherever they are needed. By transforming complex, distributed desktops into a simple, on-demand service, XenDesktop frees you from the costs and constraints of traditional computing architectures. Centralized delivery, management, and control of virtual desktops delivers new levels of efficiency to your IT organization while streamlining security and compliance. Self-service application provisioning, simplified helpdesk support, and support for mobile and virtual work styles gives you a foundation to take advantage of a new generation of IT models and strategies. Microsoft-Citrix VDI Technical Whitepaper 33 Figure 5 shows the interaction between each VDI component from the end-user connectivity perspective. Figure 4. End-user connectivity metrics Callout 1 2 End user launches an Internet browser to access the web interface. The web interface prompts the user for Active Directory Domain Services credentials and passes the credentials to Citrix XenDesktop Delivery Controller (DDC) XML service. 3 DDC authenticates the user’s credentials against Active Directory Domain Services, encrypts the credentials, and temporarily stores them for later use. DDC contacts the data store to determine virtual-desktop availability. Virtual desktop information is sent back to the web interface, which presents available desktops to user. The user clicks a desktop icon. 4 5 6 34 Description Microsoft-Citrix VDI Technical Whitepaper Callout Description 7 The web interface passes the user’s request to DDC. DDC starts the VM if it is off or suspended, and instructs the high-definition user experience protocol stack in the VM to start listening for connections. DDC passes the connection token back to the user. The user connects directly to the virtual desktop using the connection token. 8 9 10 11 Virtual Desktop Agent VDA tells DDC that the user has connected, and a XenDesktop license is checked out. After connection is approved, VDA uses the stored encrypted credentials to log on to the desktop against Active Directory Domain Services and applies the profile configuration. Capacity Planning In a VDI environment, capacity planning is driven by one fundamental question: what is the maximum number of virtual desktops required at peak utilization? The answer to this question provides the information required to determine compute, storage, and networking requirements. After deploying an initial hardware layer, you can dramatically increase the maximum number of virtual desktops by adding server, storage, and network hardware as needed and configuring the Microsoft-Citrix software layer to support it. For details and analysis on capacity planning, see Performance Characteristics on page 33. The following are some specific considerations to keep in mind when capacity planning for a Microsoft-Citrix VDI: Metrics should be set for memory and CPU in correlation to storage infrastructure and networking infrastructure. Operations measurement should be considered in context of specific workloads. Users’ workloads are likely to vary from one group to another. For example, some users will consume more CPU than other users. Basic operations to measure should include upper and lower limits for the following: o Pool spinups for “9 a.m.” scenario o Logon storms for “9 a.m.” scenario o Simulate user workload within the VMs o Shutdown for VMs at end of day o Image updates for “patch Tuesday” Dynamic memory in Windows Server 2008 R2 SP1—introduce dynamic memory (helps increase density on servers, moving bottlenecks from memory to storage IOPS and CPU). When estimating storage requirements, consider capacity and IOPS. Currently, Microsoft supports 12 virtual cores per one logical core. You will need to run your own workload ratio for your specific workload. Clusters are limited to 1,000 VMs per node. Hyper-V is limited to 384 VMs per node. Microsoft-Citrix VDI Technical Whitepaper 35 Suggested 2,000 VMs per VMM server, with 4 cores and 16 GB of RAM. With limits in mind, we opted to use two clusters with 1,000 VMs, each node with about 143 VMs. Each VM can have from 512 MB to 2,048 MB of RAM. Each cluster has two CSVs. Each CSV is attached to a single disk controller. Before launching a full-scale deployment, you should check single-server scalability. o Focus on CPU limit relative to the physical memory on the server and eliminate IOPS limits for this test. o Single-server scalability will not measure IOPS, so do not place high emphasis on results from a singer server because it is non-linear when you add more servers. o Create a theoretical capacity for the environment to determine server limit. Use one-node startups on VMM and XenDesktop layer. Consider the number of hosts, specifically with scale-up and scale-out in mind. Scale-up creates a small number of relatively powerful servers. Scale-out creates a large number of relatively less powerful servers. We recommend scale-out for this reference architecture. Capacity must be managed to meet existing and future peak demand while controlling under-utilization. Business relationship and demand management are key inputs into effective capacity management and require a service provider’s approach. Predictability and optimization of resource usage are primary principles in achieving capacity management objectives. Performance Data This section focuses on analyzing the different scenarios that are most commonly exercised in a VDI environment. This covers desktop provisioning, power management (“Green IT”), and desktop runtime (Login VSI). Power Management Power management provides automatic shutdown and startup of virtual desktops. For the purposes of testing and data collection for this reference architecture, we assumed that all 2,000 desktops had been shut down or suspended at the close of business. Each of the following sections show the time and resources consumed during the powermanagement lifecycle. 2,000 Virtual Desktop Boot Storm Prior to the “9 a.m.” logon storm, we recommend that the virtual desktops have been started and are ready for the logon storm. This will provide the user with a faster logon experience. We tested a boot storm with the assumption that a business will shut down all desktop VMs at the end of the business day and start up the desktop VMs again in the morning. This section analyzes the time required to start all the VMs and the resources consumed in the process. 36 Microsoft-Citrix VDI Technical Whitepaper Figure 6 analyzes the time required to boot 2,000 virtual desktops before the “9 a.m.” logon storm at the start of business. Figure 5. Boot storm startup with 15-75-10 throttle This graph shows the following significant points: Boot storm startup took approximately 29 minutes to complete. The graph clearly shows the 75 virtual-desktop “step” intervals that you would expect with the 15-75-10 throttle. The graph is roughly linear over the total boot up time. To optimize user experience, we recommend adding a 30 minute buffer after startup of the virtual desktops. Hyper-V Host Performance Analysis A boot storm creates significant load for Hyper-V hosts. You should monitor Hyper-V performance closely during boot storm. In this analysis, we used one Hyper-V host for performance data analysis and checked the other 13 hosts to ensure that they all had the same performance profile. For Hyper-V host analysis, we focused on four metrics: CPU, disk, memory, and network bandwidth. Note Each of the Hyper-V hosts was configured to handle 143 virtual desktops (2,000 virtual desktops divided by 14 Hyper-V hosts). Microsoft-Citrix VDI Technical Whitepaper 37 Figure 7 shows the single Hyper-V host CPU utilization during the boot storm. Figure 6. Boot storm Microsoft Hyper-V CPU utilization This graph shows the following significant points: With no VMs running, the Hyper-V host consumed approximately 3 percent of the CPU with all VMs in the shutdown state, which illustrates significant power savings. The boot up is CPU intensive, as shown by the spike between approximately minute 4 through minute 36. But overall, the host was able to handle the load using less than 60 percent of the available CPU. After the 143 virtual machines are fully booted and idle prior to the logon storm, the CPU utilization settles at approximately 10 percent. Figure 8 shows the queue lengths for the physical disk hosting the write-cache VHDs. Figure 7. Boot storm hypervisor disk queue length This graph illustrates the following significant points: 38 The maximum queue length during boot-up was three disk I/Os from the host perspective. Microsoft-Citrix VDI Technical Whitepaper There is no load on the physical disks. This clearly demonstrates the advantage of the PVS server in lowering the disk I/O. Figure 9 shows the memory utilization by the Hyper-V host during boot storm. Figure 8. Boot storm hypervisor memory available This graph shows the following significant points: With all virtual desktops shut down, memory availability is approximately 242 GB (14 GB consumed). Between minute 4 and 31, the virtual desktops are starting up and memory is being consumed at a constant rate. With all virtual desktops started, the memory available settles at approximately 167 GB, with approximately 89 GB consumed. The 143 virtual desktops consumed approximately 75 GB total. This clearly highlights the advantage of dynamic memory as each virtual desktop consumed approximately 537 MB at idle time. In a non-PVS solution, network bandwidth would not be a significant metric to monitor in boot storms; however, in this PVS solution, the virtual desktop operating system is being streamed on the network, so it is critical to understand network capacity on the Hyper-V host. Microsoft-Citrix VDI Technical Whitepaper 39 Figure 10 shows the network bandwidth utilization by the Hyper-V host on the PVS network. Figure 9. Boot storm hypervisor network-bandwidth utilization This graph illustrates the following significant points: Before boot storm, network bandwidth utilization is at 0 Mbps. During the boot-up, network bandwidth utilization remains between approximately 100 Mbps and 350 Mbps. This range helps to define the required bandwidth that needs to be allocated for the PVS network on the Hyper-V host. PVS Server Performance Analysis Boot storm is the heaviest task you can perform against the PVS server. You should monitor the PVS performance closely during boot storm. In this analysis, we used one PVS server for performance data analysis and checked the other two PVS servers to ensure that they had the same performance profile. For PVS server performance analysis, we focused on two metrics: CPU and network bandwidth utilization. Note Each PVS server handled 667 virtual desktops (2,000 virtual desktop divided by three PVS servers). 40 Microsoft-Citrix VDI Technical Whitepaper Figure 11 shows the CPU utilization by the PVS server during boot storm. Figure 10. Boot storm PVS CPU utilization This graph shows the following significant points: Between approximately minute 4 and minute 31, the virtual desktops are starting up, and CPU is being consumed at 55 percent utilization at a constant rate during boot storm. The graph clearly shows that the PVS server CPU has sufficient capacity to shorten the total boot time. Figure 12 shows the network bandwidth utilization by the PVS server during boot storm. Figure 11. Boot storm PVS network-bandwidth utilization This graph illustrates the following significant points: Before boot storm, network bandwidth utilization is at 0 Mbps. Microsoft-Citrix VDI Technical Whitepaper 41 During the boot-up, network bandwidth utilization remains between approximately 800 Mbps and 1,200 Mbps. This range helps to define the required bandwidth that needs to be allocated for the PVS network on the PVS server. The network data speed between the PVS servers and Hyper-V hosts should be roughly comparable for send and receive. In this case, the graphs show approximately 3,000 Mbps (1,000 Mbps x 3 PVS servers) when sending data, as shown in Figure 12. Boot storm PVS network-bandwidth utilization are at 3,150 Mbps (225 Mbps x 14 Hyper-V hosts) when receiving data, as shown in Figure 10, and before Boot storm hypervisor network-bandwidth utilization is at 0 percent. VMM Server Performance Analysis Boot storm creates significant load on VMM servers. In this environment, we had a single VMM server managing the 14 Hyper-V hosts with 2,000 virtual desktops. For VMM server performance analysis, we focused on two metrics: CPU and memory. Note The rate of start-VM jobs hitting VMM is controlled by the XenDesktop throttles. In this case, the throttle was 75 virtual desktop starts per minute. Figure 13 shows the VMM CPU utilization during boot storm. Figure 12. Boot storm Microsoft System Center 2012 Virtual Machine Manager component CPU utilization This graph illustrates the following significant points: 42 The average peak load was approximately 60 percent CPU utilization during the boot storm. The graph clearly shows that the VMM server CPU has sufficient capacity to handle incoming jobs at a higher rate. Microsoft-Citrix VDI Technical Whitepaper Figure 14 shows the available memory on the VMM server during boot storm. Figure 13. Boot storm Microsoft System Center 2012 Virtual Machine Manager component memory available This graph illustrates the following significant points: Throughout the boot storm, memory is not a major factor. At the beginning of boot storm, the VMM available memory was 14 GB. By the end of boot storm, the VMM server had approximately 13 GB of available memory. This range helps to define the required memory needed to run VMM at peak load. We suggest using 8 GB of memory. Boot Storm Analysis Summary The result of the boot storm analysis shows that the memory, CPU, and network bandwidth used in this VDI have sufficient headroom to start up 2,000 virtual desktops in less than 30 minutes. Although memory and network bandwidth could easily handle a higher throttle, the Hyper-V server, VMM server, and PVS servers could reach the maximum CPU saturation earlier. 2,000 Virtual Desktops Shutdown (Power Off) The validation team opted not to include any detailed analysis in this sections, considering it was very clear that the overall system was able to handle shutdowns at a very fast rate. With a shutdown rate of 500 per minute, all desktops were successfully shutdown within 4– 5 minutes with no significant load on any of the system components. Runtime Analysis (Login VSI) Runtime is the core of the testing. It is what the users experience on a daily basis. The Login VSI 3.0 tool was used for this analysis. It measures response time for opening applications and response times for clicking in the UI. Some applications include Word, Excel, PowerPoint, Microsoft Internet Explorer, and Adobe Flash video. The specific Login VSI workload that we used is “medium.” For a complete list of Login VSI workload settings, see http://www.loginvsi.com/en/admin-guide/workloads#h0-1-medium-and-mediumnoflashworkload. Microsoft-Citrix VDI Technical Whitepaper 43 Login VSI uses two modes to analyze the runs: VSIMax Classic and VSIMax Dynamic. For VSIMax Classic analysis, anything under four seconds is considered a “pass.” For VSIMax Dynamic analysis, the tool establishes a baseline and uses that for its analysis. It also accommodates for some transient spikes in response time, as long as the overall test shows good performance. Full details can be found at: http://www.loginvsi.com/en/admin-guide/analyzing-results Please note that we used VSIMax Dynamic analysis on the recommendation from LoginVSI: http://www.loginvsi.com/en/admin-guide/analyzing-results#h0-1-1-vsimax-classic In our analysis, we started with single-server scalability and expanded this to the full environment for testing the 2,000 desktop deployment. In addition, we did a single server scalability analysis with 128 GB RAM to test the effectiveness of Hyper-V dynamic memory. Single-Server Scalability Analysis To understand the capacity of a VDI environment, it is important to understand the capacity of a single Hyper-V host server. With the goal of 2,000 desktops on 14 Hyper-V hosts, it was necessary to understand if a single server is capable of handling 150 virtual desktops. In the specific test below, we started 150 virtual desktops on a single Hyper-V host and initiated the LoginVSI test at a login rate of one user every 30 seconds. In our analysis, we first ran the VSI analyzer to get data on user response times and passfail data for the test run. Next, we analyzed the following metrics on the Hyper-V host: CPU utilization, memory utilization, and a sample of virtual desktop memory demand. Figure 15 shows the output of the LoginVSI analyzer. Figure 14. 150 desktop VMs with 256 GB RAM This graph illustrates the following significant points: All response times are between 2 and 4 seconds. According to VSIMax Dynamic “The VSI max is not reached.” We did monitor the test to validated that all sessions were logged on and logged off successfully. The above data points clearly shows that the Hyper-V host was able to handle a medium workload against 150 virtual desktops. 44 Microsoft-Citrix VDI Technical Whitepaper Figure 16 shows the Hyper-V host CPU utilization at runtime. Figure 15. 150 desktop VM hypervisor CPU runtime utilization This graph illustrates the following significant points: At a rate of one user logon every 30 seconds (for a total of 150 users), the test took approximately 75 minutes to complete. The CPU utilization increases at a relatively constant rate during runtime between minute 0 and 1 hour 15 minutes. The drop after approximately 1 hour and 15 minutes shows when users begin to log off. Before the test begins, we are at approximately 10 percent utilization. This shows that the 150 started and idling virtual desktops consumed 10 percent. And at the end, we are back to 10 percent. At peak time, we reached less than 90 percent CPU utilization, which indicates that there is still some headroom remaining. Microsoft-Citrix VDI Technical Whitepaper 45 Figure 17 shows the memory available on the Hyper-V host at runtime. Figure 16. 150 desktop hypervisor runtime available memory This graph illustrates the following significant points: We start with approximately 165 GB memory, and at peak workload, available memory is 115 GB memory. The 150 virtual desktops consumed an additional 50 GB at runtime. The inflection points show the dynamic memory in action, where the memory demand is consumed when a new user logs on to a desktop and initiates the new workload; after the workload initiates, there are points where the virtual desktop is idle and the dynamic memory reclaims some of the RAM. With dynamic memory, there is more than sufficient memory available to handle the 150 virtual desktops. Figure 18 shows the physical memory allocated on a random virtual desktop on the HyperV host. Analysis showed that all 150 desktops had a similar profile. Figure 17. 150 desktop VM physical memory allocated 46 Microsoft-Citrix VDI Technical Whitepaper This graph illustrates the following significant points: Desktops start at 512 MB RAM and end with 512 MB RAM. We peak at approximately 900 MB as applications are opened, used, and closed. The graph also shows that the in-VM memory increases and decreases dynamically depending on the amount of RAM requirement for runtime. The analysis above clearly illustrates that Windows Server 2008 R2 running on Cisco UCS B230 with 20 cores (40 threads) can handle over 150 VMs per server. Please note that this is not the high watermark for the server; we didn’t run further tests to get to the breaking point of the server. Another major point is the advantages and efficiencies of dynamic memory. Some might argue that dynamic memory might add to the CPU overhead. In our internal tests, we found that the overhead is negligible and the benefits outweigh the CPU overhead. Aside from the enhanced densities that dynamic memory provides in memory-constrained environments, it is still beneficial in environments with more memory. Let’s take the 145 VMs per blade (2,000 VMs per pod) as an example: if static memory was used, each VM would be set at 1.5 GB RAM (total of 225 GB). This is a fine configuration; however, statistically it is unlikely that all VMs are running at peak load at the same time. Some users will be at maximum (probably needing more than 1.5 GB RAM), some users will be at minimum (idle), and some users will be somewhere in the middle, at about 1 GB RAM utilization. In contrast, if dynamic memory is used with a minimum memory set to 512 MB and a maximum memory set to 2 GB (similar to the scenario that we tested), this will allow the system to accommodate all user type usage, and will end up giving power users 2 GB RAM instead of limiting them to 1.5 GB. Full 2,000 Desktop environment VSI Runtime After completing the single server scalability test successfully and reaching our target 150 VMs, we went on to test the end to end environment with a target of 2,000 VMs. That would come down to roughly 143 VMs per each of the 14 hosts. In the specific test below, we started the 2,000 virtual desktops across the 14 Hyper-V hosts, gave the environment a 15 minute rest after all VMs started and initiated the Login VSI test at a login rate of one user every 1.8 seconds, that would roughly map to 1 login every 25 secs (1.8 x 14) for each host. In our analysis, we first ran the VSI analyzer to get data on user response times and passfail data for the test run. Next, we used one Hyper-V host for performance data analysis and checked the other 13 hosts to ensure that they all had the same performance profile. We analyzed the following metrics on the Hyper-V host: CPU utilization, memory utilization, and physical disk performance. We also analyzed the SAN logs to make sure the end to end system successfully handled the job. Microsoft-Citrix VDI Technical Whitepaper 47 Figure 19 shows the output of Login VSI analyzer. Figure19. 2,000 desktop VMs—Run167 This graph illustrates the following significant points: All response times are between 2 and 4 seconds. According to VSIMax Dynamic “The VSI max is not reached.” We did monitor the test to validate that all sessions logged on and logged off successfully. The above data points clearly shows that the deployed environment—across all its components—was able to handle a medium workload against 2,000 VMs. Figure 20 shows one Hyper-V host CPU utilization at runtime. Figure 180. 2,000 desktop hypervisor CPU runtime utilization This graph illustrates the following significant points: 48 At a rate of one user logon every 25 seconds (for a total of 143 users), the test took approximately 60 minutes to complete. Microsoft-Citrix VDI Technical Whitepaper The CPU utilization increases at a relatively constant rate during runtime between minute 0 and 1 hour. The drop after approximately 1 hour shows the users begin to log off. Before the test begins, we are at approximately 10 percent utilization. This shows that the 143 started and idling virtual desktops consumed 10 percent. And at the end, we are back to 10 percent. The later part of the graph (After 1 hr 10 min) was captured when the VMs were being restarted, not relevant to the test run. At peak time, we reached less than 90 percent CPU utilization, which indicates that there is still some headroom remaining. Figure 21 shows the memory available on one Hyper-V host at runtime. Figure 191. 2,000 desktop hypervisor runtime available memory This graph illustrates the following significant points: We start with approximately 165 GB memory, and at peak workload, available memory is 122 GB memory. The 143 virtual desktops consumed an additional 43 GB at runtime. The inflection points show the dynamic memory in action, where the memory demand is consumed when a new user logs on to a desktop and initiates the new workload; after the workload initiates, there are points where the virtual desktop is idle and the dynamic memory reclaims some of the RAM. With dynamic memory, there is more than sufficient memory available to handle the 143 virtual desktops. Figure 22 shows the I/O incurred by the physical disk hosting the write-cache VHDs. Microsoft-Citrix VDI Technical Whitepaper 49 Figure 202. 2,000 desktop VMs—hypervisor I/O Before analyzing the physical disk performance, it is important to understand the load. The graph illustrates: The IOPS increases linearly over the duration of the test. At peak, when all 143 VMs are running, the system was around 1,400 write IOPS (and insignificant read IOPS). Considering that all the servers had a similar profile, the system was pushing around 20,000 IOPS. Read IOPS were insignificant during the test, this is due to the Citrix PVS technology. All read operations are being streamed on the network by PVS and the rest of the reads are streamed from the profile server. The spike in read IOPS at the end of test, approximately minute 58, is due to logoff. The VM is copying the user profile from the write cache disk back to the profile fileshare server. Figure 23 shows the queue lengths for the physical disk hosting the write-cache VHDs. Figure 21. 2,000 desktop VMs—hypervisor disk queue length This graph illustrates the following significant points: 50 Microsoft-Citrix VDI Technical Whitepaper On average the queue length during runtime was sub-five disk I/Os from the host perspective. After the test completed, there was a spike of 32 queued I/Os, that was probably related to logoff and reboot of the VMs. There was no significant load on the physical disks. This clearly demonstrates the advantage of the PVS server in lowering the disk I/O, in addition to the Netapp in sustaining excellent performance. Figure 24 shows the latency for the physical disk hosting the write-cache VHDs. Figure 22. 2,000 desktop VMs—hypervisor I/O latency Disk queue length is usually a good indicator to performance and responsiveness of the system, however to get the full picture read/write latency need to be analyzed. The graph illustrates the following points: The write latency is approximately static throughout the test at about 3 ms.This is excellent performance keeping in mind that the SAN was handling around 20,000 write IOPS at peak time. This is due to the storage efficiencies implemented by NetApp. Read latency is slightly higher, at around 5–6 ms. This is still considered very good in this environment. Microsoft-Citrix VDI Technical Whitepaper 51 Figure 25 shows read/write IOPS on the NetApp SAN. Figure 23. 2,000 desktop VM storage I/O time Before analyzing the SAN performance, it is important to understand the load at the time. The graph illustrates: 52 Both controllers show balanced load, highlighting that the LUNS were balanced across the two controllers. The IOPS increases linearly over the duration of the test. At peak, when all 2,000 VMs were running, the SAN was incurring about 10,000 write IOPS per controller (and insignificant read IOPS). This is in line to the measurement we had from the host side: 14 hosts generating approximately 20,000 IOPS, and the SAN serving 20,000 IOPS. Read IOPS were insignificant during the test, this is due to the Citrix PVS technology. All read operations are being streamed on the network by PVS and the rest of the reads are streamed from the profile server. The spike in read IOPS at the end of test, approximately minute 58, is due to logoff. The VMs are copying the user profiles from the write cache disk back to the profile fileshare server. Microsoft-Citrix VDI Technical Whitepaper Figure 26 shows the CPU utilization on the NetApp SAN. Figure 24. 2,000 desktop VM storage CPU utilization This graph illustrates the following significant points: At peak time the controllers were approximately at 50 percent utilization, with 50 percent more headroom for processing. This clearly illustrates the efficiencies of the NetApp storage utilized in the architecture. Figure 27 shows read/write latency on the NetApp SAN. Figure 25. 2,000 desktop storage I/O latency This graph illustrates the following significant points: At peak time the read latency across both controllers was about 3–4 ms. This is close to the host side performance, keeping in mind that the host time includes the round trip to and from the SAN. Microsoft-Citrix VDI Technical Whitepaper 53 At peak time the write latency across both controllers was about 1–1.5 ms. The host side was about 3 ms, keep in mind that the host time includes the round trip to and from the SAN. Those latencies shows that the SAN was able to handle the 20,000 plus IOPS without significant degradation in performance. To highlight the math: o 64 x 15,000 RPM SAS disks were used. Realistically those disks can provide about 200 IOPS, which yields a total approximate maximum of 13,000 IOPS (64 x 200). o The disks could not have possibly handled this amount of IOPS without the optimizations technologies deployed by the NetApp SAN. This clearly illustrates the built-in storage efficiencies of the NetApp storage utilized in the architecture. Single-Server Scalability with Half the Host RAM (128 GB RAM) This section was run to further assess dynamic memory under a more memory-constrained environment. In the test below we removed half the memory in the server, leaving it with 128 GB. We ran 125 VMs with minimum memory set to 512 MB and maximum memory set to 2 GB. Please note that we do not advise running such an environment for a production environment—it was simply devised to highlight the robustness of dynamic memory in Windows Server 2008 R2 SP1. In the specific tests described, we initiated the Login VSI test at a logon rate of one user every 30 seconds. In our analysis, we first ran the Login VSI analyzer to get data on user response times and pass-fail data for the test run. Next, we analyzed the following metrics on the Hyper-V host: CPU utilization, memory utilization, and a sample virtual desktop memory demand. Figure 28 shows the output of the Login VSI analyzer. Figure 26. VSI medium with 125 Virtual Desktops on the host This graph illustrates the following significant points: 54 Response times are excellent, ranging between 2 and 3 seconds. Microsoft-Citrix VDI Technical Whitepaper According to VSIMax Dynamic “The VSI max is not reached.” We did monitor the test to validate that all sessions were logged on and logged off successfully. The above data points clearly shows that the Hyper-V host at 128 GB RAM was able to handle a medium workload against 125 virtual desktops. Figure 29 shows the Hyper-V host CPU utilization at runtime. Figure 27. 125 desktop VM hypervisor CPU runtime utilization At a rate of one user logon every 30 seconds (for a total of 125 users), the test took approximately 62 minutes to complete. The CPU utilization increased at a relatively constant rate during runtime between minute 7 and 1 hour 10 minutes. The drop after approximately 1 hour and 12 minutes shows when users began to log off. At peak time, the host CPU was hovering between 70–80 percent CPU utilization, which indicates that there is still some headroom remaining. In addition, it clearly illustrates that dynamic memory has little impact on CPU utilization. Microsoft-Citrix VDI Technical Whitepaper 55 Figure 30 shows the Hyper-V host CPU utilization at runtime. Figure 28. 125 desktop VM hypervisor runtime available memory This graph illustrates the following significant points: 56 We start with approximately 55 GB memory, and at peak workload, available memory is 15 GB memory. The 125 virtual desktops consumed an additional 40 GB at runtime. The inflection points show the dynamic memory in action, where the memory demand is consumed when a new user logs on to a desktop and initiates the new workload. After the workload initiates, there are points where the virtual desktop is idle and the dynamic memory reclaims some of the RAM. Without dynamic memory, it would not have been feasible to start 125 virtual desktops with a 1 GB RAM minimum requirement in such a memory-constrained environment. With dynamic memory, it was not only possible, the performance was excellent. Microsoft-Citrix VDI Technical Whitepaper Figure 31 shows the physical memory allocated on a random virtual desktop on the HyperV host. Analysis showed that all 125 desktops had a similar profile. Figure 29. 125 desktop VM physical memory allocated This graph illustrates the following significant points: It has a similar profile to the analysis run when the host had 256 GB of RAM, illustrating that even with less than 16 GB of RAM left, the host doesn’t affect VM memory performance. Desktops start at 512 MB RAM and end with 512 MB RAM. We peak at approximately 900 MB as applications are opened, used, and closed. The graph also shows that the in-VM memory increases and decreases dynamically depending on the amount of RAM required for runtime. Microsoft-Citrix VDI Technical Whitepaper 57 Appendix A Power Management Power management provides a way to manage VMs to minimize power consumption. The following sections describe the interaction between software components during startup and shutdown from a power-management perspective. Startup Figure 32 illustrates the interaction between software components during startup. Figure 30. Power management—startup Callout Description 1 Citrix XenDesktop Delivery Controller (DDC) contacts data store to determine the virtual desktop’s ID from the Virtual Machine Manager component of Microsoft System Center 2012. XenDesktop issues the VM start command to VMM. 2 58 Microsoft-Citrix VDI Technical Whitepaper Callout 3 4 5 6 7 8 9 Description VMM issues the VM start command to Microsoft Hyper-V. Hyper-V starts the required VM. The virtual desktop boots through network PXE boot, and then contacts the DHCP server to find the IP address and location of the boot file. The virtual desktop contacts the provisioning server and provides its MAC address. The provisioning server contacts the data store to identify the correct virtual desktop disk based on the MAC address. The provisioning server sends the required portions of the virtual disk needed to boot the operating system to the virtual desktop. After the virtual desktop starts, the virtual desktop agent reports to XenDesktop that the desktop is available. Shutdown Figure 33 illustrates the interaction between software components during shutdown. Figure 31. Power management—shutdown Microsoft-Citrix VDI Technical Whitepaper 59 Callout Description 1 Citrix XenDesktop Delivery Controller (DDC) contacts the data store to determine the ID for the virtual desktop from the Virtual Machine Manager component of Microsoft System Center 2012. XenDesktop issues the VM shutdown command to VMM. VMM issues the VM shutdown command to Microsoft Hyper-V. Hyper-V shuts down the required VM. VMM sends notification to XenDesktop that the VM has shutdown. 2 3 4 5 Master Image Update Using the master image, administrators can add operating system and software patches and install additional software components. Figure 34 shows the interaction between components as a master image is updated. Figure 32. Master-image update Callout 1 2 3 4 5 60 Description Image is set in private mode. Image is assigned to template VM. Changes are made to VM. Image is shut down and changes are saved on Citrix Provisioning Services (PVS). New image is assigned to production VMs. Microsoft-Citrix VDI Technical Whitepaper Appendix B Infrastructure Deployment Figure 35 shows a two-node failover cluster hosting infrastructure VMs. Figure 33. Two-node failover cluster © 2012 Microsoft Corporation. All rights reserved. This document is provided "as-is." Information and views expressed in this document, including URL and other Internet Web site references, may change without notice. You bear the risk of using it. This document does not provide you with any legal rights to any intellectual property in any Microsoft product. Microsoft-Citrix VDI Technical Whitepaper 61