Maximizing Storage Efficiency in Windows Server 2012 R2 and Windows Storage Server 2012 R2 Published January 2014 Abstract: This document examines features in Windows Server 2012 R2 and Windows Storage Server 2012 R2 that can be used to maximize storage efficiency. Topics include: How growing data volumes are driving the need for greater storage efficiency. An overview of the Microsoft storage stack, including its performance and costeffectiveness compared to a traditional storage area network (SAN). Built-in features that can be quickly deployed to improve storage efficiency, including Data Deduplication, Storage Spaces (storage virtualization), and Thin Provisioning and Trim. Supported deployment scenarios, including data center and remote office or branch office. © 2014 Microsoft Corporation. All rights reserved. This document is provided “as-is.” Information and views expressed in this document, including URL and other Internet website references, may change without notice. You bear the risk of using it. Some examples are for illustration only and are fictitious. No real association is intended or inferred. This document does not provide you with any legal rights to any intellectual property in any Microsoft product. You may copy and use this document for your internal, reference purposes. Maximizing Storage Efficiency in Windows Server 2012 R2 and Windows Storage Server 2012 R2 Contents Executive Summary ........................................................................................................................................................ 1 The Need for More Efficient Storage....................................................................................................................... 2 Enterprise-Class Storage in Windows Server........................................................................................................ 3 All You Need for Highly Efficient Storage is In-the-Box................................................................................... 4 Data Deduplication......................................................................................................................................................... 5 Storage Spaces................................................................................................................................................................. 7 Thin Provisioning and Trim ......................................................................................................................................... 9 Related Features ........................................................................................................................................................... 10 Deployment Scenarios ............................................................................................................................................... 11 Conclusion and Additional Resources.................................................................................................................. 13 Appendix A: SAN Performance at a Fraction of the Cost ............................................................................. 14 Maximizing Storage Efficiency in Windows Server 2012 R2 and Windows Storage Server 2012 R2 Executive Summary Few would disagree that affordable, reliable storage is an essential component of any technology-enabled business. When it comes to data storage, two facts hold true: For most companies, data volumes are continuing to grow at a rapid pace. Many of those companies are adopting ways of using storage more efficiently in order to minimize storage-related costs. Windows Server 2012 R2 Standard and Datacenter (and by extension, Windows Storage Server 2012 R2 Standard) provide everything that you need for highly efficient storage that can scale to support the largest workloads. Key technologies that contribute to this capability include: Data Deduplication, which can reduce the amount of disk space required for common storage workloads by 30-90 percent based on Microsoft internal testing. Storage Spaces, a technology that enables you to virtualize storage by grouping industrystandard disks into storage pools, and then create virtual disks (called storage spaces) from the available capacity in the storage pools. Thin Provisioning and Trim, which enable you to deploy only the disk space you need today, expand dynamically when needed, and automatically reclaim storage that is no longer needed. The benefits of using Windows Server 2012 R2 to maximize storage efficiency include: Reduced/deferred storage costs. With the technologies in Windows Server 2012 R2, you can store more logical data in less physical disk space, purchase only the physical storage you need today, expand dynamically as needed, and avoid operating costs associated with supporting unused disk capacity until it is actually needed. Low acquisition costs. All features needed to maximize storage efficiency are included inthe-box with Windows Server 2012 R2 (Standard and Datacenter) and Windows Storage Server 2012 R2 Standard, and can be used without any additional hardware, software, or licensing fees. Fast, easy deployment. With Windows Server 2012 R2, there’s no new hardware or software to deploy. All technologies for maximizing storage efficiency are built into the operating system and can be turned-on and configured in just a few minutes. Ease of management. Server Manager in Windows Server 2012 R2 can provide a single view of all your storage, across your data center. System administration tasks can also be performed using Windows PowerShell, enabling you to automate provisioning of new storage spaces and virtual disks whether you have one server or multiple servers. Maximizing Storage Efficiency in Windows Server 2012 R2 and Windows Storage Server 2012 R2 1 The Need for More Efficient Storage According to recent studies, more than 40 percent of information technology (IT) staffs in the world today are encountering significant increases in data volumes.1 At the same time, traditional SAN solutions are becoming more expensive to deploy and expand. Fortunately, new technology advances are providing a partial answer to these challenges: Today’s powerful, industry-standard servers provide an attractive alternative to costly, proprietary SAN storage controllers. 10 Gbps and faster network cards are enabling companies to use cost-effective Ethernet technology to remotely access shared storage. Leading technology companies have recognized this opportunity to reinvent enterprise storage and are bringing new solutions to market. Of course, making this work requires a whole lot of “secret sauce”—as required to turn a commodity server into a full-featured storage controller. But what exactly is needed from that software? Or, more specifically, which storage features can help you control storage costs in the face of growing data volumes? Again, the answer can be found in the market, among the many companies are adopting the latest ways of using storage more efficiently. These technologies include: Data deduplication, which makes storage more efficient by minimizing redundant data on a disk. Storage virtualization, which enhances storage scalability through the abstraction of logical storage from physical storage. Thin provisioning and trim, which enable you to create virtual disks that appear larger than their physical storage capacity, provision additional storage as needed, and reclaim that storage when no longer needed. If you’re not already thinking about adopting a highly efficient storage infrastructure based on industry standard hardware, you may want to consider one. But where can you get the technologies needed to maximize storage efficiency, how do they work, how are they deployed, and what do they cost? The remainder of this paper examines these questions from the perspective of Windows Server 2012 R2 and Windows Storage Server 2012 R2, which is based on Windows Server 2012 R2. Source: Agile BI, Complementing Traditional BI to Address the Shrinking Business-Decision Window, November 2011, Aberdeen Group, Inc. 1 Maximizing Storage Efficiency in Windows Server 2012 R2 and Windows Storage Server 2012 R2 2 Enterprise-Class Storage in Windows Server With Windows Server 2012 R2, you can take advantage of a proven, enterprise-class data center and cloud platform that can scale to run the largest workloads. Delivered as a dynamic, available, and cost-effective cloud solution, Windows Server 2012 R2 provides automated protection and cost-effective business continuity to keep your business up and running, all while simplifying storage management and protecting your existing storage investments. The Microsoft Storage Stack Windows Server 2012 R2 integrates SAN features with the power and familiarity of Windows Server, enabling you to easily scale up to meet growing storage needs on low-cost, industrystandard hardware. Even better, you can achieve SAN-like performance and reliability while significantly reducing storage costs in terms of $/IOPS and $/TB. Appendix A compares the performance and cost of the Microsoft storage stack to a traditional SAN, as determined by a study performed by ESG Lab. Figure 1 shows the Microsoft storage stack and its key components: Hyper-V workloads and SQL Server databases that access storage through existing networking infrastructure over the enhanced SMB 3 protocol Storage exposed through Windows Server-based Scale-Out File Servers Storage based on industry-standard disks and JBOD enclosures—provisioned using Storage Spaces Figure 1. Diagram of the Microsoft storage stack. Maximizing Storage Efficiency in Windows Server 2012 R2 and Windows Storage Server 2012 R2 3 All You Need for Highly Efficient Storage is Inthe-Box The Microsoft storage stack is more than just enterprise-ready; it’s also ready to help you maximize storage efficiency. Windows Server 2012 R2 Standard and Datacenter (and Windows Storage Server 2012 R2 Standard) provide all that you need for this in-the-box, ready to use, without additional product keys or licensing fees. When combined with the low acquisition costs for Windows Server 2012 R2 or Windows Storage Server 2012 R2 running on industry-standard hardware, this provides a level of value that is hard to match with any other storage platform. Key technologies in Windows Server 2012 R2 that can help maximize storage efficiency include: Data Deduplication, which uses sophisticated data reduction functionality to reduce the amount of physical disk space required to store a given amount of logical data. Data duplication can be used on any server, by itself, or with Microsoft BranchCache to extend branch office storage capabilities. Storage Spaces, which is Microsoft’s implementation of storage virtualization. It enables you to group industry-standard disks into storage pools, and then create virtual disks (called storage spaces) from the available capacity in those storage pools. When needed, additional capacity can easily be added to a storage space by simply bringing new disks into the underlying storage pool or pools. Thin Provisioning and Trim. When you create a virtual disk (including a storage space), you can either choose thin or fixed provisioning. With thin provisioning, you can create virtual disks that appear larger than the current storage pool capacity and then provision additional storage as needed. Similarly, trim enables you to reclaim storage that is no longer needed. These features in Windows Server 2012 R2 are designed to work with its many other storage features, including SMB 3.0, Hyper-V, Failover Clustering, Cluster Shared Volumes (CSV), Storage Quality-of-Service (QoS), and Hyper-V Replica. All storage features are accessible through a single, integrated management interface, making it simple to deploy scalable, highly available, easily managed storage that can support both traditional file server roles and application workloads—all on cost-effective, industry standard hardware. You can also choose how to buy: build (or spec) your own system using industry-standard components and Windows Server 2012 R2, or purchase a storage appliance with Windows Storage Server 2012 R2 preinstalled. Either way, all you need for highly efficient storage is inthe-box—so you’ll be all set to maximize storage efficiency and reduce your storage costs. Maximizing Storage Efficiency in Windows Server 2012 R2 and Windows Storage Server 2012 R2 4 Data Deduplication Data Deduplication, a storage efficiency feature first introduced in Windows Server 2012, helps address the ever-growing demand for file storage. Instead of expanding the storage used to host data, Windows Server automatically scans through your disks, identifying duplicate chunks and saving space by storing these chunks only once. This functionality saves you money by optimizing your existing storage infrastructure. In addition, deduplication offers even greater savings by extending the lifespan of current storage investments. How Data Deduplication Works Data Deduplication runs in the background on a file server, inspecting “cold” files that are not currently in use. Data Deduplication can also be used to optimize virtual disks for running VDI workloads—provided that the storage and compute nodes for the VDI infrastructure are connected remotely via the SMB protocol. During the deduplication process, the deduplication engine: Examines and segments files into small, variable-sized “chunks” of 32KB–128KB in size. Identifies duplicate chunks that appear in more than one file. Maintains a single copy of each chunk in a compressed format in a central repository, which is called a “chunk store,” and resides in the System Volume Information folder. Replaces each deduplicated file with a much-smaller reference (called a reparse point) that indicates which chunks are used by the file. When a deduplicated file is read, a filter in the read-path reassembles the file in a manner that is transparent to the calling application or user. Deduplication has a cache to avoid going to disk for repeatedly accessed chunks. (If multiple users are accessing deduplicated files that contain the same chunks at the same time, the caching of these chunks at the file re-assembly level will speed-up access times for all users.) Data Deduplication also throttles CPU and memory usage, enabling implementation of large volumes without impacting server performance. In addition, metadata and preview redundancy help prevent data loss due to unexpected power outages. Checksums, along with data integrity and consistency checks, help prevent corruption for volumes configured to use Data Deduplication. Routine compression run times can also be scheduled for off-peak times to reduce any impact those operations might have on data access. Maximizing Storage Efficiency in Windows Server 2012 R2 and Windows Storage Server 2012 R2 5 Where Data Deduplication Delivers Significant Results When Data Deduplication is employed, the resulting increase in storage efficiency depends on the type of data being stored. From both internal testing and that performed by ESG Lab, Data Deduplication has shown storage savings of 25-60 percent for general file shares and 90 percent for OS VHDs. This is far above what was possible with Single Instance Storage (SIS) or New Technology File System (NTFS) compression. Figure 2 shows the increase in storage efficiency provided by Data Deduplication for various storage workloads. Figure 2. Gains in storage efficiency provided by Data Deduplication for various workloads. Microsoft supports the use of Data Deduplication for most cold files. (SQL Server and Exchange Server files, even if cold, will not benefit much from Data Deduplication. Using Data Deduplication on these types of files is not recommended, nor is it supported by Microsoft due to performance considerations.) Data Deduplication is also supported for the optimization of virtual disks in VDI deployments. Data Deduplication was tested to ensure that it performs correctly on general virtualization workloads; however, efforts were focused on ensuring that the performance of optimized files is adequate for VDI scenarios. For non-VDI scenarios (general Hyper-V VMs), Microsoft cannot provide the same performance guarantees. As a result, Microsoft does not support deduplication of arbitrary, in-use VHDs with Windows Server 2012 R2. However, because Data Deduplication is a core part of the storage stack, there is no explicit block in place that prevents it from being enabled on arbitrary workloads. Maximizing Storage Efficiency in Windows Server 2012 R2 and Windows Storage Server 2012 R2 6 Storage Spaces Storage Spaces in Windows Server 2012 R2 gives you the ability to consolidate all of your Serial Attached SCSI (SAS) and Serial Advanced Technology Attachment (SATA) connected disks— regardless of whether they are Solid State Devices (SSDs) or traditional hard disk drives (HDDs)—into storage pools. After you have created these pools, you can then create logical storage devices from them, called storage spaces. Figure 3 provides a conceptual view of Storage Spaces and Storage Pools. Figure 3. Storage Spaces conceptual view. Storage Spaces virtual disks work the same as regular Windows disks. However, they can be configured for different resiliency schemes, such as mirroring and parity. Storage Spaces is compatible with other Windows Server 2012 R2 storage features, including SMB Direct and Failover Clustering, so you can use simple inexpensive storage devices to create powerful and resilient storage infrastructures on a limited budget. At the same time, you can maximize your operations by utilizing industry-standard hardware to supply high-performance and feature-rich storage to servers, clusters, and applications alike. Maximizing Storage Efficiency in Windows Server 2012 R2 and Windows Storage Server 2012 R2 7 Storage Tiers Windows Server 2012 R2 introduces a new, policy-based, tiered storage mechanism for Storage Spaces. Storage tiers provide more flexibility for hot and cold workloads and supports industry standard enclosures with SATA, SAS, and SSD devices. Storage Spaces assigns data to storage tiers within a tiered storage space based on how frequently the data is accessed: Storage Spaces automatically moves hot data (data that changes frequently) to the faster, but more expensive, SSD media. All data starts as hot data. Storage Spaces moves cold data (data that changes infrequently) to the slower, less expensive, hard disk drives. If cold data becomes hot, it automatically moves to the SSD media. If hot data becomes cold, it moves to the hard disk drives. Maximizing Storage Efficiency in Windows Server 2012 R2 and Windows Storage Server 2012 R2 8 Thin Provisioning and Trim When you create a virtual disk (including a storage space), you can either choose thin or fixed provisioning. With thin provisioning, you can create virtual disks larger than the current storage pool capacity and then add disks later to support needed growth. As a result, administrators only need to purchase the physical storage needed and can expand dynamically when necessary. Use of storage tiers requires fixed provisioning. Use of the Thin Provisioning feature in Windows Server is only supported for standalone, non-clustered configurations. (This does not apply to the use of thin provisioning functionality that is built-into a storage array.) Trim provides a mechanism that enables applications to give up storage when it is no longer needed, thereby ensuring maximal use. For example, assume a company stores its Hyper-V VMs on logical disks created with Storage Spaces. With trim, when a VM deletes a large file, it communicates the deletion to the host, which then communicates it to the storage space. As a result, the storage space automatically reclaims this space, making it available for use again. Maximizing Storage Efficiency in Windows Server 2012 R2 and Windows Storage Server 2012 R2 9 Related Features The following features—all built into Windows Server 2012 R2 (Standard and Datacenter) and Windows Storage Server 2012 R2 Standard—are also applicable to common scenarios that can benefit from improved storage efficiency. Hyper-V Replica Hyper-V Replica provides asynchronous replication of VMs for purposes of business continuity and disaster recovery. If a failure occurs at a primary site, the administrator can manually failover production VMs to a Hyper-V server at a recovery site. The VMs recover to a consistent point-intime, and are accessible to the rest of the network in near real-time. After the primary site comes back, the administrator can manually revert the VMs to the Hyper-V server at the primary site. Online VHDX Resize Virtual machines that host applications with service-level agreements must be able to increase and decrease the size of dynamic disks while the virtual machine is running versus having to take the virtual machine and applications offline to do it. Online VHDX Resize in Windows Server 2012 R2 offers a way to perform online VHDX resize operations (both expanding and trimming). With Online VHDX Resize, you can grow a SCSI virtual disk with no downtime, and expand and shrink a volume within a guest without downtime. Storage Quality-of-Service (QoS) When you virtualize databases or certain virtual machine workloads that are storage heavy, you need to be sure that the databases get the IO bandwidth they need and that you have the ability to monitor storage bandwidth usage. With storage QoS, a new feature of Hyper-V in Windows Server 2012 R2, you can set the maximum IOPS that the virtual machine can use. This can prove useful if you have an extraction, transform, load (ETL) job that runs during normal business hours but that you want to prevent from using all available IO on the virtual switch. QoS can throttle the available bandwidth to the guest network adapter when it reaches the limit that you set. You can also set a minimum amount of bandwidth so that it’s available even if other processes need resources. Maximizing Storage Efficiency in Windows Server 2012 R2 and Windows Storage Server 2012 R2 10 Deployment Scenarios The technologies for maximizing storage efficiency in Windows Server 2012 R2 are designed to work equally well in your main data center or a remote office or branch office. Data center Figure 4 shows a typical “standalone” configuration for a main data center. Figure 4. Typical data center deployment. Remote Office or Branch Office Figure 5 shows a typical deployment configuration for a remote office or branch office. Figure 5. Typical remote office or branch office deployment. Maximizing Storage Efficiency in Windows Server 2012 R2 and Windows Storage Server 2012 R2 11 In the remote office or branch office scenario, two additional technologies that are built into Windows Server 2012 R2 are often useful: BranchCache, and DFS Replication, both of which you can think of as “efficient synchronization” technologies. BranchCache BranchCache is a wide area network (WAN) bandwidth optimization technology. When users access content on remote servers, BranchCache copies that content from your main office or hosted cloud servers and caches it at branch office locations, allowing client computers at branch offices to access the content locally rather than over the WAN. BranchCache works seamlessly with Data Deduplication, Storage Spaces, and other Windows Server 2012 R2 storage features. DFS Replication DFS Replication is a multi-master replication engine that supports replication scheduling and bandwidth throttling. It uses a highly-efficient algorithm known as remote differential compression (RDC) to efficiently update files over a limited-bandwidth network. RDC detects insertions, removals, and re-arrangements of data in files, enabling DFS Replication to replicate only the changed file blocks when files are updated. DFS Replication also works seamlessly with Data Deduplication, Storage Spaces, and other Windows Server 2012 R2 storage features. Maximizing Storage Efficiency in Windows Server 2012 R2 and Windows Storage Server 2012 R2 12 Conclusion and Additional Resources The Microsoft storage stack can handle most enterprise workloads for a fraction of the price per IOPS and per TB as traditional SANs. With shrinking budgets and increased demand for storage, organizations need to take a fresh look at Windows Server 2012 R2 as a storage solution for workloads that do not require the advanced features and capabilities of traditional SAN storage. Windows Server 2012 R2 provides a new, less-expensive option for high-performance, resilient, enterprise-grade storage. Many companies have already begun to move toward more efficient storage infrastructures, and with the Microsoft storage stack, you can get the storage efficiency, reliability, and manageability you need—all at an affordable price. Additional Resources More information about Windows Server 2012 R2 File and Storage Services can be found at http://technet.microsoft.com/en-us/library/hh831487.aspx More information on Data Deduplication can be found at http://technet.microsoft.com/en-us/library/hh831602.aspx More information on Storage Spaces can be found at http://technet.microsoft.com/en-us/library/hh831739.aspx More information on Hyper-V (including related features in this paper) can be found at http://technet.microsoft.com/en-us/library/hh831531.aspx More information on BranchCache can be found at http://technet.microsoft.com/en-us/library/hh831696.aspx More information on DFS Replication can be found at http://technet.microsoft.com/en-us/library/jj127250.aspx Maximizing Storage Efficiency in Windows Server 2012 R2 and Windows Storage Server 2012 R2 13 Appendix A: SAN Performance at a Fraction of the Cost Number of Hyper-V virtual machines To illustrate the cost and performance differences between a traditional SAN storage solution and one built on the Microsoft stack, the Enterprise Strategy Group (ESG) tested response times of virtual machines (VMs) using different storage topologies (Fibre Channel-SAN, Internet Small Computer System Interface (iSCSI)-SAN, Storage Spaces over SMB with RDMA NICs). The results (see Figure 6) measure SQL Server response times for transactions in milliseconds when running 2, 4, 6, and 8 instances of SQL Server with each type of remote storage. SQL Server Response Time Comparison (OLTP Workload, Windows Server 2012, SQL Server 2012) (Less is better) 2 4 6 8 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 Average Transaction Response (s) Storage Spaces over SMB FC SAN iSCSI SAN Storage Spaces over SMB with RDMA Figure 6. Response time comparison. The time comparison between the different instances illustrates several interesting conclusions: Performance scales linearly as the number of VMs increases from two to eight. The performance differences between the storage configuration test scenarios are negligible. The performance of Storage Spaces over SMB with RDMA is slightly faster (1-4 percent) than iSCSI/FC SAN test scenarios. The performance of Storage Spaces over SMB is slightly slower (averaging 3 percent) than iSCSI/FC SAN. Maximizing Storage Efficiency in Windows Server 2012 R2 and Windows Storage Server 2012 R2 14 ESG Lab also converted the raw capacity of TBs to gigabytes (GBs) and divided the total cost of acquisition by the GB, resulting in the commonly-used metric of $/GB (see Figure 7). Similarly, to the overall cost of acquisition, iSCSI- and Fibre Channel SANs proved to be almost twice the cost of the Microsoft file-based storage offering with Storage Spaces and RDMA. $/GB Cost of Acquisition Analysis $10 $/GB $8 (14.4TB of raw capacity from 24 10K 600GB SAS drives) $6.65 $6.19 $6 $3.33 $4 $2 $0 FC SAN iSCSI SAN File-based Storage with Spaces, SMB, RDMA, SAS JBOD Figure 7: A comparison of cost acquisition between model scenarios ESG Lab did not factor in the cost of management and maintenance. Because many organizations have Microsoft and Windows experts, as well as storage experts, they can manage the Windows storage without the need for additional training. Many SAN IT vendors require vendor-specific storage specialists to provision, manage, and monitor the storage infrastructure. The full ESG Lab report can be found here. Maximizing Storage Efficiency in Windows Server 2012 R2 and Windows Storage Server 2012 R2 15