Microsoft® Exchange Characterization on Dell™ PowerEdge™ x6xx Servers and the Dell™|EMC® CX600 FC Storage Array Enterprise Systems Group (ESG) Dell White Paper By Richard Hou September 2002 Contents Executive Summary .................................................................................................................................... 4 Introduction ................................................................................................................................................. 5 Microsoft Exchange 2000 Architecture .............................................................................................. 7 Dell/EMC Fibre Channel Products for Storage Area Networks ................................................... 7 Test Methodology ....................................................................................................................................... 9 Hardware and Software ....................................................................................................................... 9 Exchange On PowerEdge x6xx Servers ..................................................................................................15 Host Side Memory Tuning ................................................................................................................15 Data File Tuning ..................................................................................................................................16 Optimize Transaction Log .........................................................................................................16 Optimize Storage Group ............................................................................................................17 Optimize Database File ..............................................................................................................17 RAID level and Stripe size .................................................................................................................20 Which Server to Use ............................................................................................................................22 Other Performance factors .................................................................................................................24 Conclusion ..................................................................................................................................................26 Contacts .................................................................................................................................................26 Appendix A: Reference Documents ......................................................................................................28 Figures Figure 1: Design flow for Exchange Environment .................................................................................. 5 Figure 2: Exchange 2000 Server Configuration ......................................................................................12 Figure 3: PowerPath Manager Console ...................................................................................................13 Figure 4: Memory Configuration on the CX600 Storage Processor .....................................................18 Figure 5: 73GB 10K rpm Drive Throughput ...........................................................................................19 Figure 6: CX600 Latency Performance Based on Number of Users .....................................................19 Figure 7: RAID5 Enhancements on the Dell/EMCCX600 ......................................................................21 Figure 8: Response time on RAID10 and RAID5 ...................................................................................21 September 2002 Page 2 Dell Enterprise Systems Group Figure 9: Response Time on different Stripe size ...................................................................................22 Figure 10: LoadSim Score for the PowerEdge 6650 and PowerEdge 2650..........................................23 Figure 11: PowerEdge 6650 versus 2650: CPU Load and Number of Users .......................................23 Figure 12: The Speed and Topology of the HBA card ...........................................................................24 Figure 13: Load Balancing Policy with PowerPath ................................................................................25 Tables Table 1: Configurations of the Test Exchange Servers...........................................................................10 Table 2: Feature Comparison Between the Dell/EMC FC4700-2 and the CX600 ...............................11 Table 3: ATF and PowerPath Feature Comparison ...............................................................................13 September 2002 Page 3 Dell Enterprise Systems Group Section 1 Executive Summary This paper provides a set of data for analyzing the performance of the Dell™ PowerEdge™ 6650 and PowerEdge 2650 servers running Microsoft® Exchange 2000 Server with the new Dell|EMC® CX600 Fibre Channel RAID storage array. Also, this paper reviews the data to provide guidelines for Exchange deployment from the perspective of storage consolidation and server consolidation. By analyzing the structures of the CX600 storage array and the Exchange 2000 database, key factors affecting design and performance of the Exchange storage backend are discussed. These considerations and suggestions can help to achieve a more efficient storage IO environment. September 2002 Page 4 Dell Enterprise Systems Group Section 2 Introduction The advent of new technology continuously expands the possibilities for better solutions. The goal of this trend is not only to increase efficiency and return-oninvestment (ROI), but also to achieve better performance. Since Microsoft released Exchange 2000, it has become very popular because of its reliability and efficiency in providing a critical messaging service for the growing needs of user information/ collaboration. But because of the business critical nature of this application, designing and deploying an Exchange environment requires considerable planning. This includes everything from Active Directory design to disk group configuration. Figure 1 shows a typical flow chart for designing an Exchange environment. Every factor that theoretically affects the environment should be studied prior to initiating the production environment. Environment Survey Setup Testing Pilot Move to Production Environment Data Analysis Pilot Testing Monitoring Performance Design Plan Data Analysis Future Upgrade or implementation Testing Plan Review Design Phase Staging Phase Production Phase Improve Phase Figure 1: Design flow for Exchange Environment This paper focuses on the storage design and characterization of Exchange 2000 on Dell newly released PowerEdge 6650 and PowerEdge 2650 servers and September 2002 Page 5 Dell Enterprise Systems Group DELL/EMC CX600 Storage Area Network (SAN) IO subsystem. To plan the storage strategy for Microsoft® Exchange 2000, four criteria usually apply: capacity, availability, performance, and recovery. Different environments have different priorities, so for each of these criteria, it is important to apply the flow chart and make sure the deployment will meet the requirements of the messaging environment. The criteria significantly affect the cost, architecture, maintenance, and disaster recovery of the Exchange 2000 environment. Capacity is typically the easiest of the criteria to start with. The Exchange 2000 database includes three parts (discussed in detail in “Running an Optimized Microsoft Exchange 2000 Server with Dell|EMC SAN” at http://www.dell.com/us/en/esg/topics/power_ps2q02-hourun.htm): a private database for information store, a public database for public folders, and a transaction log for log files. The total storage requirement for capacity is: Capacity = Information Store + Public Store + Transaction Log The size of the information store varies from organization to organization. The total size is typically equal to the number of users times the size of the mailbox for each user. Capacity will significantly affect disaster recovery time (discussed in the Dell paper “Backup and Restore of Microsoft Exchange 2000 Using Dell|EMC SAN” at http://www.dell.com/us/en/esg/topics/power_ps2q02-houback.htm). Every organization has a different SLA (Service Level Agreement). The SLA defines the level of availability required by the organization, and even how the requirements may differ from site to site. The messaging environment may be categorized as mission critical, business critical, regular, or non-essential… although the last is rare due to the heavy dependency on messaging systems of modern organizations. Availability can be achieved at different levels. On the OS level, clustering can provide application/CPU redundancy for an Exchange environment. On the I/O level, RAID (Redundant Array of Independent Disks) and Dell/EMC’s no-single-point-of-failure environment help ensure there is always a redundant path to access data. Availability can affect the cost of the implementation and increase the complexity of the environment, so it must be planned for carefully. Performance is a very broad topic. It can affect the environment at all levels, from the OS to a single disk in an array. A detailed profile of the current environment will be a great help to understand the implementation of the future solution. For this paper, the focus is on the throughput of the disk backend. Most companies profile their environment by I/Os per user per second. Reflected at the backend, this is equivalent to how many transfers (total reads and writes) the ID system handles every second. Recovery is another important metric in the SLA. A company without a recovery plan is like a company with no messaging system. Without a fallback plan, the damage may be greater than for an organization that has never used an Exchange environment. Recovery time is closely related to the size of the information store. By carefully distributing the mailbox to multiple databases, it is possible to greatly reduce the backup/restore time. September 2002 Page 6 Dell Enterprise Systems Group Dell has partnered with Microsoft for over 15 years. This relationship has helped the two companies join together to provide integrated solutions with high availability, reliability, scalability, and high performance. Microsoft Exchange 2000 on Dell PowerEdge Servers and Dell/EMC Storage Area Network (SAN) products not only offers redundancy, but also provides excellent price/performance for small, medium and enterprise customers. This paper discusses the issues mentioned above and presents best practices for configuring Dell servers and storage for Exchange 2000. Although the focus of this paper is on Information Store on a stand-alone server, the best practices also apply to cluster and public folders. Microsoft Exchange 2000 Architecture With significantly enhanced features and newly developed functions, Microsoft Exchange 2000 is a major revision of Exchange 5.5. Architecturally, Exchange 2000 does not use a self-contained messaging system like Exchange 5.5. Exchange 2000 separates many components and tightly integrates them with the underlying Windows® 2000 operating system and Microsoft Internet Information Server (IIS). These improvements are designed to reduce the overhead on the Exchange servers, enable consolidation of the users onto many fewer servers, and unify the network and messaging administration, all of which can help contribute to lower total cost of ownership (TCO). An article in the Power Solutions Journal Issue 3, 2001 discussed the new architecture. In summary, the key features of Exchange 2000 that bring scalability, reliability and availability include: Active Directory integration Multiple message database support Reliable SMTP support Active/Active Clustering Support Dell/EMC Fibre Channel Products for Storage Area Networks Dell’s Storage Area Networks (SANs) use fibre channel and RAID (redundant array of independent disks) technologies to provide the data protection, speed, flexibility, and storage capacity necessary to meet the performance requirements of complex server networks. EMC developed its first full fibre channel storage solution in 1997. By partnering with EMC, Dell’s fibre channel storage subsystems can support large or small data storage networks. Dell provides endto-end enterprise solutions including software, hardware and service, as well as a single point of contact and the ability to upgrade to future products and features to help protect customer investments. September 2002 Page 7 Dell Enterprise Systems Group SANs provide an excellent means for storage consolidation. Multiple servers (hosts) with different operating systems can share the same backend fabric on a different network environment that can help remove traffic from an Ethernet LAN, resulting in better network throughput. Currently the fibre channel standard – FC2 – delivers up to 200 megabytes per second (MB/sec) of bandwidth for half-duplex configurations, and up to 400 MB/sec for full-duplex configurations. Dell SANs also provide a high availability environment through the use of redundant paths from the front-end servers that share the storage equipment to the redundant switches in between the storage subsystem(s) and the servers. This architecture helps ensure I/O access if one component fails. Battery backup is used to increase the integrity of data written to the RAID controller caches. Dell/EMC products also provide the horsepower to drive I/O performance. The recently released Dell/EMC CX600 is a completely new enterprise storage product, appropriate for enterprise users looking to implement modular storage. For a shared environment or an application that is I/O intensive, the CX600 has two 2 GHz Pentium® processors on each storage processor. This configuration is designed to deliver optimal I/O performance. With the Dell/EMC CX600, up to 240 disks are supported by adding Disk Assembly Enclosures (DAEs), which house only drives. These features help make this Dell/EMC solution a great choice for large organizations looking to consolidate and centralize storage with modular systems that can grow along with user needs. September 2002 Page 8 Dell Enterprise Systems Group Section 3 Test Methodology The goal of testing was to determine the best and most economical configuration for an environment, and understand how many users can be supported on both the servers and the backend storage. The test Dell developed simulated an environment with multiple users accessing the server simultaneously to stress the Exchange server. The I/O activities were then logged and analyzed to understand Exchange IO patterns. To simulate the user workload, the Microsoft Messaging Benchmark 2 (MMB2) specification was used to create the user database and Microsoft’s LoadSim program was used to generate messaging requests. Information stores for 500 to 5,000 users were created on RAID10 and RAID5 arrays with different parameters. Hardware and Software The focus of the test was to identify the best configuration for the backend I/O system, and determine how to eliminate factors that negatively impact the performance prior to reaching the configuration backend. For a 5,000-user test, the Active Directory Service was likely to be very busy. To eliminate the potential bottleneck, Active Directory (AD) was installed on a PowerEdge 6450 as the Domain Controller (DC). The AD database and log file were put on different physical drives than the operation system. The PowerEdge 6450 was also used as the performance monitoring system while fourteen Optiplex™ GX110s were used to create multiple virtual users for different scenarios. For validation, one Optiplex was configured with 20 users as a control system. This was designed to compare the results of the control system with the rest of the clients in order to determine how many users will create bottleneck and stress the server. Two servers, a PowerEdge 6650 and a PowerEdge 2650, were used as the Exchange 2000 servers. Both servers use Intel® Xeon™ processors. The PowerEdge 6650 Exchange server was configured with four Intel Xeon 1.6GHz CPUs and 4GB memory, while the PowerEdge 2650 had two Intel Xeon 2.4GHz processors with 4GB memory. Table 1 shows the configurations of the test servers. Note: The primary differences between the dual (DP) and quad (MP) Xeon processors are in the clock speed and on-die cache. Currently, the DP Xeon processors support up to 512KB L2 cache, with clock speeds that range from 1.8 to 2.4GHz. The MP Xeon processors not only support 256KB L2 cache, but also up to 1MB L3 cache, with clock September 2002 Page 9 Dell Enterprise Systems Group speeds of 1.4, 1.5 and 1.6GHz. Because of greater cache support, the MP processors generally perform and scale better than DP processors, and are used on mid-range to high-end servers. PowerEdge 2650 PowerEdge 6650 Processor/Cache Dual 2.4GHz Xeon DP, 512KB L2 Cache Quad 1.6GHz Xeon MP, 256K of L2 and 1MB L3 cache Front Side Bus (FSB) 400 MHz FSB 400 MHz FSB Chipset ServerWorks GC-LE ServerWorks GC-HE Memory 4GB SDRAM (can be configured up to 6GB) 4 GB SDRAM (can be configured up to 16GB) SCSI Integrated dual channel Ultra3 (Ultra160) Integrated single channel Ultra3 (Ultra160) RAID Embedded PERC 3/DI w/128MB cache (optional feature) Hot Plug PCI/PCI-X Slots 3 PCI-X (1X64/133, 2X64/100) 7 PCI-X; 1 PCI (7x64/100,1x32/33) Hot Plug Hard Disk Drives 5 X 73GB hot plug 5 X 73GB hot plug NICs Embedded Gigabit NICs1 Dual embedded Broadcom Gigabit Network Interface Cards Remote Management Embedded Remote Access (ERA) Table 1: Configurations of the Test Exchange Servers Another major processor improvement is Hyper-Threading. By efficiently splitting the use of on-die registers, Hyper-Threading allows a single CPU to act like two, which benefits applications that are not licensed by the number of CPUs. But since there is only one path to high-speed cache memory and one connection to the rest of the computer, performance isn't the same as with a true dual-processor server. The Dell System Performance Analysis (SPA) team conducted a test on a PowerEdge 6650 with Hyper-Threading both on and off, and the performance gain with it on was about 20 percent. Even so, this technology helps improve the performance of the PowerEdge 6650 and PowerEdge 2650. With this under consideration, tests were run on both platforms to understand if the PowerEdge2650 can be used for entry and middle level Exchange environments, which could help reduce the cost of hardware. 1 This term indicates compliance with IEEE standard 802.3ab for Gigabit Ethernet, and does not connote actual operating speed of 1 Gb/sec. For high speed transmission, connection to a Gigabit Ethernet server and network infrastructure is required." September 2002 Page 10 Dell Enterprise Systems Group A Dell/EMC CX600 SPE and two CX600 DAEs were used as the fibre channel storage subsystem. The CX600 is a new product that extends the Dell/EMC Fibre Channel offering into a higher performance market. Table 2 shows the difference between FC4700-2 and the CX600. FC4700-2 Cx600 4 direct attached servers 8 direct attached servers Up to 64 servers in a SAN Up to 128 servers in a SAN Up to 21.7TB / 120 Disks 18,36,73,181Gb 1.6-inch drives Up to 17.5TB / 240 Disks 36,73Gb 1-inchdrives 360MB/Sec 1,100MB/Sec 50,000 IOPS 140,000 IOPS RAID 0, 1, 1/0, 3, 5 RAID 0, 1, 1/0, 3, 5 Maximum 2GB Cache Maximum 8GB Cache Multipath I/O Multipath I/O Non Disruptive Upgrade Non Disruptive Upgrade Snapshots and FC mirroring Snapshots and FC mirroring 4 X 2Gb front-end, 4 X 1Gb back-end 8 X 2Gb front-end, 4 X 2Gb back-end Table 2: Feature Comparison Between the Dell/EMC FC4700-2 and the CX600 Unlike the FC4700-2 Disk Processor Enclosure (DPE) with 10 drives inside, the CX600 Storage Processor Enclosure (SPE) does not have any drives inside. All DAE enclosures were populated with fifteen FC2 interface disks, with each using only 3U of rack space. For storage subsystem connectivity, the Exchange server had two Emulex 9002 Host Bus Adapters (HBAs). Two Dell/EMC DS-16B2 switches were used to create a switched fabric. Figure 2 shows the complete configuration tested, with fully redundant paths for high availability. September 2002 Page 11 Dell Enterprise Systems Group Workload generator TCP/IP Network (LAN) Exchange Server Management Station SAN Storage System DC CX600 Fibre Channel Network Network Management Connection Figure 2: Exchange 2000 Server Configuration Note from Table 2 that the CX600 has four FC2 back-end and eight FC2 front-end connections. These can significantly widen the pipe for the data movement. The CX600 uses the same means as the FC4700-2 for connections between the processor enclosure and the disk enclosures. Please refer to the SAN deployment poster for a diagram of how this is done: http://www.dell.com/us/en/esg/topics/power_ps2q02-poster.htm. Other differences between the CX600 and FC4700-2 are on the software side. Disk management can be managed through a web browser. The UI (User Interface) is run on array to provide downloadable Java applets for the front interface. Backend storage I/O systems can also be configured in different domains for ease of management as in an Ethernet environment. Management station can be any system with JRE 1.3.1.01 and above installed. For security, the web-based management interface is using both HTTP and HTTPS for connection. Three kinds of users can be created: administrator, manager, and monitor. The latter only has read access. On the host side, PowerPath™ has replaced Application Transparent Failover (ATF) software. PowerPath automatically manages multiple paths between host and LUN. Unlike ATF, which is only for failover, PowerPath supports failover and load balancing, as well as restore. In a multiple path environment, if one path fails, the I/O is automatically load balanced through the rest of the path, and September 2002 Page 12 Dell Enterprise Systems Group restored to the previous operation after the failure has been fixed. Table 3 shows the differences between the two applications. Feature ATF PowerPath Connectivity Up to 4 paths Up to 16 paths Fail Over & Recovery Automatic Fail Over, Manual Recovery Automatic Fail Over and Recovery Load Balancing Static & Round Robin Round Robin, Least I/O’s, Least Blocks, Dell/EMC Optimized Host/Array types W2K, NT, AIX, Solaris, with Dell/ EMC arrays W2K, NT, AIX, Solaris, Unix & Linux with Dell /EMC & Symmetrix arrays Table 3: ATF and PowerPath Feature Comparison Figure 3 shows the multiple paths to the LUN with PowerPath. Figure 3: PowerPath Manager Console To ensure that the server itself is not the bottleneck, best practices from previous testing still apply: September 2002 Ensure that the server has sufficient memory, fast disk drives, and a fast network adapter(s). Page 13 Dell Enterprise Systems Group Run Exchange server as a member server, not a domain controller, to reduce the traffic to the memory and disks. Run the Exchange server and domain controller or Global Catalog Server on the same network. Install Microsoft Windows 2000 Advanced Server and Exchange 2000 on RAID 1 internal drives. Reconfigure the Page file to be on a different physical drive than the operating system. Change the server mode to “Optimized for Server Application.” Enable the /3GB switch from the boot.ini file when using 4GB memory or more. The last bullet is also called 4GT tuning. By default, the Windows 2000 kernel will allocate 2GB memory, even though most of the time the usage is under 1GB. 4GT tuning forces the OS to allow user mode processes to use the upper 3GB of RAM. It is also important to create a monitoring plan to continuously monitor the environment. The performance of the environment is not the same all the time, especially when new factors are introduced into the scenario. Also see the following URL for additional information about deploying Exchange: http://www.microsoft.com/exchange/default.asp September 2002 Page 14 Dell Enterprise Systems Group Section 4 Exchange On PowerEdge x6xx Servers As mentioned in Section 2, four factors affect Exchange 2000 Server’s database implementation: capacity, redundancy, performance, and recovery. Although it is possible to use the capacity formula on Page 5 to determine how much storage space is necessary, this only establishes the minimum storage requirement. Capacity is different than server sizing. Sizing is about how many mailbox users are supported per server. Although one server might support enough storage space, other resources may limit the number of users on it. Differences in the SLA of different databases can result in the requirement to put them in different storage groups. Understanding capacity combined with performance data will determine the right number of servers for a given environment. Host Side Memory Tuning Of all the factors that affect the supported number of users on the host side, the most important one is memory. In order to operate at peak efficiency, sufficient physical RAM must be installed to prevent page file swapping to disk. Also, it is critical that the STORE.EXE process does not run out of virtual memory. If this happens, memory allocation fails and the Information Service will not respond to requests. Under normal operating conditions, half of the total memory installed is reserved for the OS, leaving half of the memory available for applications. This is unnecessary since the OS is not using nearly that much memory. When the 4GT tuning is enabled, it reserves 3GB of virtual RAM (instead of 2GB) for applications and 1 GB of virtual RAM is left for the OS and pagefile. With 4GB of physical RAM installed and the use of the /3GB switch, the majority of virtual memory resides in physical RAM. Under the assumption that the maximum virtual memory for the Store.exe is 2.8GB for a standalone server, the OS usually uses about 128MB of this memory, and the server usually reserves 150MB for each storage group created. A typical user requires 300KB memory. If the Exchange server has only one storage group, then the maximum number running on a server can be: (2.8GB – 128MB -150MB)/300KB = 8,800 users This means that even if the backend storage is enough, the server still cannot support more users unless each user is using less than 300KB of memory. For best practices, putting all databases into one storage group is not a good plan. When it is time to perform an online data backup, databases inside the same storage group can only be done serially, compared with databases from different September 2002 Page 15 Dell Enterprise Systems Group storage groups where the backup can be executed in parallel. When more storage groups are created, the number of supported users drops because more memory is reserved for the groups. The advantage of using multiple storage groups with the database is that the system can efficiently recover the user database because it has been split into manageable sizes. And because it is divided into storage groups, the database can be put on different Logic Units (LUN) that offer more disk spindles (for better disk performance) as well as the transaction log file for each different storage group. Data File Tuning Optimize Transaction Log Transaction logs are important, as they are key to the recovery of the Exchange database. Because it is 100 percent sequential write and I/O intensive, write operations should be optimized. For a small user environment (a few hundred users) the effect of the transaction log is not significant. But for large user environment, optimization of the transaction log file can bring performance gains. Here are some best practices on how to optimize the transaction log: Use a dedicated LUN for transaction log files. Do not put multiple transaction log files on the same LUN. Even if they are all sequential, multiple sequentials can actually create a random environment. Enable the write-through mode (disable write cache) on the LUN. Use RAID1 or RAID10 for the log file. This is because the transaction log is 100 percent write. Dell used RAID1 for the transaction logs for all tests. Make sure the backup power source is functioning normally. This is very important. As with the FC4700, the CX600 comes with two SPSs (standby power sources) for the storage processors. In a power failure, the SPS will flush all data in the cache to the drive to ensure no data is lost. Lack of backup power can cause data loss or file corruption because the most recent changes have not yet been committed to the database. Storage Processor cache is also protected by mirroring write cache between two processors. On Exchange 5.5, the circular log is enabled by default. Although this can save the log space, it is not recommended for recovery of log files. Make sure the circular log is disabled to enable data discovery. Also, monitor the environment to assure the capacity for log files is large enough between full backups. Inside the event viewer, change the Application and System event properties to “Overwrite events as needed.” Otherwise, if the event log is full, it will stop the service. September 2002 Page 16 Dell Enterprise Systems Group Optimize Storage Group Multiple storage groups are essential for server consolidation. When Exchange backs up the data, it will back up from the storage group level, so multiple groups can create multiple parallel backup processes. And, when doing the restore, it can be on database level within the storage group. Thus users can be categorized into different databases, i.e., by name, by SLA or by business functionality. Since backup windows have been reduced, more users can be put on one server than use the common backend storage device for storage consolidation. Optimize Database File Different than the transaction logs that live on the storage group level, database files live inside the storage group and are defined at data-store level. The I/O pattern for the Exchange database (random reads and writes with about a 21:9 ratio) is different than that of the transaction log. Database files are critical since they are the repositories of all messages. Factors like size, RAID type, stripe size and memory usage can affect the performance and backup/restore time. Because of the random access pattern, the system doesn’t need a lot of read cache. As long as there is enough space for pre-fetching, the rest of the memory should be assigned to write cache to improve the performance. The CX600 algorithms that both read and write cache will be also be used for the read I/O. This is very efficient, since most of the read I/Os tend to read the most recent data, and the write cache usually holds the most recent data. The CX600 supports a total of 8GB of onboard memory. For this test, Dell used 50MB for read cache and assigned 2048MB for write cache (although theoretically up to 4GB can be used for write cache , since Dell was running a beta core software, it was only possible to assign 2048MB to the write cache. The official release will fix this). Figure 4 shows the configuration window for SP cache. September 2002 Page 17 Dell Enterprise Systems Group Figure 4: Memory Configuration on the CX600 Storage Processor Another factor to consider is the database for key users in the organization. Usually these users need fast recovery time, so they should be put into a relatively small database. On the other hand, they should not all be put within one database, so that in the case of a database failure, not all of these users will lose their email. The number of drives inside the RAID group not only determines the available capacity, but is also an important factor for performance. The number of users and the mailbox sizes determine the required capacity. Then the number should be compared to the spindle number derived from I/O performance of the drives. Administrators should have an idea of the typical I/O workload for their environment (i.e., IOPS for the Exchange environment), and use the following formula to determine the total I/O requirements for the backend IO system: Total I/O = IOPS/user X number of Users The total I/O is divided by I/Os that each spindle handles, which should equal the number of working spindles in the RAID group. In addition, different drives have different performance rates. For the CX600 in this test, default drives inside were 73GB 10K rpm drives. By using the Intel IOMeter, Dell determined the typical throughput for each drive, as shown in Figure 5. September 2002 Page 18 Dell Enterprise Systems Group 250 73GB 10K IOPS 200 150 100 50 0 2K IO 4K IO 8K IO Access IO Figure 5: 73GB 10K rpm Drive Throughput The result of running LoadSim to simulate Exchange users on a PowerEdge 6650 is close to a production environment. For the test, Dell used a 5 spindle RAID5 with a 32KB stripe size. Dell derived this configuration from previous tests as a good starting configuration. (The results of testing with a different stripe size are shown in the next section.) As shown in Figure 6, the latency is acceptable until the load exceeds 3,000 users. Response time starts to change exponentially. This is quite different, and an improvement, compared with the results from FC4700, where 5 drive RAID5 only supported up to 2,000 users. This is due to new features on the CX600 including large cache support, true FC2 interface on the hard drive, and more room in the drive memory for buffering. 0.18 0.16 Latency (sec) 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 500 750 1000 1250 2250 2500 3000 3750 User Number Figure 6: CX600 Latency Performance Based on Number of Users September 2002 Page 19 Dell Enterprise Systems Group A LoadSim score is a statistic that is 95 percent focused on response time. It averages the data collected from different actions: send, receive, schedule, public folder, etc.; and creates a more user centric view. Figure 7 shows the LoadSim score on the test. The graph shows that at a load above 3,000 users, the response time for a front-end user will go above a half second and become unacceptable. 10000 LoadSim Score (95%,msec) 9000 8000 7000 6000 5000 4000 3000 2000 1000 0 500 750 1000 1250 2250 2500 3000 3750 User Number Figure 7: LoadSim Test Score RAID level and Stripe size Storage consolidation is not a good idea unless the data at the backend can be protected and highly available. At the disk group level, Dell uses RAID for data protection. Both RAID 5 and RAID 10 provide data protection, but come with different costs and benefits. For the same amount of disks, RAID 5 provides more space than RAID10. Both RAID5 and RAID10 have the same read performance, but on write I/O, traditionally RAID10 out-performs RAID5 because RAID5 needs to read the old data and parity data from the drive, re-calculate the new parity, then write new data and parity to the disk. This is called the “RAID5 penalty” since it uses the extra I/O for parity calculation. This has been changed in the CX600. Coalescing is used inside the cache, so that instead of writing all data to the drive, the storage controller makes sure that the data can fit in the stripe line, and then writes the data and parity at the same time. Figure 8 shows the flow of this mechanism. September 2002 Page 20 Dell Enterprise Systems Group Figure 8: RAID5 Enhancements on the Dell/EMCCX600 Modified RAID5 runs like RAID3, so it is very efficient for large block I/O operations. Exchange typically does 5 to 6K I/O blocks, so this implementation is not very efficient since I/Os will be flushed to the drive before the stripe line can be filled, thus old data and parity information still need to be read in first. Dell tests show the difference between a 3,000-user test on a 5-spindle RAID5 and on a 10-spindle RAID10. On performance, RAID10 still performs better but RAID5 is more economical. 350 300 LoadSim Score 250 200 150 100 50 0 RAID5 RAID10 Figure 9: Response time on RAID10 and RAID5 Stripe size is another factor that can be tuned after the RAID level is decided. Although Dell’s previous test and papers show that a 32K stripe out performs the default 64KB stripe, on the CX600 this was not true. Figure 9 shows the test results. The test was of 3,000 simulated users driven by Microsoft LoadSim software on the same 5-spindle RAID5 group. When binding the LUN there is an option for element size, which defaults to 128. The default of a 128 element size equals a 64KB stripe size because on Dell|EMC storage an element is 512 bytes. September 2002 Page 21 Dell Enterprise Systems Group This means that the element size must be divided by two to reach the size in KB, because it takes two 512-byte elements to equal 1 KB. Test results show that Keeping the Default for Exchange environment offers the best results. 0.07 Latency (sec) 0.06 0.05 0.04 0.03 0.02 0.01 0 16K RAID5 32K RAID5 64K RAID5 Figure 10: Response Time on different Stripe sizes Which Server to Use Is it necessary to use the most powerful server in an Exchange environment? The answer is no, for several reasons. First, Exchange is not a CPU-intensive application. If the front-end server is highly utilized, there will already be a bottleneck somewhere else: memory, network or backend I/O subsystem. Second, Hyper-Threading significantly improves two-way system performance. Third, the speed increases of CPUs are significant. A less powerful system today can handle a greater workload than a “powerful” system from just a few years ago. Finally, storage I/O subsystems have greatly improved in capability. The main reason to use a SAN is for storage consolidation. Since Exchange is not a CPU hog, if the backend can handle the traffic, then the front-end server should not be a big concern. With this in mind, Dell also tested a Dell PowerEdge 2650 at a 3,000-user level versus the PowerEdge 6650. Figure 10 shows that the results are very close. September 2002 Page 22 Dell Enterprise Systems Group 370 LoadSim Score (msec) 360 350 340 330 320 310 300 290 280 PE6650 PE2650 Figure 11: LoadSim Score for the PowerEdge 6650 and PowerEdge 2650 Dell used one more test to compare the two systems. For this test, the focus was on the CPU utilization of both of the systems. The results show that for the PowerEdge 2650, CPU utilization and number of users are linear. This means that for more than 5,000 users, an organization should use the PowerEdge 6650 instead of the PowerEdge2650, since it is best practice to always leave a 25 percent margin on the server CPU. This makes PowerEdge 2650 a very attractive candidate for small- to middle-sized Exchange environments, especially based on cost. For the Enterprise-level environment, the PowerEdge 6650 is a better choice for both scalability and performance. 70 PE6650 CPU Utilization (%) 60 PE2650 50 40 30 20 10 0 2000 3000 4000 5000 Figure 12: PowerEdge 6650 versus 2650: CPU Load and Number of Users Note that the users shown here are not running in one single RAID5 group. For the above data, four RAID5 groups were used within one storage group. September 2002 Page 23 Dell Enterprise Systems Group Other Performance factors There are other factors that can affect performance. On the I/O pipeline, any component can create a performance bottleneck. The following are recommendations from EMC and lessons learned from previous tests completed at Dell. Figure 13: The Speed and Topology of the HBA card First, ensure that the host bus adapters (HBAs) are taking advantage of FC2. As shown in Figure 12 (based on an Emulex controller), make sure each part of the topology runs at 2Gb speed, and that the right topology is selected. This will help optimize the communication between devices. The second parameter to check is through the PowerPath application. PowerPath has different ways to distribute I/Os through different paths. It is important to consider and choose the appropriate path for the environment. Figure 13 shows the configuration window. September 2002 Page 24 Dell Enterprise Systems Group Figure 14: Load Balancing Policy with PowerPath And, finally, if possible, dedicate a LUN for each database, and spread the databases across different storage groups. This helps reduce overhead on the same RAID group, and increase backup/restore efficiency while the system is online. Also, use a dedicated LUN for the transaction log. Since the Transaction Log is pure write, use RAID1 or RAID10 instead of RAID5. September 2002 Page 25 Dell Enterprise Systems Group Section 5 Conclusion The tests that Dell conducted simulating a Microsoft Exchange environment have shown that Dell’s PowerEdge 2650 and PowerEdge 6650 servers are both great candidates for an Exchange server. The PowerEdge 2650 is an excellent choice for a small to mid-sized environment, while the PowerEdge 6650 is good for midsized to enterprise organizations. The DELL/EMC CX600 provides enough power to drive the storage backend with more enhanced features than its predecessors, and Exchange 2000 Server takes full advantage of the high availability and scalability of the CX600. With the new database design, more users can use a single server, which helps reduce the number of servers needed. With the right design, configuration, and continuous monitoring, Dell servers and storage products can be combined with Microsoft Exchange 2000 to create a robust messaging environment. Contacts For questions about this paper or the implementation of Microsoft Exchange Server 2000 with Dell products, please contact a Dell sales representative. For comments and feedback about this paper, please send email to us_technology_showcase@Dell.com. Solution Enablement Lab and Showcase Enterprise Systems Group/Storage Systems Group Dell Computer Corporation One Dell Way Round Rock, Texas USA 78682 +1-(800) WWW-DELL (999-3355) or +1-(512) 338-4400 us_technology_showcase@Dell.com www.dell.com THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF ANY KIND. September 2002 Page 26 Dell Enterprise Systems Group Dell, PowerEdge, and PowerVault are trademarks of Dell Computer Corporation. EMC is a registered trademark of EMC Corporation. Microsoft and Windows are registered trademarks of Microsoft Corporation. Intel is a registered trademark and Xeon is a trademark of Intel Corporation. Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products. Dell disclaims proprietary interest in the marks and names of others. ©Copyright 2002 Dell Computer Corporation. All rights reserved. Reproduction in any manner whatsoever without the express written permission of Dell is strictly forbidden. For more information, contact Dell. Dell cannot be responsible for errors in typography or photography. Information in this document is subject to change without notice. September 2002 Page 27 Dell Enterprise Systems Group Section 7 Appendix A: Reference Documents 1. Microsoft Exchange Web Site: http://www.microsoft.com/exchange 2. Dell online: http://www.dell.com 3. Microsoft Exchange 2000 Server Administrator’s Pocket Consultant, William R. Stanek, Microsoft Press, 2000 4. Exchange 2000 Server Black Book, Marcus Goncalves, The Coriolis Group, LLC, 2002 5. Exchange 2000 Resource Kit, Microsoft Press, 2002 6. Tuning and Sizing Windows 2000, Curt Aubley, Prentice Hall PTR, 2001 (date) 7. Running Microsoft Exchange 2000 on Dell SAN 4.0, Richard Hou, Power Solution, Issue 3, 2001 8. Running an Optimized Microsoft Exchange 2000 server with Dell|EMC SAN, Richard Hou, Power Solution, May 2002 9. Exchange 2000 on EMC Symmetrix: A guide to Best Practices, EMC white paper, March 2002 10. The EMC CLARiiON Data Integrity Difference, EMC white paper, May 2001 September 2002 Page 28 Dell Enterprise Systems Group