Redpaper Ed Moffatt IBM SAN Volume Controller 8A4 Hardware: Performance and Scalability with the IBM System Storage DS3400 In this IBM® Redpaper™ publication, we discuss the performance and scalability of a storage area network (SAN) environment that is based around IBM SAN Volume Controller (SVC) 8A4 hardware and IBM System Storage™ DS3400 storage controllers. We discuss performance, in terms of I/O per second (IOPS) and MB per second (MBPS), that can be achieved from a single I/O group of SVC 8A4 nodes. By focusing on cache hit I/Os, we demonstrate that the entry level hardware can achieve significant performance benefits in a SAN, thereby offering a suitable alternative to the 8G4 for mid-sized businesses. In addition, we discuss the performance of IBM DS3400 storage controllers virtualized by an 8A4 cluster. We consider the best RAID configuration for logical unit numbers (LUNs) on the back end and look at the performance that can be achieved on disk reads and writes (cache misses). We examine the relative performance of a SAN with one, two, and three DS3400 controllers managed by the SVC, and thereby demonstrate the effectiveness of the SVC 8A4 and DS3400 as a scalable solution. Background 8A4 is the name for the SVC Entry Edition model that was announced in October 2008. The SVC software that it runs is the same code that runs on all SVC nodes. The hardware has been modified to make it a more viable and cost-effective solution for mid-sized businesses. It uses a single socket variant of the Intel® Xeon platform that all other SVC hardware uses. Each node has 8 GB of cache and a Dual Core 3.0 GHz Xeon CPU. The SAN interface remains the same as for other node types with four 4 Gbps Fibre Channel ports. Because the software running on an 8A4 is the same as on other node types, the dual core CPU achieves the same benefits of binding, lock elimination, and performance enhancements, thereby boosting the I/O throughput of a SAN and reducing response times. © Copyright IBM Corp. 2010. All rights reserved. ibm.com/redbooks 1 In comparison with the 8G4 node, the most notable difference of the 8A4 is a reduced bandwidth of internal memory. The 8A4 uses a single socket Intel Xeon® platform, where the 8G4 has a dual socket platform, meaning that the 8A4 can provide about 60% of the bandwidth of an 8G4, at roughly 60% of the cost. On the basis of these hardware differences, we expect 8A4 nodes to be capable of less impressive performance than 8G4s or CF8s (the other two current SVC hardware types), but still offer considerable benefit to mid-sized businesses. By benefit, we mean that installing an 8A4 cluster in your SAN can lead to considerable performance increases (as well as the management and maintenance benefits that we do not measure quantitatively in these tests). In addition, by installing such a cluster in your SAN, you gain the ability to scale out the storage that is managed underneath the SVC, while observing improved performance when you add storage controllers. Hardware and test details All results presented in this paper were obtained through tests by using the same two-node cluster of 8A4 SVC nodes, running SVC 5.1.0.0 code. The I/O was driven by an IBM System p® host with thirteen IBM POWER5™ processors, 26 GB of physical memory, and eight 4 Gbps FC ports. All of the DS3400s used were the dual-controller model, with three EXP3000 expansion drawers attached, for a total of 48 15,000 RPM SAS drives each. The drives were a mixture of sizes between 73.4 GB and 146.8 GB hard disk drives (HDDs). The I/Os were run at the cluster with three different block sizes: 512 B, 4 KB, and 64 KB. Queue depths were increased until we saw response times go over 30 ms. Anything with a longer response was judged inappropriate for customer practical use cases. In such a case, any increases in IOPS or MBPS that we saw might not be true performance improvements for a data center, because we might wait too long for the I/O to complete. I/O can be either sequential or random. For cache hit tests, sequential I/O can be used. However, we used random I/O for cache misses, because the SVC starts to prefetch data if it detects that sequential operations are performed. Terminology The following terminology is useful in understanding how physical SAS drives in back-end storage relate to the devices to which the host performs I/O and to the role of the SVC in virtualizing and managing these devices. We begin at the lowest level (individual HDDs) and move up to the highest level (devices used by the host). Physical disks A DS3400 enclosure contains one or two controllers and several SAS drives. The number of disks available to a controller can be increased by attaching expansion drawers to the controller. RAID arrays The DS3400s present their disks to the SVC as RAID arrays (RAID10 or RAID5 in this testing). Note: Because we are not interested in a configuration that does not provide both striping and HDD failure protection in a data center, RAID10 and RAID5 are the most viable RAID types to use. The controller uses the relevant RAID algorithm to arrange extents from one of more disks into logical units (LUs). Each controller in our testing presented the SVC with 11 LUs. 2 IBM SAN Volume Controller 8A4 Hardware: Performance and Scalability with the IBM System Storage DS3400 MDisks SVC refers to LUs that it has been presented with as managed disks (MDisks). The MDisk is distinct from the LU because it is the SVC reference to the LU rather than the array that exists on the storage controller. MDisk groups Administrators can use the SVC to arrange MDisks into MDisk groups. These groups are pools of storage that can span some or all of the MDisks that are presented by one or more controllers in the back end. For the one and two controller tests, we used four MDisk groups. When the third controller was added, we increased the number of MDisk groups to six MDisk groups to ensure a wider striping of extents across the back end. VDisks Virtualized disks (VDisks) are created from the MDisk groups (pools of capacity). The SVC presents the VDisks to the host. HDisks We used AIX® hosts. The VDisks that the SVC presents through a Fibre Channel to the host are referenced by AIX as HDisks. AIX manages volume groups, logical volumes, and multipathing to these devices, but an explanation of these concepts is not within the scope of this paper. RAID5 scalability comparisons In this section, we consider the performance of reads and writes first for RAID5 and then for RAID10. The I/O was randomized to avoid data prefetching by the SVC and boosting performance that way. In doing so, we were able to gain a clearer idea of how the back end was performing. Performance is measured by two metrics, which are considered in separate charts. We can aim to optimize either the number of IOPS or the throughput in MBPS. The following charts show the best results that were achieved. For detailed results and a comparison of performance with different block sizes, see “Tables of the results” on page 9. Figure 1 on page 4 shows the RAID5 disk read IOPS. This chart shows that the throughput is reaching its optimal point for one and two controllers. There is a point on the curve where the second differential reaches its maximum. After this point, we incur more of a cost in terms of increased response time for every throughput gain that we manage to achieve from the SAN. For one controller, this point happens at around 18,000 or 19,000 IOPS. For two controllers, we just reach it at the highest tested queue depths, which is around 38,000 IOPS. At our highest queue depth, we had not yet managed to reach this point for three controllers, but a continuation of the graph’s trend might suggest that we may see another linear increase with the addition of a third controller. Therefore, the chart in Figure 1 on page 4 shows evidence of the linear performance scaling that we expect. It also indicates that the addition of more controllers will allow for much higher I/O queue depths before we start seeing any noticeable impact to our response times. IBM SAN Volume Controller 8A4 Hardware: Performance and Scalability with the IBM System Storage DS3400 3 Figure 1 RAID5 disk read IOPS Figure 2 shows RAID5 disk read MBPS. Larger block sizes are less tolerant of over queuing, but this chart still shows similarly shaped curves to the IOPS chart. We can achieve about 560 MBPS from a single controller, 1000 MBPS from two controllers, and 1500 MBPS from three controllers, demonstrating neat and linear performance scalability on disk reads when a data center must obtain high volume throughput. Fewer results are in the data set for three controllers simply because of time constraints. We went straight to the optimal queue depths rather than testing all of the lower ones to demonstrate a smooth curve, because this had already been shown for one and two controllers. Figure 2 RAID5 disk read MBPS 4 IBM SAN Volume Controller 8A4 Hardware: Performance and Scalability with the IBM System Storage DS3400 Figure 3 shows RAID5 disk write IOPS. As we might expect, throughput for disk writes is significantly lower than on reads. We also observe that fewer I/Os can be queued up before response times become large. The chart clearly shows a progressive and even performance increase as we add controllers. The best result with one controller was 3761 IOPS with a response time of 8.506 ms. Because I/O was random in these tests, we cannot be sure of even, wide usage of our SAS drives in each result set. We might expect somewhere around 11,000 IOPS from the three controllers to demonstrate totally linear scaling. We observed 10,876 IOPS, which is achievable with a response time of 1.467ms. We do not see such ideal curve shapes (“hockey-stick-shaped” curves) in our write results, because of the nature of the testing that was done. Our host simply drives the I/O as hard as it possibly can. Therefore, even with the minimum queue depth tested, it was already driving the I/O hard enough to have passed the point of the curve where performance was rapidly increasing. If we wanted to show similar curve shapes to our read results, it might be a case of forcing the host to drive fewer writes. Figure 3 RAID5 disk write IOPS IBM SAN Volume Controller 8A4 Hardware: Performance and Scalability with the IBM System Storage DS3400 5 Figure 4 shows the RAID5 disk write MBPS. Emphasis on MBPS performance is achieved by performing the I/O with a larger block size. In this chart, we see scaling similar to what we observed with IOPS. The scaling indicates that adding DS3400 controllers underneath an 8A4 cluster offers performance scalability on writes, regardless of the metric that we use. Notice that there are fewer results for one controller because we achieved the expected result quickly. Therefore, there seemed no point in ramping up our queues because we were already close to the 30 ms response time threshold that we set for validity of the results. The reasons for not seeing “hockey-stick-shaped” curves are the same as for write IOPS. Figure 4 RAID5 disk write MBPS RAID10 scalability comparisons For RAID10 disk reads, we have good results for two controllers. Yet we must remember that the I/Os are random and that performance depends on load distribution across the SAS drives. However, we can see that performance is scaling linearly as we add the additional DS3400s. Figure 5 on page 7 shows the RAID10 disk read IOPS. The data set looks slightly unusual for two controllers (less spread out than the others). This result is because the test was run when we already had a shrewd idea of the optimal queue depth, and therefore, it was not necessary to test such a wide range of values. For reads, RAID10 and RAID5 are a lot closer to each other than on writes. With one controller, we see that RAID5 marginally outperforms RAID10. Again, we attribute this to the randomness of the I/O. However, as we add to the back end, RAID10 overtakes RAID5 as the true performance deficits are scaled up accordingly, meaning that we are about 15% up on the RAID5 number of reads when testing with three DS3400s. 6 IBM SAN Volume Controller 8A4 Hardware: Performance and Scalability with the IBM System Storage DS3400 Figure 5 RAID10 disk read IOPS Figure 6 shows the RAID10 disk read MBPS. For MBPS read performance, RAID5 again comes close, but RAID10 still slightly outperforms it. This chart clearly shows both the linear performance scaling that we expect and the optimal point on each curve before response time increases start to outweigh performance gains. For all three data sets, the optimal point is achieved with a response of around 15 ms and queues of roughly 10 I/Os to 20 I/Os. Figure 6 RAID10 disk read MBPS IBM SAN Volume Controller 8A4 Hardware: Performance and Scalability with the IBM System Storage DS3400 7 Figure 7 shows the RAID10 disk write IOPS. Configuring RAID10 arrays on our DS3400 controllers for this test allows for almost 175% of the throughput of a RAID5 configured back end. Therefore, clearly RAID10 is a better option for high IOPS throughput. We verify this later with read I/Os. We see that our best results for this test are with low queue depths (between one and eight I/Os queued). Also, it is clear from the chart that our expected linear performance increases as controllers are added. As with the RAID5 results, we might need to limit the writes that our host is driving to see the steep performance increases in the first part of the curves and give us the ideal shapes that we expect. Figure 7 RAID10 disk write IOPS 8 IBM SAN Volume Controller 8A4 Hardware: Performance and Scalability with the IBM System Storage DS3400 Figure 8 shows RAID10 disk write MBPS. For MBPS measurements on disk writes, we again see that RAID10 significantly outperforms RAID5. With three DS3400s, we see an 89% performance increase. Figure 8 RAID10 disk write MBPS Tables of the results Our results suggest that it is best to have your storage configured in RAID10 arrays to push for maximum performance. However, there might be situations in which the I/O block size in a data center cannot be easily controlled. Therefore, choosing RAID5 might be beneficial in some situations. The tables in the following sections show the results that we collected for all three of our tested block sizes (512 B, 4 KB, and 64 KB) and demonstrate that there are some cases when RAID5 benefits performance. Cache hit results Regardless of the back-end storage, a SAN can take advantage of the caching capabilities of the SVC to boost performance. The results in Table 1 on page 10 show the performance that can be seen on cache hit reads and writes with our hardware setup as explained earlier in this paper. It is important to note that we are limited in our results by how much I/O the host is capable of driving. For example, a two-node 8A4 cluster might be able to report significantly higher values than this if we had the processing power to drive more workload to it. IBM SAN Volume Controller 8A4 Hardware: Performance and Scalability with the IBM System Storage DS3400 9 Table 1 shows the cache hit results. Table 1 Cache hit results Workload Response (ms) IOPS MBPS Read hit 512 blocks Sequential 0.644 385,263 192.632 Read hit 4K blocks Sequential 0.687 363,454 1,453.816 Read hit 64K blocks Sequential 3.078 41,545 2,658.885 Write hit 512 blocks Sequential 5.585 109,786 54.393 Write hit 4K blocks Sequential 3.425 176,869 707.478 Write hit 64K blocks Sequential 27.065 21,279 1,361.868 Results with one DS3400 controller Unlike the results shown in Table 1, in this section, in this section, we show the results of doing cache miss I/Os. As you can see in the results, we start to be limited by the speed that we can drive from the spindles in our back-end storage. Nevertheless, the wide striping of the SVC for miss-type workloads such as this means that we can activate more spindles and boost the performance of the controller. We ran tests with the disks configured by the DS3400 controller into RAID10 and RAID5 arrays to compare the performance of the two arrays. In general, it seems that RAID10 will provide more performance benefits, but you might want to choose RAID5 for specific workloads. See Table 2 (for RAID10) and Table 3 (for RAID5) for a comparison. Table 2 shows the RAID10 results. Table 2 RAID10 10 Workload Response (ms) IOPS MBPS Read miss 512 blocks Random 28.406 20,275 10.138 Read miss 4K blocks Random 28.187 20,433 81.375 Read miss 64K blocks Random 27.876 9,183 587.733 Write miss 512 blocks Random 21.252 6,022 3.011 Write miss 4K blocks Random 20.517 6,237 24.952 Write miss 64K blocks Random 26.335 2,430 155.523 IBM SAN Volume Controller 8A4 Hardware: Performance and Scalability with the IBM System Storage DS3400 Table 3 shows the RAID5 results. Table 3 RAID5 Workload Response (ms) IOPS MBPS Read miss 512 blocks Random 25.77 22,350 11.175 Read miss 4K blocks Random 26.089 22,076 88.307 Read miss 64K blocks Random 29.315 8,732 558.861 Write miss 512 blocks Random 8.506 3,761 1.880 Write miss 4K blocks Random 9.251 3458.3 13.833 Write miss 64K blocks Random 25.904 1,235 79.053 Results with two DS3400 controllers Table 4 shows the RAID10 results. Table 4 RAID10 Workload Response (ms) IOPS MBPS Read miss 512 blocks Random 14.466 39812 19.906 Read miss 4K blocks Random 17.168 42865 171.4632 Read miss 64K blocks Random 23.43 16388 1048.829 Write miss 512 blocks Random 20.508 12481.9 6.241 Write miss 4K blocks Random 20.209 12665.8 50.663 Write miss 64K blocks Random 28.156 4545.8 290.928 Table 5 shows the RAID5 results. Table 5 RAID5 Workload Response (ms) IOPS MBPS Read miss 512 blocks Random 15.411 37371 18.686 Read miss 4K blocks Random 15.377 37454 149.817 IBM SAN Volume Controller 8A4 Hardware: Performance and Scalability with the IBM System Storage DS3400 11 Workload Response (ms) IOPS MBPS Read miss 64K blocks Random 24.63 15589 997.725 Write miss 512 blocks Random 17.556 7289 3.645 Write miss 4K blocks Random 19.367 6608 26.434 Write miss 64K blocks Random 27.07 2364.1 151.301 Results with three DS3400 controllers Table 6 shows the RAID10 results. Table 6 RAID10 Workload Response (ms) IOPS MBPS Read miss 512 blocks Random 11.288 51019.7 25.510 Read miss 4K blocks Random 16.227 63095.2 252.3809 Read miss 64K blocks Random 24.516 23492.8 1503.54 Write miss 512 blocks Random 27.551 18576.2 9.288 Write miss 4K blocks Random 6.772 188894.6 75.578 Write miss 64K blocks Random 4.734 6756.3 432.4021 Table 7 shows the RAID5 results. Table 7 RAID5 12 Workload Response (ms) IOPS MBPS Read miss 512 blocks Random 11.09 54812.6 27.406 Read miss 4K blocks Random 11.049 55016 220.0641 Read miss 64K blocks Random 24.664 23352.6 1494.569 Write miss 512 blocks Random 1.469 10876 5.438 Write miss 4K blocks Random 1.642 9731.5 38.926 IBM SAN Volume Controller 8A4 Hardware: Performance and Scalability with the IBM System Storage DS3400 Workload Response (ms) IOPS MBPS Write miss 64K blocks Random 4.454 3590.8 229.809 Conclusions The cache hit results show the performance levels that our test configuration is capable of. In summary, we can drive 385K read IOPS and 177K write IOPS (values rounded to three significant figures). If we focus on larger block sizes, and on driving a large amount of data transfer rather than completed operations, we see the performance of 2,659 MBPS on reads and 1,362 MBPS on writes (values rounded to the nearest full MB). As mentioned in “Tables of the results” on page 9, the limiting factor is not the SVC cluster but the System p host that we used. We reached the limits of how much throughput the host could drive before we reached the limit of how much the SVC could handle. With more processing power, the same two-node 8A4 cluster might be able to comfortably handle 550K IOPS and over 3K MBPS on reads. As mentioned earlier, the 8A4 hardware provides roughly 60% of the performance of an 8G4, which is offset in terms of business value by the fact that it costs 60% more as compared to the 8G4. In fact, based on recent performance data for the 8G4, the lowest percentage of 8G4 performance that an 8A4 delivers is on read hits, where the 8A4 hits over 65% of the capability of the 8G4. This result shows that an 8A4 cluster can deliver impressive performance regardless of the lower price tag and the reduced internal memory bandwidth. In addition, we showed the scalability of performance that is available by adding DS3400 controllers (and extra HDDs) to a SAN underneath the 8A4 cluster. In “RAID5 scalability comparisons” on page 3 and “RAID10 scalability comparisons” on page 6, our charts demonstrate that performance scales linearly as we add storage controllers and disks to the back end. When looking at these charts, we note that any results with a response time of >30 ms are artificial, because such responses might not be acceptable in a data center. The team who wrote this IBM Redpaper This paper was produced at IBM Hursley Park, in Hursley, UK. Ed Moffatt is a software tester who has five years of computing and programming experience including a year of working on software for the IBM accounting department in North Harbour. He joined the IBM Hursley Labs in September 2008, after completing a Bachelor of Science degree in mathematics at Bath University. Before joining the SVC performance test team, he worked on regression and CVT testing and is continuing his SVC career with a role in Level 3 support. Jon Tate, project manager for this paper, is a Project Manager for IBM System Storage SAN, DCN, and Virtualization Solutions at the International Technical Support Organization (ITSO). Before joining the ITSO in 1999, he worked in the IBM Technical Support Center, providing Level 2 and 3 support for IBM storage products. Jon has 24 years of experience in storage software and management, services, and support, and is both an IBM Certified IT Specialist and an IBM SAN Certified Specialist. He is also the UK Chairman of the Storage Networking Industry Association. Thanks to Barry Whyte of IBM Hursley for his contributions to this project. IBM SAN Volume Controller 8A4 Hardware: Performance and Scalability with the IBM System Storage DS3400 13 14 IBM SAN Volume Controller 8A4 Hardware: Performance and Scalability with the IBM System Storage DS3400 Notices This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A. The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. © Copyright International Business Machines Corporation 2010. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. 15 This document REDP-4631-00 was created or updated on January 8, 2010. ® Send us your comments in one of the following ways: Use the online Contact us review Redbooks form found at: ibm.com/redbooks Send your comments in an email to: redbooks@us.ibm.com Mail your comments to: IBM Corporation, International Technical Support Organization Dept. HYTD Mail Station P099 2455 South Road Poughkeepsie, NY 12601-5400 U.S.A. Redpaper ™ Trademarks IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. These and other IBM trademarked terms are marked on their first occurrence in this information with the appropriate symbol (® or ™), indicating US registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at http://www.ibm.com/legal/copytrade.shtml The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both: AIX® IBM® POWER5™ Redpaper™ Redbooks (logo) System p® System Storage™ ® The following terms are trademarks of other companies: Intel Xeon, Intel, Intel logo, Intel Inside logo, and Intel Centrino logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Other company, product, or service names may be trademarks or service marks of others. 16 IBM SAN Volume Controller 8A4 Hardware: Performance and Scalability with the IBM System Storage DS3400