Red
The increasing demand for information has created the need for storage system performance improvements and new performance levels for communications between servers and storage.
Two main storage paradigms enable connections between a server and a storage system: network-attached storage (NAS) and storage area networks (SANs).
NAS is data storage that is connected to a traditional Ethernet network. Standard protocols make NAS accessible to clients on those networks. The most common NAS solution is the network file system (NFS). NFS simplifies data transport and reduces the management complexity of large storage installations.
SANs rely on a set of storage specific transport protocols: iSCSI and FCP. Because iSCSI uses TCP/IP as its transport, existing IP-based host connections use Ethernet to pass information. FCP, by contrast, uses Fibre Channel to connect a server and storage.
Currently, Ethernet bandwidth is eclipsing Fibre Channel and the cost of Ethernet is becoming more affordable than Fibre Channel. One question that has arisen is how NFS and iSCSI perform in comparison to FCP.
This publication provides a performance comparison and tuning recommendations for three different transport protocols, FCP, iSCSI, and NFS, when IBM® DB2® V9 and AIX® 5L™ 5.3
TL-04 are run on a 2-way IBM System p5™ 520 server using an IBM System Storage™
N5500. An online transaction processing (OLTP) workload was used to study the performance and the results show that the throughput achieved for NFS and iSCSI was within
7%, with FCP bearing an advantage over the other two protocols.
ibm.com/redbooks 1 © Copyright IBM Corp. 2007. All rights reserved.
The data server used for this study was an IBM System p5 520 server running AIX 5L 5.3
TL04 with 8 GB of physical RAM and two 1.5 GHz processors running DB2 9 Enterprise
Server Edition. For data storage, we used an IBM System Storage N5500 with two nodes running Data ONTAP® 7G. Two 1 Gb Ethernet network adapters established the NFS and iSCSI connections, and two 2 Gb FC-AL adapters were used for the FCP connection between the data server and storage system; this supported the bandwidth availability at the time.
Figure 1 shows the high level system architecture of the data server and the IBM System
Storage N series.
3
Figure 1 System configuration
© 2005 IBM Corporation
Our test environment consisted of a DB2 9 Enterprise Server Edition database running an
OLTP workload. The database was approximately 20 GB. The database table spaces were created with the NO FILE SYSTEM CACHING clause, which is available only in the IBM
Enhanced Journaling File Systems (JFS2).
By default, the operating system caches file data that is read from and written to disk. For
OLTP workloads, caching at the file system level and in the DB2 buffer pools causes performance degradation because extra CPU cycles are required to perform double caching.
DB2 uses the Concurrent I/O (CIO) feature to disable file system caching when either the
CREATE TABLESPACE or ALTER TABLESPACE statement is used with the NO FILE
SYSTEM CACHING clause. For more information about CIO see "Improve database performance on file system containers in IBM DB2 UDB using Concurrent I/O on AIX" at: http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0408lee/
2 IBM DB2 9 on AIX 5L with NFS, iSCSI, and FCP using IBM System Storage N series
Achieving efficient transaction response times depends on the database management system (DBMS). Fast access to data and the transaction logs are also crucial. IBM System
Storage N series offers efficient connectivity to the data server by supporting various transport protocols such as FCP, iSCSI, and NFS. In this section, we outline the storage layout on the IBM System Storage N5500 and make some recommendations for optimizing the storage system for OLTP workloads.
Before you can use IBM System Storage N series to store tablespace containers, you must first configure it and create the appropriate storage objects, such as aggregates, flexible volumes, and LUNs. The steps required to configure an IBM System Storage N series for a
DB2 environment are described in detail in the Technical Report called "DB2 9 for UNIX®:
Integrating with a NetApp® Storage System” found at: www.netapp.com/library/tr/3531.pdf
In our environment, we created two aggregates, one on each node of the clustered storage system. We used the following command to create the aggregates: create aggr0 -r 20 40@36g -t RAID_DP
is the name of the aggregate that was created over 40 disks, each 36 GB in size. Dual parity and a RAID group size of 20 were also used. Dual parity (RAID-DP™) allows greater data protection in the event of double disk failures. (For more information, see the IBM
Redpaper called “IBM System Storage N Series Implementation of Raid Double Parity for
Data Protection, REDP4169.) Figure 2 illustrates the aggregate that was created on the IBM
System Storage N5500.
Figure 2 Aggregate on the IBM System Storage N5500
The Data ONTAP operating system on IBM System Storage N series supports a virtual storage layer that is known as a flexible volume or FlexVol™. A FlexVol volume is created in an aggregate and provides greater performance and manageability than traditional volumes.
A FlexVol can also grow and shrink as needed and spans all the disk spindles in a single aggregate. To create a flexible volume, issue: vol create [VolName] [AggrName] [VolSize]
IBM DB2 9 on AIX 5L with NFS, iSCSI, and FCP using IBM System Storage N series 3
where:
identifies the name of the volume to be created.
identifies the name of the aggregate that contains the volume.
identifies the size of volume in KB, MB, or GB.
For example, to create a volume named dbstorage_data1 in an aggregate named aggr0 and
assign it a size of 2 GB, you issue the command in Example 1.
Example 1 vol create vol create dbstorage_data1 aggr0 2G
For better performance and manageability, the authors recommend that transaction logs and tablespace containers be housed in separate volumes. For this study, we created two volumes (dbstorage_data1 and dbstorage_data2) for the data containers and one volume
(dbstorage_log) for the transaction logs. The database directory was created in a volume
named dbstorage (see Figure 1 on page 2). IBM JFS2 file systems were created in the data
volumes.
To obtain rapid communication between the data server and the storage system, we enabled jumbo frames on both. If the connection topology involves a switch, the switch must also have support for jumbo frames enabled.
At the lowest "physical" layer, electrical signals are exchanged in the network, representing bits and bytes. But just above that layer, the network exchanges frames. An Ethernet frame contains 1500 bytes of user data, plus its headers and trailer. By contrast, a jumbo frame contains 9000 bytes of user data, so the percentage of overhead for the headers and trailer is much less and data-transfer rates can be much higher.
On IBM System Storage N series, jumbo frames can be enabled by issuing the command in
Example 2 ifconfig ifconfig e10 mtusize 9000 up
To make the setting persistent, include the ifconfig statement in the /etc/rc file in the storage system. The ifconfig line in the /etc/rc file on the IBM System Storage N5500 used in these tests was: ifconfig e5 [hostname] -e5 netmask 255.255.255.0 mtusize 9000 flowcontrol full
There are certain options that can be enabled to achieve good performance from IBM System
Storage N series when you are running database workloads. We used the following options:
Automatic snapshots: Disabled. When a volume is created, it has a default snapshot schedule set for it. Normally, a database is backed up based on a user-defined schedule; therefore, there is no need for volumes that are used by databases to use the default snapshot schedule. We used the following command to disable automatic snapshots for each of the volumes that were used by the database: vol options [VolName] nosnap on
4 IBM DB2 9 on AIX 5L with NFS, iSCSI, and FCP using IBM System Storage N series
Read-ahead feature: Disabled. By default, this option is off, causing the storage system to perform speculative file read-ahead when needed. Because OLTP workloads do not benefit from prefetching the data, we used the following command to disable the read-ahead feature for our N5500 volumes:
vol options [VolName] minra on
Update the access time of all files: Enabled. If this option is on, it prevents the update of the access time on an inode when a file is read (each file has an inode that stores information about that file). For volumes used by databases, the DBMS manages the correct access time for inodes; therefore, we used the following command to enable this option for each volume: vol options [VolName] no_atime_update on
Snapshot™ directory display: Disabled. By default, the snapshot directory is visible to users at the client mount points. If this option is on, the display of the snapshot directory at client mount points is disabled. We used the following command to disable the snapshot directory display for each volume: vol options [VolName] nosnapdir on
Set nvfail option on: If this option is on, the IBM System Storage N series performs additional status checking when the system is started to verify that the NVRAM of the storage system is in a valid state. This option is useful for databases because, if any problems with NVRAM are found, the database instances shut down and an error message is sent to the console to alert database administrators. We used the following command to set the nvfail option for each volume: vol options [VolName] nvfail on
Set the snapshot reserve to zero: When a new volume is created by default, ONTAP reserves 20% of the space for the snapshots, which cannot be used for the data. Because we wanted to use the storage space better, we opted to set the snapshot reserve to 0 by executing the following command: snap reserve -V [VolName] 0
For all of the commands given here,
is the name of the volume.
To change all of the options mentioned above for a volume named dbstorage_data2, you
would use the set of commands in Example 3.
Example 3 Changing options vol options dbstorage_data2 nosnap on vol options dbstorage_data2 minra on vol options dbstorage_data2 no_atime_update on vol options dbstorage_data2 nosnapdir on vol options dbstorage_data2 nvfail on snap reserve -V dbstorage_data2 0
For other configuration and suggestions for the physical design of the storage system, see
"DB2 9 for Unix: Integrating with NetApp Storage Server" at: www.netapp.com/library/tr/3531.pdf
IBM DB2 9 on AIX 5L with NFS, iSCSI, and FCP using IBM System Storage N series 5
There are several operating system parameters that can be tuned to gain better performance from the data server. In our setup, the DB2 9 database runs on AIX 5L 5.3 TL04. As outlined before, IBM JFS2 file systems were created in the data volumes, so that we could use higher performing file system options such as CIO. In addition to this, we controlled the buffer-cache paging activity on the data server and performed AIO tuning.
The behavior of the AIX 5L file buffer cache manager can have a significant effect on performance because excessive paging activity can decrease performance substantially. It can also cause an I/O bottleneck, resulting in lower overall system throughput. On AIX 5L, tuning buffer-cache paging activity must be done carefully and infrequently.
The lru_file_repage parameter instructs the system to steal file memory pages only when you are determining what type of memory to steal. This means that the file system pages are reused in preference to the database pages, resulting in better database performance. This
can be enabled by the command in Example 4.
Example 4 lru_file _repage parameter vmo -o lru_file_repage=0
For more information about this parameter, see the IBM System p™ and AIX Information
Center at: http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp
An application that uses synchronous I/O cannot continue until the I/O operation that it is waiting for is complete. In contrast, asynchronous I/O (AIO) operations run in the background and do not block user applications. AIO improves performance because I/O operations and application processing can run simultaneously. The actual performance, however, depends on how many server processes that handle I/O requests are running.
For this study, we used the commands in Example 5 to update the maxservers and maxreqs
AIO parameters.
Example 5 AIO parameters aioo -o maxservers=20 aioo -o maxreqs=32768
For more details about AIO performance benefits, see the IBM Redbooks® publication called
"Database Performance Tuning on AIX, SG24-5511.
We recommend using a dedicated Gb Ethernet network that connects the storage system and the data server to obtain high bandwidth. The Gb Ethernet driver can also play an important role in network performance. In our environment, we used V5.3.0.40 and a 1 Gb
Ethernet network adapter per storage system for the NFS and iSCSI protocols.
6 IBM DB2 9 on AIX 5L with NFS, iSCSI, and FCP using IBM System Storage N series
We enabled jumbo frames for these tests. For an Ethernet Adapter, you can enable jumbo
frames on the data server with the command in Example 6.
Example 6 Enabling jumbo frames chdev -l '[AdapterName]' -a jumbo_frames='yes' -a mtu='9000'
identifies the name of the Ethernet adapter where the jumbo frame is enabled.
For example, to enable jumbo frames and set the MTU size to 9000 for an Ethernet adapter
named en02, you would execute the command in Example 7.
Example 7 Enable jumbo frames chdev -l 'en02' -a jumbo_frames='yes' -a mtu='9000'
There are several AIX 5L protocol configurations that can be tuned to gain better performance. In our setup, we tuned the AIX 5L configurations for NFS, iSCSI, and FCP.
For the NFS protocol, you must tune at both the data server and the storage system levels.
We used the mount options in Example 8 for the data file systems.
Example 8 Mount options options=rw,bg,hard,nointr,proto=tcp,vers=3,rsize=32768, wsize=32768,timeo=600
Additionally, we enabled the nfs_rfc1323 parameter. This parameter allows TCP window sizes that are greater than 64 KB, which helps minimize the wait for TCP acknowledgments. The
nfs_rfc1323 parameter can be set on the data server by issuing the command in Example 9.
Example 9 nfs_rfc 1323 parameter nfso -o nfs_rfc1323=1
For more information, see "AIX 5L NFS Client Performance Improvements for Databases on
NAS" at: www-03.ibm.com/servers/aix/whitepapers/aix_nfs.pdf
Example 10 shows parameter settings used for the N series.
Example 10 N series options nfs.tcp.enable on nfs.tcp.recvwindowsize 65536 nfs.tcp.xfersize 65536
IBM DB2 9 on AIX 5L with NFS, iSCSI, and FCP using IBM System Storage N series 7
The first option verifies whether the storage system accepts TCP connections. The next two parameters set the receive window size and the NFS transfer size to their maximum values.
These parameters can be set using the options command in Example 11.
Example 11 Accepting TCP connections option options nfs.tcp.enable on
The AIX 5L iSCSI driver has a default queue_depth of 1, which is too low for database workloads. In this setup, we changed the queue_depth to 256 for each hdisk available with
the command in Example 12 (for example, for hdisk1).
Example 12 queue_depth setting chdev -l hdisk1 -a queue_depth=256
We also changed the maximum transfer size to 512 KB for each hdisk available with the
Example 13 Maximum transfer size setting chdev -l hdisk1 -a max_transfer=0x80000
In addition, we used the network parameters in Example 14.
Example 14 Network parameters no -p -o tcp_recvspace=262144 no -p -o tcp_sendspace=262144 no -p -o tcp_nagle_limit=0 no -p -o ipsrcrouterecv=1 no -p -o sb_max=1310720
For more information about these parameters, visit the IBM System p and AIX 5L Information
Center at: http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp
We changed the queue depth for each disk to 256 using the command in Example 12 and we
also changed the num_cmd_elems, which is a setting for the maximum number of commands to queue to the FC-AL card adapter, to 2048 with the command in Example 15.
Example 15 Maximum number of commands to queue to the FC-AL adapter setting chdev -l fcs0 -a num_cmd_elems=2048
8 IBM DB2 9 on AIX 5L with NFS, iSCSI, and FCP using IBM System Storage N series
We used the average throughput (transactions attained per minute) after a 15-minute warm up and a steady state of 20 minutes while running an OLTP workload to determine our results. The total number of users was 250, which ensured that most of the CPU in the data server was used. Figure 3 shows the relative throughput that was attained with the FCP, iSCSI, and NFS protocols. The throughput that was achieved for each protocol was within
20%, with FCP attaining the maximum. NFS yielded 7% more throughput than iSCSI.
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
100%
87%
80%
FCP NFS iSCSI
Figure 3 Relative throughput for each protocol
Figure 4 shows the CPU usage during the workload runs for each of the protocols.
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
56%
32%
42%
49%
FCP NFS
System User
Figure 4 CPU usage during OLTP workload runs for each protocol
44%
47% iSCSI
IBM DB2 9 on AIX 5L with NFS, iSCSI, and FCP using IBM System Storage N series 9
A higher percentage of time is spent in the kernel for NFS and iSCSI. The path length of a transaction is longer in these cases, when compared to FCP, because more kernel components are involved in processing the transaction. An explanation of the extra kernel components that are involved is beyond the scope of this paper.
We also derived the scaled relative throughput (Figure 5) by dividing the actual throughput
that was attained for each of the protocols with the percentage of user plus kernel CPU that each protocol used. We then used the number that was obtained for FCP as the 100% point.
The results echo the results from Figure 3 on page 9, with FCP bearing the best throughput per CPU percentage. The results also show that FCP has an advantage over NFS and iSCSI when CPU utilization is taken into account.
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
100%
84%
FCP NFS
Figure 5 Scaled relative throughput for each protocol
77% iSCSI
This paper demonstrates that with IBM DB2 9 and AIX 5.3 TL-04, using an IBM N5500 as system storage, the throughput attained with NFS is comparable to that of the iSCSI protocol.
FCP yielded higher throughput compared to the other two protocols because kernel CPU usage was lower. However, an advantage to NFS compared to SAN is that clients can access them through networks. FCP connectivity currently is a slightly more expensive solution, and it does not offer this added accessibility.
10 IBM DB2 9 on AIX 5L with NFS, iSCSI, and FCP using IBM System Storage N series
This IBM Redpaper was originally produced by a team of specialists from around the world working at IBM and Network Appliance™ as a performance study. It was converted to an IBM
Redpaper by Alex Osuna at the International Technical Support Organization (ITSO) in
Tucson, Arizona.
Alex Osuna is a project leader at the ITSO, Tucson Center. He writes extensively and teaches IBM classes on all areas of storage. He has over 28 years in the IT industry, working for the United States Air Force, IBM, and Tivoli® in computer maintenance, field engineering, service planning, Washington Systems Center, business and product planning, advanced technical support, and systems engineering, focusing on storage hardware and software. He has more than10 certifications from IBM, Microsoft® and Red Hat.
Nailah Bissoon holds a Master of Science degree in Computing Science from Queen's
University, Canada. She is currently a member of the DB2 solutions development and benchmarking team, which provides DB2 solutions for business needs and helps publish leading DB2 benchmark performance results. Her tasks also include promoting new and existing DB2 features. She has been with IBM since 2004.
Sunil Kamath is the technical manager for DB2 performance and solutions development. He has been working on DB2 performance for more than seven years and has successfully led many world-record TPC-C and SAP® benchmarks. In addition, he has also designed and implemented many high performance and leading client solutions. His interest and responsibilities also include exploring and exploiting key hardware (including virtualization), operating system, and compiler technologies that help improve data server performance.
Jawahar Lal is an alliance engineer for IBM DB2 at Network Appliance, Inc., in Research
Triangle Park, North Carolina. He holds a Master of Science Degree in Computer Science from University of Rajasthan, Jaipur, India, and is working on an MBA degree from the
University of North Carolina. He has more than 11 years of experience in the areas of database programming, modeling, designing, administration, performance tuning, and storage integration. He writes extensively about storage and database integration. He is an
IBM DB2 Certified DBA and holds four other certifications from IBM and Oracle®.
Augie Mena is a senior technical staff member in the AIX/System p performance area. One of his areas of expertise is AIX kernel performance analysis and tuning, specifically for file systems. His current responsibilities range from driving performance improvements into future System p products to resolving client-reported performance issues for existing products. Augie has been an IBM employee since 1985.
Roger Sanders is a senior manager, IBM Alliance Engineering, at Network Appliance. He has been designing and developing databases and database applications for more than 20 years and has been working with DB2 Universal Database™ and its predecessors since it was first introduced on the IBM PC. He has written articles for several magazines, authored
DB2 tutorials for the IBM developerWorks® Web site, presented at several international and regional DB2 user group conferences, taught classes on DB2 fundamentals and database administration, and is the author of nine books about DB2 and one book about ODBC. Roger is also a member of the DB2 Certification Exam development team and the author of a regular column (Distributed DBA) in DB2 Magazine.
IBM DB2 9 on AIX 5L with NFS, iSCSI, and FCP using IBM System Storage N series 11
12 IBM DB2 9 on AIX 5L with NFS, iSCSI, and FCP using IBM System Storage N series
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION
PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.
Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs.
© Copyright International Business Machines Corporation 2007. All rights reserved.
Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by
GSA ADP Schedule Contract with IBM Corp.
13
Send us your comments in one of the following ways:
Use the online Contact us review Redbooks form found at:
ibm.com/redbooks
Send your comments in an email to: redbooks@us.ibm.com
Mail your comments to:
IBM Corporation, International Technical Support Organization
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400 U.S.A.
®
Red
™
The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both:
Redbooks (logo) developerWorks®
AIX 5L™
AIX®
® DB2 Universal Database™
DB2®
IBM®
Redbooks®
System p™
System p5™
System Storage™
Tivoli®
The following terms are trademarks of other companies:
SAP, and SAP logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries.
Oracle, JD Edwards, PeopleSoft, Siebel, and TopLink are registered trademarks of Oracle Corporation and/or its affiliates.
Snapshot, RAID-DP, FlexVol, Network Appliance, Data ONTAP, NetApp, and the Network Appliance logo are trademarks or registered trademarks of Network Appliance, Inc. in the U.S. and other countries.
Microsoft, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Other company, product, or service names may be trademarks or service marks of others.
14 IBM DB2 9 on AIX 5L with NFS, iSCSI, and FCP using IBM System Storage N series