Redbooks Paper John Foley Jr Alex R. Osuna Chapter 1. IBM System Storage N series File System Design for an NFS File Server The IBM® System Storage™ N series is specifically designed to be used as Network File System (NFS) file servers. The requirements for a file system operating in an NFS file server environment are different from those for a general purpose file system. That is, NFS access patterns are different from local access patterns, and the special-purpose nature of an appliance also affects the design. The Write Anywhere File Layout (WAFL) is the file system used in N series storage systems. WAFL is designed to work in an NFS filer and to meet four primary requirements: To provide fast NFS service To support large file systems that grow dynamically as disks are added To provide high performance while supporting Redundant Array of Independent Disks (RAID) To restart quickly, even after an unclean shutdown due to power failure or system crash © Copyright IBM Corp. 2006. All rights reserved. ibm.com/redbooks 1 The requirement for a fast NFS service is obvious, given WAFL's intended use in an NFS file server. Support for large file systems simplifies system administration by allowing all disk space to belong to a single large partition. Large file systems make RAID desirable because the probability of disk failure increases with the number of disks. Large file systems require special techniques for fast restart, because the file system consistency checks, for normal UNIX® file systems, become unacceptably slow as file systems grow. NFS and RAID both strain write performance. NFS does this because servers must store data safely before replying to NFS requests. And RAID strains write performance because of the read-modify-write sequence RAID uses to maintain parity. Strained write performance has led to using non-volatile RAM to reduce NFS response time. Strained write performance has also led to using a write-anywhere design that allows WAFL to write to disk locations that minimize RAID's write performance penalty. The write-anywhere design enables Snapshots, which in turn eliminate the requirement for time-consuming consistency checks after power loss or system failure. This paper covers the basic N series storage system architecture, WAFL file system, and RAID-4 implementation. We explain how WAFL uses Snapshots to eliminate the need for file system consistency checking after an unclean shutdown. The primary focus of this redbook is to explain the algorithms and data structures that WAFL uses to implement Snapshots. 1.1 Snapshots WAFL's primary distinguishing characteristics are Snapshots, which are read-only copies of the entire file system. WAFL creates and deletes Snapshots automatically at prescheduled times, and WAFL keeps up to 255 Snapshots online at one time to provide easy access to old versions of files. Snapshots use a copy-on-write technique to avoid duplicating disk blocks that are the same in a Snapshot as in the active file system. Only when blocks in the active file system are modified or removed, do Snapshots that contain those blocks begin to consume disk space. You can access Snapshots through the NFS to recover files that users have accidentally changed or removed. System administrators can use Snapshots to create backups safely from a running system. In addition, WAFL uses Snapshots internally so that WAFL can restart quickly, even after an unclean system shutdown. 2 IBM TotalStorage N series File System Design for an NFS File Server 1.1.1 User access to Snapshots Every directory in the file system contains a hidden Snapshot subdirectory that allows users to access the contents of Snapshots over NFS. Suppose that you accidentally remove a file named "todo" and want to recover it. The following example, Figure 1-1, shows how to list all the versions of "todo" that are saved in Snapshots: spike% ls -lut .snapshot/*/todo -rw-r--r-- 1 hitz 52880 Oct 15 00:00 .snapshot/nightly.0/todo -rw-r--r-- 1 hitz 52880 Oct 14 19:00 .snapshot/hourly.0/todo -rw-r--r-- 1 hitz 52829 Oct 14 15:00 .snapshot/hourly.1/todo ... -rw-r--r-- 1 hitz 55059 Oct 10 00:00 .snapshot/nightly.4/todo -rw-r--r-- 1 hitz 55059 Oct 9 00:00 .snapshot/nightly.5/todo Figure 1-1 Snapshot saved example With the -u option, ls shows todo's access time, which is set to the time when the Snapshot containing it was created. You can recover the most recent version of todo by copying it back into the current directory: Figure 1-2. spike% cp .snapshot/hourly.0/todo . Figure 1-2 Snapshot recovery The .snapshot directories are "hidden" in the sense that they do not show up in directory listings. If .snapshot were visible, commands like fin would report many more files than expected, and commands such as rm -rf would fail because files in Snapshots are read-only and cannot be removed. 1.1.2 Snapshot administration The N series storage system has commands to let system administrators create and delete Snapshots, but it creates and deletes most Snapshots automatically. By default, the N series storage system creates four hourly Snapshots at various times during the day, and a nightly Snapshot every night at midnight. It keeps the Chapter 1. IBM System Storage N series File System Design for an NFS File Server 3 hourly Snapshots for two days, and the nightly Snapshots for a week. It also creates a weekly Snapshot at midnight on Sunday, which it keeps for two weeks. For file systems that change quickly, this schedule may consume too much disk space, and Snapshots may need to be deleted sooner. A Snapshot is useful even if it is kept for just a few hours, because users usually notice immediately when they have removed an important file. For file systems that change slowly, it may make sense to keep Snapshots online longer. In typical environments, keeping Snapshots for one week consumes 10 to 20 percent of disk space. 1.2 WAFL implementation WAFL is a UNIX compatible file system optimized for network file access. In many ways, WAFL is similar to other UNIX file systems such as the Berkeley Fast File System (FFS) and TransArc's Episode file system. WAFL is a block-based file system that uses inodes to describe files. It uses 4 KB blocks with no fragments. Each WAFL inode contains 16 block pointers to indicate which blocks belong to the file. Unlike FFS, all the block pointers in a WAFL inode refer to blocks at the same level. Thus, inodes for files smaller than 64 KB use the 16 block pointers to point to data blocks. Inodes for files larger than 64 MB point to indirect blocks which point to actual file data. Inodes for larger files point to doubly indirect blocks. For very small files, data is stored in the inode itself in place of the block pointers. 1.2.1 Meta-data lives in files Like Episode, WAFL stores meta-data in files. WAFL's three meta-data files are the inode file, which contains the inodes for the file system, the block-map file, which identifies free blocks, and the inode-map file, which identifies free inodes. The term map is used instead of bit map, because these files use more than one bit for each entry. The block-map file's format is described in detail below. See Figure 1-3. 4 IBM TotalStorage N series File System Design for an NFS File Server Figure 1-3 The WAFL file system as a tree of blocks with the root inode at the top, and meta-data files and regular files underneath Keeping meta-data in files allows WAFL to write meta-data blocks anywhere on disk. This is the origin of the name WAFL, which stands for Write Anywhere File Layout. The write-anywhere design allows WAFL to operate efficiently with RAID by scheduling multiple writes to the same RAID stripe whenever possible to avoid the 4-to-1 write penalty that RAID incurs when it updates just one block in a stripe. Keeping meta-data in files makes it easy to increase the size of the file system on the fly. When a new disk is added, the FAServer automatically increases the sizes of the meta-data files. The system administrator can increase the number of inodes in the file system manually if the default is too small. Finally, the write-anywhere design enables the copy-on-write technique used by Snapshots. For Snapshots to work, WAFL must be able to write all new data, including meta-data, to new locations on disk, instead of overwriting the old data. If WAFL stored meta-data at fixed locations on disk, this would not be possible. 1.2.2 Tree of blocks A WAFL file system is best thought of as a tree of blocks. At the root of the tree is the root inode, as shown in Figure 1-3 on page 5. The root inode is a special inode that describes the inode file. The inode file contains the inodes that describe the rest of the files in the file system, including the block-map and inode-map files. The leaves of the tree are the data blocks of all the files. Figure 1-4 on page 6 is a more detailed version of Figure 1-1 on page 3. It shows that files are made up of individual blocks and that large files have additional layers of indirection between the inode and the actual data blocks. In order for WAFL to boot, it must be able to find the root of this tree, so the one exception to WAFL's write-anywhere rule is that the block containing the root inode must exist at a fixed location on disk where WAFL can find it. Chapter 1. IBM System Storage N series File System Design for an NFS File Server 5 Figure 1-4 A more detailed view of WAFL's tree of blocks 1.2.3 Snapshots Understanding that the WAFL file system is a tree of blocks rooted by the root inode is the key to understanding Snapshots. To create a virtual copy of this tree of blocks, WAFL simply duplicates the root inode. Figure 1-3 on page 5 shows you how this works. Figure 1-5 on page 7 (a. Before Snapshot) is a simplified diagram of the file system shown in Figure 1-2 on page 3 that leaves out internal nodes in the tree, such as inodes and indirect blocks. Figure 1-5 on page 7 (b. After Snapshot) shows how WAFL creates a new Snapshot by making a duplicate copy of the root inode. This duplicate inode becomes the root of a tree of blocks representing the Snapshot, just as the root inode represents the active file system. When the Snapshot inode is created, it points to exactly the same disk blocks as the root inode, so a new Snapshot consumes no disk space except for the Snapshot inode itself. Figure 1-5 on page 7 (c. After Block Update) shows what happens when you modify data block D. WAFL writes the new data to block D on disk and changes the active file system to point to the new block. The Snapshot still references the original block D, which is unmodified on disk. Over time, as files in the active file system are modified or deleted, the Snapshot references more blocks that are no longer used in the active file system. The rate at which files change determines how long Snapshots can be kept online before they consume an unacceptable amount of disk space. 6 IBM TotalStorage N series File System Design for an NFS File Server Figure 1-5 WAFL creating a Snapshot by duplicating the root inode that describes the inode file It is interesting to compare WAFL's Snapshots with Episode's fileset clones. Instead of duplicating the root inode, Episode creates a clone by copying the entire inode file. This generates considerable disk input/output (I/O) and consumes a lot of disk space. For instance, a 10 GB file system with one inode for every 4 KB of disk space would have 320 MB of inodes. In such a file system, creating a Snapshot by duplicating the inodes would generate 320 MB of disk I/O and consume 320 MB of disk space. Creating 10 such Snapshots would consume almost one-third of the file system's space even before any data blocks were modified. By duplicating the root inode, WAFL creates Snapshots quickly and with little disk I/O. Snapshot performance is important, because WAFL creates a Snapshot every few seconds to allow quick recovery after unclean system shutdowns. Figure 1-6 on page 7 shows the transition from Figure 1-5 before and after Snapshot in more detail. When a disk block is modified, and its contents are written to a new location, the block's parent must be modified to reflect the new location. The parent's parent or ancestor, in turn, must also be written to a new location, and so on, up to the root of the tree. Figure 1-6 Writing a block to a new location requires the pointers in the block's ancestors to be updated and written to new locations Chapter 1. IBM System Storage N series File System Design for an NFS File Server 7 WAFL would be inefficient if it wrote this many blocks for each NFS write request. Instead, WAFL gathers up many hundreds of NFS requests before scheduling a write episode. During a write episode, WAFL allocates disk space for all the dirty data in the cache and schedules the required disk I/O. As a result, commonly modified blocks, such as indirect blocks and blocks in the inode file, are written once per write episode, instead of once per NFS request. 1.2.4 File system consistency and non-volatile RAM WAFL avoids the need for file system consistency checking after an unclean shutdown by creating a special Snapshot called a consistency point every few seconds. Unlike other Snapshots, a consistency point has no name, and it is not accessible through NFS. Like all Snapshots, a consistency point is a completely self-consistent image of the entire file system. When WAFL restarts, it simply reverts to the most recent consistency point. This allows a filer to reboot in about a minute even with 20 GB or more of data in its single partition. Between consistency points, WAFL writes data to disk, but it writes only to blocks that are not in use. Therefore, the tree of blocks representing the most recent consistency point remains completely unchanged. WAFL processes hundreds or thousands of NFS requests between consistency points. The on-disk image of the file system remains the same for many seconds until WAFL writes a new consistency point. At that time, the on-disk image advances atomically to a new state that reflects the changes made by the new requests. Although this technique is unusual for a UNIX file system, it is well-known for databases. Even in databases, it is unusual to write as many operations at one time as WAFL does in its consistency points. WAFL uses non-volatile remote access memory (NVRAM) to keep a log of NFS requests that it has processed since the last consistency point. (NVRAM is special memory with batteries that allow it to store data even when system power is off.) After an unclean shutdown, WAFL replays any requests in the log to prevent them from being lost. When an N series storage system shuts down normally, it creates one last consistency point after suspending the NFS service. Therefore, on a clean shutdown, the NVRAM does not contain any unprocessed NFS requests, and it is turned off to conserve its battery life. WAFL actually divides the NVRAM into two separate logs. When one log becomes full, WAFL switches to the other log and starts writing a consistency point to store the changes from the first log safely on disk. WAFL schedules a consistency point every 10 seconds, even if the log is not full, to prevent the on-disk image of the file system from becoming outdated. Logging NFS requests to NVRAM has several advantages over the more common technique of using NVRAM to cache writes at the disk drive layer. 8 IBM TotalStorage N series File System Design for an NFS File Server Lyon and Sandberg describe the NVRAM write cache technique, which Legato's Prestoserve NFS accelerator uses1. Processing an NFS request and caching the resulting disk writes generally take much more NVRAM than simply logging the information required to replay the request. For instance, to move a file from one directory to another, the file system must update the contents and inodes of both the source and target directories. In FFS, where blocks are 8 KB each, this uses 32 KB of cache space. WAFL uses about 150 bytes to log the information needed to replay a rename operation. Rename, with its factor of 200 difference in NVRAM usage, is an extreme case. Even a simple 8 KB write caching of disk blocks will consume 8 KB for the data, 8 KB for the inode update, and for large files, another 8 KB for the indirect block. WAFL logs the 8 KB of data along with about 120 bytes of header information. With a typical mix of NFS operations, WAFL can store more than 1,000 operations per megabyte of NVRAM. Using NVRAM as a cache of unwritten disk blocks turns it into an integral part of the disk subsystem. An NVRAM failure can corrupt the file system in ways that fsck cannot detect or repair. If something goes wrong with WAFL's NVRAM, WAFL may lose a few NFS requests, but the on-disk image of the file system remains completely self-consistent. This is important, because NVRAM is reliable, but not as reliable as a RAID disk array. A final advantage of logging NFS requests is that it improves NFS response times. To reply to an NFS request, a file system without any NVRAM must update its in-memory data structures, allocate disk space for new data, and wait for all modified data to reach disk. A file system with an NVRAM write cache does all the same steps, except that it copies modified data into NVRAM instead of waiting for the data to reach disk. WAFL can reply to an NFS request more quickly because it needs only to update its in-memory data structures and log the request. It does not allocate disk space for new data or copy modified data to NVRAM. See Figure 1-7. 1 Lyon, Bob and Sandberg, Russel. "Breaking Through the NFS Performance Barrier." SunTech Journal 2(4), Autumn 1989: 21-27. Chapter 1. IBM System Storage N series File System Design for an NFS File Server 9 CLIENT Virtualization layer • Client RAM-to-NVRAM data path ⇒ Fast, predictable client response time • Permits WAFL® layer to optimize physical access (fewer writes, faster writes) ⇒ Better disk subsystem throughput 31 © 2005 IBM Corporation Figure 1-7 NVRAM assists with performance 1.2.5 Write allocation Write performance is especially important for network file servers. Ousterhout and Douglis observed that as read caches become larger for both the client and server, writes begin to dominate the I/O subsystem2. This effect is especially pronounced with NFS, which allows little client-side write caching. The result is that the disks on an NFS server may have five times as many write operations as reads. WAFL's design was motivated largely by a desire to maximize the flexibility of its write allocation policies. This flexibility takes three forms: WAFL can write any file system block, except the one containing the root inode, to any location on disk. – In FFS, such meta-data as inodes and bit maps, is kept in fixed locations on disk. This prevents FFS from optimizing writes by, for example, placing 2 Ousterhout, John and Douglis, Fred. "Beating the I/O Bottleneck: A Case for Log-Structured File Systems." ACM SIGOPS, 23 January 1989. 10 IBM TotalStorage N series File System Design for an NFS File Server both the data for a newly updated file and its inode right next to each other on disk. Since WAFL can write meta-data anywhere on disk, it can optimize writes more creatively. WAFL can write blocks to disk in any order. – FFS writes blocks to disk in a carefully determined order so that fsck(8) can restore file system consistency after an unclean shutdown. WAFL can write blocks in any order because the on-disk image of the file system changes only when WAFL writes a consistency point. The one constraint is that WAFL must write all the blocks in a new consistency point before it writes the root inode for the consistency point. WAFL can allocate disk space for many NFS operations at one time in a single write episode. – FFS allocates disk space as it processes each NFS request. WAFL gathers hundreds of NFS requests before scheduling a consistency point, at which time it allocates blocks for all requests in the consistency point at once. Deferring write allocation improves the latency of NFS operations by removing disk allocation from the processing path of the reply. It avoids wasting time allocating space for blocks that are removed before they reach disk. These features give WAFL extraordinary flexibility in its write allocation policies. The ability to schedule writes for many requests at once enables more intelligent allocation policies. And the fact that blocks can be written to any location and in any order allows a wide variety of strategies. It is easy to try new block allocation strategies without any change to WAFL's on-disk data structures. The details of WAFL's write allocation policies are outside the scope of this paper. In short, WAFL improves RAID performance by writing to multiple blocks in the same stripe. WAFL reduces seek time by writing blocks to locations that are near each other on disk. And WAFL reduces head-contention when reading large files by placing sequential blocks in a file on a single disk in the RAID array. Optimizing write allocation is difficult because these goals often conflict. 1.3 Snapshot data structures and algorithms Snapshot uses unique data structures and algorithms to preserve its write performance and self-consistency. 1.3.1 The block-map file write allocation Most file systems keep track of free blocks using a bit map with one bit per disk block. If the bit is set, then the block is in use. This technique does not work for Chapter 1. IBM System Storage N series File System Design for an NFS File Server 11 WAFL because many Snapshots can reference a block at the same time. WAFL's block-map file contains a 32-bit entry for each 4 KB disk block. Bit 0 is set if the active file system references the block, bit 1 is set if the first Snapshot references the block, and so on. A block is in use if any of the bits in its block-map entry are set. Figure 1-8 shows the life cycle of a typical block-map entry. At time t1, the block-map entry is completely clear, indicating that the block is available. At time t2, WAFL allocates the block and stores file data in it. When Snapshots are created, at times t3 and t4, WAFL copies the active file system bit into the bit indicating membership in the Snapshot. The block is deleted from the active file system at time t5. This can occur either because the file containing the block is removed, or because the contents of the block are updated and the new contents are written to a new location on disk. The block cannot be reused, however, until no Snapshot references it. In Figure 1-5 on page 7, this occurs at time t8 after both Snapshots that reference the block have been removed. Figure 1-8 Life cycle of a block-map file entry 1.3.2 Creating a Snapshot The challenge in writing a Snapshot to disk is to avoid locking out incoming NFS requests. The problem is that new NFS requests may need to change cached data that is part of the Snapshot and which must remain unchanged until it reaches disk. An easy way to create a Snapshot would be to suspend NFS processing, write the Snapshot, and then resume NFS processing. However, writing a Snapshot can take over a second, which is too long for an NFS server to stop responding. Remember that WAFL creates a consistency point Snapshot at least every 10 seconds, so performance is critical. 12 IBM TotalStorage N series File System Design for an NFS File Server WAFL's technique for keeping Snapshot data self-consistent is to mark all the dirty data in the cache as IN_SNAPSHOT. The rule during Snapshot creation is that data marked IN_SNAPSHOT must not be modified, and data not marked IN_SNAPSHOT must not be flushed to disk. NFS requests can read all file system data, and they can modify data that is not marked as IN_SNAPSHOT, but processing for requests that need to modify IN_SNAPSHOT data must be deferred. To avoid locking out NFS requests, WAFL must flush IN_SNAPSHOT data as quickly as possible. To do this, WAFL performs the following steps: 1. Allocates disk space for all files with IN_SNAPSHOT blocks. – WAFL caches inode data in two places: in a special cache of in-core inodes and in disk buffers belonging to the inode file. When it finishes write allocating a file, WAFL copies the newly updated inode information from the inode cache into the appropriate inode file disk buffer and clears the IN_SNAPSHOT bit on the in-core inode. When this step is complete, no inodes for regular files are marked IN_SNAPSHOT, and most NFS operations can continue without blocking. Fortunately, this step can be done quickly because it requires no disk I/O. 2. Updates the block-map file. – For each block-map entry, WAFL copies the bit for the active file system to the bit for the new Snapshot. 3. Writes all IN_SNAPSHOT disk buffers in cache to their newly-allocated locations on disk. – As soon as a particular buffer is flushed, WAFL restarts any NFS requests waiting to modify it. 4. Duplicates the root inode to create an inode that represents the new Snapshot, and turns off the root inode's IN_SNAPSHOT status. The new Snapshot inode must not reach disk until after all other blocks in the Snapshot have been written. If this rule is not followed, an unexpected system shutdown might leave the Snapshot in an inconsistent state. When the new Snapshot inode has been written, no more IN_SNAPSHOT data exists in cache, and any NFS requests that are still suspended can be processed. Under normal loads, WAFL performs these four steps in less than a second. Step 1 can be done generally in a few hundredths of a second, and after WAFL completes it, few NFS operations need to be delayed. Deleting a Snapshot is trivial. WAFL simply zeroes the root inode representing the Snapshot and clears the part that represents the Snapshot in each block-map entry. Chapter 1. IBM System Storage N series File System Design for an NFS File Server 13 1.4 Performance It is difficult to compare WAFL's performance to other file systems directly. Since WAFL runs only in an NFS appliance, it can be benchmarked against other file systems only in the context of NFS. The best NFS benchmark available today is the SPEC System File Server (SFS) benchmark, also known as LADDIS. The name LADDIS stands for the group of companies of Legato, Auspex, Digital, Data General, Interphase, and Sun™ that originally developed the benchmark. LADDIS tests NFS performance by measuring a server's response time at various throughput levels. Servers typically handle requests quickly at low load levels; as the load increases, so does the response time. Figure 1-9 compares the N series storage system LADDIS performance with that of other well-known NFS servers. Figure 1-9 SPECnfs_A93 operations per second, or for clusters, SPECnfs_A93 cluster operations per second Using a system-level benchmark, such as LADDIS, to compare file system performance can be misleading. You may argue, for instance, that the graph in Figure 1-9 underestimates WAFL's performance because the N series storage system cluster has only eight file systems, while the other servers all have dozens. On a per-file-system basis, WAFL outperforms the other file systems by almost eight to one. Furthermore, the N series storage system uses RAID, which typically degrades file system performance substantially for the small request sizes characteristic of NFS, where the other servers do not use RAID. You may also argue that the benchmark overestimates WAFL's performance. The entire N series storage system is designed specifically for NFS, and much 14 IBM TotalStorage N series File System Design for an NFS File Server of its performance comes from NFS-specific tuning of the whole system, not just to WAFL. Given WAFL's special purpose nature, there is probably no fair way to compare its performance to general purpose file systems, but it clearly satisfies the design goal of performing well with NFS and RAID. 1.5 Conclusion WAFL was developed, and has become stable, surprisingly quickly for a new file system. It has been in use as a production file system for over a year, and we know of no case where it has lost user data. We attribute this stability in part to WAFL's use of consistency points. Processing file system requests is simple because WAFL updates only in-memory data structures and the NVRAM log. Consistency points eliminate ordering constraints for disk writes, which are a significant source of bugs in most file systems. The code that writes consistency points is concentrated in a single file, it interacts little with the rest of WAFL, and it executes relatively infrequently. More importantly, we believe that it is much easier to develop high quality, high performance system software for an appliance than for a general purpose operating system. Compared to a general purpose file system, WAFL handles a very regular and simple set of requests. A general purpose file system receives requests from thousands of different applications with a wide variety of different access patterns, and new applications are added frequently. By contrast, WAFL receives requests only from the NFS client code of other systems. There are few NFS client implementations, and new implementations are rare. Of course, applications are the ultimate source of NFS requests, but the NFS client code converts file system requests into a regular pattern of network requests, and it filters out error cases before they reach the server. The small number of operations that WAFL supports makes it possible to define and test the entire range of inputs that it is expected to handle. These advantages apply to any appliance, not just to file server appliances. A network appliance only makes sense for protocols that are well-defined and widely used, but for such protocols, an appliance can provide important advantages over a general purpose computer. 1.6 Bibliography Lyon, Bob and Sandberg, Russel. "Breaking Through the NFS Performance Barrier." SunTech Journal 2(4), Autumn 1989: 21-27. Chapter 1. IBM System Storage N series File System Design for an NFS File Server 15 Ousterhout, John and Douglis, Fred. "Beating the I/O Bottleneck: A Case for Log-Structured File Systems." ACM SIGOPS, 23 January 1989. 16 IBM TotalStorage N series File System Design for an NFS File Server Notices This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A. The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrates programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy, modify, and distribute these sample programs in any form without payment to IBM for the purposes of developing, using, marketing, or distributing application programs conforming to IBM's application programming interfaces. © Copyright International Business Machines Corporation 2006. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. 17 This document created or updated on February 8, 2006. Send us your comments in one of the following ways: Use the online Contact us review redbook form found at: ibm.com/redbooks Send your comments in an email to: redbook@us.ibm.com ® Trademarks The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both: Eserver® Redbooks (logo) ™ IBM® Redbooks™ System Storage™ TotalStorage® The following terms are trademarks of other companies: Snapshot, WAFL, FAServer, and NetApp logo are trademarks or registered trademarks of NetApp Corporation or its subsidiaries in the United States, other countries, or both. Sun, and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. Other company, product, or service names may be trademarks or service marks of others. 18 IBM TotalStorage N series File System Design for an NFS File Server