Cluster File System Performance - DESY

advertisement
Cluster File System
Performance
Ellard Roush, Availability Products Group
Sun BluePrints™ OnLine—February 2003
http://www.sun.com/blueprints
Sun Microsystems, Inc.
4150 Network Circle
Santa Clara, CA 95045 U.S.A.
650 960-1300
Part No. 817-1593-10
Revision 1.0, 2/10/03
Edition: February 2003
Copyright 2003 Sun Microsystems, Inc. 4150 Network Circle, Santa Clara, California 95045 U.S.A. All rights reserved.
Sun Microsystems, Inc. has intellectual property rights relating to technology embodied in the product that is described in this document. In
particular, and without limitation, these intellectual property rights may include one or more of the U.S. patents listed at http://
www.sun.com/patents and one or more additional patents or pending patent applications in the U.S. and in other countries.
This product or document is protected by copyright and distributed under licenses restricting its use, copying, distribution, and decompilation.
No part of this product or document may be reproduced in any form by any means without prior written authorization of Sun and its licensors,
if any. Third-party software, including font technology, is copyrighted and licensed from Sun suppliers.
Parts of the product may be derived from Berkeley BSD systems, licensed from the University of California. UNIX is a registered trademark in
the United States and other countries, exclusively licensed through X/Open Company, Ltd.
Sun, Sun Microsystems, the Sun logo, Sun BluePrints, Sun Cluster, Sun Sigma, and Solaris are trademarks or registered trademarks of Sun
Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks or registered
trademarks of SPARC International, Inc. in the US and other countries. Products bearing SPARC trademarks are based upon an architecture
developed by Sun Microsystems, Inc.
The OPEN LOOK and Sun™ Graphical User Interface was developed by Sun Microsystems, Inc. for its users and licensees. Sun acknowledges
the pioneering efforts of Xerox in researching and developing the concept of visual or graphical user interfaces for the computer industry. Sun
holds a non-exclusive license from Xerox to the Xerox Graphical User Interface, which license also covers Sun’s licensees who implement OPEN
LOOK GUIs and otherwise comply with Sun’s written license agreements.
U.S. Government Rights—Commercial use. Government users are subject to the Sun Microsystems, Inc. standard license agreement and
applicable provisions of the Far and its supplements.
DOCUMENTATION IS PROVIDED "AS IS" AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES,
INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT,
ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID.
Copyright 2003 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, Californie 95045 Etats-Unis. Tous droits réservés.
Sun Microsystems, Inc. a les droits de propriété intellectuels relatants à la technologie incorporée dans le produit qui est décrit dans ce
document. En particulier, et sans la limitation, ces droits de propriété intellectuels peuvent inclure un ou plus des brevets américains énumérés
à http://www.sun.com/patents et un ou les brevets plus supplémentaires ou les applications de brevet en attente dans les Etats-Unis et dans
les autres pays.
Ce produit ou document est protégé par un copyright et distribué avec des licences qui en restreignent l’utilisation, la copie, la distribution, et la
décompilation. Aucune partie de ce produit ou document ne peut être reproduite sous aucune forme, par quelque moyen que ce soit, sans
l’autorisation préalable et écrite de Sun et de ses bailleurs de licence, s’il y en a. Le logiciel détenu par des tiers, et qui comprend la technologie
relative aux polices de caractères, est protégé par un copyright et licencié par des fournisseurs de Sun.
Des parties de ce produit pourront être dérivées des systèmes Berkeley BSD licenciés par l’Université de Californie. UNIX est une marque
enregistree aux Etats-Unis et dans d’autres pays et licenciée exclusivement par X/Open Company Ltd.
Sun, Sun Microsystems, the Sun logo, Sun BluePrints, Sun Cluster, Sun Sigma, et Solaris sont des marques de fabrique ou des marques déposées,
ou marques de service, de Sun Microsystems, Inc. aux Etats-Unis et dans d’autres pays. Toutes les marques SPARC sont utilisées sous licence et
sont des marques de fabrique ou des marques déposées de SPARC International, Inc. aux Etats-Unis et dans d’autres pays. Les produits portant
les marques SPARC sont basés sur une architecture développée par Sun Microsystems, Inc.
L’interface d’utilisation graphique OPEN LOOK et Sun™ a été développée par Sun Microsystems, Inc. pour ses utilisateurs et licenciés. Sun
reconnaît les efforts de pionniers de Xerox pour la recherche et le développement du concept des interfaces d’utilisation visuelle ou graphique
pour l’industrie de l’informatique. Sun détient une licence non exclusive de Xerox sur l’interface d’utilisation graphique Xerox, cette licence
couvrant également les licenciés de Sun qui mettent en place l’interface d’utilisation graphique OPEN LOOK et qui en outre se conforment aux
licences écrites de Sun.
LA DOCUMENTATION EST FOURNIE "EN L’ÉTAT" ET TOUTES AUTRES CONDITIONS, DECLARATIONS ET GARANTIES EXPRESSES
OU TACITES SONT FORMELLEMENT EXCLUES, DANS LA MESURE AUTORISEE PAR LA LOI APPLICABLE, Y COMPRIS NOTAMMENT
TOUTE GARANTIE IMPLICITE RELATIVE A LA QUALITE MARCHANDE, A L’APTITUDE A UNE UTILISATION PARTICULIERE OU A
L’ABSENCE DE CONTREFAÇON.
Please
Recycle
Cluster File System Performance
The Sun™ Cluster product provides a highly available cluster platform for hosting
applications that range across a broad spectrum that is as diverse as relational
databases and web services. The file system is an important feature for most of these
applications, both in terms of functionality and performance. An examination of
these applications readily reveals that their file system needs are equally diverse.
This examination also reveals certain usage patterns, which we use to categorize the
following file system workloads:
■ Database
■ General file system
■ Network file system (NFS) [1]
■ Availability
■ Backup
The requirements in each of these areas provides the criteria for selecting a
comprehensive set of performance benchmarks.
This Sun BluePrints™ OnLine article explains how we selected, measured, and
analyzed performance benchmarks. In addition, we validate performance
enhancement proposals with these benchmarks. While the work is still underway,
we have validated a number of performance enhancements that suggest
performance gains that are backed by actual measurements. This article discusses
these changes and their impact on the Sun Cluster file system.
Note – The preceding tasks map well to a Sigma project, and this investigation is
being undertaken as a registered Sun™ Sigma project. The rigor of numerical
measurements and emphasis on satisfying real customer needs is especially relevant.
This article contains the following sections:
■
“Proxy File System” on page 2 describes proxy file system (PxFS).
■
“Benchmarks” on page 4 presents the file system performance benchmarks.
■
“Performance Improvements” on page 7 identifies ways to improve performance.
1
Proxy File System
Sun Cluster [4] [8] software creates a cluster from a collection of machines running
the Solaris™ Operating Environment (Solaris OE) [7]. Sun Cluster software builds on
the Solaris OE to produce a cluster operating system. This article focuses on the
portion of Sun Cluster software that provides cluster-wide file system support.
The Solaris OE has a component that supports multiple specific file systems
concurrently. A specific file system communicates with the Solaris OE using the vfs/
vnode interface [5]. The PxFS [6] interposes between the Solaris OE and the specific
file system. PxFS connects to the Solaris OE at the vfs/vnode interface, while
simultaneously connecting to the specific file system using the vfs/vnode interface.
PxFS is not a file system, but is actually a cluster infrastructure software component
that extends a single machine file system across the cluster.
PxFS introduces two features that are externally visible. PxFS provides locationtransparent access to file system capabilities across the cluster and shields the
customer from many hardware failures. Sun Cluster software enables the customer
to configure the cluster to survive a configurable number of machine failures.
Therefore, the system automatically recovers from a hardware failure, and work in
progress is not lost, as long as the number of failures does not exceed the configured
capability. The PxFS design calls for preserving the semantics of the underlying file
system.
The following high-level overview of the PxFS architecture provides the foundation
for explaining PxFS operations, and sets the context for proposed changes to
improve performance.
Sun Cluster software divides PxFS into two layers, client and server. A PxFS client
resides on each cluster node and supports global file system operations for
applications on that node. PxFS presents a proxy file system (vfs) and proxy files
(vnodes) to the Solaris OE, which are used as if they constituted the actual file
system. Applications accessing the file system execute standard Solaris OE system
calls, and the Solaris OE in turn issues vfs/vnode operations on the proxy vfs and
vnodes. The PxFS client layer examines its own caches and satisfies the request from
cached information, when possible. When the PxFS client layer cannot satisfy a
request, it forwards the request to the PxFS server layer.
The Sun Cluster software infrastructure provides a location-transparent invocation
mechanism. Location transparency applies both when the client and server are colocated on the same node and when they are on different nodes. If a recoverable
communication error occurs, such as a communication path error on one of the paths
between nodes, the Sun Cluster infrastructure automatically and transparently
retries the client request of a server operation.
2
Cluster File System Performance • February 2003
The PxFS primary server manages coherency issues between different client caches,
preventing users from receiving stale information. In conjunction with the high
availability framework portion of the Sun Cluster infrastructure, the PxFS server
provides the ability for operations to complete despite the occurrence of node
failures. The PxFS primary server sends critical information to one or more PxFS
secondary servers on other nodes, in what is referred to as a checkpoint operation. If
a node failure occurs, a PxFS secondary server becomes the new PxFS primary server
and completes the operation using the information from checkpoints. Except for a
small increase in latency for the affected operation, the system hides the failure.
The PxFS primary server issues vfs/vnode operations to the underlying singlemachine file system, such as UNIX® file system (UFS) or VERITAS file system
(VxFS). The PxFS server returns the response to the PxFS client, which caches
information before returning the response to the Solaris OE for delivery to the
application accessing the file system.
FIGURE 1 on page 4 displays the overall PxFS architecture using the example of the
components supporting an individual file. Each active file has three components in
each of the client and server layers: A file, file attributes, and file data. Relevant
details are addressed when specific performance-improvement techniques are
discussed later in this article.
Proxy File System
3
Node 1
Solaris
Request
VOP
Response
PXFS client
File
proxy
Attribute
cache
Data
cache
Invocation
request/reply
File
server
Attribute
provider
Data
provider
Node 2
Checkpoint
PXFS server primary
Request
VOP
File
server
Attribute
provider
Data
provider
PXFS server secondary
Response
Single machine file system
FIGURE 1
PxFS Architecture Overview
Benchmarks
A good set of performance benchmarks is essential for performance work. We have
chosen a set that covers important workloads for PxFS and that gives us insight into
the product's strengths and weaknesses. Performance work often involves trade-offs.
This broad spectrum test suite exposes trade-offs that have consequences in other
important areas. The performance benchmarks are being used to enhance the
software engineering development and maintenance processes.
These performance benchmarks provide not only a picture of PxFS performance, but
also transform performance work into a quantitative discipline, whose rigor can be
applied to both software development and maintenance.
4
Cluster File System Performance • February 2003
Database
Database file system feature usage patterns are considerably different from other
clients of the file system. We selected the online transaction processing benchmark
TPC-C and the decision support system benchmark TPC-H. These industry-standard
benchmarks [11] have an appropriate mix of operations for their areas.
When running a database on UFS, directio reads and writes are very important
operations. We execute Iozone [3] to measure these critical operations. The
directio performance of PxFS on UFS versus UFS appears in FIGURE 2 for reads
and FIGURE 3 on page 6 for writes. These results show PxFS on UFS roughly matches
the performance of UFS in this area, which shows how Sun Cluster software can
provide good support for databases.
Throughput comparison (ufs drctio/pfs drctio %)
160
140
120
100
80
60
64
128
256
512
1024
2048
4096
8192 16384 32768 65536 131072
File size (Kilobytes)
128:ufs drctio/pxfs drctio:read
4:ufs drctio/pxfs drctio:read
256:ufs drctio/pxfs drctio:read
8:ufs drctio/pxfs drctio:read
512:ufs drctio/pxfs drctio:read
32:ufs drctio/pxfs drctio:read
1024:ufs drctio/pxfs drctio:read
64:ufs drctio/pxfs drctio:read
4096:ufs drctio/pxfs drctio:read
FIGURE 2
directio Read Performance
Benchmarks
5
Throughput comparison (ufs drctio/pfs drctio %)
180
160
140
120
100
80
60
64
128
256
512
1024
2048
4096
8192 16384 32768 65536 131072
File size (Kilobytes)
4:ufs drctio/pxfs drctio:write
8:ufs drctio/pxfs drctio:write
32:ufs drctio/pxfs drctio:write
64:ufs drctio/pxfs drctio:write
128:ufs drctio/pxfs drctio:write
256:ufs drctio/pxfs drctio:write
512:ufs drctio/pxfs drctio:write
1024:ufs drctio/pxfs drctio:write
4096:ufs drctio/pxfs drctio:write
FIGURE 3
directio Write Performance
General File System and NFS
Sun Cluster software runs existing Solaris OE applications unchanged. Therefore,
PxFS supports the Solaris OE file system semantics for which an important standard
is the Portable Operating System Interface (POSIX) [2]. In addition to hosting
applications using file systems, Sun Cluster software also can export a global file
system for NFS access.
Both types of workloads use a wide range of file system features. We selected two
benchmarks that execute a mix of operations. This demonstrates how PxFS performs
when supporting a reasonable mix of operations. We execute SPECsfs [10], which is
the industry standard performance benchmark for NFS. We also execute PostMark
6
Cluster File System Performance • February 2003
[9], which is a widely used file system performance benchmark that runs entirely
within the cluster. Read/write performance is so important, that we execute Iozone
to measure read/write operations. The Bigdir benchmark measures file system
performance when working with a file that contains very large numbers of files.
Availability
High availability is an important Sun Cluster feature. All high-availability clusters
must perform some processing to recover from failures. We developed tests to
measure recovery time after node failure, switchover time to move a file system
primary between nodes, and the time to transfer state to a node joining the cluster.
Backup
Most sites perform incremental backups. Ordinarily, workloads perform a lot of
operations on any particular file. Incremental backups look at lots of files and copy
only a very small percentage of them, which reverses the usual pattern. The find
and ls -lR commands behave similarly to backup operations when processing large
numbers of files. We developed a performance benchmark for incremental backups.
Performance Improvements
The investigation for methods to improve performance is still underway. However,
we have already identified a number of things that can significantly improve
performance. This section describes some of these improvements, as well as the
results of experiments validating the proposed changes. Space limitations prevent us
from covering all of our findings.
UFS Logging
Performance can be improved in many ways, including leveraging better technology
provided by groups at Sun other than the Availability Products Group. We tested
PxFS with an improved UFS logging subsystem from the UFS organization in Sun.
When PxFS was run with this logging subsystem, PxFS performance using the
PostMark benchmark improved 20–80 percent, depending upon the transaction mix.
Performance Improvements
7
Sync on UFS Log
The UFS logging capability benefits Sun Cluster software by making it possible to
avoid using the fsck command to clean up a file system prior to mounting that file
system. On today's very large file system, fsck can execute for a correspondingly
long time. This negatively impacts the availability goal, and can prevent you from
quickly remounting a file system after a node failure.
The UFS logging subsystem records file metadata changes to a log file before the
actual file metadata is changed on persistent storage, which is usually disk storage.
The UFS logging subsystem synchronously processes the metadata changes for just a
few operations, such as file sync. For other operations, the UFS logging subsystem
batches file metadata changes to improve performance.
PxFS needs to be able to access information to recover from a node failure. The Sun
Cluster product uses shared persistent storage. PxFS issues a sync operation on the
UFS log file to make file metadata available if a node failure occurs, but file sync
operations are expensive. Many file operations cause file metadata changes, so sync
operations occur quite frequently. Because there is only one UFS log file, syncing file
systems can serialize many file system operations even when working on different
files.
Fortunately, sync operations are not the only possible solution. The PxFS primary
server checkpoints information to the PxFS secondary servers, and this information
can be used to redo the operation in case of node failures. To use this method, the
checkpointed information must be retained until the UFS log has been written and
until the file operation has completed.
We prototyped this alternative solution and achieved 73–100 percent performance
improvement depending upon the mix of operations and the operation type. The
following table compares PxFS performance with the UFS log sync “Yes” and
without the UFS log sync “No” on the PostMark benchmark with 10,000 initial files,
20,000 transactions, and a size range 500 bytes to 9.77 kilobytes. The units are either
operations-per-second or the amount of data read or written per second. The last
row shows the relative performance of not using a file sync command versus using a
file sync command for file operations that change file meta-information.
TABLE 1
8
Comparison Sync Versus No SyncUFS Log
Sync Log
Transaction
Files Created
Files Read
Files Deleted
Files Appended
MB Read
MB
Written
Yes
16
11
8
11
8
32.50
67.57
No
32
19
16
19
16
57.38
119.28
No/Yes
200%
173%
200%
173%
173%
177%
177%
Cluster File System Performance • February 2003
Attribute Cache
PxFS does client-side caching of the file attributes, such as file size and modify time.
File attributes are frequently accessed, and the attribute cache can significantly
reduce the number of times that the PxFS client has to contact the PxFS server for file
attributes. Even with today's fast interconnects, round trips across the interconnect
are relatively expensive. When the PxFS client and PxFS server are not co-located,
the client-side cache can reduce expensive round trips across the interconnect.
Caches come with a cost. A synchronization protocol provides cache coherency.
POSIX semantics result in frequent changes to file attributes, because many file
operations change one or more fields of a file's attributes. The synchronization
protocol serializes many file operations on any particular file to support the goal that
everyone sees the latest data. The single-machine file system where PxFS runs
already has its own copy of the file attributes in memory. While one set of file
attributes is not large, a system can easily have very large numbers of active files,
whose attributes are all memory resident. This means that attributes consume
memory twice in the co-located case. When the PxFS client and PxFS server are colocated, the client can obtain the file attributes from the underlying file system,
inexpensively.
Another cost of caching stems from the high-availability property. The current
design checkpoints information about which clients have a particular file's attributes
in order to preserve correct cache coherency despite node failures.
We developed a prototype in which the PxFS client did not cache attributes and
always obtained the file attributes from the underlying file system. Our first
experiment used a two-node cluster where the PxFS client and PxFS server were colocated. FIGURE 4 on page 10 shows that throughput performance improved close to
30 percent with no attribute caching versus the existing design with attribute
caching.
Performance Improvements
9
1300
Actual operations per second
1200
1100]
1000
900
800
700
800
1000
1200
1400
1600
1800
Requested operations per second
sfssum.no_attr_cache
sfssum.wait
FIGURE 4
SPECsfs Throughput With and Without Attribute Caching
FIGURE 5 on page 11 shows an even larger improvement in reducing the average
latency time for operations. In both figures, the legend sfs sum.no attr cache
identifies the no attribute cache prototype, and the other legend represents the
system with attribute caching. The SPECsfs benchmark looks for the highest
operation throughput that can be sustained when compared with the requested load.
The SPECsfs benchmark attempts to execute a certain number of operations per
second and measures the actual number that the system actually processed. SPECsfs
repeats this process over a range of requested load values.
Note – Note that this figure only demonstrates the advantage of not caching in the
local case; SPECsfs has numerous requirements that must be met when it is used to
represent a system's performance capability.
10
Cluster File System Performance • February 2003
28
26
Latency in ms per operation
24
22
20
18
16
14
12
10
8
6
800
1000
1200
1400
1600
1800
Requested operations per second
sfssum.no_attr_cache
sfssum.wait
FIGURE 5
SPECsfs Latency With and Without Attribute Caching
Next, we experimented using the same two-node cluster configured such that the
client and server were not co-located. Not caching file attributes resulted in serious
performance degradation for the remote case. We would like to get the best of both
worlds, and are planning to experiment with a system that caches attributes in the
remote case only. Even the fastest interconnects have a cost, so in most cases, Sun
Cluster product configurations co-locate applications with their associated file
system. This means that improvements in the local case generally provide a
significant performance benefit to most cluster configurations.
Performance Improvements
11
Checkpointing
A valuable feature provided by PxFS is that file system operations complete
successfully even with hardware failures that do not exceed the configured
capability. Overcoming failures requires replicating information. PxFS primarily uses
two techniques for replicating information.
■
PxFS stores information on persistent storage, usually disk storage, that is
accessible from multiple nodes and uses some form of replication, such as disk
mirrors.
■
PxFS “checkpoints” information that enables PxFS to redo an operation that was
interrupted by a node failure.
The PxFS primary server and PxFS secondary servers are always located on different
nodes. Therefore, checkpoints always cross the interconnect. The primary server
cannot proceed until it receives confirmation that the checkpoint has safely reached
the secondary servers. The checkpoint processing by secondary servers can happen
later, asynchronously. Sun Cluster software sends an acknowledgement message for
checkpoints that travel over Ethernet-based interconnects. The round trip makes
checkpointing relatively costly. Some newer interconnect technologies have
hardware-based acknowledgements, which can reduce the time. Modern
interconnect technologies have reduced the wire time to the point where other
factors, such as marshalling data, can dominate the communication cost. Therefore,
checkpointing is relatively expensive when compared to local operations other than
disk accesses.
It was no surprise when our investigation showed that checkpointing has a
substantial performance impact. We identified several improvement techniques,
which we placed into two categories: faster checkpoints and reducing the number of
checkpoints.
Faster Checkpoints
We conducted an experiment aimed at determining the impact of faster checkpoints.
In this experiment, the system did not transmit the checkpoints, and instead
introduced a busy wait with varied times. While there was considerable variation,
on average a 1 microsecond reduction yielded a roughly 1 operation-per-second
improvement in SPECsfs. Each checkpoint consumes many microseconds, which
means that there is room for useful improvement in this area.
Today, each checkpoint requires multiple memory allocations for different data
structures, and a measurement of the code path showed that these multiple memory
allocations consumed a significant portion of the checkpoint processing time. The
time required for one memory allocation generally does not vary according to the
12
Cluster File System Performance • February 2003
number of bytes allocated. An alternative design would combine these various data
structures and perform one memory allocation. This same technique improves all
inter-node communications, and not just checkpoints.
The Sun Cluster software infrastructure flow controls checkpoints to prevent
flooding a node with checkpoints. Our investigation found that increasing the
allowed number of outstanding checkpoints, while still within a safe range,
improved the peak SPECsfs performance by roughly five percent.
Reducing Number of Checkpoints
While checkpoints can be used to support high availability, checkpoints are not the
only possible solution. Some information is already replicated by PxFS in the form of
information maintained in each of the PxFS clients. After a node failure takes out the
PxFS primary server, the newly promoted PxFS primary server can obtain the
information from the surviving clients. Node failures are extremely rare when
compared to the huge numbers of file operations that are performed per second on a
busy system. This follows the classic model of moving costly operations from the
common code path to the rare code path. The newly promoted PxFS primary server
can obtain this information on a per file basis, and can do this lazily without
impacting file system availability time.
For historical reasons, PxFS has multiple server components that support each active
file. The system checkpoints the creation of these components to the PxFS secondary
servers so they are prepared to receive checkpoints for subsequent file operations,
such as write. The file lookup operation spends more time checkpointing than
anything else. The file object, file attribute object, and data provider object can be
combined, which reduces the number of checkpoints. Currently, these objects are
created before a file can be actively used, which means that this is a frequently
occurring operation. Each highly available object has a number of Sun Cluster
infrastructure support data structures. Consolidating the PxFS file objects
consolidates the infrastructure support. Given the enormous numbers of active files,
this can be a substantial memory savings.
About the Author
Dr. Ellard Roush currently leads the PxFS Performance Team. He has worked for
over six years on the Sun Cluster project, first on the cluster infrastructure, and now
on the cluster file system. On the OPUS distributed operating system at Unisys, he
achieved a 50+ percent time reduction for local messages and a 68 percent time
reduction for remote messages. His Ph.D. thesis work at the University of Illinois at
About the Author
13
Urbana-Champaign demonstrated a new process migration algorithm that performs
20 times faster than the world's previous record. Prior to that, he worked for many
years on U.S. government distributed systems.
References
[1] B. Callaghan. NFS Illustrated. Addison Wesley Longman, Inc., 2000.
[2] IEEE. IEEE Standard for Information Technology - Portable Operating System
Interface (POSIX) - Part 1: System Application Interface (API) - Amendment 1:
Realtime Extension. IEEE Std 1003.1b-1993.
[3] Iozone. Iozone file system benchmark at: http://www.iozone.org/.
[4] Y. Khalidi, J. Bernabeu-Auban, V. Matena, K. Shirri, and M. Thadani. Solaris MC:
A Multi Computer OS. In 1996 USENIX Conference, 1996.
[5] S. Kleiman. Vnodes: an architecture for multiple file system types in Sun UNIX®.
In Summer USENIX, pages 238–247, 1986.
[6] V. Matena, Y. Khalidi, and K. Shirri. Solaris MC file system framework. In ACM
Conference on Operating Systems Design and Implementation, 1996.
[7] J. Mauro and R. McDougall. Solaris Internals Core Kernel Architecture. Sun
Microsystem Press, 2001.
[8] S. Microsystems. Sun Cluster architecture: A white paper. In IEEE Computer
Society International Workshop on Cluster Computing, pages 331{338, 1999.
[9] Network Appliances. PostMark at http://www.netapp.com/tech library/
3022.html.
[10] Standard Performance Evaluation Corporation. SPECsfs benchmark at:
http://www.specbench.org/.
[11] Transaction Processing Performance Council database performance benchmarks
at: http://www.tpc.org.
14
Cluster File System Performance • February 2003
Ordering Sun Documents
The SunDocsSM program provides more than 250 manuals from Sun Microsystems,
Inc. If you live in the United States, Canada, Europe, or Japan, you can purchase
documentation sets or individual manuals through this program.
Accessing Sun Documentation Online
The docs.sun.com web site enables you to access Sun technical documentation
online. You can browse the docs.sun.com archive or search for a specific book title
or subject. The URL is http://docs.sun.com/
To reference Sun BluePrints OnLine articles, visit the Sun BluePrints OnLine Web site at:
http://www.sun.com/blueprints/online.html
References
15
Download