Red books

advertisement
Redbooks Paper
Carlos P. Sosa
Balaji V. Atyam
Peter Heyrman
Naresh Nayar
Jeri Hilsabeck
Life Sciences Applications on IBM
POWER5 and AIX 5L Version 5.3:
Virtualization Exploitation through
Micro-Partitioning Implementation
Abstract
In this study, we present a series of benchmarks that exploit virtualization
features using IBM® POWER5™ technology and IBM AIX® 5L™ Version 5.3.
We define virtual benchmarks (VBs) based on new functionality introduced in the
IBM Eserver® iSeries™ and pSeries® POWER5 technology-based systems
and AIX 5L Version 5.3. The benchmarks selected rely on virtualization
exploitation through Micro-Partitioning™. The applications tested in this study
correspond to Gaussian 03 Rev. C.01, BLAST 2.2.6, and AMBER 7. We show
that throughput benchmarks running on a system with Micro-Partitioning can
take full advantage of a pool of shared processors. In other words, virtual
processors improve the time to solution.
© Copyright IBM Corp. 2005. All rights reserved.
ibm.com/redbooks
1
Introduction
The idea of virtualization is currently being exploited in many areas within
different groups at IBM. In the area of storage solutions, virtualization is
considered as a way to help reduce the complexity and costs of managing
SAN-based storage. With the IBM TotalStorage® Virtualization family, you can
manage your storage infrastructure from a single point of control with centralized
volume, file, and device management. Together, these products can help you
drive down the cost and complexity of managing your storage infrastructures,
while providing the flexibility to address rapidly changing storage needs. Another
example can be considered in the Virtual Loan Program (VLP). In this case, the
idea is to provide ISVs with access to pSeries on an on demand basis. This
eliminates the costly proposition of providing each ISV with its own computer
system, which also delays the availability of applications on a given release.
The IBM virtualization engine is one of the most recent efforts being carried out
to exploit virtualization technology at multiple levels. It ties very well with on
demand business model. The virtualization engine is composed of services and
IBM technologies. The key idea is for resources available on IBM servers to
function as a single pool that can be more easily managed across the
organization [1].
In this study, we look at the part of the virtualization engine that corresponds to
logical partition (LPAR) and Micro-Partitioning [1]. Partitioning capabilities have
been improved on POWER5 technology-based systems to provide
sub-processor partitioning [2,3]. On pSeries POWER4™ technology-based
systems, partitions were constrained to physical processor boundaries. Now, this
limitation has been removed, and fractions of physical processors can be used
as shared or part of the pool of resources [1,2].
In particular, we explore the usability of this new technology to improve
performance on a series of throughput benchmarks. These throughput
benchmarks, of course, require the system to be partitioned through
Micro-Partitioning. The benchmarks were carried out with three of the most
popular applications in Life Sciences and they reflect actual benchmarks
requested by customers.
Micro-Partitioning
The benchmarks carried out here make use of many of the new “virtual” features
of the IBM POWER5 technology-based systems, namely, Micro-Partitioning
technology [1]. Micro-Partitioning enables multiple LPARs to run on a physical
processor in a time-sliced fashion (see Figure 1 on page 3). The POWER™
Hypervisor™ manages the time-slicing of LPARs according to Hardware
2
Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3
Management Console (HMC)-defined parameters [1]. Micro-Partitions are
assigned CPU entitlements at the granularity of 1/100 of a CPU (with a minimum
of 1/10 CPU per LPAR), where the CPU entitlement is defined as part of a MicroPartition’s profile definition, but can be dynamically changed.
Micro-Partitions run with “virtual” processors. The number of virtual processors is
defined as part of a Micro-Partition’s image definition. The number of virtual
processors can be dynamically changed, and virtual processors are scheduled to
physical processors of the shared physical processor pool. On POWER5
technology-based systems, there is a single shared pool that provides the
physical processors for all Micro-Partitions. The shared pool size can be
dynamically resized by adding or removing physical processors. A virtual
processor can be dispatched on any physical processors in the pool. The
Hypervisor will attempt to maintain virtual-to-physical processor affinity when it
dispatches virtual processors.
Figure 1 An example of a system configured with Micro-Partitioning
A Micro-Partition can be capped or uncapped [1,2]. A capped Micro-Partition
cannot exceed its entitled CPU capacity. Conversely, an uncapped
Micro-Partition can use CPU resources beyond its entitlement, as long as there
are excess cycles in the pool. When a virtual processor in a shared
Micro-Partition reaches its idle loop, it gives up the remaining cycles in its entitled
capacity to the Hypervisor so that the cycles can be used by other
Micro-Partitions. It is important to note that dedicated processor LPARs continue
Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3
3
to be supported; here CPU resources are dedicated and not shared between
LPARs. LPAR isolation is maintained for Micro-Partitions.
Resources
In this section, we describe the resources we used for this study.
Hardware
To carry out this study, we used one of the recently announced IBM Eserver
pSeries POWER5. The p5 Model 570 server used here had 16 processors
running 1.9 GHz. The system memory consisted of 512 GB (DDR1). The
POWER5 processor supports the 64-bit PowerPC® architecture. Each chip
contains two identical processor cores, where each core supports two identical
threads by simultaneous multi-threading (SMT). With SMT, the chip appears as a
4-way processor to the operating system. In this study, we did not use the SMT
feature. Each of the cores share a 1.92 MB L2 cache. On the POWER5
technology-based system, the L3 cache directory is on-chip for the off-chip 36
MB L3 cache. Also, the memory controller is integrated on-chip.
On POWER5 technology-based systems, the logical partitioning of the machine
is substantially different from POWER4 technology-based systems. On POWER5
technology-based systems, physical processors are abstracted into virtual
processors. Provided that the system has been configured that way, the physical
processors can be shared by multiple logical partitions.
Scientific applications
We selected three Life Sciences applications that rely on different computational
methods to carry out molecular simulations. These applications are in the areas
of quantum chemistry, molecular mechanics and molecular dynamics, and
bioinformatics.
Gaussian [4] is a connected series of programs that can be used for performing a
variety of electronic structure calculations: molecular mechanics, semi-empirical,
ab initio, and density functional theory. Gaussian consists of a collection of
programs commonly known as links. Each link communicates through disk files
and are grouped into overlays [5]. Links are independent executables located in
the g03 directory and labeled as lxxx.exe; where xxx is the unique number of
each link. In general, overlay zero is responsible for starting the program, which
includes reading the input file. After the input file is read, the route card
(keywords and options that specify all the Gaussian parameters) is translated
into a sequence of links. Overlay 99 (l9999.exe) terminates the run; in most
4
Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3
cases, l9999.exe finishes with an archive entry (brief summary of the
calculation).
The theoretical methods chosen in this study have been extensively discussed in
the literature [6], and it is beyond the scope of this work to describe these
methods. The approximations used in this work correspond to Hartree-Fock [6].
The case used in this study corresponds to one of the cases from previous
studies [7-9]. The molecule α-pinene at the HF level of theory using the
6-311G(df,p) basis set was used as our benchmark. We selected this case
because the I/O for this particular calculation is minimal. The I/O capabilities will
be tested in a future study [10].
AMBER (Assisted Model Building with Energy Refinement) is a flexible suite of
programs for performing molecular mechanics and molecular dynamics
calculations based on force fields [11]. Sander is the primary program used for
molecular dynamics simulations and is the only program considered in our
current study. Sander carries out energy minimization, molecular dynamics, and
NMR refinements. AMBER is floating point-intensive FORTRAN code. The
version used in this study corresponds to AMBER 7 for IBM systems [12]. The
test that we selected to run AMBER is the JAC benchmark. This is a joint
AMBER-CHARMM benchmark. It considers a protein dhfr (dihydrofolate
reductase) in an explicit water bath with cubic periodic boundary conditions.
Details of system size and simulation conditions are 23,558 atoms, cubic periodic
box, 62.23 Å dimension, 9Å nonbond cutoff with 2Å buffer, that is, list with 11Å
cutoff, 1 fs time step, 1000 steps, NVE ensemble (constant energy, constant
volume), bonds to hydrogen constrained (SHAKE). The particle mesh Ewald
(PME) method was used for calculating the Lennard-Jones (LJ) and electrostatic
interactions with the 64x64x64 grid; the equilibration temperature was 300 K.
BLAST (Basic Local Alignment Search Tool) is a set of similarity search
programs designed to explore all of the available sequence databases regardless
of whether the query is protein or nucleic acid [13]. The BLAST programs have
been designed for speed, with a minimal sacrifice of sensitivity to distant
sequence relationships. The scores assigned in a BLAST search have a
well-defined statistical interpretation, making real matches easier to distinguish
from random background hits. BLAST uses a heuristic algorithm that seeks local,
as opposed to global, alignments and is therefore able to detect relationships
among sequences that share only isolated regions of similarity [13].
Life Sciences virtual benchmarks
In the area of Life Sciences and for this set of applications, these benchmarks
illustrate that Micro-Partitioning allows for increased overall utilization of system
resources by automatically making use of additional processors that are part of a
Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3
5
shared pool. These processors that are part of the shared processor pool are not
associated with dedicated partitions.
The partition profiles that we selected for this study are summarized in Table 1.
Case I tries to simulate two separate machines, where each machine has eight
processors. In this particular case, by definition, there is no shared processor
pool. Case II corresponds to a partition (LPAR1) that has been configured with
four virtual processors. However, LPAR2 is running in dedicated mode. This case
should provide information about differences due to virtual processors. Case III is
similar to case II. The main difference is that LPAR2 was shutdown. Cases IV
though VI test the benefit of virtual processors on a throughput benchmark. Case
IV had four virtual processors. Here, LPAR2 is defined as shared. Case IV is
characterized by the fact that no jobs were submitted on LPAR2 while the
throughput benchmark was running and completed on LPAR1. Cases V and VI
are similar to the previous case, except that 4 and 8 jobs were submitted to
LPAR2, respectively.
Table 1 Profile used for LPAR1 and LPAR2 for cases I-VI
Parameters
Case I
Case II
Case III
Case IV
Case V
Case VI
LPAR1 profile
Desired processor
capacity
N/A
8
8
8
8
8
Desired virtual
processors
8
12
12
12
12
12
Capped
N/A
No
No
No
No
No
Variable capacity
weight
N/A
128
128
128
128
128
LPAR2 profile
6
Desired processor
capacity
N/A
N/A
N/A
8
8
8
Desired virtual
processors
8
8
8
8
8
8
Capped
N/A
N/A
N/A
No
No
No
Variable capacity
weight
N/A
N/A
N/A
128
128
128
Partitions mode
LPAR1
dedicated
and LPAR2
dedicated.
LPAR2 up
and
running in
dedicated
mode.
LPAR2
was
shutdown.
LPAR2
was
shared, up
and idle,
empty.
LPAR2
was
shared, up
and 4 jobs
running.
LPAR2
was
shared, up
and 8 jobs
running.
Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3
Table 2 summarizes the elapsed time for our Gaussian 03 example as function of
the previously defined cases. Column 1 corresponds to the number of jobs that
were submitted at the same time. Once submitted, only these jobs ran on the
corresponding partition. All the times for two or more jobs are compared to a
single job running on a dedicated system. Clearly, in this table, we see that as we
increase the number of jobs running on the machine, the elapsed time increases.
Of course, the elapsed time will depend on the particular case. The times
reported in Table 2 correspond to the average time computed over the number
jobs for each throughput run.
Table 2 Average elapsed times for the Gaussian 03 application
Number of
jobs
Average elapsed time in seconds
Case I
Case II
Case III
Case IV
Case V
Case VI
1
516
499
504
501
498
511
2
516
497
499
506
500
508
4
517
505
502
503
504
510
8
539
530
503
503
514
542
10
657
660
509
508
520
745
16
1053
1076
663
659
680
1085
The information presented in Table 3 on page 8 summarizes the percentage
difference of each case compared to case I. In other words, we want to know
what the effect is of having a pool of shared virtual processors. We compute the
percentage difference (∆%) of all the other cases compared to case I. A positive
number represents slowdown and a negative number represents speedup. We
chose case I as the baseline, because this case was defined without a pool of
shared virtual processors.
In the case of Gaussian 03, case I simulates a stand-alone system with eight
processors. The benchmark that we use is a throughput benchmark with 2N jobs,
where N is the number of processors. In other words, the maximum that we over
subscribe the machine is 2N. Case II shows that defining virtual processors
introduces some slight slowdown when running 10 and 16 jobs. This is not
surprising because it has been previously reported [2]. However, this enables us
to quantify it, and for this case, such a slowdown is of the order of 2%, where 2%
is for extreme cases. The difference between case II and case III is that LPAR1
had access to all the resources on the machine since LPAR2 was shutdown.
Clearly, this example illustrates the effect of the virtual processors when
available. Similarly, as can be seen in case IV, this time, LPAR2 was idle; thus, all
the resources could be made available to LPAR1 when needed.
Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3
7
Cases V and VI try to simulate production environments where the second
partition (LPAR2) might be partially busy or totally busy. Case V corresponds to
an environment where LPAR2 is partially busy and some of the resources might
be available for LPAR1, which is fully subscribed. In this case, we see a behavior
consistent with the previous case. However, in case VI, we clearly see that there
are no additional resources, because both partitions are fully used. In this case,
the performance slowdown is similar to what we saw in case I: Slightly higher for
jobs 8, 10, and 12, except for 10 jobs that are abnormally higher in case VI.
Table 3 Percentage difference for cases II through VI when compared to case I for Gaussian 03 application
Number of
jobs
∆% Percentage difference
Case I
Case II
Case III
Case IV
Case V
Case VI
1
0
-3
-2
-3
-3
-1
2
0
-4
-3
-2
-3
-2
4
0
-2
-3
-3
-3
-1
8
0
-2
-7
-7
-5
1
10
0
0
-23
-23
-21
13
16
0
2
-37
-37
-35
3
Table 4 on page 9 presents similar information as in the case of Gaussian 03.
This table summarizes all the average elapsed times for the AMBER 7
throughput runs. The trends are basically the same as in the case of Table 2 on
page 7. As the number of jobs running on a throughput benchmark increases,
the elapsed time increases. The patterns observed for each of the cases in our
previous applications are reflected here as well.
8
Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3
Table 4 Average elapsed timings for the AMBER 7 application
Number of
jobs
Average elapsed time in seconds
Case I
Case II
Case III
Case IV
Case V
Case VI
1
574
571
572
571
571
572
2
574
576
575
574
577
585
4
574
577
576
577
577
596
8
592
619
575
575
587
644
10
710
880
579
580
589
882
16
1197
1207
743
747
772
1265
The percentage differences observed for AMBER 7 are similar to Gaussian 03
(see Table 5). We see that for case II, the percentage difference for throughput
jobs running 10 and 16 instances of this sequential input shows slowdowns of 24
and 1, respectively. However, when virtual processors are available, we see an
improvement in performance for cases III, IV, and V. This is reflected in the
negative numbers for throughput runs with 10 and 16 jobs.
Table 5 Percentage difference for cases II through VI when compared to case I for AMBER 7 application
Number of
jobs
∆% Percentage difference
Case I
Case II
Case III
Case IV
Case V
Case VI
1
0
-1
0
-1
-1
0
2
0
0
0
0
1
2
4
0
1
0
1
1
4
8
0
5
-3
-3
-1
9
10
0
24
-18
-18
-17
24
16
0
1
-38
-38
-35
6
The last application that we tested corresponds to BLAST, which is different from
the two previous applications. BLAST reads a database through mmap and does a
pattern search. The BLAST family of tools to search for similarities in pair
sequences is developed at the National Center for Biotechnology Information
(NCBI). BLAST is part of the development of software tools for analyzing genome
data. The BLAST family set of tools is capable of searching databases
regardless of whether the query is a sequence of amino acids or nucleotides.
Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3
9
BLAST uses a heuristic algorithm to carry out local alignments. This type of
alignment or search can be carried out with different programs available in
BLAST. Table 6 illustrates the programs available in BLAST.
Table 6 BLAST programs1
Programs
Query
Database
blastp
amino acid
protein
blastn
nucleotide
nucleotide
blastx
nucleotide translated
protein
tblastn
amino acid
nucleotide translated
tblastx
nucleotide translated
nucleotide translated
BLAST can be considered as a three-step algorithm [14]: in step1, the program
compiles a list of high-scoring strings; in step 2, the program searches for hits,
where for each successful hit it generates a seed; and in step 3, it extends the
seeds. The version of BLAST we used is BLAST 2.2.6 [13]. NCBI BLAST has a
wrapper called blastall. This wrapper then calls each of the programs in Table 6.
Throughout this work, we invoked blastn.
We used the gi|5706771|gb|AC007518.16|AC007518 Mus musculus
chromosome 6 clone 345_D_4 map6 as our query. The database used is the
human genome DNA sequence from the Sander center’s ensemble server [15].
The version used in this work contains 44521 sequences and 3200338544
letters.
Table 7 Average elapsed times for the BLAST application
Number of
jobs
Average elapsed time in seconds
Case I
Case II
Case III
Case IV
Case V
Case VI
1
458
480
484
461
460
461
2
461
458
462
462
463
487
4
464
475
481
469
469
498
8
511
528
489
492
513
550
10
611
776
510
511
526
833
16
989
2003
670
693
697
2405
1
10
For more information, see reference 13.
Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3
Similarly as before, Table 7 on page 10 summarizes the elapsed time as a
function of the cases selected in this study. The trends are similar as those
shown previously. However, as we shall see from Table 8, BLAST has a larger
performance difference when compared to Gaussian 03 and AMBER 7.
Table 8 Percentage difference for cases II through VI when compared to case I for the BLAST application
Number of
jobs
∆% Percentage difference
Case I
Case II
Case III
Case IV
Case V
Case VI
1
0
5
6
1
0
1
2
0
-1
0
0
0
6
4
0
2
4
1
1
7
8
0
3
-4
-4
0
8
10
0
27
-17
-16
-14
36
16
0
103
-32
-30
-30
143
In the case of BLAST, Table 8 shows that BLAST experiences larger differences
when running 10 or 16 jobs in a throughput mode. However, it shows similar
benefits when running with an available pool of shared virtual processors. The
larger difference seen in case II and case VI with 10 and 16 jobs is characteristic
of BLAST when compared against the two previous applications. In the case of
Gaussian 03 or AMBER 7, the largest difference for 16 jobs in any of the cases
tested was approximately 6. However, in the case of BLAST, as shown in Table 8,
the corresponding value for case VI with 16 jobs is one order of magnitude larger.
These unusually large deviations from the baseline when there are no virtual
processors have been discussed previously [16]. Here, we mention the main
reason. AIX tends to add additional system time when executing an mmap. The
system time, of course, is reflected in the elapsed time. This explains the factor of
a 2 or 3 larger deviation when compared to the other two applications.
Summary
In this study, we have tried to provide information about the performance of a
series of throughput benchmarks on a system that has been configured with
Micro-Partitioning. To show the benefit of Micro-Partitioning, we ran our set of
benchmarks with and without Micro-Partitioning. In other words, we tried to
simulate a cluster of multiprocessor workstations that cannot make use of
Micro-Partitioning versus a shared-memory system with Micro-Partitioning. A
Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3
11
shared-memory system with logical partitions can simulate a cluster of
multiprocessor workstations with the added benefit of Micro-Partitioning.
This series of benchmarks have clearly shown that for three different applications
in Life Sciences, the availability of a pool of virtual processors improves the time
to solution. A partition that has exhausted its resources can take advantage of a
pool of shared virtual processors provided that they are not required by other
partitions.
References
1. Advanced POWER Virtualization on IBM Eserver p5 Servers: Introduction
and Basic Configuration, SG24-7940
2. Browning, L., IBM Eserver p5 AIX 5L Support for Micro-Partitioning and
Simultaneous Multi-threading, July 2004, available at:
http://www.ibm.com/servers/aix/whitepapers/aix_support.pdf
3. Tsao, H-F. and B. Olszewski, IBM Eserver p5 570 Server Consolidation
Using POWER5 Virtualization White Paper, July 2004, available at:
http://callisto.bstoke.uk.ibm.com/unixsolutions/white/p5consol.pdf
4. Gaussian 03, Revision C.01, M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E.
Scuseria, M. A. Robb, J. R. Cheeseman, J. A. Montgomery, Jr., T. Vreven, K.
N. Kudin, J. C. Burant, J. M. Millam, S. S. Iyengar, J. Tomasi, V. Barone, B.
Mennucci, M. Cossi, G. Scalmani, N. Rega, G. A. Petersson, H. Nakatsuji, M.
Hada, M. Ehara, K. Toyota, R. Fukuda, J. Hasegawa, M. Ishida, T. Nakajima,
Y. Honda, O. Kitao, H. Nakai, M. Klene, X. Li, J. E. Knox, H. P. Hratchian, J. B.
Cross, C. Adamo, J. Jaramillo, R. Gomperts, R. E. Stratmann, O. Yazyev, A. J.
Austin, R. Cammi, C. Pomelli, J. W. Ochterski, P. Y. Ayala, K. Morokuma, G. A.
Voth, P. Salvador, J. J. Dannenberg, V. G. Zakrzewski, S. Dapprich, A. D.
Daniels, M. C. Strain, O. Farkas, D. K. Malick, A. D. Rabuck, K. Raghavachari,
J. B. Foresman, J. V. Ortiz, Q. Cui, A. G. Baboul, S. Clifford, J. Cioslowski, B.
B. Stefanov, G. Liu, A. Liashenko, P. Piskorz, I. Komaromi, R. L. Martin, D. J.
Fox, T. Keith, M. A. Al-Laham, C. Y. Peng, A. Nanayakkara, M. Challacombe,
P. M. W. Gill, B. Johnson, W. Chen, M. W. Wong, C. Gonzalez, and J. A.
Pople, Gaussian, Inc., Wallingford CT, 2004
5. Frisch, A. E. and M. J. Frisch, Gaussian 03 User’s Reference, 2nd Edition,
Gaussian, Inc., available at:
http://www.gaussian.com
6. Hehre et al., Ab Initio Molecular Orbital Theory, Wiley-Interscience, 1986,
ISBN 0471812412
7. Sosa, C. P., et al., “Ab-initio quantum chemistry on a cc-NUMA architecture
using OpenMP,” Parallel Computing, 26, pages 843-856, 2000
12
Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3
8. Sosa, C. P. and S. Andersson, Gaussian benchmarks put the pSeries 690
server through its paces, February 2002, available at:
http://www.ibm.com/servers/esdd/articles/gauss_bench/index.html
9. Ab Initio Quantum Chemistry on the IBM pSeries 690: A Comparison
Between Turbo 1.3 GHz and Turbo 1.1 GHz, REDP-0444
10.Sosa, C. P., B. V. Atyam, and J. Hilsabeck, in preparation
11.Case et al., AMBER 7 User’s Manual, University of California
12.For more information about AMBER on IBM systems, visit:
http://www.msi.umn.edu/~cpsosa/ChemApps/MolMech/amber/amber.html
13.Altschul, S. F., et al., “Basic Local Alignment Search Tool,” Journal of
Molecular Biology, Volume 215, Issue 3, pages 403-410, October 5, 1990
14.Setubal, C.,and J. Meidanis, Introduction to Computational Molecular Biology,
PWS Publishing, 1997, ISBN 0534952623
15.Sanger Institute Human Genome Server, available at:
http://www.ensembl.org/Homo_Sapiens/
16.BLAST Throughput Benchmarks: mmap versus read, REDP-3692
The team that wrote this Redpaper
This Redpaper was produced by a team of specialists.
Carlos P. Sosa (IBM and University of Minnesota Supercomputing Institute,
IBM Eserver Solutions Enablement) is a Senior Technical Staff Member in the
Systems Group, where he has been a member of the IBM Chemistry and Life
Sciences high-performance effort since 2001. For the last 18 years, he has
focused on scientific applications with an emphasis in Life Sciences, parallel
programming, benchmarking, and performance tuning. Carlos received a Ph.D.
degree in Physical Chemistry from Wayne State University. He completed his
post-doctoral work at the Pacific Northwest National Laboratory. His research
interests are in the area of new pSeries architectures, Blue Gene, and molecular
cell biology. He is currently working with researchers at the University of
Minnesota trying to classify the vertebrate secretome.
Balaji V. Atyam is a Senior Software Engineer in the Systems and Technology
Group since 2000. His responsibilities are porting, benchmarking, performance
tuning, parallel programming, and technical consulting services to key
independent software vendors (ISV) in the area of High Performance Computing
on IBM Eserver. He received his Ph.D. in Applied Mathematics from Indian
Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3
13
Institute of Technology, Roorkee, India. He was a Scientist/Engineer in Indian
Space Research Organization (ISRO) prior to joining IBM.
Peter Heyrman is a Senior Technical Staff Member in Rochester, MN. He has
worked at IBM for 24 years. He previously worked on iSeries TPC-C performance
and currently works on the IBM Eserver Hypervisor.
Naresh Nayar is a Senior Technical Staff Member with the Systems and
Technology Group at IBM in Rochester, MN. He joined IBM in 1992 and has
worked on many i5/OS kernel projects with a focus on synchronization primitives
and task dispatching. His most recent work has been in the area of LPAR for
iSeries and pSeries systems, and he holds numerous patents in the area of
partitioning and kernel design. He holds a Bachelor of Technology degree in
Electrical Engineering from the Indian Institute of Technology, New Delhi, India,
and M.S. and Ph.D. degrees in Computer Science from Iowa State University.
Jeri Hilsabeck is the Manager of Integrated and Sector Solutions Enablement.
She has 19 years of experience in the computer industry. She holds a B.A. in
Computer Science from The University of Texas at Austin. Prior to nine years of
management, her area of expertise was in development of the AIX operating
system.
Thanks to the following people for their contributions to this project:
CPS would like to give special thanks to Sam Ellis from IBM Rochester for
facilitating our interaction with the Hypervisor team at IBM Rochester (P.
Heyrman and N. Nayar).
We also would like to thank Scott Vetter for his help and the use of Figure 1 on
page 3, which is part of the IBM Redbook Advanced POWER Virtualization on
IBM Eserver p5 Servers: Introduction and Basic Configuration, SG24-7940.
We thank Dr. Joel Tendler for valuable discussions and suggestions about how to
present some of our results.
CPS would like to thank Bruce Hurley for encouraging and supporting this effort.
14
Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3
Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area.
Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM
product, program, or service may be used. Any functionally equivalent product, program, or service that
does not infringe any IBM intellectual property right may be used instead. However, it is the user's
responsibility to evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document.
The furnishing of this document does not give you any license to these patents. You can send license
inquiries, in writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION
PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer
of express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may
make improvements and/or changes in the product(s) and/or the program(s) described in this publication at
any time without notice.
Any references in this information to non-IBM Web sites are provided for convenience only and do not in any
manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the
materials for this IBM product and use of those Web sites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without
incurring any obligation to you.
Any performance data contained herein was determined in a controlled environment. Therefore, the results
obtained in other operating environments may vary significantly. Some measurements may have been made
on development-level systems and there is no guarantee that these measurements will be the same on
generally available systems. Furthermore, some measurement may have been estimated through
extrapolation. Actual results may vary. Users of this document should verify the applicable data for their
specific environment.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm
the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on
the capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to the names and addresses used by an actual business
enterprise is entirely coincidental.
© Copyright IBM Corp. 2005. All rights reserved.
15
®
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrates programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the
sample programs are written. These examples have not been thoroughly tested under all conditions. IBM,
therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy,
modify, and distribute these sample programs in any form without payment to IBM for the purposes of
developing, using, marketing, or distributing application programs conforming to IBM's application
programming interfaces.
Send us your comments in one of the following ways:
򐂰 Use the online Contact us review redbook form found at:
ibm.com/redbooks
򐂰 Send your comments in an email to:
redbook@us.ibm.com
򐂰 Mail your comments to:
IBM Corporation, International Technical Support Organization
Dept. JN9B Building 905, 11501 Burnet Road
Austin, Texas 78758-3493 U.S.A.
Trademarks
The following terms are trademarks of the International Business Machines Corporation in the United States,
other countries, or both:
AIX 5L™
AIX®
Eserver®
Hypervisor™
ibm.com®
IBM®
iSeries™
Micro-Partitioning™
POWER4™
POWER5™
PowerPC®
POWER™
pSeries®
Redbooks (logo)
TotalStorage®
Other company, product, and service names may be trademarks or service marks of others.
16
Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3
™
Download