Studying Protein Folding on the Grid:
Experiences Using CHARMM on NPACI
Resources under Legion
University of Virginia
Anand Natrajan
Marty A. Humphrey
Anthony D. Fox
Andrew S. Grimshaw
Scripps (TSRI)
Michael Crowley
Charles L. Brooks III
SDSC
Nancy Wilkins-Diehr http://legion.virginia.edu
anand@virginia.edu
• CHARMM
– Issues
• Legion
• The Run
– Results
– Lessons
• Portals
• Summary
• Routine exploration of folding landscapes helps in search for protein folding solution
• Understanding folding critical to structural genomics, biophysics, drug design, etc.
• Key to understanding cell malfunctions in
Alzheimer’s, cystic fibrosis, etc.
• CHARMM and Amber benefit majority (>80%) of bio-molecular scientists
• Structural genomic & protein structure predictions
Molecular
Dynamics Simulations
100-200 structures to sample
(r,R gyr
) space r
R gyr
• Immunoglobulin-binding protein
– 62 residues (small), 585 atoms
– 6500 water molecules, total 20085 atoms
– Each parameter point requires O(10 6 ) dynamics steps
– Typical folding surfaces require 100-200 sampling runs
• CHARMM using most accurate physics available for classical molecular dynamics simulation
• Multiple 16-way parallel runs - maximum efficiency
• Parameter-space study
– Parameters correspond to structures along & near folding path
• Path unknown - could be many or broad
– Many places along path sampled for determining local low free energy states
– Path is valley of lowest free energy states from high free energy state of unfolded protein to lowest free energy state (folded native protein)
• Many independent runs
– 200 sets of data to be simulated in two sequential runs
• Equilibration (4-8 hours)
• Production/sampling (8 to 16 hours)
• Each point has task name, e.g., pl_1_2_1_e
Complete, Integrated Infrastructure for Secure Distributed Resource
Sharing
• Wide-area
• High Performance
• Complexity
Management
• Extensibility
• Security
• Site Autonomy
• Input / Output
• Heterogeneity
• Fault-tolerance
• Scalability
• Simplicity
• Single Namespace
• Resource Management
• Platform
Independence
• Multi-language
• Legacy Support
• Provide improved response time
• Access large set of resources transparently
– geographically distributed
– heterogeneous
– different organisations
6 organisations
6 queue types
10 queues
6 architectures
~1000 processors
HP SuperDome
CalTech
440 MHz PA-8700
128/128
IBM SP3
UMich
375MHz Power3
24/24
DEC Alpha
UVa
533MHz EV56
32/128
IBM Blue Horizon
SDSC
375MHz Power3
512/1184
Sun HPC 10000
SDSC
400MHz SMP
32/64
IBM Azure
UTexas
160MHz Power2
32/64
• Binaries for each type
• Script for dispatching jobs
• Script for keeping track of results
• Script for running binary at site
– optional feature in
Legion
• Abstract interface to resources
– queues, accounting, firewalls, etc.
• Binary transfer (with caching)
• Input file transfer
• Job submission
• Status reporting
• Output file transfer
Register binaries
Dispatch specification
24%
1%
2%
1%
1%
0%
71%
SDSC IBM
CalTech HP
UTexas IBM
UVa DEC
SDSC Cray
SDSC Sun
UMich IBM
• Network slowdowns
– Slowdown in the middle of the run
– 100% loss for packets of size ~8500 bytes
• Site failures LEGION
– LoadLeveler restarts
– NFS/AFS failures
• Legion
– No run-time failures
UMich
– Archival support lacking
– Must address binary differences
SDSC
01101
UVa
• Science accomplished faster
– 1 month on 128 SGI Origins @Scripps
– 1.5 days on national grid with Legion
• Transparent access to resources
– User didn’t need to log on to different machines
– Minimal direct interaction with resources
• Problems identified
• Legion remained stable
– Other Legion users unaware of large runs
• Large grid application run at powerful resources by one person from local resource
• Collaboration between natural and computer scientists
Easy Interface to Grid
• Simple point-and-click interface to Grids
– Familiar access to distributed file system
– Enables & encourages sharing
• Application portal model for HPC
– AmberGrid
– RenderGrid
– Accounting
Transparent Access to Remote Resources
Intended Audience is
Scientists
(Distributed
File System)
( Legion )
( Chime )
• CHARMM Run
– Succeeded in starting big runs
– Encountered problems
– Learnt lessons for future
• AmberGrid
– Showed proof-of-concept - grid portal
– Need to resolve licence issues