“Grid Platform for Drug Discovery” Project Mitsuhisa Sato Center for Computational Physics, University of Tsukuba, Japan UK Jpana N+N Meeting 2003/10/3 1 Our Grid Project • JST-ACT program: “Grid platform for drug discovery”, funded by JST(Japan Science and Technology Corporation), 1.3 M$/ 3 years started from 2001 – Tokushima University, Toyohashi Inst. Of Tech., University of Tsukuba, Fuji Res. Inst. Corp. • ILDG: International Lattice QCD Data Grid – CCP, U. of Tsukuba, EPCC UK, SciDAC US. – Design of QCDML – QCD Meta database by web services, QCD data sharing by SRM and Globus replica … UK Jpana N+N Meeting 2003/10/3 2 High Throughput Computing for drug discovery • Exhaustive parallel conformation search and docking over Grid • Accumulation computing results into large scale database and reuse • High performance ab initio MO calculation for large molecules on clusters “Combinatorial Computing” Using Grid UK Jpana N+N Meeting 2003/10/3 3 Grid applications of our drug-discovery • Conformation search : find possible confirmations • Docking search: compute energy of combination of molecules • Quantitative Structure-Activity Relationships (SQAR) analysis: finding rules of drug design Drug libraries Conformations Docking Search Conformation Search CONFLEX-G Grid enabled Conformation Search application UK Jpana N+N Meeting target Docking Computation results (using ab initio MO calculation) MO in clusters Job submission for MO Coarse-grain MO for Grid (REMD, FMO) 2003/10/3 4 QSAR analysis Design of XML for results Web service interface CONFLEX • Algorithm: tree search – Local conformation changes – Initial conformation selection • We are implementing with OmniRPC Conformation search tree – Tree search action is dynamic!!! Corner Flap Edge Flip Gauche ∆E=0.9 kcal/mol Stepwise Rotation Gauche + ∆E=0.9 kcal/mol Anti ∆E=0.0 kcal/mol UK Jpana N+N Meeting 2003/10/3 5 Gird Platform for drug discovery Univ. of Tsukuba AIST Control & monitoring Development of large-scale •Scheduling and ab-initio MO program monitoring of computations Database of MO •distributed data base request calculation results management •Design of Grid middleware request request request Toyohashi Inst. Of Tech. Tokushima Univ. wide-area network Cluster for CONFLEX development of conformation search program (CONFLEX) 3D structure database for drug design Database for CONFLEX results UK Jpana N+N Meeting 2003/10/3 6 What can Grid do? Parallel Applications, programming, and our view • “Typical” Grid Applications – Parametric execution: Execute the same program with different parameters using an large amount of computing resources – master-workers type of parallel program • for grid “Typical” Grid Resources Our View – A Cluster of Clusters: some PC Clusters are available – Dynamic resources: load and status are changed time-to-time. Grid Grid Environment Environment PC PC PC PC PC Cluster PC PC PC PC PC Cluster PC PC PC PC PC Cluster UK Jpana N+N Meeting 2003/10/3 7 Parallel programming in Grid – Using Globus shell (GSH) • Submit batch job scripts to remote nodes • staging and workflow – Grid MPI (MPICH-G, PACX MPI, …) • General-purpose, but difficult and error-prone • No support for dynamic resource and fault-tolerance • No support for Firewall, clusters with private network. – Grid RPC • a good and intuitive programming interface • Ninf, NetSolve, … OmniRPC UK Jpana N+N Meeting 2003/10/3 8 Overview of OmniRPC A Grid RPC system for parallel computing • Provide seamless parallel programming environment from clusters to grid. – It use “rsh” for a cluster, “GRAM” for a grid managed by Globus, “ssh” for a conventional remote nodes. – Program development and testing in PC clusters – Product run in Grid to exploit huge computing resources – User can switch configuration with “host file” without any modification • Make use of remote clusters of PC/SMP as Grid computing resource – Support for clusters in firewall and private address Host file PC <? xml version=“1.0 ?> <OmniRpcConfig> <Host name=“dennis.omni.hpcc.jp” > <Agent invoker=“globus” mxio=“on”/> <JobScheduler type=“rr” maxjob=“20”/> </Host> </OmniRpcConfig> UK Jpana N+N Meeting 2003/10/3 PC PC PC PC PC Grid Environment PC PC PC PC Cluster PC Cluster PC Client 9 PC PC PC PC PC PC Cluster PC PC Cluster Overview of OmniRPC (cont.) • Easy-to-use parallel programming interface – A gridRPC based on Ninf Grid RPC – Parallel programming using asynchronous call API – The thread-safe RPC design allows to use OpenMP in client programs • OmniRpcRequest reqs[100]; OmniRpcInit(&argc, &argv); Support Master-workers parallel programs for parametric search grid applications – Persistent data support in remote workers for applications which requires large data • int main(int argc, char **argv) { int i, A[100][100],B[100][100][100],C[100][100][ Monitor and performance tools UK Jpana N+N Meeting 2003/10/3 10 } for(i = 0; i< 100; i++) reqs[i] = OmniRpcCallAsync(“mul”,100, B[i], A OmniRpcWaitAll(100,reqs); . OmniRpcFinalize(); return 0; OmniRPC features • need Globus? – No, you can use “ssh” as well as “globus” – It is very useful for an application people. – “ssh” can solve “firewall” problem. • Data persistence model? – Parameter search type application need to share the initial data. – OmniRPC support it. • Can use many (remote) clusters? – Yes, OmniRPC supports “cluster of clusters”. • How to use in different machine and environment ? – You can switch the configuration by “config file” without modification on source program. • Why not “Grid PRC” standard? – OmniRPC provides high level interface, to avoid “scheduling” and “faulttolerance” from users. UK Jpana N+N Meeting 2003/10/3 11 OmniRPC Home Page http://www.omni.hpcc.jp/omnirpc/ UK Jpana N+N Meeting 2003/10/3 12 Conflex from Cluster to Grid • For large bimolecules, the number of combinational trial structure will be huge! • Geometry optimization of large molecular structures requires more time to compute! • Geometry optimization phase takes more than 90% in total execution time • So far, executed on PC Cluster by using MPI Grid allows to use huge computing resources to overcome these problem! UK Jpana N+N Meeting 2003/10/3 13 Our Grid Platform Univ. of Tsukuba Dennis Cluster Dual P4 Xeon 2.4GHz 10 nodes Alice Cluster Dual Athlon 1800+ 14 nodes Toyohashi Univ. of Tech. Toyo Cluster Dual Athlon 2000+ 8 nodes Tsukuba WAN Tokushima Univ. Toku Cluster P3 1.0GHz 8 nodes UK Jpana N+N Meeting SINET 2003/10/3 14 AIST UME Cluster Dual P3 1.4GHz 32 nodes Summary of Our Grid Environment Cluster Machine overview # of Nodes RTT* (ms)# Throughput (MB/s)# Dennis Dual P4 Xeon 2.4GHz 10 - - Alice Dual Athlon 1800+ 14 0.18 11.22 Toyo Dual Athlon 1800+ 8 13.00 0.55 Toku P3 1GHz 8 24.40 0.69 UME Dual P3 1.4GHz 32 2.73 2.12 *Round-Trip Time # All measurement Dennis Cluster and Each Cluster UK Jpana N+N Meeting 2003/10/3 15 CONFLEX-G:Grid enabled CONFLEX • Parallelize molecular geometry optimization phase using Master/Worker model. • OmniRPC persistent data model (automatic initializable remote module facility) allows to reuse workers for each call. – Eliminate initializing worker program at every PRC. Selection of Initial Structure Local Perturbation Geometry Optimization Comparison & Store Conformation Database UK Jpana N+N Meeting 2003/10/3 16 PC PC PC PC PC Cluster A PC PC PC PC PC PC Cluster B PC PC PC PC Cluster C Experiment Setting • CONFLEX’s version: 402q • Test data: Two Molecular samples – C17 (51 atoms) – AlaX16a (181 atoms). • Authentication method :SSH • CONFLEX-G client program was executed on the server node of Dennis cluster • We used all nodes in clusters of our grid UK Jpana N+N Meeting 2003/10/3 17 Sample Molecules # of trial structure at one opt. phase (degree of parallelism) data C17 (51 atoms) AlaX16a (181 atoms) UK Jpana N+N Meeting Average exec. # of opt. time to opt. trial trial structures structure (s) 48 1.6 160 2003/10/3 300 18 Estimated total exec.time for all Trial structures in Dennis’s Single CPU (s) 522 835 320 96000 = 26.7(h) Comparison between OmniRPC and MPI in Dennis Cluster C17 (51 atoms, degree of parallelism 48) 10 times Speedup using OmniRPC T otal execution T im e (s) 1200 1000 800 S equential M PI O m niR P C Overhead of On-Demand Initialization of worker program in OmniRPC 600 400 200 0 1 2 4 8 N um ber of W orkers UK Jpana N+N Meeting 2003/10/3 19 16 20 Execution time of AlaX16a (181 atoms, degree of parallelism 160) Dennis+Alice+UME(112w) Alice+UME(92w) Dennis+UME(84w) Dennis+Alice(48w) UME(64w) Alice(28w) Dennis(20w) Dennis MPI(20w) 0 64 times Speedup 500 1000 1500 2000 2500 3000 3500 Total execution time (s) UK Jpana N+N Meeting 2003/10/3 20 Discussion • Performance of CONFLEX-G was observed to be almost equals to that of CONFLEX with MPI – Overheads to initialize workers was found. It will be required to imporve. • We could achieve performance improvement using multiple clusters, – A speedup of 64 on 112 workers in AlaX16a(181 atoms) – However … , In our experiment: • Each workers takes only one or two trial structures, too few! • Load in-balance occurs because exec. time of each opt. varies. • We expect more speed up for larger molecule. UK Jpana N+N Meeting 2003/10/3 21 Discussion (cont’d) • Possible improvement: – Exploit more parallelism • Parallelize the outer loop to increase the number of structure optimization at a time – Efficient Job Scheduling • Heavy jobs -> fast machines • light jobs -> slow machines – Can we estimate execution time ? – Parallelize worker program by SMP(OpenMP) • Increase the performance of worker • Reduce the number of workers UK Jpana N+N Meeting 2003/10/3 22 Summary and Future work • Conflex-G: Grid-enabled molecular confirmation search. – We used OmniRPC to make it grid-enabled. – We are actually doing product-run.. • For MO simulation (Docking), we are working on coarsegrain MO, as well as job submission – REMD (replica exchange program using NAMD) – FMO (Fragment MO) • For QSAR – Design of ML to describe computation results – Web service interface to access the database UK Jpana N+N Meeting 2003/10/3 23