Molecular Simulation at MCSR Brian W. Hopkins Mississippi Center for Supercomputing Research 9 October 2008 What We’re Doing Here • Talk about molecular simulation procedures. • Talk about available simulation programs. • Discuss user simulation needs and program options. Why Simulation • NOT so we can watch pretty pictures of atoms moving around! • Many systems have very large numbers of internal DoF. • It’s simply not possible to determine what is “the minimum” or whatever on these surfaces. • To get an accurate thermodynamic & kinetic picture of these systems, it’s necessary to sample a medium->large region of phase space. Simulation: Anatomy of A Program • Read in sim paramters, atomic coordinates and topology. • Begin Timestepping – – – – – – Apply periodic boundary conditions. Build neighbor list. Compute bonded forces. Compute short-rang nonbonded forces. Compute electrostatic forces. Integrate forward in time. • Write atomic positions, velocities and forces. • Take data for statistical accumulators. – Repeat. • End timestepping. • Print restart data. Program Anatomy: Read Control nafion #restart temperature steps timestep 298.00 200000 0.00050 ps ensemble nve evbconv 1e-6 movlp 1 cutoff rvdw delr 10.0 angstrom 9.0 angstrom 0.00 angstrom Program Anatomy: Read Config hcl in water 2 1 2000000 21.5518556898 0 0 21.5518556898 0 0 …… OW 7 15.994900 -0.820000 4.5231E-01 4.3198E+00 -6.6975E+00 4.7263E+00 3.4690E-01 -6.3309E+00 -1.9170E+04 5.2043E+03 -2.1576E+03 HW 8 1.007800 0.410000 5.1477E-01 5.3695E+00 -6.6887E+00 1.0886E+01 -6.6502E+00 2.6571E+01 -6.1635E+02 -8.2402E+03 2.6958E+03 HW 9 1.007800 0.410000 -5.5940E-01 4.0344E+00 -6.5708E+00 7.6145E+00 -7.4611E-01 -1.0319E+01 1.8438E+04 1.8167E+03 8.6264E+02 … 0.001 0 0 21.5518556898 Program Anatomy: Read Topology UNITS kcal MOLECULAR TYPES 4 FLEXIBLE-TIP3P-Water NUMMOLS 332 atoms 3 OW 15.9949 -0.8200 HW 1.0078 0.4100 HW 1.0078 0.4100 BONDS 2 harm 1 2 1059.162 1.012 harm 1 3 1059.162 1.012 ANGLES 1 harm 2 1 3 75.900 FINISH …… vdw 9 OW OW lj 0.155425 OT OW lj 0.1238000 HT OW lj 0.0025115 113.240 3.16549 3.1420000 1.5827460 Program Anatomy: Set up Cell Program Anatomy: Build Neighbor List • • • • Apply PBC to cell Begin at an atom Step out, forming thin shells. When an atom falls within the thin shell, add it to the neighbor list. • Continue until cutoff is reached. • Move on to the next atom. Pairlisting Bonded Forces • Identify atoms that are connected by bonds, angles, dihedrals and impropers. • Calculate energies and forces from functions, parameters in input: – Bonds: harmonic, morse – Angles: harmonic – Dihedrals: cosine – Impropers: cosine Short Range Forces • • • • Loop over atoms. For each: Loop over pairs Calculate van der Waals potential, force Calculate short-range (real) ES potential, force • Scale for 1-4 interactions? Long-Range Forces • Because vdW forces obey a 6-12 potential, they fall away very quickly wrt distance. • Consequently, as you move away from an atom through a simulation cell you quickly come to a point where including additional vdW terms is irrelevant. Hence, the cutoff. • Coulomb forces and energies, however, fall off with r2 and r respectively. • As a result, no reasonable cutoff will be satisfactory; results will always be strange. Solution: Ewald Summation • The coulomb equations converged much more quickly in Fourier space. • So, we can transform all atomic coordinates to Fourier space, compute ES energies until some convergence criterion is reached, then transform the forces and energies back to real space. • This is called an Ewald sum; almost all modern simulations use some form of it. Ewald Sums Are Expensive • Ewald sums still require computing interactions between atoms and many layers of periodic images. • Fourier transforms are demanding. • As much as 80% of a modern simulation is spent in the Ewald routines. – 15% in pairlisting/vdw and just 5% everywhere else Speeding up the Ewald sum • Computational approaches: – Smarter decompositions – Faster comms – Single precision FFT • Physical approximations: – “Multiple timestepping” – Particle mesh approaches The Particle Mesh Ewald Sum • Atomic charges are first mapped onto a regular grid. • The gridpoints are transformed, worked on, and transformed back. • Forces are then decomposed from grid back onto atoms. PME Costs and Benefits • Scales with NlogN rather than N2 • Does entail extra cost to map/unmap atomic charges to grid. – Crossover point? • Some energetic slop is associated with these methods. • Some flavors (PPPM, &c.) may interfere with thermo and barostats. Integration • Once a full set of forces has been calculated, the atoms are allowed to move in response to those forces • This is done by calculating integral forms of Newton’s Laws. – Verlet, leapfrog, velocity verlet, &c. • How far can you integrate forward? Accumulation • Recall that the whole point here is to take large scale statistical averages of thermdynamic properties. • Some of these properties are a lot easier to calculate inside the program than they will be to calculate from trajectories later. • So, every so often, the program needs to take a second and compute – Properties in the current step – Rolling average from the previous steps. Output: Accumulators ---------------------------------------------------------------------------------------------------------------------------------step time(ps) cpu (s) eng_tot eng_pv volume temp_tot temp_rot temp_shl eng_cfg vir_cfg eng_shl eng_vdw vir_vdw vir_shl eng_cou vir_cou alpha eng_bnd vir_bnd beta eng_ang vir_ang gamma eng_dih vir_con vir_pmf eng_tet vir_tet press eng_kin conserv ---------------------------------------------------------------------------------------------------------------------------------1 8.7047E+01 8.9004E+02 -3.4994E+01 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 1.2204E+02 0.001 8.7047E+01 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 1.98 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 9.0000E+01 9.0000E+01 9.0000E+01 0.0000E+00 0.0000E+00 rolling averages 8.7047E+01 8.9004E+02 -3.4994E+01 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 1.2204E+02 8.7047E+01 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 9.0000E+01 9.0000E+01 9.0000E+01 0.0000E+00 0.0000E+00 ---------------------------------------------------------------------------------------------------------------------------------- Output: Trajectory timestep OW 0.0000E+00 4.4176E-01 -1.1626E+04 HW 0.0000E+00 1.7204E-01 -4.1073E+03 HW 8.9567E-01 7.7140E+01 3.2399E+04 OW 2.4602E+00 -1.1793E+01 -3.5980E+04 HW 3.0701E+00 9.2834E+00 2.2288E+04 1 48 2 1 15.994900 -0.820000 0.0000E+00 0.0000E+00 -2.3695E+00 -4.4226E+00 -3.2094E+04 -1.7497E+04 2 1.007800 0.410000 0.0000E+00 9.4700E-01 1.3940E+01 5.0345E+01 1.5125E+03 3.6299E+04 3 1.007800 0.410000 0.0000E+00 -3.1667E-01 4.7094E+00 -3.8531E+01 -1.4520E+03 -1.4832E+04 4 15.994900 -0.820000 -1.0000E-06 -9.0073E-01 -1.9654E-01 -5.1510E-01 7.3873E+02 8.1414E+03 5 1.007800 0.410000 -1.0000E-06 -1.7237E-01 -2.3632E+00 6.3046E+01 -5.2163E+01 2.1425E+04 0 0.001000 Anatomy of a Simulation Project • Identify a problem – Molecular system – Region of phase space • • • • • Build a system Minimize Equilibrate Run MD Analyze Choosing a Program • How intense is your job? • Do you need particular features? – Force functions – Thermostats – Barostat – Sampling methods – &c. • Ask for help!! Building Your System • It’s nontrivial to place 1.5 million atoms in a cubic simulation cell. • Biomolecules: PDB, homology modelling, &c. • Others: replication of small boxes. • Lots of tools for more complicated cases. • Sometimes this is the hardest part. Minimization • You probably didn’t do a very good job building a box. • If anything about your box makes the forcefield angry, it can explode under dynamics. • Solution is to run minimization for some time to eliminate bad contacts. Equilibration • Minimization brings your box down to ~0K. • You need to heat it back up to the intersting temperature. • Assign velocities to atoms from a Boltzmann distribution. • This will be wacky; temperature and energy will be unstable. • To fix, run some “equilibration time” – Don’t take statistics – Periodically scale atomic velocities Run • Typical MD runs use a 0.5-2.0 fs timestep and run for 100ps-2ns. • This means anywhere from 50,000 to 4 million timesteps. • Trajectories are typically output every100-1000 timesteps. • Statistics are taken every 100 or so timesteps. How Do I Know My Simulation Doesn’t Suck? • Is it conserving what it’s supposed to be conserving? • Is it conserving what it’s not really supposed to be conserving? • Are you in a region of phase space that interests you? • Have you checked for common failures? – Flying icecube, &c. My Simulation is Over. What Now? • Now you have large amounts of – thermochemical data – atomistic trajectories • From this you can calculate stuff – rates of reactions – diffusion coefficients – structural properties (rdf, sf, &c.) How Do I Do That? • Canned analysis programs – VMD! • Write your own programs – Everybody does this! • We may talk about this another time. What We Can Do For You • Help you select simulation software – Almost all MD software is free, so we can get and install most packages you could want. • Help you set up your simulation cell. • Help you manage your jobs as they run. • Help you write, parallelize, and run analysis code.