Chemistry in Parallel Computing Brian W. Hopkins Mississippi Center for Supercomputing Research 29 January 2009 What We’re Doing Here • Discuss the importance of computational chemistry in the HPC field. • Define some common terms in computational chemical sciences. • Discuss the two major branches of computational chemistry. • Discuss the particular needs of various computational chemistry applications and methodologies. Why We’re Doing It • Computational chemistry is one of the driving forces for continued investment in HPC infrastructure. – – – – – Better System Use More Production Less Money More Opportunities &c. • Stop me if there are questions! The Primacy of Computational Chemistry in HPC • Nationwide, computational chemistry and molecular biology consume a very large share of HPC resources. Scientific Discipline %of total Allocation % of Total Usage Molecular Biosciences 19 23 Physics 17 23 Astronomical Sciences 14 19 Chemistry 12 21 Materials Research 10 4 Chemical, Thermal Systems 8 6 Atmospheric Sciences 7 7 Advanced Scientific Computing 3 2 All Others 10 109 54% Total Use! • Here at UM and MCSR, CC is even more important: Computational Chemistry at UM/MCSR • Quantum programs are the biggest consumer of resources at MCSR by far: – Redwood: 99% (98 of 99 jobs) – Mimosa: 100% (86 of 86 jobs) – Sweetgum: 100% (24 of 24 jobs) • The one job in this snapshot that was not a QC job was an AMBER MD simulation. • This is typical. Computational Chemistry: A Sort-Of Dichotomy • Quantum chemistry is the attempt to solve the molecular electronic Schrodinger equation, and to compute chemical properties therefrom. • Molecular dynamics is the attempt to simulate the motion of atoms and molecules in space over short (1-10ns) timespans. • There is actually some overlap between the two. Quantum Chemistry: Overview • The equations that describe chemical behavior are known: E H • While known, these equations are not solvable by any analytic approach. • The basic problem: interdependence of a very large number of electronic coordinates. • While analytic solutions are not available, approximate numerical solutions are. The Polynomial Scaling Problem • Because of the complexity of the Schrodinger equation, the baseline QC method (HF theory) scales with the system size N as O(N4). • More accurate methods scale from O(N4) -- O(N8). • The very best method scales with (get this) O(N!). • “System size” here is some cooked-up number generated by hashing the number of electrons, the number of orbitals, symmetry, &c. • The polynomial scaling problem applies to every resource used by a job: CPU time, memory, disk, everything. A Word on Alphabet Soup • Always remember that the Schrodinger Equation cannot be solved; we’re always working at some level or approximation – – – – – – – – – – HF DFT MP2 MP4 CCSD, CISD CCSD(T) CCSDT, CISDT CCSDTQ, CISDTQ … FCC, FCI Increasing accuracy Increasing expense Decreasing scalability Decreasing availability • The fewer approximations we make, the better the results (and the more the calculation costs). Iteration in Quantum Chemistry • To solve the interdependence of coordinates, QC programs rely on iteration. • A guess is made for the location of each electron; that guess is processed; lather, rinse, repeat. • When the solution stops changing, you’re done. • The converged solution gives both a total energy of the molecule and a wavefunction that decribes its state. So…What, Exactly, Is This Program Doing? • Building a guess wavefunction, represented by a huge 4D matrix of double-precision numbers. • Processing that matrix in a variety of ways (mostly matrix multiplies and inversions) • Diagonalizing the matrix. • Using the resulting eigenvectors to build a new guess. • Iterate until self-consistency. Common Chemical Properties • Many common chemical properties are computed by building derivatives of the molecular electronic wavefunction. – – – – molecular structures harmonic vibrational frequencies polarizabilities &c. • These derivatives can be calculated analytically or numerically. Geometry Optimization • • • One extremely common job type is the geometry optimization. Procedure: – Start with a guess set of nuclear coordinates – Compute the wavefunction for the molecule – Compute the derivative of the wavefunction with respect to the nuclear coordinates – Adjust the nuclear coordinates – Repeat until the derivative is within tolerance of zero in every dimension Note that this is a nested iteration: we’re iterating to build a wavefunction, Requested convergence on RMS density matrix=1.00D-08 within 64 cycles. Requested convergence on MAX density matrix=1.00D-06. SCF Done: E(RHF) = -565.830259809 A.U. after 16 cycles Convg = 0.7301D-08 -V/T = 2.0017 S**2 = 0.0000 • then we’re iterating again to find a geometry Item Maximum Force RMS Force Maximum Displacement RMS Displacement Value 0.000289 0.000078 0.037535 0.006427 Threshold Converged? 0.000015 NO 0.000010 NO 0.000060 NO 0.000040 NO Analytic vs. Numerical Derivatives • Computing derivatives can be done two ways: – analytically, if the relevant functional form is in the code • add significant expense relative to the underlying energy point • often not as scalable as the corresponding energy point calculation – numerically, by finite displacements of the relevant properties • always available; just do lots (and lots, and lots) of energy points (3N-5 internal coordinates) • embarrassingly parallel Scaling a Quantum Chemistry App • QC apps tend not to be very scalable. • There’s often no really good way to decompose the problem for workers. – symmetry blocks excepted • As a result, these codes are extremely talky • Talkiness is mitigated somewhat by use of specialized MP libs (TCGMSG). • Also, the biggest jobs tend to be I/O intensive, which murders performance. • SMP is better than MP, but limited by machine size (watch out!) The Scaling Wall • Gaussian scales to ~8 procs in the very best cases; many jobs will not scale at all. • NWChem will scale for most jobs to a few dozen procs; some jobs to just a handful. • MPQC will scale to many procs, but functionality is limited. • All parallel QC programs show some limited soft scaling • Always consult program manuals for scalability of a new method. • For most quantum chemists, the principal utility of a big machine like redwood is for running a large number of jobs on a few procs each. Quantum Chemistry and the Computing Specialist • User-set parameters: – the molecule to be studied – the set of orbitals used to describe the molecule (ie, basis set) – the level of approximation used to compute • Opportunities for user guidance: – – – – what program to use? how to build/optimize that program for a platform how to effectively run the program on the machine identification of common pitfalls (and pratfalls, too) • PARALLEL PROJECTS, SERIAL JOBS Molecular Simulation Methods • Basic idea: do a very rudimentary energy calculation for a very large number of atomic configurations; translate these energies into thermodynamic properties via the molecular partition function • Configurations can be determined either deterministically (MD) or stochastically (MC), but that doesn’t matter. – we’ll look at MD as an example The Molecular Dynamics Procedure • Begin with a set of molecules in a periodic box – like Asteroids, only geekier • Compute instantaneous forces on every atom in the box – bonds, angles, dihedrals, impropers within molecules – vdW and coulomb forces for proximal atoms – kspace electrostatic forces for distal atoms • Allow the atoms to move for 0.5 -- 2 fs. • Repeat from 100,000 to 10,000,000 times • Occasionally print out atomic positions, velocities, forces, and thermodynamic properties • Most analysis done post-hoc. A Snapshot of Computational Demands • Bonded forces: – – – – bonds angles dihedrals impropers • Short-range pair forces: – van der Waals forces – coulomb forces • Long-range pair forces: – kspace Ewald sum for elctrostatics What’s an Ewald Sum? • The energy of interaction for a van der Waals pair falls away with r6. • Consequently there’s a point (~10A) where these interactions are negligible and can be excluded • We set this point as a cutoff and exclude vdW pairs beyond it. • By contrast, ES potential falls away with r. • Over the years, it’s been demonstrated that imposing even a very long cutoff on the ES part is an unacceptable approximation. • The Ewald sum is the current solution. Electrostatics in Fourier Space • When transformed into Fourier space (aka kspace), the ES part converges more rapidly. • Thus, the Ewald sum: – Real space ES for pairs within the vdW cutoff – Fourier (k-)space ES for longer range pairs until convergence is achieved (usually 5-7 layers of periodic images). – Fourier space ES for short range pairs as a correction against double counting. • And the particle-mesh Ewald sum: – Real space as in Ewald – Fourier space done by projecting atomic forces onto a grid, computing kspace part, projecting forces from grid back onto atoms The Cost of an Ewald Sum • An ordinary, real-space LRES calculation would eventually converge, but require >15 periodic images to do so. • The kspace LRES of Ewald is similarly pairwise and similarly scales with N2. • However, the more rapid convergence means we only need 5-7 periodic images. • On the other hand, we now have to do some extra stuff: – 3d FFT to generate kspace part – extra SR calcs for double counting correction – 3d FFT tp transform Ewald forces back to real space Particle-Mesh Ewald • The expense of an Ewald sum can be reduced by mapping atomic (particle) charges onto a grid (mesh) with far fewer points. • This grid can be transformed into kspace, the Ewald sum done, and transformed back. • The Ewald forces are then reverse-mapped onto the atoms that made up the grid. • This is an approximation, of which there are various flavors: – PME – SPME – PPPM – &c. • Scales as NlogN rather than N2. • Reduced scaling + extra mapping expense = crossover point? • In practice, crossover point is so low as to not matter. Scaling an MD Simulation • MD programs lend themselves moderately well to MP parallelism. • Typical approach is spatial decomposition; ie, each PE computes forces & integrates for a region within the simulation cell. • Talk points are limited: – – – – Pairlist builds 3D FFT for Ewald sum Mapping and unmapping for PME New atomic positions in integrator • Thus, the main issue is whether or not there’s enough work to go around --> soft scaling very pronounced • I/O can be an issue, esp. for large boxes; should be limited to once every 100-1000 steps, though. So, How Do MD Programs Do? • As usual, there’s a compromise to be made between “full featured” and “high performance”. • Old, heavily developed platforms like Amber have lots of features but only scale moderately well (to a few dozen procs). • New, high tech platforms like NAMD scale to hundreds of procs but lack common features (NH, &c.) • Again, all MD programs exhibit pronounced soft scaling: – bigger problems more accessible – smaller problems are no faster. Moving Between MD Programs • Trickier than moving between QC programs. • There are a lot of subtle things that must be considered: – – – – – available PME approaches different pairlisting algorithms trajectory synchronization scaling of 1-3 and 1-4 interactions &c. • It’s almost never possible to build a single research project on simulations run with two different programs • Thus, it’s critical to choose the right program for the whole job at the beginning. Molecular Simulation and the Computing Specialist • User-set parameters: – – – – the size of the box the length of the timestep the number of steps needed the forcefield features needed • Opportunities for guidance: – – – – Which program to use building/optimizing for a platform scaling limits for a job of given size &c. Summary • Quantum Chemistry: – listen to what your users need – help the user organize jobs into “parallel projects” – go shopping for the best-scaling program to do individual job types – programs are more or less perfectly interchangeable, with a little care • Molecular Simulation – listen to what your users need – help the user shop for the best program for their sim – be careful about what you choose, because you’ll be stuck with it