Exploring Algorithm Space Variations on the Exchange Theme Daniel M. Zuckerman Department of Computational Biology School of Medicine University of Pittsburgh Goal • More efficient atomistic sampling, consistent with statistical mechanics • Take care with the meaning of “efficiency” Outline • Protein fluctuations in biology • Replica exchange simulation -- a second look • Resolution exchange simulation – Initial results – How to approach larger systems? • Exchange Variants • Assessing Sampling Transport Proteins Fluctuate - I Transport Proteins Fluctuate - II Motor Proteins Fluctuate Signalling Proteins Fluctuate Conformational Change Requires Fluctuation • Either ligand leaves free-like bound structure or ligand binds bound-like free structure (or nearly so) free free ligand bound ligand bound Biology Take-Home Message • Fluctuations are ubiquitous and essential – They are not a sideshow; they are the show! • Experimental structures are only snapshots -just the beginning of the story Key for medicinal chemists especially • Drug design via “docking” is a key practical use of molecular modeling – Typically, drug candidate molecules are fitted into static protein structures – Common lament: need to know protein fluctuations • Necessary for free energy calculations – e.g., binding affinity Questioning low RMSD in MD • Is 1.3 Å right? What is nature’s avg RMSD??? RMSD 1 - 1.5 Å time A Physical View of Fluctuations • Rough, high-dimensional energy landscape U x Simplest Physical Picture: Bistable system • Most phenomena can be understood from a toy picture x U x p x t Defining the Problem • We want a good sample of p(x) – “Equilibrium distribution” – “Complete canonical ensemble” – Probability density function – x is a vector in configuration space -- i.e., vector of all coordinates: (x1,y1,z1, x2,y2,z2, …) • In English: We want a set of structures distributed according their probability of occurrence at the specified temperature • Hard because we access p(x) only indirectly – Blind person feeling elephant It’s NOT optimization/search/minimization! • However, undiscovered sampling algorithms may be similar to search algorithms! The Problem with the Problem • It’s too hard!! • Present methods, implemented on standard computers, are inadequate by orders of magnitude -- think timescales – Simulations access nsec - msec timescales – Proteins fluctuate on nsec - sec timescales – 3-9 orders of magnitude short! • Today: taking steps toward the solution Theoretical/Computational Basics • Boltzmann factor p(x) exp U(x) k B T l3 2 • “Forcefield” (potential energy function) l2 1 – Configuration vector to real number l1 U(x) k1l1 l10 k1l2 l20 ... 2 1 2 1 2 2 2 2 1 ˆ ˆ k11 10 2 k12 20 ... 1 2 ... – Terms not shown: sterics, electrostatics, four-body (e.g., dihedral) U l10 l1 Exchange Schemes U x • Original idea: use higher temperature to facilitate barrier crossing [Swendsen, 1986] – Barriers are the real problem • Arrhenius law: U – rate ~ barrier’s Boltz. fac. k e U / kB T Ufwd x Exchange Ladder • High-temperature hops percolate down via configuration swaps ( temperature swaps) – Independent sim’s with occasional exchange attempts hot T 300K t Exchange attempts How does replica exchange work? • It’s just Monte Carlo • Physics view of Metropolis – Accept trial move: xold xtry with min[1,exp(-U/kT)] – U=U(xtry) - U(xold) • Probability view: – Accept with min[1, prob(try)/prob(old)] Exchange as simple Monte Carlo hot • Exchanges are only attempted in pairs • Two independent simulations – Probability for combined system is simple product: p = p1*p2 – Metropolis criterion: min[1, ptry / pold] time 300K T2 T1 pold px1;T1 px 2 ;T2 ptry px2;T1 px1;T2 Does replica exchange really help? • For a given investment of CPU time, is better fixed-T sampling achieved? – Compared to equal time direct simulation -- e.g., for a 20level ladder, a simulation 20 times as long • To my knowledge, no convincing evidence yet • Key: Sampling limited by top level • Worry 1: High T does not help with entropic barriers – Hard-to-find low energy pathways • Worry 2: High T not so helpful for low barriers – Simulations and experiments suggests barriers are low – Even for 600K simulation, only moderate speedup • 2kT 2.7 speedup • 4kT 7.4 speedup • 6kT 20.1 speedup exp U /kB 600K exp U /kB 300K Summary of Concerns re Replica Exchange • Efficiency limited by top level (highest T) • Highest T may not be fast enough for biomolecules – High T does not affect entropic barriers – Energy barriers may be low • Should work for sufficiently high energy barriers Can replica exchange be fixed? • Yes • Two improvements today • Plus a sketch of other variants Improvement (1): Pseudo-exchanges • Key: Need complete sampling U top level (highest T) • Work from top down …if we can “pseudo exchange” x hot 300K hot 300K time Top level can be generated with multiple simulations Anatomy of a Pseudo-Exchange • Point 1: Normal exchanges need not be performed at identical intervals – Not required in derivation of Metropolis criterion – Imagine one fast CPU & one slow CPU fast slow • Point 2: Imagine top-level CPU is extremely fast – Long intervals no correlations equil. dist. – Alternatively, view top level as “perfect” Monte Carlo equil. dist. • Conclusion: no need to continue top-level sim. from exchanged configuration can pull randomly each time from top level Two Ways to Use Pseudo Exchange • Same ladder • More widely spaced ladder – Lower acceptance OK since trials are cheap (serial) – No need for frequent attempts in parallel since few high T hops • Essentially guaranteed to be more efficient than standard parallel replica exchange. hotter! hot 300K time 300K Top-down test: Di-leucine Peptide • Two amino-acid peptide with two main conformations • 50 atoms (144 degrees of freedom) • Langevin dynamics; GBSA continuum solvent model – ALL SIMULATIONS Example: Di-leucine via two-level ladder • Di-leucine, a 50-atom peptide: two levels only b a T=500K T=298K using pseudoexchanges with shuffled 500K trajectory T=500K, shuffled Not really efficient T=500K T=298K • Boost to 500K only modestly increases hop rate – In 300nsec: 488 hops at 500K vs. 300 at 298K – Barriers are too low • Ordinary trajectories shown (no exchange) • Still should be better than parallel exchange sim. Improvement (2): Resolution Exchange Coarse Detailed • Canonical sampling in detailed model Dreams of multi-scale modeling • (At least) since Levitt and Warshel, Nature (1975) • Warshel -- free energy for detailed model based on coarse-grained reference (1999) • Brandt and collaborators -- complex multi-level formulation • Vendrusculo and coworkers -- ad hoc addition of atomic detail onto coarse structures • Resolution exchange is concrete, simple and general Improvement (2): Resolution Exchange • Qualitative picture COARSE detailed time Exchange attempts Implementing Resolution Exchange • Need – Formulate as exchange process – Derive acceptance criterion f2 l3 • Coarse model will use subset – Detailed (regular) model x = (l1,l2,l3, …, 1, 2, …, f1,f2, …) – Coarse model is subset, e.g., f = (f1,f2, …) – Arbitrary potential Ucoarse(f) -- i.e., pcrs(f) = exp[- Ucoarse(f) / kT ] – Simply exchange common coords. f1 2 l2 1 l1 Key Point: Subsets are natural for coarse models • Examples – Dihedrals only (fixed angles, lengths) – Backbone coordinates only – Side-chains by beta carbons • Proteins are branched chains f2 l3 f1 b 2 l2 1 l1 b b b Res-Ex Metropolis Criterion • The trial exchange coarse – From: (la,a,fa) and fb [“old”] – To: (la,a,fb) and fa [“try”] time detailed • Metropolis: min[1, ptot(try) / ptot(old)] • Final criterion – min[1,R] pdtl la ,a , fb pcrs fa R pdtl la ,a , fa pcrs fb CANONICAL SAMPLING FOR ALL COORDS, ALL LEVELS!!! Downside of Res-ex: more work! • The ladder needs to be engineered • Analogy to replica exchange: limit on difference between models – simple solution (later) • Implicit solvent: still hard and important problem COARSE detailed time Exchange attempts You can recycle! • Top-down approach (pseudo-exchanges) permits old trajectories to be exchanged into new – New temperature – New forcefield • Same or different numbers of coordinates • Minimal CPU cost, if original trajectory already crossed barriers Initial Results • • • • • Still early stages Verifying the algorithm Efficiency in a 50-atom di-peptide [A penta-peptide] Reduced models of proteins are reasonable Algorithm Check: Butane • Butane is C4H10 Line is from direct sim. f central dihedral Real Molecular Test: Di-leucine Peptide • Two amino-acid peptide with two main conformations • Exchange all-atom to united-atom (GBSA “solvent”) – eliminate non-polar H – 50 atoms to 24 “united atoms” united atom Initial Results: Res-ex really works • CPU Savings: Factor of 15 (including united-atom cost) Leucine free energy difference via Res-Ex • Gab measures if correct time spent in each state • Increased precision indicates speedup (first report??) • Cost of united-atom simulation included in graph From long brute-force sim. Comments • Results obtained from a two-level ladder • Faster sampling should be possible with more levels – Requires forcefield engineering • Can use higher temperature also – AND/OR softer parameters Spin Systems Too • Absolute spins ,,,,,,,,,,,,,,, • … or block spins as coarse variables () – Relative spins as detailed coordinates (+–) ;,,,,;,,,,;,,,,;,,, How do we progress from here? • Need an exchangeable ladder – But we have design criteria • Top level needs to explore important fluctuations A Possible Ladder 1. 2. 3. 4. 5. Backbone only (Go interactions) Backbone + beta-carbon “side-chains” United groups (quasi rigid) United atom All atom • • Each level omits specific internal coordinates Other levels may be needed Key Point: Resolution Difference is Tunable • Can (de)coarsen part of a molecule at a time – e.g., groups of 3 residues all coarse all detailed • Initial results: Met-enkephalin – Less overall CPU time for de-coarsening one residue at a time vs. whole molecule (for a fixed number of “hops”) – Order of magnitdue more efficient than single-step decoarsening – Poster by Ed Lyman Resolution Exchange Variants • Switching – Coarse sim. as MC trial coarse t detailed • Decorating – Sample coarse and detailed coordinates separately – Re-weight by true Boltzmann factor pfull l,, f pcrs f paddl l, • “Algorithm Space” has not been fully sampled! Annealing based approach: replica exchange variant hot cold • Can be re-weighted for canonical sampling at low T [Neal, 2001] Equivalent to Jarzysnki (exactly) l=0 l=1 So you’ve got a new method … How do we judge sampling quality? • Without enumerative technique, generally impossible to guarantee full sampling – Can’t know about unseen regions • Best we can hope for: proper distribution among states visited – Very difficult [new approach under study] • We can show: lack of convergence, even among visited states Previous Approaches • Stare at RMSD vs. time plot • Principal components – Mostly 2D visual inspection – How to quantify? • Van Gunsteren and co-workers: cluster counting – Fails to account for relative populations New Approach: cluster, then classify 1. Cluster via (e.g.) RMSD threshold 2. Choose reference structure from each cluster 3. Re-analyze trajectory, classifying (binning) each structure with closest reference • • • Classification is statistically “rigorous” Simple 1D histogram results Easy to implement for large proteins p Met-enkephalin: the old view • Is it converged? Evolution of Distribution 2nsec 4nsec 10nsec 50nsec 198nsec Self-referential comparison: 1st vs. 2nd half 4 nsec 20 nsec 100 nsec 198 nsec Conclusions • Sampling matters -- life runs on fluctuations • Parallel replica exchange has key limitations • Resolution exchange (+ top-down) offers hope – Good results using only two levels, single T – Much work to be done in completing a ladder – BUT: a concrete path to ever-increasing efficiency • Res-ex applies to molecular and spin systems and …? • Algorithm space is large -- many variants • Semi-systematic convergence analysis Acknowledgments • Edward Lyman • Marty Ytreberg, Svetlana Aroutiounian • Ivet Bahar, Robert Swendsen, Hagai Meirovitch, Carlos Camacho, Eva Meirovitch • Funding – NIH – Depts of Computational Biology, Environmental & Occupational Health A more complete picture • In configuration space x2 x1 If barriers are low, why are dynamics slow? • Too many barriers! U x Buildup Schemes • Stochastic growth of molecule – Not dynamics – Re-weighting using Boltzmann factor and distribution used for construction Multiple-histogram view of Replica Exchange • Temperature increments constructed to minimize overlap – Just enough to permit exchange – WHAM just fills out high-energy tail of coldest distribution p coldest (target) hottest E Top-down: how much CPU time saved? • Optimal: time spent at low T tiny – Cost is same as for high T • Down-side: limited by Arrhenius factor Top-down vs. Parallel: Rough Comparison • Typical standard replica exchange – 20 levels tuned to 20% exchange-acceptance ratio – 1 nsec each (106 snapshots/energy calls) – No need to attempt frequent exchanges due to relatively slow top-level dynamics/hopping • Compare to top-down + pseudo-exchange – 20 levels; only top level is 1 nsec • Attempt exchange every 10 steps • 104 steps 200 acceptances (many hops!) – Also can use higher T ladder (lower acceptance) Systematically checking convergence • Ambiguous results – Energy – RMSD vs. starting config Can we afford to climb down the ladder? • How many energy calls? – Depends on desired ensemble size: Say 104 – Assume 100 ladder levels; only 1% exchange acceptance (conservative) • Assume 108 energy calls – Top level: 9*107 (cheapest!) – 105 calls per lower level – Attempt every 10th step 104 attempts 100 exchanges (hops) – Almost every exchange will yield a new basin: good sampling! hot 300K Some Resolution Exchange Statistics • Di-leucine (UA to AA; OPLS) – Modified: 0.16% acceptance – Unmodified: 0.14% – Incremental (one residue at a time): ~2.5% (UA to mixed), ~0.25% (mixed to UA) • Met-enk (UA to AA; OPLS) – Whole molecule (75 atoms to 57): 0.09% acceptance -modified UA – Incremental (one residue at a time): so far, 10% acceptance for 3/5 levels -- modified UA -- ongoing • Comparison: Replica Exchange: 15-20% • Met-enk (top-down temperature exchange) – ~2% T ladder: 200K, 270, 367, 505, 950, 1305, 1810 – Comparison: max 700K with 15% exchange What’s wrong with NMR “ensembles”? • Determined by search/minimization approaches • Peak, not tails, of distribution Need proper distribution • 10 equi-prob regions, e.g. •10 structs: 1 per region •20 structs: 2 per region p p x x Hope at the top of the ladder • Reduced models capture large-scale fluctuations tendimistat Closer look at top (most reduced) level • Inexpensive “smart” models can be built with lookup tables (dihedrals/orientations) – – – – Ramachandran propensities Peptide-plane sterics Backbone H-bonding Beta carbon (hydrophobicity) • Go interactions can stabilize any model – Canonical sampling preserved by res-ex criterion [Dickerson & Geis] More Go-model fluctuations: ferrodoxin ferrodoxin More Go-Model Fluctuations: Protein G Protein G Sampling Strategies [to get p(x)] • “Direct” dynamics • [Build-up schemes] • Exchange dynamics – Temperature – Resolution Direct Dynamics • Dynamical trajectory x(t) histogram p(x) • Varieties of dynamics – All embody U(x); f = -dU/dx; Boltzmann dist. – Newtonian (“Molecular Dynamics”) – Langevin/Brownian -- fully stochastic • TODAY’S DATA – Monte Carlo -- fully stochastic (dynamical??) • All lead to Boltzmann distribution p(x) exp U(x) k B T Research • Free energy calculations (fluctuations) – F, absolute F • Rare dynamic events / Path sampling (fluctuations) – Theory and molecular applications • Equilibrium sampling – Today • Non-traditional coarse-grained model design – Discretization; different “resolution levels” • Overall Goal: Make biologically relevant desktop computations possible – Stay true to statistical mechanics