...of ligands from umbrella sampling and steered MD simulations; ...applications to ions, small molecules and toxin peptides. Contents of this presentation I Introduction General principles of free energy calculations Illustrations by simple examples Theoretical Underpinnings Toy models: ions in small water boxes Ways to affirm and void the fidelity of simulation models to reality Advanced examples Ligands in gramicidin-A channel, K+ channels Contents of this presentation II Introduction Full analysis of an absolute free energy of binding calculation Using toxin peptides (my work) Summary and notes Implications of various assumptions Warning signs to watch out for ...let’s go! Goal of FE calculations Introduction Accuracy Chief motivation of conducting calculations is to predict/replicate experimental values We must not lose sight of this Therefore, it is worthwhile to test and verify our foundations Especially for calculations that last over a month Mechanics of FE calculations Introduction Umbrella sampling and steered MD are path-dependent calculations e.g. the ligand must physically move out of the receptor A well-constructed simulation can: Take information from known reaction mechanisms Provide information on how those mechanisms occure Foundation of FE calculations Introduction F = ma All simulations are models Classical approximation of QM phenomena System proceeds via forces imposed on atoms Energy calculations occur via manipulation of the these forces Small loss in accuracy must be granted ...motivated? Steered molecular dynamics A simple manipulation F= ½ k ( r – r0 + tv ) SMD is a “moving spring” Force is imposed on an atom or set of atoms Equilibrium position moves to create a path Work done by this spring is used to find the local free energy surface. A simple model Steered Molecular Dynamics A small box, containing: ~400 water atoms, one neon atom. Suppose that we create an artificial gaussian barrier (of ~5 kT) for the Ne atom. How do I measure the free energy surface of that barrier, using the Ne as a probe? Simulation design Steered Molecular Dynamics Constrain position of Ne in (x,y,z) to a starting location Define path along water box Measure the total amount of work done W = ∫ F(z) dz W → manipulations → free energy surface Work done(?) Steered Molecular Dynamics Resulting potential of mean force Includes environmental influences i.e. action of water around the atom NB: notice that PMF does not return to zero readily Averaging Steered Molecular Dynamics A single trajectory does not “sample” the reaction sufficiently i.e. A free energy surface requires comprehensive knowledge of the sub-states, ala possible trajectories Trajectories contain influence from the SMD pulling itself Some of the work is directed at displacing the solvent around Ne This needs to be accounted for Jarzynski’s Equality Steered Molecular Dynamics .˙. Take multiple trajectories The average of multiple trajectories should even out random influences on the PMF < e -W/kT > ≈ Σi=1n e -Wi / kT / n .˙. Apply Jarzynski’s Equality (given certain assumptions) e -ΔG/kT = < e -W/kT > Average work Steered Molecular Dynamics The Boltzmann average of 10 calculations v = 10 Å ns-1 A very fast calculation, converges because Neon does not interact with environment ...next: theory Criteria of SMD-based FE calculations SMD Assumptions Reaction coordinate must be well chosen This coordinate measures all contributions to the real ΔG, and only those contributions. The simulation must be near-equilibrium Jarzynski's Equality holds when no dissipative work is done by the pulling Reaction coordinates SMD Assumptions In SMD, the dimension(s) controlled by the constraining potential is a reaction coordinate The reaction coordinate can be: Distance Center of mass (collective variable) RMSD to target structure The path traced through the SMD simulation is a reaction path Theoretical interpretation SMD Assumptions Our systems are canonical ensembles: Z(n,P,T) We wish to measure particular sub-states of the system e.g. ligand-bound and ligand-unbound. Thus, steered molecular dynamics (SMD): Constrains the system to certain sub-states according to some coordinate Drives the system along this coordinate to new sub-states by application of forces. Implications on path design SMD Assumptions Both the reaction-path and the simulations must sufficiently sample the required sub-space May not be complete, but must be representative In a ligand-unbinding process Phase space Bulk Bound Translation must be primary Rotation may be secondary Implications on path design SMD Assumptions Sampling and stability of a system suffers where large barriers must be crossed Therefore, choose path of least resistance Simply using the distance between the ligand and the whole receptor may not be a good choice… Criteria of SMD-based FE calculations SMD Assumptions Reaction coordinate must be well chosen This coordinate measures all contributions to the real ΔG, and only those contributions. The simulation must be conducted in a near-equilibrium state Jarzynski's Equality holds when no dissipative work is done by the pulling Theoretical implications SMD Assumptions SMD –(Jarzynski's Equality)→ Free energy JE comes with certain qualifications: Dissipative work can be done on the system Pulling velocity i.e. System must equilibrate around SMD perturbation, else perturbation will also be measured. (We will show this later.) ...next: Umbrella Sampling Collecting local information F= ½ k ( r – ri), i along path US is a static potential Force is also imposed on an atom or set of atoms Multiple overlapping states are constructed to cover reaction path. Each state provides information about local surface Link to derive complete surface ... Second toy model Umbrella sampling A box containing two ions sodium and chloride Solution known FE surface related to radial distribution function This was done ab-initio yesterday Second toy model Umbrella sampling Reaction coordinate Na – Cl separation Umbrella potential 1 Å apart, 2.5-9.5 Å k = 10 kcal mol-1 Å-2 Derive original by WHAM analysis PMF convergence Umbrella sampling Phase space Bulk Bound PMF Umbrella sampling Phase space Bulk Bound ...pretty... Criteria of US-based FE calculations US assumptions Reaction coordinate must be well chosen This coordinate measures all contributions to the real ΔG, and only those contributions. Sufficient sampling over the entire path: Convergence of PMF curve means that environmental variables are well sampled. Overlap between adjacent windows. Reaction coordinate US assumptions Same arguments as for SMD (underlying physics identical) Umbrella sampling is capable of treating two/three dimensions As long as all dimensions are properly sampled Criteria of US-based FE calculations US assumptions Reaction coordinate must be well chosen This coordinate measures all contributions to the real ΔG, and only those contributions. Sufficient sampling over the entire path: Convergence of PMF curve means that environmental variables are well sampled Overlap between adjacent windows Environmental variables US assumptions All coordinates not included in your reaction coordinate(s) must be well sampled This means that simulation has visited dimensions perpendicular to reaction path that may contribute to FEbind Bound Bulk Window Overlap Bulk US assumptions Require enough sampling to accurately interpolate between windows Bound Window-overlap US assumptions Define measure of overlap between two distributions When underlying surface is flat, harmonic potential produces gaussian distributions Theoretical overlap: Ω = [ 1- erf(d/8σ)] Ω = [ 1- erf(d/8σ)] US assumptions In practice, minimum overlap is ~2% Overlap should agree with theoretical value when in bulk k = 20 kcal/mol/Å-2 d = 0.5 Å Ω = 15% k = 40 kcal/mol/Å-2 d = 0.5 Å Ω = 4% Summary Steered MD –vs– Umbrella Sampling Constructing FE surfaces via SMD Straightforward construction Relies on JE conditions: Difficult to achieve in practice Constructing FE surfaces via US Additional checks and balances Both dependent on sufficient sampling ...take a break? JE/US comparisons J Chem. Phys. 128:155104 (2008) Using several test cases of ions and molecules in channel systems Ion transit through membrane Ion binding to gramicidin exterior Organic-cation binding to gramicidin-A JE/US comparisons J Chem. Phys. 128:155104 (2008) Comparing PMFs Tests balanced by equalising the total simulation time SMD setup @ v=5 Å ns-1 ~ US setup SMD: Also use different pulling velocities to test reversibility of JE Ion transit (nanotube) J Chem. Phys. 128:155104 (2008) Results equivalent between two methods Energy surface not equal at both openings resulting from system setup JE valid 36 Ion transit (gA) J Chem. Phys. 128:155104 (2008) Umbrella sampling, not SMD, gives symmetric surface Pulling at different velocities do not seem to help Can potentially use v < 1 Å ns-1 But more time consuming than equivalent US setup 37 JE: Practical problems? J Chem. Phys. 128:155104 (2008) Equilibration time sharply increases for peptide environments Nanotube highly ordered .˙. Fast dissipation Reliance on sampling “negative work” trajectories High v: low probability for the environment to push SMD particle K+-binding to gA J Chem. Phys. 128:155104 (2008) ( next test case: ) gA has a weak ion binding site at the entrances v =2.5 Å ns-1 Smaller potentials Perhaps using a smaller k will reduce the perturbations 39 K+-binding to gA J Chem. Phys. 128:155104 (2008) What about using different force constants? No significant help in repairing JE assumptions v =2.5 Å ns-1 Using small k may reduce perturbations on system However, binding site shape lost k must be greater than binding well ‘potential’ 40 K+-binding to gA J Chem. Phys. 128:155104 (2008) k = 2 kcal mol-1 Å-2 There are hard limits to varying parameters k = 20 kcal mol-1 Å-2 41 EA and TEA binding J Chem. Phys. 128:155104 (2008) Ethylammonium (EA) and tetra-ethylammonium (TEA) bind weakly to gA v = 2.5 A /ns k = 20 kcal mol-1 Å-2 Reducing barrier height produces no difference here CnErg1 Toy comparison J Chem. Phys. 128:155104 (2008) If it doesn’t work for small cations, it won’t work for a peptide Test for a purported binding of CnERG1 toxin to hERG channel since rate of dissipation to environment is less than rate of work done… SMD-PMFs essentially measures work done to move solvent 43 Mechanisms J Chem. Phys. 128:155104 (2008) Input: work done on system Carried out by imposed SMD forces (irreversible) Contribution from underlying FE surface (reversible) Output: dissipation to heatbath (NPT systems) Equilibration occurs by two means: Forces bleed out to atoms far from SMD location Temperature coupling to thermostat JE maintained in O < I conditions (only in nanotube) 44 Mechanisms J Chem. Phys. 128:155104 (2008) The time for equilibration is such that v << 2.5 Ang/ns is required for JE condition to hold This velocity requirement become more stringent as: ligand size increases interactions increase Not as efficient as umbrella sampling. ...paper finished. Organic cations-gA J. Phys. Chem. B 111:11303 (2007) Block of GA by various small molecular ligands Energetics dependent on ligand size and partial charges Comparison between ligands, and with extant experimental data Organic cations J. Phys. Chem. B 111:11303 (2007) Use Autodock3 to find potential binding sites of molecules MD, umbrella sampling to find PMF and free energy of binding COM-coordinates of ligands 10 kcal mol-1 Å-2 0.5 Å ns-1 Molecule list J. Phys. Chem. B 111:11303 (2007) Use of six different molecules Varying sizes and polarity Determines strength of binding, and whether molecules can permeate through gA MA and EA J. Phys. Chem. B 111:11303 (2007) FMI and GNI J. Phys. Chem. B 111:11303 (2007) TMA and TEA J. Phys. Chem. B 111:11303 (2007) Comparison of FEbind J. Phys. Chem. B 111:11303 (2007) Molecule z (Å) r (Å) FEbind (kT) K(M-1) (expt.) MA 10.7 ± 0.8 0.7 ± 0.4 -1.4 4.1 (4.4) EA 12.5 ± 0.6 1.4 ± 0.6 1.6 0.2 (~0) FMI 12.6 ± 0.5 1.5 ± 0.7 0.5 0.6 (23) GNI 12.8 ± 0.3 2.3 ± 0.5 -2.2 8.9 TMA 13.2 ± 0.6 1.4 ± 0.6 0 1 TEA 14.1 ± 0.5 2.2 ± 0.8 0.9 0.4 FMI and GNI binding Channel lifetimes increases by many folds Binding must influence the center of pocket A binding site likely exists in the centre of the channel, not in simulation ...wait, how did we get FE? From PMF to free energy How did we finally derive FEbind? Assumption: x-y variations in the PMF are “small” in the region sampled Keq = ∫ ∫ ∫ e-W(z)/kT dx dy dz = π R2 ∫ e-W(z)/kT dz ∆Gb= -kT ln (Keq Co ) Co is standard concentration What is R? Integration Volume Bulk How did we finally derive FEbind? The measured PMF occurs over a certain volume This represents the size of the entire binding pocket Larger binding pocket => larger effective ∆Gb This is then standardised to 1 M of ligand Equivalent to 1 ligand per 1661 Å3 Bound Integration Volume Bulk How did we finally derive FEbind? One dimensional PMF merely hides the fact that binding sites have volume Assumption essentially states that the PMF value at some position is the average PMF value of the local slice R is the radius of these local slices This depends on the actual area sampled Bound Integration Volume Bulk How did we finally derive FEbind? In practice, R does not change significantly over the length of the bound area Therefore, can pick uniform average R at a minimal loss of accuracy e-W(z)/kT means that only the sampling around binding site is critical The rest of the path connects this to bulk Bound Set R to be average area visited by ligand in site From PMF to free energy How did we finally derive FEbind? ∆Gb= -kT ln (Keq Co ) Other notes: These derivations assume a simple two state mechanism [L] + [B] [LB] In cases of e.g. cooperative binding, the relationship needs a different derivation From PMF to free energy How did we finally derive FEbind? ∆Gb= -kT ln (Keq Co ) Other notes: We also ignore possibilities of multiple binding sites Secondary binding pockets within the same site may contribute – this depends on your sampling and reaction paths ...next paper... K+ permeation Biophys J. (2011) PMFs can be used to study permeation processes subject to classical assumptions The K+ channel conducts ions at a near diffusion rate This implies that only small barriers exist along the path How does this occur? K+ permeation Biophys J. (2011) K+ ions occupy the filter at all times in-vivo S1/S3, S0/S2/S4 If less ions are present, the channel closes .˙. conduction must involve concerted movement Setup Biophys J. (2011) Kv1.2 (Shaker) Reaction coordinate along channel-axis Define various PMFs for the different conditions that may exist: one K+ approaching two ions within filter Two ions moving along filter Three ions moving along filter One ion movt. Biophys J. (2011) Approach from left One ion movt. Biophys J. (2011) Approach from right Significant difference between the two occupancy states Barrier-less permeation? Biophys J. (2011) The “barrier-less” transport involves Movement of the two ions in the filter Filling the hole left behind Every pair must be separated by 1 water Adjacent K+ states are unfavourable States like S1/S3/S4 should not exist We further test the cohesive movement K+ permeation Biophys J. (2011) 2-ion movement S2/S4 -> S1/S3 Barrier less S1/S3 -> S0/S2 Large barrier? K+ permeation Biophys J. (2011) 3-ion movement S1/S3/‘S5’ -> S0/S2/S4 also large barrier Barrier-less permeation? Biophys J. (2011) The large barrier cannot physically exist But simulation well converged: there must be a problem with the simulation itself Classical forcefields used here are not polarisable Leads to difference between protein behaviour near solvent (S0) and within filter (S1-S4)? Barrier-less permeation? Biophys J. (2011) The large barrier cannot physically exist Similar problem with permeation All due to polarisation of molecules in different media Stay tuned for next generation forcefields ...take a break? Binding of Charybdotoxin to KcsA Current paper in submission Various aspects shown above in illustrations We will now cover the project in detail Highlighting the tests that affirm accuracy i.e. Sampling sufficiency, coordinate choices, control… Criteria of US-based FE calculations US assumptions Reaction coordinate must be well chosen Sufficient sampling over the entire path: This coordinate measures all contributions to the real ΔG, and only those contributions. Convergence of PMF curve means that environmental variables are well sampled. Overlap between adjacent windows. Complexity from collective variables: Influence of internal coordinates on PMF. Ligand Complexity US assumptions Reaction coordinates work by contracting the entire freedom of the ligand A ligand molecule has 3N degrees of freedom A fraction of these are important Maximum of ~3 coordinates is reasonable. In complex ligands with multiple sites of interaction, must either: select all important coordinates as reaction coordinates, e.g. charge-centers. Contract them further, e.g. center-of-mass Then must deal with how well reaction coordinate corresponds to the interaction sites Ligand complexity Small ligands have dozens of atoms Relatively few internal degrees of freedoms Choosing COM does not introduce complications Ligand complexity Proteins contain 100s of atoms with tertiary structure many internal degrees of freedom low vibrational modes. Choosing COM – Internal modes may respond to umbrella potential Background The binding of charybdotoxin Scorpion toxins Selectively targets neuronal ion channels and blocks conduction. Specificity can be altered by mutations Modified toxins for therapeutic targets Potential for: therapeutics in-vivo studies of channel distribution Background The binding of charybdotoxin Potassium channel targets Kv1 family, calcium-activated channels Structure difficult to obtain. Bacterial K+ channel, KcsA, is easier Mutate residues and bind scorpion toxins KcsA-ChTX complex obtained by NMR + previous crystallography Park et. al. (2005) MD system setups The binding of charybdotoxin Reaction coordinate: Toxin backbone center-of- mass. Path extends along channel-axis. Harmonic constraints on residues 3-35 and 17-20 Prevent unfolding of protein during sampling. ~60,000 atoms Match experimental ionic concentration (150 mM) NPT ensemble KcsA and membrane lightly constrained Prevent unlikely case of drifting The center-of-mass coordinate The binding of charybdotoxin Represents the translational freedoms of the ligand. Sampling must integrate rotational effects and internal modes Works for ions and small ligands with single important site of interaction May not work alone for peptides with multiple sites of interaction Cartoon example The binding of charybdotoxin A ligand is pulled from its binding site to the bulk However, ligand unfolds over the path of reaction coordinate. Then the chosen path must include the energy of unfolding. Prior Work The binding of charybdotoxin We’ve carried out a direct umbrella sampling procedure for a scorpion toxin COM coordinate Results in partial unfolding of alpha helix Trapped in alternate conformation Solution The binding of charybdotoxin Parts of the protein are restrained in order to prevent unfolding As a check, calculate the free energy of restraining and unrestraining Solution The binding of charybdotoxin Contribution of thermodynamic cycle needs to be calculated (site) 1.43 (bulk) 1.47 Negligible in this case Not negligible if restraints applied to functional residues Spoiler The binding of charybdotoxin Parts of the protein are restrained in order to prevent unfolding A different “path” is sampled when toxin is restrained Sampling Overlap The binding of charybdotoxin Plot overlap between successive windows Gaps likely at transitional barriers These require additional windows A junction can introduce errors of up to 0.5 kcal mol-1 Sampling Overlap The binding of charybdotoxin Spoiler II The binding of charybdotoxin WHAM analysis interpolates data to connect two adjacent windows. Gaps in this case introduce ~0.5 kcal mol-1 differences Important w.r.t. accuracy Spoiler II - note The binding of charybdotoxin Angular peaks within an otherwise smooth PMF curve Location of barrier Potential indication of insufficient sampling PMF convergence (work computer died) PMF B&C: toxin-restrained PMFs ----- A: non-restrained PMFs path-dependence of PMF The binding of charybdotoxin path-dependence of PMF The binding of charybdotoxin k-dependence of PMF The binding of charybdotoxin k-dependence of PMF The binding of charybdotoxin There is no dependence of the binding free energy on k The storage of elastic energy in the peptide is conservative (error is calculated by standard deviations of subsets of sampling data) Source set FE of binding A20 -17.1 ± 0.9 A40 -16.8 ± 2.3 B20 -8.7 ± 1.7 C40 -7.6 ± 1.1 Experiment -8.3 Kinetic analyses The binding of charybdotoxin Rotational freedom of ChTX increases in steps Significant increase after contact disassociation Clearly there should be free rotation in bulk Kinetic analyses The binding of charybdotoxin Translational freedom only achieved outside binding pocket Effective binding site is actually rather small Transition region with some charge contacts Kinetic analyses The binding of charybdotoxin Although ligand is 100’s of atoms, sampling is not as hard as one might imagine Kinetic analyses The binding of charybdotoxin Binding site is small due to contacts Multiple charge interactions “lock” the toxin to the pore Therefore a very narrow and deep binding pocket Kinetic analyses The binding of charybdotoxin Significant residues identifiable R25, K27, R34, Y36 Not K11 or other charges Correlates with existing mutational data Take home message Addendum Many types of ligand interactions can be explored Important caveats exist, but not extraordinarily challenging Most useful in explaining why a process occurs Complementary with experimental data