Understanding Protein Electrostatics Using Boundary-Integral Equations Jaydeep P. Bardhan Dept. of Physiology and Molecular Biophysics Rush University Medical Center, Chicago IL Joint work with • M. Knepley (Computation Institute, U. Chicago) • P. Brune (Math and Computer Science Division, Argonne) • A. Hildebrandt (J. Gutenberg U., Mainz, Germany) Outline: • Preliminaries: Biomolecule electrostatics Continuum theory and boundary-integral methods Numerical simulation 1.Fast Poisson approximation 2.Nonlocal continuum model Applied Math Biophysics My research Computer science (HPC) Emphasizing the interdisciplinary nature of computational biophysics Fact: Water Makes Life Possible Fox L. Freberg Vander Kass ‘05 A Crucial Consequence of Solvation • Molecular binding involves sacrificing solute--solvent interactions for solute--solute interactions: d=0 d=1 Basic Continuum Electrostatic Theory • 100-1000 times faster than MD • Protein model: Shape: “union of spheres” (atoms) Point charges at atom centers Not very polarizable: = 2-4 • Water model: no fixed charges Modeling ions in solution is critical! But today’s Single water: sphere of radius 1.4 Angstrom focus is on the simpler Highly polarizable: = 80 math of “pure” water. • In total: mixed-dielectric Poisson Solving the PDE Directly is Possible, But… The idea: Just throw down a finite-difference grid or a finite-element mesh and go to town! PDE Complications 1. Boundary conditions are at infinity 2. Point charges must be spread onto the grid 3. The dielectric interface is approximated Green’s Representation Formula • Well known: boundary values of a harmonic function determine it uniquely Ex: D Dirichlet: given Neumann: given (For exterior domain!) • The challenge is determining S given BV. Separation of variables Numerical: finite elements, finite differences.. In 3 dimensions, solve for 3-dimensional unknown • Alternatively: if you knew BOTH conditions, (3 dimensions) Potential Surface integrals ONLY anywhere in D Thus: finding the other boundary condition gives you the answer directly Deriving a Boundary Integral Equation: Exterior Neumann Problem • Given r is in the domain D , need to find r’ is on the boundary S • Let r approach surface Given data D S Addressing the Singularities • Single-layer potential D S • Dipole-layer Limit depends on WHICH SIDE of the Also continuous asis approaching!! surface your point Continuous as z z r + + + ++ + ++ + - - - - - x a + y r a + ++ + +++++ ++- -- - x y Deriving a Boundary Integral Equation: Exterior Neumann Problem • Given , need to find D S Why Bother With Integral Equations? Easy problem: Medium problem: Exterior problems? To infinity Problems with mostly empty, uninteresting space Practical Advantages of PDE Approaches PDE •Accessible (many codes) •Reliable, durable •Versatile, does OK job 1. 2. 3. 4. BIE •Less accessible (few codes) •Hard to convince it to run •Does one thing really well Much more general (nonlinearity, etc) Easier to parallelize (that’s different from “easy”) Often easier to explore model space (see point 1) PDE solvers give sparse systems; BIE, dense systems! Similarity Between FEM and BEM • Both weighted residual methods: BEM FEM 1 on panel i 0 elsewhere Enforce (Galerkin method) Enforce Galerkin: Differences Between BEM and FEM 1. Extra freedom in choosing test functions Collocation: test = delta functions Centroids of elements 2. Matrix elements are harder to compute Galerkin FEM: Smooth integrand: Easily computed with quadrature! Galerkin BEM: Double integral of a singular function!! Fast Solvers for Integral Equations 1. Solve Ax=b approximately using Krylov-subspace iterative methods such as GMRES: 2) time and memory Storing matrix: O(N 2. Compute dense matrix-vector product using O(N) method (fast multipole; Each FFT; multiply: O(N2) time tree code; precorrected FFTSVD) Storing compressed matrix: O(N) time and memory 3. Improve iterative convergence withO(N) preconditioning Each multiply: time 4. For many problems, use diagonal entries! Iteration converges faster if matrix eigenvalues are “well clustered” P “looks like” A-1 A Boundary Integral Method For Biomolecule Electrostatics + + + + + +- - - Conservation law Constitutive relation + + + - - - - 1. Boundary conditions handled exactly 2. Point charges are treated exactly 3. Meshing emphasis can be placed directly on the interface BIBEE: A New, Rigorous Model of Continuum Electrostatics for Proteins “Boundary Integral Based Electrostatics •Estimation” Idea: Use preconditioner to approximate inverse No need to compute sparsified operator (saves time and memory) No need for Krylov solve • Test of elementary charges in a 20-Angstrom sphere: Single +1 charge +1, -1 charges 3 A apart BIBEE: Introducing Different Variants • The preconditioning approximation takes into account the singular character of the electric-field kernel: • The Coulomb-field approximation ignores the operator entirely: CFA seems better here… …and worse here. BIBEE Approximates the Eigenvalues of the Boundary Integral Operator • The integral operator has to be split into two terms Sphere: analytical A hundred years of analysis! • Eigenvalues are real in [-1/2,+1/2) • -1/2 is always an EV • Left, right eigenvectors of -1/2 are constants -1/10 -1/6 -1/2 • BIBEE approximates E’s eigenvalues P uses 0 (limit for sphere, prolate spheroid) CFA uses -1/2 (known extremal) i BIBEE Clarifies an Empirical, Heuristic Model + + BIBEE approx. charge includes all contributions R1 “Effective Born radius” - the radius of a sphere with the same solvation R2energy R3 Coulomb-field approximation: corresponds exactly to ignoring Stillthe equation: theoperator. basis of totally integral nonphysical Generalized Born (GB) models BIBEE/CFA is the extension of CFA to multiple charges! No ad hoc parameters, no heuristic interpolation Same approach taken by Borgis et al. in variational CFA BIBEE/CFA Energy Is a Provable Upper Bound Feig et al. test set, > 600 proteins • BIBEE/P is an effective lower bound, provable in some cases but not all • Another variant (BIBEE/LB) is a provable LB but too loose to be useful Bardhan, Knepley, Anitescu (2009) BIBEE: Improve by Analyzing the Sphere -1/10 i -1/6 -1/2 • Get first mode (monopole) analytically correct, other modes are bounded from below: tighter lower bound! • Impact on sphere is better than impact on proteins (Feig et al. test set) Bardhan+Knepley, J. Chem. Phys. (in press) BIBEE: Accurate One-parameter Model -1/10 -1/6 -1/2 i Dominant energies come from dominant modes: try to capture dipole/quadrupole modes approximately! • This effective parameter is expected to be rigorously determined by approximating protein as ellipsoid (Onufriev+Sigalov, ‘06) Bardhan+Knepley, J. Chem. Phys. (in press) BIBEE: A New, Rigorous Model Tripeptide Protein-Drug 3968 7564 15,212 32,022 49,708 18.368 24.493 87.647 515.256 735.092 (3.271) (6.665) (18.274) (62.217) (109.040) SGB/CFA (heuristic) time 1.198 2.623 7.070 14.611 28.861 BIBEE time 2.974 6.540 18.125 39.066 77.205 # boundary elements Total BEM time Matrix compression time • BIBEE is 3-5 times faster than full solve (including large setup time for both) • Unoptimized implementation (will save big on setup time) • Modern FMM implementation (Yokota, Knepley, Barba, et al.) gives 10-20X speedup Reaction-Potential Operator Eigenvectors Have Physical Meaning • Eigenvectors from distinct eigenvalues are orthogonal • Thus: the eigenvectors correspond to charge distributions that do not interact via solvent polarization (weird, huh?) • If an approximate method generates a solvation matrix its eigenvectors should “line up” well with the actual eigenvectors, i.e. i=j , “Getting the Modes Right” Is Important • Modes from small eigenvalues still contribute significantly to the total energy -10 -20 -30 20 40 Eigenvalue Index 60 80 Cumulative Electrostatic Free Energy (kcal/mol) Projection of charge distribution onto eigenvector • Here, 25% of the total energy comes from modes with eigenvalues smaller than 1% of the maximum eigenvalue Eigenvalue Magnitude 104 102 100 10-2 20 40 60 Eigenvalue Index 80 BIBEE Is An Accurate, Parameter-Free Model Snapshots from MD • Peptide example Met-enkephalin BIBEE’s stronger “diagonal” appearance indicates superior reproduction of the All models essentially the same here. eigenvectors of the look operator. SGB/CFA GBMV BIBEE/CFA BIBEE: A New, Rigorous Model of Continuum Electrostatics for Proteins Design systematic approximation Have proved that the model: • Gives upper and lower bounds • Preserves important physics Relate empirical models to strong math Leverages existing algorithms (e.g. fast multipole methods, parallel codes) Next: Apply to other physics problems Applied Math Biophysics BIBEE Computer science (HPC) Nonlocal Continuum Electrostatics: Adding molecular realism “the right way” KNOWN weaknesses of Poisson model: First look for ways to extend 1. Linearmodels--don’t response assumption existing just give up and reinvent everything! Nonlinearity IS important for more highly charged species! Test Caveat: with allatom molecular 2. Violates continuum-length-scale assumption dynamics Oxygen Relatively small deviation! Lone pair electrons Hydrogen bonds y=x denotes exactly Water molecules have finite size linear response Hydrogens Water molecules form semi-structured networks Nina, Beglov, Roux ‘97 Nonlocal Continuum Electrostatics: Demonstrating the Failure Mode Run all-atom molecular dynamics: ion surrounded by water Consequence: ion energies are wrong Significant structuring of charge density! Data points: radii from molecular simulation (Aqvist 1990) and energies from experimental data Ion radius in nanometers Nonlocal Continuum Modeling: A Classical Multiscale Theory • Studied since the 1970s in numerous domains Problems whose length scales are NOT well separated from those of the constituent molecules! de Abajo ‘08 Duan et al. ‘07 Schatz et al. ‘01nonlocal Expect Park ‘06 theory to play major rolesScott in et al. ‘04 nanoscale science and Gao et al., ‘09 engineering modeling… Nonlocal Continuum Electrostatics: Nonlocal Dielectric Response • Polarization charge as a function of distance from the ion: not simple Short-range: electronic response Long-range: bulk behavior • Local: bulk everywhere • Nonlocal: simple function that captures asymptotes Supported by experiments and Local response detailed simulations Wave number (inverse distance) Smoothly interpolates between known limits Nonlocal Continuum Electrostatics: Lorentzian Model and Promising Tests • Nonlocal response: • Now • Integrodifferential Poisson equation Green’s function for Single parameter fit for gives much better agreement with experiment!! Nonlocal Continuum Electrostatics: Reformulation for Fast Simulations • Integrodifferential equations in complex geometries? • Result: No progress on nonlocal model for DECADES Spherical ions, charges near planar half-spaces… nothing else. • Breakthrough in 2004 (Hildebrandt et al.): 1. 2. 3. Define an auxiliary field: the displacement potential Molecular surface “Licorice” “Cartoon” Approximate the nonlocal boundary condition Double reciprocity leads to a boundary-integral method Nonlocal Continuum Electrostatics: 1. Introduce an Auxiliary Potential • Use Helmholtz decomposition: • Electrostatic potential now satisfies a Yukawa equation: Yukawa/linearized PoissonBoltzmann equation Displacement potential acts as a volume source Nonlocal Continuum Electrostatics: 2. Approximate Nonlocal B.C. • Original boundary conditions: • Exact normal deriv. of solvent potentials satisfy Nonlocal boundary condition: Choose to drop • The actual PDEs complete the local formulation: Nonlocal Continuum Electrostatics: 3. Green’s Theorem + Double Reciprocity • Electric potential Green’s theorem gives a volume integral • The displacement potential is harmonic: 0 • Defining single- and double-layer operators Nonlocal Continuum Electrostatics: Purely BIE Formulation • Three surface variables, two types of Green’s functions, and a mixed first-second kind problem • Fasel et al. have recently derived a purely second-kind method Hildebrandt et al. 2005, 2007 Nonlocal Continuum Electrostatics: Analytical Solution for Sphere Alloperators of theseshare are diagonal For sphere, these a common eigenbasis: spherical harmonics • Solve each mode independently and presto! • Note: This is not about matching interior and exterior expansions--unlike the Kirkwood solution for local model • This decomposition may provide further analytical insights (e.g., eigenvectors of reaction-potential operator) Bardhan and Brune, to be submitted Nonlocal Continuum Electrostatics: Charge Burial and the pKa Problem • Understanding charge burial energetics is important! For protein folding, misfolding (Alzheimer’s), etc. For two molecules binding (drug-protein, protein-protein, etc.) For change in environment (pH, temperature, concentration, etc.) Ion or charged chemical group, alone in water Local theory needs unrealistically large dielectric constants to match experiment! 3 2 Error in pKa value (RMSD) 1 Ion or charged chemical group, buried in protein 0 Measured protein dielectric constants suggest = 2-5 5 20 Demchuk+Wade, 1996 40 60 80 Nonlocal Continuum Electrostatics: Charge Burial and the pKa Problem • Nonlocal theory with realistic dielectric constant predicts similar energies as (widely successful) local theories with unrealistic dielectric constants! Bardhan, J. Chem. Phys. (in press) Nonlocal Continuum Electrostatics: Fast Solver is a Must for Accurate Studies • O(N2) memory limitation: big discretization errors • O(N) fast solver: only way to get accurate energies # of boundary 1000 10,000 100,000 1,000,000 elements Dense BEM Fast BEM Illustration of surface Memory 7 700 70,000 representations for 0.07 needed (GB) memory-constrained dense and fast BEM Nonlocal Continuum Electrostatics: Fast BIE Solver Performance • Time and memory scale linearly in the number of unknowns • Unoptimized code still allows a laptop to solve 10X larger problems than is possible on a cluster • Preconditioning is vital (use diagonal entries of blocks) Required accuracy Dense methods used previously could not achieve useful accuracy! Bardhan and Hildebrandt, DAC ‘11 Nonlocal Continuum Electrostatics: Fast BIE Solver Enables Tests on Proteins Local Model Nonlocal Model • Observe reduced “electrostatic focusing” • Next step: compare to molecular dynamics Bardhan and Hildebrandt, DAC ‘11 Nonlocal Continuum Electrostatics: Adding molecular realism “the right way” Extend the space of models that are supported by good theory Derive fast analytical methods for testing the new theories Test on important open questions Build high-performance solvers for realistic, accurate simulations Applied Math Biophysics Nonlocal model Computer science (HPC) Summary: • Improve understanding of existing models • Develop new models on strong foundations Applied Math • Stringent tests of new models Biophysics My research • Leverage HPC expertise by re-using computational primitives • “Think computationally” to gain new insights into model development Computer science (HPC) • Identify critical model weaknesses • Explain previously unresolved phenomena Acknowledgments • Support: Wilkinson Fellowship at Argonne National Lab Partial support from a Rush University New Investigator award • Colleagues: Ridgway Scott (U. Chicago) Bob Eisenberg, Dirk Gillespie (Rush) Mala Radhakrishnan (Wellesley) Nathan Baker (Pacific Northwest Nat’l Lab)