Title: Geometry optimization, Binding of molecules Name: Gero Friesecke, Florian Theil Affil./Addr. 1: Center of Mathematics, TU Munich gf@ma.tum.de Affil./Addr. 2: Mathematics Institute, University of Warwick f.theil@warwick.ac.uk Geometry optimization, Binding of molecules Mathematics Subject classification 81V55, 70Cxx, 92C40 Short Definition Geometry optimization is a method to predict the three-dimensional arrangement of the atoms in a molecule by means of minimization of a model energy. The phenomenon of binding, that is to say the tendency of atoms and molecules to conglomerate into stable larger structures, as well as the emergence of specific structures depending on the constituting elements, can be explained, at least in principle, as a result of geometry optimization. Pheonomena Two atoms are said to be linked together by a bond if there is an opposing force against pulling them apart. Associated with a bond is a binding energy, which is the total energy required to separate the atoms. Except at very high temperature, atoms form bonds between each other and conglomerate into molecules and larger aggregates such as atomic or molecular chains, clusters, and crystals. The ensuing molecular geometries, that is to say the 3D arrangements of the atoms, and the binding energies of the different bonds, crucially influence physical and chemical behaviour. Therefore, theoretically predicting them forms a large and important part of contemporary research in chemistry, materials science, and molecular biology. A major difficulty is that binding energies, preferred partners, and local geometries are highly chemically specific, that is to say they depend on the elements involved. For instance, the experimental binding energies of the diatomic molecules Li2 , Be2 , and N2 (i.e. the dimers of element number 3, 4, 7 in the periodic table) are roughly in the ratio 10 : 1 : 100. And CH2 is bent, whereas CO2 is straight. When atoms form bonds, their electronic structure, that is to say the probability cloud of electrons around their atomic nucleus, re-arranges. Chemists distinguish phenomenologically between different types of bonds, depending on this type of re-arrangement: covalent, ionic, and metallic bonds, as well as weak bonds such as hydrogen- or van-derWaals-bonds. A covalent bond corresponds to a substantial re-arrangement of the electron cloud into the space between the atoms while each atom maintains a net charge neutrality, as in the C–C bond. In a ionic bond, one electron migrates almost fully to the other atom, 2 as in the dimer Na–Cl. The metallic bond between atoms in a solid metal is pictured as the formation of a “sea” of free electrons, no longer associated to any particular atom, surrounding a lattice of ionic cores. The above distinctions, albeit a helpful guide, should not be taken too literally, and are often not so clear-cut in practice. A unifying theoretical viewpoint of the 3D molecular structures resulting from interatomic bonding, regardless of the type of bonds, is to view them as geometry optimizers, i.e. as locally or globally optimal spatial arrangements of the atoms which minimize overall energy. For a mathematical formulation see Section 2. If the number of atoms or molecules is large (& 100), then the system will start behaving in a thermodynamic way. At sufficiently low temperature, identical atoms or molecules typically arrange themselves into a crystal, that is to say the positions of the atomic nuclei are given approximately by a subset of a crystal lattice. A crystal lattice L is a finite union of discrete subsets of R3 of form {ie + jf + kg | i, j, k ∈ Z}, where e, f, g are linearly independent vectors in R3 . Near the boundaries of crystals, the underlying lattice is often distorted. Closely related effects are the emergence of defects such as vacancies, interstitial atoms, dislocations and continuum deformations. Vacancies and interstitial atoms are missing respectively additional atoms. Dislocations are topological crystallographic defects which can sometimes be visualized as being caused by the termination of a plane of atoms in the middle of a crystal. Continuum deformations are small long-wavelength distortions of the underlying lattice arising from external loads, as in an elastically bent macroscopic piece of metal. A unifying interpretation of the above structures arises by extending the term ‘geometry optimization’, which is typically used in connection with single molecules, to large scale 3 systems as well. The spatial arrangemens of the atoms can again be understood, at least locally and subject to holding the atomic positions in an outer region fixed, as geometry optimizers, i.e. minimizers of energy. Geometry optimization and binding energy prediction Geometry optimization, in its basic all-atom form, makes a prediction for the 3D spatial arrangement of the atoms in a molecule, by a two-step procedure. Suppose the system consists of M atoms, with atomic numbers Z1 , .., ZM . Step A: Specify a model energy, or potential energy surface (PES), that is to say a function Φ : R3M → R ∪ {+∞} which gives the system’s potential energy as a function of the vector X = (X1 , .., XM ) ∈ R3M of the atomic positions Xj ∈ R3 . Step B: Compute (local or global) minimizers (X1 , .., XM ) of Φ. Basic physical quantities of the molecule correspond to mathematical quantities of the energy surface as follows: binding energy difference between minimum energy and sum of energies of subsystems stable configuration local minimizer transition state saddle point bond length/angle parameter in minimizing configuration More precisely, the theoretical binding energy ∆E of the minimizer obtained in Step B with respect to decomposition into two subsystems, say of the first K atoms and the last M − K atoms, is defined as ∆E = min Φ(X) − lim min{Φ(X) : dist({X1 , . . . , XK }, {XK+1 , . . . , XM }) ≥ R}. R→∞ 4 Potential energy surfaces have the general property of Galileian invariance, that is to say Φ(X1 , .., XM ) = Φ(RX1 + a, .., RXM + a), for any translation vector a ∈ R3 and any rotation matrix R ∈ SO(3). Thus a one-atom surface Φ(X1 ) is independent of X1 , and a two-atom surface Φ(X1 , X2 ) equals ϕ(|X1 − X2 |) for some function of interatomic distance. In particular, for a diatomic molecule, the geometry optimization step B reduces to computing the bond length, r∗ := argminr ϕ(r). Model energies A wide range of model energies are in use, depending on the type of system and the desired level of understanding. To obtain quantitatively accurate and chemically specific predictions, one uses ab initio energy surfaces, that is to say surfaces obtained from a quantum mechanical model for the system’s electronic structure which requires as input only atomic numbers. For large systems, one often uses classical potentials. The latter are particularly useful for predicting the 3D structure of systems composed from many identical copies of just a few basic units, such as crystalline clusters, carbon nanotubes, or nucleic acids. Born-Oppenheimer potential energy surface The gold standard model energy of a system of M atoms, which in principle contains the whole range of phenomena described in Section 1, is the ground state Born-Oppenheimer PES of non-relativistic quantum mechanics. With X = (X1 , .., XM ) ∈ R3M denoting the vector of nuclear positions, it has the general mathematical form ΦBO (X) = min E(X, Ψ ), (1) Ψ ∈AN 5 where E is an energy functional depending on an infinite-dimensional field Ψ , the electronic wavefunction. For a molecule with N electrons, the latter is a function on the configuration space (R3 × Z2 )N of the electron positions and spins. More precisely AN = {Ψ ∈ L2 ((R3 × Z2 )N ) → C | ||Ψ ||L2 = 1, ∇Ψ ∈ L2 , Ψ antisymmetric}, where antisymmetric means, with xi , si denoting the position and spin of the ith electron, Ψ (..., xi , si , ..., xj , sj , ...) = −Ψ (..., xj , sj , ..., xi , si , ...) for all i < j. The functional E is given, in atomic units, by E(X, Ψ ) = H = vX (x1 ) + N X ∇2xi + R (R3 ×Z2 )N X Ψ ∗ HΨ where Wee (xi − xj ) + Wnn (X) (2) X Zα Zβ 1 Zα , Wee (r) = and Wnn (X) = , |r − Xα | |r| |X α − Xβ | 1≤α<β≤β (3) j=1 1≤i<j≤N and vX (r) = − N X α=M see also H. Yserentant’s entry Schrödinger equation for chemistry in this encyclopedia. Note that the energy functional captures chemical specificity, by depending on the nuclear charges Z1 , .., ZM ∈ N (e.g. 1 for hydrogen, 6 for carbon, 8 for oxygen). Fig. 1. Numerical geometry optimizer for water, H2 O, for the Born-Oppenheimer energy surface (1), (2), (3). Water corresponds to M = 3, Z1 = Z2 = 1, Z3 = 8, N = 10. The positions of the atomic nuclei are visualized as spheres. Data as predicted in Ref. [C05]: O–H bond lengths 0.95870 Ao , H–O–H bond angle 104.411o . The high-dimensionality of Step A (solving the underlying Schrödinger partial differential equation on R30 ) is tackled by a method far beyond this article (internally contracted multi-reference configuration-interaction with aug-cc-pV6Z basis set). 6 Numerically computing the PES and ensuing molecular geometry from (1)–(3) is already highly nontrivial for a small system as in Figure 1, and becomes infeasible for large systems, due to a curse of dimension phenomenon that the unknown field Ψ is a function on a 3N dimensional space. For more information see C.J. Garcia-Cervera’s entry Linear scaling methods. Coarse-graining A key method for reducing the complexity of E is coarse-graining. In the simplest case the minimization is performed over a low-dimensional subset obtained via some ansatz. Examples are the Hartree-Fock method, which makes a tensor product ansatz for the electronic wavefunction, or the Cauchy-Born rule (see eq. (9)). Such methods generate controlled approximations in the sense that the minimization of the energy over all trial configurations leads to upper bounds for the true energy minimum. Ansatz-free methods involve a modification of the energy E to account implicitly for eliminated degrees of freedom. Such methods, ingenious as they may be, provide uncontrolled approximations. Key examples are density functional theory (Sec. 3.3) and classical potentials (Sec. 3.4), as well as related intermediate methods. For example, one may eliminate only core electrons, and model their impact on valence electrons by pseudopotentials; or one may model chemically active sites of a molecule quantum mechanically and the remainder classically, as in the quantum mechanics/molecular mechanics (QM/MM) method (see e.g. [ST09]). 7 Density functional theory models A great deal of geometry optimization calculations in the chemistry and physics literature are based on DFT models, introduced by Hohenberg and Kohn (1964) and Kohn and Sham (1965). Such models describe the electronic structure in terms of a single scalar function on R3 , the single-electron density ρ : R3 → R, thereby eliminating the curse of dimension from (1). The associated PES are of form ΦDF T (X) = min E DF T (X, ρ), (4) ρ for some functional E DF T , a number of different functionals being used in practice. For more information see the entry Density Functional Theory by Rafael Benguria. For examples of optimal DFT geometries of molecules with up to 100 atoms see e.g. [RK04]. It occasionally happens that DFT fails to get the most favorable geometries right, even when the best available functionals are used, as in a, by DFT standards small, set of 20 Carbon atoms [MEF96, Table 4]. Classical potentials For large systems, one often uses classical potentials, in which the energy as a function of atomic position vector is given by an explicit expression. A basic example is the pair potential energy Φclassical (X1 , .., XM ) = X ϕ(|Xi − Xj |) (5) 1≤i<j≤M with Lennard-Jones (6,12) potential ϕ(r) = ar−12 − br−6 , (6) 8 which provides a good description of noble gases and noble metals such as Argon or Copper. Here a > 0, b > 0 are empirical parameters. For a variety of monatomic systems, more sophisticated classical potentials containing three-body and higher interactions, Φclassical (X1 , .., XM ) = X V2 (Xi , Xj ) + i<j X V3 (Xi , Xj , Xk ) + ..., (7) i<j<k have been developed, well known examples being the Tersoff, Brenner, and StillingerWeber potentials for Carbon. An example of a geometry optimizer for the model (5), (6) is shown in Figure 2. Classical potentials for biomolecules (customarily called “force fields” in biochemistry) are considerably more subtle. In particular, they require not just a significant number of empirical constants, but also prior knowledge of the molecule’s topology (i.e., which atom is covalently bonded to which; in biochemistry language, the primary structure). In some cases, one also needs to know the hydrogen bonds (the secondary structure). Software packages such as CHARMM [B09] have the capability of specifying an all-atom potential given a molecule’s primary structure, and provide in-built geometry optimization routines. The accuracy of the potentials has improved significantly over time since the package’s first release in 1983, but systematic improvement by building in empirical or ab-initio information about subunits bigger than a few atoms is impeded by the combinatorial growth of possibilities. 9 Fig. 2. Numerical 70-atom geometry optimizer for the Lennard-Jones energy (5), (6), plotted from the results of [No87]. The high-dimensionality of Step B (the configuration space is 210-dimensional) is tackled by first generating a good set of initical configurations before relaxing them under the Lennard-Jones energy. The initial configurations are subsets of plausible crystal lattices, and are found via a stochastic search algorithm based on the number of ‘bonds’ (pairs of particles with close to optimal distance). Methods and Mathematical Aspects Rigorous results On the rigorous level, very little is known about binding of molecules and geometry optimization in ab-initio models. In fact, it is even far from mathematically obvious that interatomic binding occurs, i.e. that the energy difference ∆E defined in Sec. 2 is negative, and that ab-initio potential energy surfaces possesses minimizers. The latter properties for general neutral molecules essentially follow from results by Lieb and Thirring [LT86] for the Born-Oppenheimer PES, and were fully proved by Catto and Lions [CL93] for density functional models such as the Thomas-Fermi-Weizsäcker model. For classical models like (5), (6), the fact that binding occurs and geometry optimizers exist is mathematically obvious, but the basic numerical fact that optimizers have a crystalline structure has not 10 been explained by any mathematical argument. Rigorous insights into global optimality of crystalline arrangements are currently limited to even further simplified models and two space dimensions [HR80, FT02, Th05, EL09]. Numerical methods Numerical computation of binding energies and equilibrium geometries for specific systems has a huge physics, chemistry, biochemistry, and materials science literature. One has to face curse-of-dimension phenomena and the multiscale structure of the energy landscapes. Tiny energy differences (in relation to the system’s total energy) between competing electronic states or atomic configurations often lead to very different minimizers. The large and sophisticated array of methods that is being used in practice, while fitting into the general framework described in Sec. 3.2., rely both on model reduction via physical and chemical intuition, and algorithmic ideas, and cannot be reviewed here. For small molecules, generation of ab-initio PES based on these methods and subsequent geometry optimization lies within the capabilities of software packages such as Gaussian [Ga09]. For more information on algorithmic issues for large molecules see the entry by S. Redon. Passage to larger scales Let us give two examples where empirical assumptions on atomistic geometry optimizers directly lead to widely used continuum theories on larger scales. Cluster shapes. Assume that the M -atom ground states of a PES are, to good approximation, subsets of a crystal lattice. As M gets large, the ground state energy decomposes into 11 a shape-independent O(M ) contribution and an O(M 2/3 ) surface energy which depends on the overall cluster shape Ω ⊂ R3 , Z Φ(X1 , .., XM ) ≈ M · E∞ + e(ν) dS (8) ∂Ω (for a rigorous version for a simple 2D model see [AFS12]). Here E∞ is the asymptotic energy per particle, limM →∞ M −1 minX1 ,..,XM Φ(X1 , .., XM ), and e(ν) is an energy density per unit surface area which depends on the normal direction ν of the surface with respect to the lattice. The minimizers of such surface functionals are surprisingly simple, and can be found explicitly (so-called Wulff shapes). Cauchy-Born rule. This rule postulates that when a crystal is subjected to a small linear displacement of its boundary, all atoms will follow this displacement. For a crystal with overall shape Ω ⊂ R3 subjected to a continuum deformation u : Ω → R3 , locally applying this rule leads to an elastic energy, Φ(X) ≈ Icont (u) = R Ω W (∇u(x)) dx, with stored-energy function W given, for Φ as in (7), by W (F ) = 1 X 1 1 X V2 (0, F `) + V3 (0, F `, F `0 ) + . . . . v(L) 2 `∈L 6 `,`0 ∈L (9) Here v(L) denotes the volume of a lattice cell, and the map u : Ω → R3 is a continuum approximation of the map from the atomic positions in the undeformed crystal to the new positions. Computationally, the passage from Φ to Icont is a dramatic simplification, because we have replaced the discrete (and expensive) sums with integrals, which can be re-discretized on a much larger scale. Closely related are hybrid methods such as the quasicontinuum method which retain atomistic resolution in some regions (TOP96). 12 Temperature Geometry optimization is a zero temperature method. At finite temperature T , the system is more accurately described by the Boltzmann-Gibbs distribution ρT (X, P ) = 1 − k 1 T H(X,P ) e B , Z(T ) where P is the vector of particle momenta, H(X, P ) is the Hamiltonian of the system, kB the Boltzmann constant, and Z(T ) a normalization constant. The Boltzmann-Gibbs distribution provides a unified treatment of entropic and energetic effects, and concentrates near the ground state of Φ if T is sufficiently small in relation to the closest critical temperature at which a phase transition occurs. Many small molecules and most solids are well within this regime at 300K, but large biomolecules often are not. A numerical finite-temperature analogue of geometry optimization for such molecules is to sample trajectories of a thermostatted molecular dynamics model with initial conditions given by zero-temperature geometry optimizers. References AFS12. Y. Au Yeung, G. Friesecke, B. Schmidt, Minimizing atomic configurations for short range pair potentials in two dimensions: crystallization in the Wulff shape, Calc. Var. PDE 44, 81-100, 2012 B09. B. R. Brooks et al., CHARMM: The Biomolecular Simulation Program. J Comput Chem 30: 15451614, 2009 CC05. A. G. Császár et al, On equilibrium structures of the water molecule, J. Chem. Phys. 122, 214305, 2005 CL93. I. Catto and P.-L. Lions. Binding of atoms in Hartree and Thomas-Fermi type theories. Part 3: Binding of neutral subsystems. Commun. PDE 18, 381-429, 1993 13 EL09. W. E and D. Li, On the Crystallization of 2D Hexagonal Lattices, Commun. Math. Phys. 286, 1099-1140, 2009 FT02. G. Friesecke and F. Theil, Validity and failure of the Cauchy-Born hypothesis in a two-dimensional mass-spring lattice, J. Nonl. Sci. 12 No. 5, 445-478, 2002 Ga09. Gaussian 09, Revision A.1, M. J. Frisch et al., Gaussian, Inc., Wallingford CT, 2009 HR80. R. C. Heitmann and C. Radin, The ground state for sticky discs, J. Stat. Phys. 22, 281-287, 1980 KS65. W. Kohn and L. J. Sham. Self-consistent equations including exchange and correlation effects. Phys. Rev. A 140, 1133-1138, 1965 MEF. J. M. L. Martin, J. El-Yazal, and J.-P. Francois, On the structure and vibrational frequencies of C24 , Chem. Phys. Letters 255, 7-14, 1996 No87. J. A. Northby, Structure and binding of Lennard-Jones clusters: 13≤N≤147, J. Chem. Phys. 87, 6166-6177, 1987 RK04. J. U. Reveles and A. M. Köster. Geometry Optimization in Denstiy Functional Methods. J. Comput. Chem. 25, 1109-1116, 2004 SO96. A. Szabo and N. S. Ostlund, Modern Quantum Chemistry, Dover Publications, 1996 ST09. H. M. Senn and W. Thiel, QM/MM Methods for Biomolecular Simulation, Angew. Chem. Int. Ed. 48, 1198-1229, 2009 Th06. F. Theil, A proof of crystallization in two dimensions, Comm. Math. Phys. 262, 209-236, 2006 TOP96. E. B. Tadmor, M. Ortiz and R. Phillips. Quasicontinuum analysis of defects in solids, Phil. Mag. A, 73, 1529-1563, 1996 14