RUHR-UNIVERSITÄT BOCHUM FAKULTÄT FÜR CHEMIE UND BIOCHEMIE LEHRSTUHL FÜR THEORETISCHE CHEMIE High-Dimensional Neural Network Potentials for Solids and Surfaces: Applications to Copper and Zinc Oxide Dissertation zur Erlangung des Doktorgrades der Naturwissenschaften Nongnuch Artrith Die vorliegende Dissertation wurde in der Zeit von Juni 2008 bis November 2012 am Lehrstuhl für Theoretische Chemie an der Fakultät für Chemie und Biochemie der Ruhr-Universität Bochum angefertigt. Leiter der Arbeit: Dr. Jörg Behler Referent: Dr. Jörg Behler Koreferent: Prof. Dr. Dominik Marx Dekan: Prof. Dr. Wolfram Sander วิทยานิ พนธเลมนี้ ขอมอบใหแด บิดา-มารดาของขาพเจา นายประยุทธ-นางสังวาลย อาจฤทธิ ์ นางสาวบุษบา อาจฤทธิ ์ (นองสาว) ตลอดจนครอบครัว อาจฤทธิ ์ ศรีกะกุล และ คุณพี่ดารา-คุณพี่สมบูรณ พิลาโสภา ทุกคนคอยสงกําลังใจ ใหความสําคัญกับการศึกษา และใหการสนับสนุนทุกๆ อยางแกขาพเจาตลอดมา โดยเฉพาะชวงเวลาที่ขาพเจาอยูหางไกลคนละทวีปของโลกใบนี้ ขอกราบขอบพระคุณดวยความเคารพและรักสุดหัวใจ นางสาวนงคนุช อาจฤทธิ ์ (2012) To my parents and my family (Artrith and Srikakul) for their love, support and patience. A special thanks for my mother's words: “Keep studying as much as you can, and study well, because no one can take from you what you have already learned.” Meinen Eltern und meiner Familie Artrith und Srikakul für ihre Liebe, Unterstützung und Geduld Einen besonderen Dank für die Worte meiner Mutter: „Lerne, soviel du kannst, und lerne gut, denn niemand kann dir nehmen, was du schon gelernt hast.” Zusammenfassung Die Simulation großer, realistischer Oberflächen nicht-idealisierter heterogener Katalysatoren erfordert die Modellierung von Systemen mehrerer Tausend Atome. Insbesondere die Zuverlässigkeit von Molekulardynamik-Simulationen großer Systeme ist hierbei stark abhängig von einer genauen Beschreibung der zugrunde liegenden Potentialenergiefläche (PES). Methoden basierend auf first principles, wie zum Beispiel Dichtefunktionaltheorie (DFT), erlauben zwar die genaue Vorhersage von Energien und atomaren Kräften, jedoch sind die notwendigen Systemgrößen aufgrund des hohen Rechenaufwands derzeit nicht mittels DFT zugänglich. In dieser Dissertation wird gezeigt, dass hochdimensionale neuronale Netzwerke (NNs), die mit first principles Daten trainiert wurden, die PES von Kupfer/Zinkoxid-Grenzflächen akkurat darstellen können. Das System aus Kupferclustern auf Zinkoxid ist ein wichtiger heterogener Katalysator für die industrielle Methanolsynthese. Im Vergleich zu DFT-Rechnungen ist die Auswertung von NN-Potentialen um einige Größenordnungen schneller. Darüber hinaus skaliert der Rechenaufwand linear mit der Anzahl der Atome. Die Konstruktion eines akkuraten Kupfer/Zinkoxid-Potentials erforderte eine Reihe von Zwischenschritten, die in dieser Arbeit erörtert werden: Zunächst wurde die allgemeine Anwendbarkeit der NN-Methode auf metallische Systeme anhand des Fallbeispiels Kupfer demonstriert. Zudem wurde eine Erweiterung der NN-Methode basierend auf umgebungsabhängigen atomaren Ladungen vorgestellt, welche die zuverlässige Beschreibung von Ladungstransport in Mehrkomponentenstrukturen ermöglicht. Diese Methodik wird anhand eines NN-Potentials für Zinkoxid erläutert. Die Erkenntnisse aus den ersten beiden Schritten erlauben anschließend die Konstruktion eines ersten NN-Potentials für das ternäre System aus Kupfer und Zinkoxid. Zuletzt wird die Genauigkeit der NN-Methode für molekulare Strukturen an dem Beispiel des Methanolmoleküls untersucht. Jedes in dieser Arbeit vorgestellte NN-Potential wurde sorgfältig getestet. Hierfür wurden vielfältige Eigenschaften, wie strukturelle Energieunterschiede, atomare Kräfte, Fehlstellenbildungsenergien, elastische Eigenschaften und Oberflächenenergien verschiedener Kupfer- und Zinkoxidoberflächen untersucht. Die vorhergesagten Geometrien, Energien, atomaren Kräfte und Ladungen sind in hervorragender Übereinstimmung mit den DFT Referenzwerten. vii Abstract The simulation of large realistic surfaces of non-ideal heterogeneous catalysts makes it necessary to model systems containing several thousand atoms. Especially molecular dynamics simulations of large systems critically depend on the accurate description of the underlying potential energy surface (PES). First-principles methods such as density-functional theory (DFT) can provide very accurate energies and forces, but the simulation of such system sizes with DFT currently is unfeasible due to the high computational costs. In this thesis it is demonstrated that high-dimensional neural networks (NN) trained to first-principles data are able to accurately represent the PES of zinc oxide supported copper clusters, an important heterogeneous catalyst for the methanol synthesis. The evaluation of NN potentials is several orders of magnitude faster than DFT calculations, and its computational cost scales linearly with the simulated number of atoms. The construction of an accurate copper/zinc oxide potential made several steps necessary that are discussed in the thesis. First, the general applicability of the NN method to metallic systems has been investigated at the example of a copper NN potential. Second, it is demonstrated that an extension of the NN methodology based on environmentally dependent atomic charges allows the accurate description of multicomponent systems exhibiting charge transfer. Third, a working ternary potential for copper/zinc oxide interface structures is presented. Finally, the accuracy of NN potentials for molecular structures is reported for the methanol molecule as a benchmark example. Each constructed NN potential has been carefully tested. Several properties, e.g., structural energy differences, atomic forces, vacancy formation energies, elastic properties and surface energies for different copper and zinc oxide surfaces have been presented here. The predicted geometries, energies, atomic forces, and atomic charges are in excellent agreement with reference DFT calculations. viii Associated Publications Some of the results presented in this thesis have already been published in the following articles: N. Artrith and J. Behler, “High-dimensional neural network potentials for metal surfaces: A prototype study for copper”, Phys. Rev. B 85 (2012) 045439. N. Artrith, T. Morawietz and J. Behler, “High-dimensional neural-network potentials for multicomponent systems: Applications to zinc oxide”, Phys. Rev. B 83 (2011) 153101. N. Artrith, B. Hiller and J. Behler, “Neural network potentials for metals and oxides – First applications to copper clusters at zinc oxide”, Phys. Status Solidi B 1–13 (2012), accepted (invited feature article). K. V. J. Jose, N. Artrith and J. Behler, “Construction of high-dimensional neural network potentials using environment-dependent atom pairs”, J. Chem. Phys. 136 (2012) 194111. ix Contents Zusammenfassung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii Associated Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . I 1 II 2 3 Introduction ix 1 Introduction 3 1.1 Heterogeneous catalysis . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Accurate and efficient atomistic potentials for materials . . . . . . . . 5 1.3 Structure of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Theoretical Background 9 Electronic Structure Calculations 11 2.1 The Born–Oppenheimer approximation . . . . . . . . . . . . . . . . 11 2.2 The electronic structure problem . . . . . . . . . . . . . . . . . . . . 12 2.3 Density-functional theory . . . . . . . . . . . . . . . . . . . . . . . . 15 The FHI-aims Code 19 3.1 19 Basis set expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Molecular dynamics simulations 23 5 Neural Network Potentials 25 5.1 Artificial neural networks . . . . . . . . . . . . . . . . . . . . . . . . 25 5.2 High-dimensional neural network potentials . . . . . . . . . . . . . . 27 5.3 High-Dimensional Neural Networks for Multicomponent Systems . . 33 xi 5.4 Benchmark of activation functions and symmetry functions . . . . . . 35 5.5 Molecular dynamics simulations employing NN potentials . . . . . . 37 III Computational Details 39 6 Computational Details 41 6.1 DFT calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 6.2 Construction of reference data sets . . . . . . . . . . . . . . . . . . . 42 6.3 Optimization of the neural network architecture . . . . . . . . . . . . 45 IV Results 47 7 A Neural Network Potential for Copper 49 7.1 Reference data set . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 7.2 A neural network potential for copper . . . . . . . . . . . . . . . . . 56 7.3 Reliability of the neural network potential for a large realistic structure 70 8 9 A Multicomponent Neural Network Potential for Zinc Oxide 77 8.1 Neural network potentials for multicomponent systems . . . . . . . . 77 8.2 Reference data set . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 8.3 Neural network potential for zinc oxide . . . . . . . . . . . . . . . . 80 Neural Network Potentials for Ternary Systems 9.1 9.2 Construction of a neural network potential-energy surface for copper/zinc oxide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Reference data set . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 10 A Neural Network Potential for the Methanol Molecule V Summary and Outlook 11 Summary and Outlook xii 91 107 115 117 VI Appendix A Symmetry Function Parameters 121 123 A.1 Copper potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 A.2 Zinc oxide potential . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 A.3 Copper/zinc oxide potential . . . . . . . . . . . . . . . . . . . . . . . 128 B Calculation of Elastic Constants of Cubic Lattices 131 Bibliography 133 Acknowledgements 145 xiii xiv List of Figures 5.1 Small example of a feed-forward neural network . . . . . . . . . . . 26 5.2 High-dimensional neural network . . . . . . . . . . . . . . . . . . . 28 5.3 Demonstration of a NN fit without force information . . . . . . . . . 30 5.4 Demonstration of a NN fit including forces . . . . . . . . . . . . . . 31 5.5 An example of the radial symmetry functions G2i . . . . . . . . . . . 33 G4i 5.6 An example of the angular symmetry functions . . . . . . . . . . 33 5.7 High-dimensional NN for multicomponent systems . . . . . . . . . . 34 5.8 Flow chart of the RuNNer–TINKER interface . . . . . . . . . . . . . 38 6.1 A systematic approach to construct neural network potentials . . . . . 43 7.1 Energy vs. Volume curves for Cu crystal structures . . . . . . . . . . 50 7.2 DFT energies of Cu bulk structures in the reference data set . . . . . . 51 7.3 Comparison of NN energies of two different Cu potentials. . . . . . . 52 7.4 Comparison of RMSEs of copper bulk systems . . . . . . . . . . . . 57 7.5 Comparison of the fitting error of different copper NNs . . . . . . . . 58 7.6 Comparison of errors for the Cu training and test sets . . . . . . . . . 59 7.7 Comparison of NN and DFT energies for a random Cu30 cluster . . . 60 7.8 Comparison of NN and DFT forces for atoms in a Cu14 cluster . . . . 61 7.9 Comparison of NN and DFT energies of 16 atom bulk Cu bulk structures 65 7.10 Comparison of NN and DFT forces acting on atoms in Cu bulk structures 66 7.11 Energy profiles of a diffusing Cu surface adatom . . . . . . . . . . . 70 7.12 Slab model of a realistic Cu (111) surface . . . . . . . . . . . . . . . 71 7.13 Atomic benchmark environments in a realistic Cu surface . . . . . . . 71 7.14 Clusters extracted from the realistic Cu surface model . . . . . . . . . 74 7.15 Comparison of NN and DFT atomic forces for Cu clusters (6 Å) . . . 74 7.16 Comparison of NN and DFT atomic forces for Cu clusters (12 Å) . . . 75 xv 7.17 Convergence of the NN forces with increasing cutoff . . . . . . . . . 75 8.1 Energy density of states for the ZnO data set . . . . . . . . . . . . . . 80 8.2 Comparison of NN and DFT energies of random Zn40 O40 clusters . . 86 8.3 Comparison of NN and DFT forces in a Zn15 O15 cluster . . . . . . . 86 8.4 Comparison of NN and DFT force components in random Zn15 O15 cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 87 Comparison of NN and DFT atomic charges for the same Zn15 O15 cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 8.6 Comparison of NN and DFT energies of ideal ZnO crystal structures . 90 8.7 Comparison of NN and DFT energies of random ZnO bulk structures . 90 8.8 Comparison of NN and DFT energies of thermally distorted surface structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 9.1 Realistic Cu surface model with imperfections . . . . . . . . . . . . . 97 9.2 Cu clusters extracted from a large surface model . . . . . . . . . . . . 98 9.3 Comparison of NN and DFT forces acting on the central atoms of Cu clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 9.4 Comparison of NN and DFT energies of random ZnO bulk structures . 100 9.5 Comparison of NN and DFT forces in a Zn15 O15 cluster . . . . . . . 100 9.6 Comparison of NN and DFT energies of random CuO bulk structures 101 9.7 Comparison of NN and DFT energies of random CuZn bulk structures 102 9.8 Comparison of NN and DFT energies of random Cu27 Zn20 O20 clusters 102 9.9 Comparison of NN and DFT energies for random Cu/ZnO slabs . . . 103 9.10 A model of a large Cu cluster on a ZnO surface . . . . . . . . . . . . 103 9.11 Snapshot of an MD simulation of a large Cu cluster on a ZnO surface 105 9.12 Comparison of NN and DFT atomic forces in a large Cu/ZnO interface structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 9.13 Convergence of the NN forces with respect to the cutoff radius . . . . 106 10.1 Comparison of NN and MM3 energies for the methanol molecule . . 111 10.2 Dihedral potential for the CH3 rotation in methanol . . . . . . . . . . 112 10.3 Comparison of NN and MM3 energies during an MD simulation of methanol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 xvi List of Tables 5.1 RMSEs of NN fits for the Cu dimer . . . . . . . . . . . . . . . . . . 36 7.1 Cu clusters in the training and the test set . . . . . . . . . . . . . . . 53 7.2 Cu bulk structures in the training and the test set. . . . . . . . . . . . 54 7.3 Cu surface structures in the training and the test set . . . . . . . . . . 55 7.4 RMSEs for different copper NN potentials . . . . . . . . . . . . . . . 57 7.5 Comparison of NN and DFT Cu lattice parameters . . . . . . . . . . 62 7.6 Comparison: NN vs. DFT for Cu elastic constants . . . . . . . . . . . 62 7.7 Cu vacancy formation energies . . . . . . . . . . . . . . . . . . . . . 64 7.8 DFT and NN copper surface energies . . . . . . . . . . . . . . . . . . 67 7.9 Vacancy formation energies at various Cu surfaces . . . . . . . . . . 69 7.10 Number of atoms in Cu clusters of increasing diameter . . . . . . . . 76 8.1 Composition of the training and the test set for the ZnO system. . . . 81 8.2 Energies and charges in the ZnO reference data set . . . . . . . . . . 82 8.3 RMSEs of energies and forces of different ZnO NN potentials (A) . . 84 8.4 RMSEs of energies and forces of different ZnO NN potentials (B) . . 84 8.5 Lattice parameters and bulk moduli of ZnO crystal structures . . . . . 88 9.1 Composition of the Cu/ZnO reference data set . . . . . . . . . . . . . 93 9.2 RMSEs for energies and forces of various Cu/ZnO NN fits (A) . . . . 94 9.3 RMSEs for energies and forces of various Cu/ZnO NN fits (B) . . . . 95 9.4 Cu lattice parameters as obtained using the Cu/ZnO potential . . . . . 96 9.5 Cu surface energies as obtained using the Cu/ZnO potential . . . . . . 96 9.6 ZnO lattice parameters as obtained using the Cu/ZnO potential . . . . 99 10.1 Symmetry function parameters for the methanol NN potential (A) . . 108 10.2 Symmetry function parameters for the methanol NN potential (B) . . 109 xvii 10.3 RMSEs of energies and forces for the methanol potential . . . . . . . 110 A.1 Radial symmetry functions for Cu . . . . . . . . . . . . . . . . . . . 123 A.2 Angular symmetry functions for Cu . . . . . . . . . . . . . . . . . . 124 A.3 Radial symmetry functions for O in ZnO . . . . . . . . . . . . . . . . 125 A.4 Radial symmetry functions for Zn in ZnO . . . . . . . . . . . . . . . 125 A.5 Angular symmetry functions for O in ZnO . . . . . . . . . . . . . . . 126 A.6 Angular symmetry functions for Zn in ZnO . . . . . . . . . . . . . . 127 A.7 Radial symmetry functions used in the Cu/ZnO potential . . . . . . . 128 A.8 Angular symmetry functions used in the Cu/ZnO potential . . . . . . 129 xviii Part I Introduction 1 1 Introduction The combustion of fossil fuels leads to a climate change and to environmental pollution, due to the emission of green house gases, such as carbon dioxide (CO2 ). Nuclear electric power, on the other hand, does not only bear high and potentially uncontrollable safety risks, but also leads to the unsolved problem of nuclear waste disposal. However, most sustainable energy sources, such as solar radiation, wind or tidal wave energy, do not provide a continuous energy output. Both, solar radiation and wind depend on the weather, the hour of the day, and the season. It is therefore inevitable to store the energy when it is abundant for consumption at a later point in time. Synthetic fuels, such as alcohols, are a particularly appealing option for energy storage, as they are highly transportable, have a high energy density, and can be used with available technology (e.g., combustion engines and fuel cells). Olah has pictured a prospective methanol based economy, in which methanol, an important precursor in the chemical industry, is used as the main energy storage [1, 2]. The methanol synthesis involves the conversion of carbon monoxide (CO) and CO2 , so that the net emission of green house gases is close to zero, even though the combustion of methanol releases CO2 . Today’s standard synthesis route to methanol is based on a heterogeneous catalysis over oxide-supported copper clusters (Cu/ZnO/Al2 O3 ) catalyst [3]. 1.1 Heterogeneous catalysis Heterogeneous catalytic chemical reactions, involving a solid catalyst and gaseous or liquid reactants, are at the core of many energy and environment related challenges. The conversion of toxic exhaust gases to less harmful substances is achieved with catalytic converters. The high overpotential of the electrochemical oxygen reduction reaction (ORR), the fundamental reaction in fuel cells, is lowered by a heterogeneous 3 electrocatalyst (traditionally Pt/C) [4, 5]. In fact, the industrial production of the majority of chemical compounds is based on heterogeneous catalysis. The most prominent example is the ammonia synthesis following the Haber–Bosch process, which accounts for 1.4 % of the world’s consumption of fossil fuels [6] and has been recognized in three Nobel Prizes (Fritz Haber 1918, Carl Bosch 1931, and Gerhard Ertl 2007). The work that is the subject of this thesis was done as part of the collaborative research center SFB 558 at the University of Bochum, which has focused on the experimental and theoretical investigation of the methanol synthesis for the past 12 years (2000– 2012) [7]. Many experimental studies within the SFB 558 and in other research groups have improved the understanding of the catalytic methanol synthesis, and of heterogeneous catalysis over oxide supported metal clusters in general [3, 8–10]. Especially the interaction between the copper clusters and the support has been found to result in complex phenomena, such as the formation of copper–zinc alloys under reducing conditions [11, 12], the migration of copper atoms [13] and small copper clusters [14] into the zinc oxide surface, the thermal oxidation of copper [15], and the sensitive dependence of the shape of the adsorbed copper clusters on the gaseous environment [16]. All these effects of the strong metal–support interaction (SMSI) have a significant influence on the activity of the catalyst. The inspiration for this thesis came from the surprising dependence of the shapes of copper clusters at zinc oxide surfaces on the gas phase of the environment, as discovered by Hansen and co-workers [16]. It has been the motivation of the studies presented here to obtain an understanding of the dynamical and structural properties of this system. As a first step, this thesis is focused on copper clusters at zinc oxide surfaces in vacuum. The gaseous environment was not included. A recent study of copper cluster on non-polar zinc oxide surfaces by Köhler et al. [17] using scanning tunneling microscopy (STM) has showed evidence for the penetration of the copper clusters into the support surface. In their experiment the zinc oxide surfaces were heated to different temperatures (290 K and 650 K) before depositing copper by means of Molecular Beam Epitaxy. Subsequently footprints of the copper clusters were revealed, when individual clusters were removed from the support surface using 4 the STM tip. The ambitious goal of this thesis was the realistic theoretical modeling of the Cu/ZnO interactions that lead to the observed effects. Since the catalytic reaction itself is governed by processes at the atomic and electronic length-scales, several theoretical studies based on electronic structure calculations have provided additional insight into the reaction mechanism of the pure zinc oxide catalyst [18, 19], the thermodynamical properties of copper clusters and surfaces [20– 22] and zinc oxide surfaces [23–25]. Because of the inherent complexity of the system, lack of theoretical studies have addressed model calculations of the entire copper/zinc oxide catalyst [3, 26, 27]. Electronic structure calculations are, however, limited to small, idealized model structures containing a maximum of a few hundreds of atoms. The simulation of a realistic active catalyst including the effects of the SMSI would make it necessary not only to model nanometer-scale copper clusters adsorbed on non-ideal zinc oxide surfaces, but also the containing gaseous atmosphere. In general, the active site of solid catalysts is often related to imperfections, such as step edges, defects, or ad-atoms, which spoil the periodicity of the ideal surface and render it necessary to simulate large supercells [3]. A computational model of such a realistic solid catalyst has to contain several thousands or even tens of thousands of atoms, which is well beyond the feasibility of density-functional theory. It is therefore necessary to turn to a more approximate theoretical method, that should, however, still be sufficiently accurate to describe the complex geometries of a non-ideal catalyst surface. 1.2 Accurate and efficient atomistic potentials for materials For the reliable simulation of non-ideal catalyst surfaces a method is needed that is at the same time sufficiently accurate to allow for quantitative predictions, not restricted to a particular class of materials, and efficient to allow the simulation of long time scales. Atomic interaction potentials allow the simulation of large structures containing tens of thousands of atoms. Usually, these potentials have been developed for a certain application, such as for the simulation of solid insulators and molecules [28–31], 5 metals [32, 33, 40], and large biological entities (e.g., proteins, DNA fragments) [28, 34–36]. However, despite the vast number of different atomic interaction potentials, none of the conventional potentials is suitable for the accurate simulation of nonideal oxide supported metal clusters. To give just a few specific examples: molecular force fields (FF) [37–39], are specialized on the description of the chemical bonds in large organic or biological molecules, and are in general not suited to describe solids or surfaces. This class of potentials also relies on the definition of atomic connectivities and therefore does not allow the formation or the cleavage of bonds during the simulation. The functional form of embedded atom models (EAM) [40] on the other hand has been derived from the electronic energy expression in metals, and can not in general be applied to covalent insulators or molecular materials, even though extended EAM potentials have been suggested [41, 42]. The most general class of atomistic potentials are bond order potentials (BOP), such as the Tersoff potential for solids [31], or potentials based on the reactive force field approach by Duin and co-workers [43]. BOPs are applicable to a wide range of structures and have previously been used to simulate zinc oxide [44]. However, the functional form of BOPs, which is based on physical approximations, is not flexible enough to reach the accuracy that is necessary for reliable predictions of structural and dynamical properties. Recently, two new potential types have been suggested that could be a remedy to the restrictions discussed above: Gaussian Approximation Potentials (GAP) [45, 46] and potentials based on artificial Neural Networks (NN) [47–49]. Both approaches are purely mathematically motivated, and allow the interpolation of the potential energy surface using a flexible, non-linear functional form based on a set of reference structures. In case of GAP, all reference structures enter the energy expression, and the efficiency of the potential therefore depends on the number of reference structures. The NN potentials are, on the other hand, fitted once to reproduce the reference energies and forces and can be subsequently used in large-scale simulations without any information about the initial set of reference structures. In recent years NN potentials have emerged that promises to surpass the shortcomings of the conventional potentials [50–52]. Behler and Parrinello have shown that symmetry-adapted artificial neural networks can be trained to accurately represent the high-dimensional potential energy surface (PES) of atomic structures [51, 53, 54]. 6 Within the scope of this thesis the applicability of NN potentials for the simulation of zinc oxide supported copper clusters is explored. Previously, the method had been used for the simulation of silicon crystal phases, i.e., for a covalent insulator of a single atomic species [55, 56]. As a first step it is thus mandatory to assess the capability of the NN potential method for the description of metals, in particular for copper. To simulate the Cu/ZnO system it was furthermore necessary to extend the methodology to multicomponent systems of more than a single chemical element. Eventually, for the simulation of the actual methanol synthesis the NN potentials need to provide an accurate description of molecules. 1.3 Structure of the thesis In the first part of the thesis a brief review of the employed theoretical and computational methods is provided. The NN potential methodology and its extension to multicomponent structures is discussed in detail. The second part of the thesis focuses on the steps outlined in the previous section: in Chapter 7 the first application of the NN potential method for a metal is presented for the example of copper. The first application of a multicomponent NN potential for the construction of a zinc oxide potential is discussed in Chapter 8. The results of these two chapters are combined in a ternary Cu/ZnO potential in Chapter 9. Finally, the applicability of the NN method for molecular structures is evaluated for a single methanol molecule in Chapter 10. The final part of the thesis provides a summary of the results and an outlook to the prospective simulations using the constructed potentials. 7 8 Part II Theoretical Background 9 2 Electronic Structure Calculations The investigation of the structural, energetic, and dynamical properties of complex condensed matter and materials requires a reliable description of the atomic interactions. In this project density-functional theory (DFT) has been used, which provides an accurate description of many complex systems, in particular for solids and surfaces like copper/zinc oxide. The FHI-aims code (Fritz Haber Institute ab initio molecular simulations) [57], an all-electron code that employs numerical atomic orbitals as basis functions, has been used for all production calculations to set up training sets for neural network potentials. Additionally, some tests have also been carried out with the PWSCF code [58], a pseudo potential program that employs plane waves as basis functions. This code was used to verify the accuracy of the FHI-aims electronic structure calculations. 2.1 The Born–Oppenheimer approximation From quantum mechanics we know that the static1 eigenstate of an atomic system can be described by a wave function Ψ, which depends on all electronic coordinates {~r} and all ionic coordinates {~R}. The Schrödinger equation [59] b r}, {~R}] Ψ[{~r}, {~R}] = E Ψ[{~r}, {~R}] H[{~ (2.1) b which also depends relates the eigenstate Ψ to an energy E. The Hamilton operator H, on all electronic and ionic coordinates, is given by the sum of the kinetic and potential energy operators Tb and V b = Tb +V = Tbe + Tbn +Vee +Ven +Vnn H 1 We , (2.2) will not discuss time dependent properties in this work. 11 where the indices “e” and “n” refer to the electrons and the nuclei. For an atomic system of Ne electrons and Nn nuclei the solution of Schrödinger’s equation (2.1) thus involves 3 Ne + 3 Nn independent variables. In 1927 Born and Oppenheimer suggested to separate the degrees of freedom of the electrons and nuclei [60] Ψ[{~r}, {~R}] ≈ Ψe [{~r}] · Ψn [{~R}] , (2.3) and the justification of this approximation are the very different time-scales of the electronic and ionic dynamics. The “light and fast” electrons are expected to adjust adiabatically to the “slow” changes of the ionic positions. In that case the purely ionic contributions to the Hamiltonian (2.2), namely the kinetic energy of the nuclei and the ion–ion repulsion, can also be treated separately and the electronic structure problem can be reformulated in an electronic Schrödinger equation be Ψe = Ee Ψe H be = Tbe +Vee +Ven with H . (2.4) Note, that—by convention—the ionic repulsion Vnn is kept in the electronic Hamiltonian, but usually treated in a classically way. In general the Born–Oppenheimer approximation (2.3) is a very good approximation. There are situations, in which the separation of the electronic and ionic degrees of freedom is not justified. An example for such non-adiabatic problems is the avoided crossing of two quantum states close in energy. In this work, however, only atomic structures within the Born–Oppenheimer approximation were considered. We will therefore usually drop the index “e” in the following sections, and we implicitly refer b and Ψ. to the electronic Hamiltonian and wave function by H 2.2 The electronic structure problem In section 2.1 the electronic Schrödinger equation was introduced. Our objective shall be to determine the electronic ground-state energy for a system with N electrons and Nn nuclei that is described by a given Hamiltonian of the form of equation (2.2). The different contributions in the position representation of the Hamilton operator (all 12 expressions are given in Hartree atomic units) are the kinetic energy of the electrons N 1 Tb = − ∑ ~∇2i 2 i , (2.5) the electron–electron repulsion 1 j<i |~r j −~ri | Vee = ∑ , (2.6) the electron–ion attraction N Nn Ven = − ∑ ∑ i α zα |~rα −~ri | , (2.7) where zα is the ionic charge at nucleus α, and the classical ion–ion repulsion Nn Vnn = zα zβ |~r −~rα | β <α β . ∑ (2.8) Note, that especially in the literature related to density-functional theory (see section 2.3) it is common to combine the electron–ion interactions with further external field contributions (for example from electric fields) in a more general energy due to an external potential Vext = Ve,n + . . ., which governs the electronic degrees of freedom. The external potential can then be written as a sum of the interactions of the individual electrons with an external potential that depends on the coordinates of all nuclei {~rα } Vext = ∑ vext ({~rα },~ri ) . (2.9) i The energy of any normalized quantum state Ψ, is then given by the expectation value of the Hamiltonian b = hΨ|H|Ψi b E[Ψ] = hHi = Z b Ψdτ Ψ∗ H , (2.10) where the integration is over all electronic coordinates. The ground state wave function minimizes the energy functional of Eq. (2.10) (variational principle) E0 = min E[Ψ] . (2.11) Ψ Note, that an ansatz for the wave function Ψ is necessary to actually evaluate the expectation value (2.10) and make use of the variational principle (2.11). 13 2.2.1 The electronic wave function The many-electron wave function Ψ is a function of all electronic coordinates. Similar as in the Born–Oppenheimer approximation, Eq. (2.3), the problem of finding an ansatz for the wave function simplifies if the total function is decomposed in a product ansatz. The simplest such ansatz for an N-electron wave function Ψ is the Hartree product Ψ(~r1 ,~r2 , . . . ,~rN ) ≈ ΨHartree (~r1 ,~r2 , . . . ,~rN ) = ψ1 (~r1 ) · ψ2 (~r2 ) · . . . · ψN (~rN ) (2.12) of single-electron functions ψ. The Hartree product is, however, not a suitable representation of an electronic state. Electrons are Fermions and any ansatz for an electronic wave function has therefore to obey the antisymmetry (Pauli) principle with respect to the exchange of two particles Ψ(. . . , i, . . . , j, . . .) = −Ψ(. . . , j, . . . , i, . . .) . (2.13) The antisymmetrization of the Hartree product (2.12) leads to the Slater determinant ψ (~r ) · · · ψ (~r ) 1 1 1 N 1 . .. .. Ψ(~r1 ,~r2 , . . . ,~rN ) ≈ ΨSD (~r1 ,~r2 , . . . ,~rN ) = √ .. . . N! ψN (~r1 ) · · · ψN (~rN ) which satisfies the Pauli principle (2.13). The prefactor of √1 N! , (2.14) provides for normaliza- tion, given the one-electron functions themselves are normalized and orthogonal, i.e., hψi |ψ j i = δi j , so that the probabilistic interpretation of the squared norm of the wave function is possible. The use of a single Slater determinant as ansatz for the all-electron wave function to exploit the variational principle is the foundation of the Hartree-Fock method. The approximation of the many-electron wave function by a Slater determinant is, however, not always sufficiently accurate. A hierarchy of theoretical chemistry methods exists that improves on this approximation, either by means of perturbation theory (MP2, MP4), by including further electronic configurations that correspond to excited states (CI, MCSCF, CASSCF, CC), or by combining the two approaches (CASPT2). 14 2.3 Density-functional theory In the previous section the Hartree–Fock method was mentioned, which is based on a Slater-determinant ansatz for the all-electron wave function. The results of Hartree– Fock calculations are often not satisfactory and one way to cure the shortcomings is to choose a more complex ansatz for the wave function. However, density-functional theory (DFT) [76] takes a conceptually different route, where the description of the correlated all-electron wave function is completely avoided [61]. The fundamental quantity of DFT is the electron density n(~r), which is related to the N-electron wave function Ψ by Z n(~r) = N Z ... Ψ∗ (~r,~r2 , . . . ,~rN )Ψ(~r,~r2 , . . . ,~rN ) d~r2 . . . d~rN , (2.15) so that n(~r) is the spacial distribution function of all electrons that are described by Ψ and the integration over the whole space consequently yields the number of electrons Z n(~r) d~r = N . (2.16) The goal of DFT is to express the total energy functional (2.11) directly in terms of the electron density (2.15) E = E[n(~r)] (2.17) instead of the wave function Ψ. This does not only have the advantage of avoiding any ansatz for the all-electron wave function, it would also mean that the number of degrees of freedom of the electronic structure problem for N electrons is reduced from 3 N to just 3. That such a (rather counter-intuitive) density-functional exists has been shown by Hohenberg and Kohn, who could additionally prove that the variational principle (2.11) also applies to DFT. The Hohenberg–Kohn theorems and their proofs are discussed in the following section 2.3.1. 2.3.1 The Hohenberg–Kohn theorems In section 2.2 it was explained that the electronic (Born–Oppenheimer) Hamilton operator is entirely determined by the knowledge of an external potential Vext (which in 15 the simplest case is just determined by the ionic positions) and the number of electrons N. The Hamilton operator in turn determines the all-electron states via the Schrödinger equation (2.1) and in particular the ground-state wave function Ψ0 , from which the ground-state electron density n0 can be derived: b r1 , . . . ,~rNe ) → Ψ0 (~r1 , . . . ,~rNe ) → n0 (~r) . {Vext (~r), N} → H(~ (2.18) In their first theorem Hohenberg and Kohn [62] showed that the above relation can be reversed, i.e., a given ground-state electron density n0 (~r) can only be the result of exactly one particular external potential Vext (~r) and one particular number of electrons N, which in turn determines the ground-state be (~r1 , . . . ,~rNe ) → Ψ0 (~r1 , . . . ,~rNe ) . n0 (~r) → {Vext (~r), N} → H (2.19) If the ground-state wave function is uniquely determined by the ground-state electron density, the wave function itself is a functional of the density and there must also be a b density-functional for the expectation value of any operator O b 0 [n0 (~r)]i . O[n0 ] = hΨ0 [n0 (~r)]|O|Ψ (2.20) The second Hohenberg–Kohn theorem derives the applicability of the variational principle to the ground-state electron density. The one-to-one mapping of the electron density and the wave function, which is the result of the first theorem, immediately transfers the minimum-energy principle to the density: b 0 i ≤ hΨ̃|H| b Ψ̃i hΨ0 |H|Ψ b b ⇔ hΨ[n0 ]|H|Ψ[n 0 ]i ≤ hΨ[ñ]|H|Ψ[ñ]i ⇔ E[n0 ] ≤ E[ñ] . (2.21) For the variational minimization of the energy, it must be additionally guaranteed that the ground-state is a stationary point of the energy functional with respect to the variation of the density. With the constraint that the integration of the density over the whole space must yield the number of electrons, the problem can be formulated as Lagrangian function δ n E[n] + µ δn where µ is a Lagrange multiplier. 16 Z n(~r) d~r − Ne o =0 , (2.22) 2.3.2 Kohn–Sham density-functional theory The Hohenberg–Kohn theorems demonstrate that an energy functional E[n] of the electron density exists and that its minimum is the ground state energy. However, it is not obvious how such a functional can be obtained. Of the individual contributions to the total electronic energy (see Sec. 2.2) E[n] = T [n] +Vee [n] +Vext (2.23) neither the kinetic energy functional T [n] nor the functional of the electronic interaction potential Vee [n] is known. Kohn and Sham [63] identified the known contributions to the unknown quantities in the energy functional and substituted the classical electrostatic Hartree energy VH [n] = 1 2 Z Z n(~r) vH (~r) d~r with vH (~r) = n(~r0 ) d~r0 |~r −~r0 | (2.24) for the electron–electron interaction Vee [n]. The kinetic energy of the correlated electrons T [n] is replaced by the kinetic energy Ts [n] of a fictitious auxiliary system of non-interacting electrons. The presumably small missing energy contributions due to these approximations are captured by an additional term, the exchange–correlation energy Exc [n]. The Kohn–Sham total energy functional thus reads E KS [n] = Ts [n] +VH [n] + Exc [n] +Vext , (2.25) which is formally equivalent to the energy functional in Eq. (2.23), if the functional Exc is known. To obtain the kinetic energy of the auxiliary Kohn–Sham system, it is still necessary to solve a Schrödinger equation. However, in the case of non-interacting electrons the many-electron wave function Ψ is exactly given by a Slater determinant, Eq. (2.14), of one-electron wave functions {ψi } (Kohn–Sham orbitals). Thus, as in the Hartree–Fock method, a set of one-electron eigenvalue problems has to be solved b h ψi = εi ψi 1 with b h = − ~∇2 + veff (~r) , 2 (2.26) where b h is the one-electron Hamiltonian, and veff (~r) is an effective potential, which will be discussed. The kinetic energy of the auxiliary system is then given as sum of 17 the one-electron energies 1 Ts = ∑ − hψi |~∇2 |ψi i , 2 i (2.27) and the density of the non-interacting electrons is ns (~r) = ∑ |ψi (~r)|2 . (2.28) i As an additional condition, Kohn and Sham require the ground state electron density n0 of the auxiliary system to be equal to the one of the real system. Thus, at the ground state n = ns = n0 , the variation n → n + δ n, Eq. (2.22), of the energy of the fictitious and of the real system must both become zero and can be set equal. This leads to an expression for the effective potential veff [n](~r) = vext (~r) + vH [n] + vxc [n] , where the exchange–correlation potential vxc is defined as δ Exc [n] vxc = . δ n n=n0 (2.29) (2.30) The Kohn–Sham equations, Eq. (2.26), have to be solved self-consistently, since the effective potential depends on the density, which in turn depends on the Kohn–Sham orbitals that belong to one specific effective potential. A good initial guess for the ground state density is the superposition of atomic densities. Note, that the exact exchange–correlation functional Exc is not known. However, good approximations of Exc derived from the local density of the homogeneous electron gas (local-density approximation, LDA) and density gradient corrected approximations (generalized gradient approximation, GGA) are available. More advanced approximations include a dependence on the second derivative of the density or even depend on the Kohn–Sham orbitals. 18 3 The FHI-aims Code FHI-aims (Fritz Haber Institute ab initio molecular simulations) [57] is an efficient computer program package for the calculation of physical and chemical properties of condensed matter and materials (such as molecules, clusters, solids, surfaces, and liquids) based on first-principles descriptions of the electronic structure (e.g., using DFT). The program implements all-electron methods that use numerical atom-centered orbitals as the basis functions for the Kohn–Sham orbitals. This enables accurate allelectron and full-potential calculations at a computational cost, which is competitive with, for example, plane wave methods. Further, it allows to carry out calculations with and without periodic boundary conditions. 3.1 Basis set expansion The general solution of the self-consistent Kohn–Sham equations (2.26) for arbitrary molecular Kohn–Sham orbitals is infeasible for structures containing many atoms. In practice, the orbital space is restricted by expanding the wave functions in a set of suitable basis functions {φµ } |ψi i = ∑ ciµ |φµ i . (3.1) µ This technique was independently suggested by Roothaan and Hall [64, 65] for the iterative solution of the Hartree–Fock equations. The expansion of Eq. (3.1) transforms the abstract operator eigenvalue problem (2.26) to a generalized matrix–vector eigenvalue problem b h|ψi i = εi |ψi i −→ ∑ Hµν ciν = εi ∑ Sµν ciν ν ↔ H~ci = εi S~ci , (3.2) ν where H is the matrix representation of the one-electron Hamiltonian in the chosen basis set with matrix elements Hµν = hφµ |b h|φν i. The eigenvectors {~ci } correspond 19 to the Kohn–Sham orbitals {ψi }. The elements of the overlap matrix Sµν = hφµ |φν i depend only on the choice of the basis functions and are structurally independent. If the basis functions are chosen to be pairwise orthogonal, the overlap matrix becomes the identity matrix, and Eq. (3.2) simplifies to a regular matrix–vector eigenvalue problem. Note, that Hermitian eigenvalue problems of the form (3.2) can be solved numerically with high efficiency [66]. Most electronic structure methods at their core rely on a basis set expansion of some kind. Differences stem from the choice of the functional form of the basis {φµ }, for which many different approaches are employed: common examples are Slater type orbitals (STOs), Gaussian type orbitals (GTOs), plane waves, wavelets, and grid distributed or atom centered numerical basis functions. All electronic structure calculations presented in this work have been performed using the FHI-aims program [57], which employs a basis set of numerical atom-centered orbitals. 3.1.1 Atomic orbitals H The analytic solutions ψnlm (~r) of the Schrödinger equation (2.1) for the hydrogen atom H are, in spherical coordinates, given by the product of a radial function Rnl (r) and a spherical harmonic function Ylm (ϑ , ϕ) H H H ψnlm (~r) = ψnlm (r, ϑ , ϕ) = Rnl (r) ·Ylm (ϑ , ϕ) , (3.3) and are often called atomic orbitals (in contrast to Hartree–Fock or Kohn–Sham orbitals, which are called molecular orbitals). The concept of atomic orbitals can be extended to arbitrary atom types, where the all-electron wave function is approximated by a combination of one-electron atomic orbitals, the simplest of which is a Slater determinant. The radial functions Rαnl (r) for the atom type of atom α are the solutions of the radial Schrödinger equation (in Hartree atomic units) " ! # 1 1 ∂ 2∂ l(l + 1) − r − + veff (r) Rαnl (r) = εnα Rαnl (r) , 2 r2 ∂ r ∂ r r2 (3.4) which can equivalently be expressed as one dimensional Schrödinger equation with 20 transformed eigenfunctions ! l(l + 1) 1 ∂2 α α α α r R (r) . r R (r) = ε + + v (r) − n eff nl nl 2 ∂ r2 2 r2 (3.5) Note, that the spherical effective atomic potential vαeff (r) depends on the atom type, i.e., on the core charge and the number of electrons, but the radial part Ylm (ϑ , ϕ) of the atomic orbitals is independent of the atomic species. 3.1.2 Linear combination of atomic orbitals (LCAO) Atoms are the building blocks of larger structures, such as molecules and crystals. An intuitive basis set {φµ } for the wave functions of many-atoms structures is therefore α } of the atoms in the system. In such a case the basis set the set of atomic orbitals {ψnlm expansion of the molecular orbitals, Eq. (3.1), corresponds to a superposition (linear combination) of atomic orbitals [67, 68] α ψ(~r) = ∑ cµ ψnlm (~r −~rα ) with µ = (α, n, l, m) , (3.6) µ where α enumerates the atoms. This ansatz is the starting point of optimized atomiclocal basis sets, which mainly differ in the representation of the structurally dependent radial functions Rαnl (r) of the atomic orbitals. Numerical basis functions directly employ the numerical solutions of the radial Schrödinger equation, Eq. (3.5), as radial functions. Motivated by the analytic solution for the hydrogen atom, linear combinations of Slater functions, i.e., exponential functions multiplied with polynomials have been used. A common choice in quantum chemistry are linear combinations of Gaussian functions, which make it possible to solve the one- and two-electron integrals of the HF method analytically. All DFT calculations performed within the scope of this thesis employed numerical atomic-local basis sets as implemented in the FHI–aims program [69]. 3.1.3 Numerical basis functions In general, the direct solutions of the radial Schrödinger equation of free atoms are no good basis functions for the representation of molecular or crystal wave functions. 21 The atomic wave functions have, in principle, an infinite range, whereas the orbitals in compounds experience an effective screening due to the other atoms in the structure. Numerical basis functions are therefore often constructed from a confined radial Schrödinger equation, where a confining potential vcut (r) is added to the effective potential vαeff (r) of Eq. (3.5) [69, 70]. The confining potential is constructed in such a way that the solutions decay smoothly to zero at a given cutoff radius rcut . Blum and coworkers, Ref. 69, employ a confining potential with a second order pole at the cutoff radius 0 vcut (r) = s exp ∞ if w r−ronset 1 (r−rcut )2 r ≤ ronset if ronset < r < rcut if , (3.7) r ≥ rcut which ensures that also the first derivative of the radial function vanishes at the cutoff radius (s and w are adjustable parameters). Using the potential form of Eq. (3.7), the solutions of the confined Schrödinger equation are equal to the solutions of the unconfined one for radii r smaller than the onset radius ronset . The range restriction of the atomic-orbitals has the additional advantage that integrals involving orbitals, for example the elements of the Hamilton matrix, also become localized in space and can be more efficiently evaluated. 22 4 Molecular dynamics simulations Molecular dynamics simulations (MD) are tools to compute structural and dynamical properties of classical many-body systems, and allow to analyze experimental results on an atomic level. Classical, here, means that the movement of the nuclei follows the laws of classical mechanics [71]. Computational simulations are based on statistical mechanics. In this framework, the physical behavior of macroscopic systems is related to ensemble averages over micro-states M, which are characterized by the positions and momenta of all particles in the system [72]. In an MD simulation the equations of motion of a many-body system are solved numerically in consecutive time intervals τ which generates a time sequence of micro-states. This time sequence represents a trajectory in phase space. According to the ergodic hypothesis the ensemble average hAi can be replaced by a time average A if one allows the system to evolve infinitely in time [73]: hAi ≡ lim M→∞ 1 M M ∑ Ai = lim i τ→∞ 1 τ Z t0 +τ A(t) dt ≡ A . (4.1) t0 Therefore, experimental observables can be approximated by time averages obtained from MD simulations if the simulation time is sufficient. In order to perform an MD simulation, one has to carry out two steps: first the forces acting on each particle have to be calculated. In the second step Newton’s equations of motion are integrated numerically based on the forces calculated in the first step: Fi = mi ai ⇐⇒ d2 ri ∂V ({ri }) mi 2 = − dt ∂ ri . (4.2) A widely used numerical algorithm to integrate the equations of motion is the Velocity Verlet algorithm [74]. In ab initio molecular dynamics (AIMD) [75] the atomic forces are computed by approximately solving the time-independent Schrödinger equation on-the-fly. These 23 methods, unfortunately, are computationally too expensive to allow for a configurational sampling or MD simulations of the kind of structure this work is dealing with (typically several thousand of atoms) with long time scales. More efficient but sufficiently accurate potentials are required. Within the scope of this work, methods for the construction of efficient and accurate high-dimensional potential energy surfaces (PES) have been employed, which are based on artificial neural networks (NN) [47, 48]. These potentials allow to study systems of experimental length and time scales beyond the capabilities of conventional DFT implementations. 24 5 Neural Network Potentials 5.1 Artificial neural networks Artificial neural networks (NN) represent a general fitting scheme that in principle allows to approximate any function to arbitrary accuracy [49]. The NN algorithm is inspired by the biological neural network of the brain. In contrast to other regression methods the functional form of the underlying problem does not need to be known when using NNs. They state a flexible class of fitting functions that are able to learn unknown target functions to high accuracy using a training set of known function values. The training set is presented to the NN in order to find the best values for the rather large number of parameters using an optimization algorithm. Out of the many types of neural networks, the class of multilayer feed-forward neural networks has particularly proven to be a useful tool for the representation of potential-energy surfaces (PES) [78]. 5.1.1 Feed-forward neural networks The feed-forward neural network shown in Fig. 5.1 consists of one input layer, one hidden layer and one output layer. In each layer there are several nodes represented in the figure by squares (input layer and output layer) and circles (hidden layer). For the use as atomistic potential the input nodes define the atomic configuration (e.g., in form of bond lengths and bond angles) and may be given by a set of atomic coordinates: Gi = (Xi ,Yi , Zi ). The output layer consists of only a single node, whose value is the predicted energy of the atomic configuration. The nodes in the hidden layer do not have any physical meaning, but provide the functional flexibility of the NN. All nodes in each layer are connected to the nodes in the adjacent layers by weight-parameters, wkij , represented by the black arrows in Fig. 5.1. They are the fitting parameters of the neural network. 25 Figure 5.1 A small example of a two-dimensional feed-forward neural network (NN) presenting a functional relation between the energy E (output) and the coordinates G1 and G2 describing the atomic configuration (input). The analytic expression for this atomic NN is given in equation 5.1. The output value of such a neural network is calculated in the following process: first, the coordinates of an atomic configuration are provided in the input nodes of the NN, where each input node refers to one particular degree of freedom. The coordinates are then passed to the nodes in the first hidden layer by multiplying their numerical values by the connection weight values. On each node in the hidden layer these products are summed up and an activation function fai is applied to the sum !# " 3 Eatom = fa2 w201 + ∑ w2j1 fa1 w10 j + j=1 2 ∑ w1µ j Gi µ . (5.1) µ=1 In general, the activation function is a non-linear function that introduces the capability to fit non-linear functions into the NN. Typical examples are the sigmoid function f (x) = 1/(1 + e−x ) , (5.2) which has a similar form as the activation functions of biological neurons, the hyperbolic tangent f (x) = tanh(x) ≡ 26 e2x − 1 e2x + 1 , (5.3) and the Gaussian functions f (x) = e−α x 2 . (5.4) Sometimes periodic functions, such as the cosine function can be useful for fitting periodic potentials. To avoid any constraint in the range of output number, a linear activation function f (x) = x is used in the output layer. There are several advantages of NN potentials over conventional atomistic potentials: the functional form of the potential energy surface (PES) does not need to be known beforehand, as NNs are highly flexible and can represent any arbitrary function. Moreover, a systematic improvement of the NN potential is possible, if new data for the training set becomes available. Also note, that a NN fit from any electronic structure method (DFT, HF, MP2, etc. ) is possible. The main target of the NN potentials is the total energy. Nevertheless, the analytic derivatives of the functional form of the neural network are readily available, which allows for the fast computation of forces and therefore for the speedup of molecular dynamics (MD) simulations. A number of conceptual problems prevent a direct application of potentials based on conventional feed-forward neural networks to condensed systems. First, there is a fixed number of input nodes that is defined for a certain number of atoms (or degrees of freedom). An NN fit is therefore only valid for one system size. Second, the energy expression is not invariant with respect to rotation, translation, and the permutation of equivalent atoms. Therefore, a new NN scheme is required to deal with high-dimensional systems. 5.2 High-dimensional neural network potentials To be useful as general class of atomistic potentials, it is necessary to overcome the limitations of the conventional low-dimensional NNs, so that the method becomes applicable to systems with a significantly larger number of atoms. Independently, Smith and coworkers [79, 80] and Behler and Parrinello [53] suggested to decompose 27 Figure 5.2 A high-dimensional neural network potential consisting of N atomic neural networks. The total energy output E of the high-dimensional network is given as sum of the individual atomic energies Ei . the total structural energy E into a sum of atomic energies N E = ∑ Ei . (5.5) i=1 Each atomic energy Ei is in turn represented by a conventional feed-forward neural network that takes the local chemical environment into account. While Smith et al. employ a description of the atomic environment using chains of atoms and coordination functions, Behler and Parrinello have developed universal many-body functions, the symmetry functions Gi , that are able to capture a local structural fingerprint. A schematic depiction of the Behler–Parrinello approach is shown in Fig. 5.2. The atomic configuration is given by a set of Cartesian coordinates Ri = (Xi ,Yi , Zi ) (red squares in Fig. 5.2), which are transformed to rotationally and translationally invariant coordinates in form of symmetry functions Gi . These many-body functions depend only on the relative positions of the atoms in the structure [81]. Additionally, the radial extension of the symmetry functions is confined by a cutoff function fc , so that only the local atomic environment of atom i contributes to Gi . The values of the symmetry functions are the input vectors of the atomic neural networks, which yield environment dependent atomic energy contributions. Note, that the atomic NNs of all atoms of the same species are identical, which implicitly imposes a permutational symmetry for 28 equivalent atoms. It is furthermore straightforward to adapt the high-dimensional NN of Fig. 5.2 to any arbitrary number of atoms, simply by including the corresponding number of atomic NNs. High-dimensional NNs of the Behler–Parrinello type have been successfully employed in studies of various kinds of materials, such as silicon [55, 56], sodium [82], carbon [83, 84], and copper [85]. For all applications in this thesis the Behler–Parrinello method as implemented in the RuNNer code has been used [78]. 5.2.1 Training of the atomic neural networks The optimization of the weight parameters of the atomic NNs proceeds by an iterative minimization of the error for a given set of reference energies. In this work, the adaptive Kalman filter optimization algorithm has been used [77, 86–88]. Reference energies can be obtained from electronic structure calculations, for example using density-functional theory. Note, that it is not necessary to decompose the reference total energy into atomic contributions, as the training of the atomic NNs can be done simultaneously, using the total energies as target values. Not only the potential energy surface itself, but also its gradient, i.e., the atomic force components, can be used as reference data for the NN optimization [87, 89, 90]. A structure containing N atoms thus provides 3 N + 1 pieces of information, namely the total energy and the 3 N force components. The force ~Fk acting on atom k is given by the negative gradient of the NN function, i.e., for the cartesian components Fα,k (with α = x, y, z) N N Mi ∂E ∂ Ei ∂ Ei ∂ Gi, j Fαk = − =−∑ =−∑ ∑ ∂ αk i=1 ∂ αk i=1 j=1 ∂ Gi, j ∂ αk , (5.6) where Mi is the total number of symmetry functions for atom i. The derivative of the symmetry function with respect to the cartesian direction, ∂ Gi, j ∂ αk , only depends on the functional form of the symmetry function Gi, j , whereas the specific NN function enters the derivative of the atomic energy, ∂ Ei ∂ Gi, j . A comparison of the NN training procedure with and without the use of the force information is shown in Figs. 5.3 and 5.4 for a model potential. 29 (a) (b) (c) (d) Figure 5.3 Demonstration of the neural network optimization process: A feed-forward NN with one hidden layer and two nodes is trained to a one-dimensional model potential using the reference energies (black diamonds) only. The NN output (red line) as well as the indiviual contributions to the total NN energy from both nodes (blue and green lines) and the bias weights (dashed blue line) are shown for different epochs. 30 (a) (b) (c) (d) Figure 5.4 Demonstration of the neural network optimization process: A feed-forward NN with one hidden layer and two nodes is trained to a one-dimensional model potential using reference energies (black diamonds) and forces (black line). The NN output (red line) as well as the indiviual contributions to the total NN energy from both nodes (blue and green lines) and the bias weights (dashed blue line) are shown for different epochs. The NN forces are represented by the orange line. 31 5.2.2 Symmetry functions A number of many-body functions that are suitable for the use as symmetry functions in high-dimensional NNs are given in Ref. [81]. For the discussion in this section we will restrict ourselves to the three most common symmetry functions. The notation of Ref. [81] is followed. In general, the symmetry function values Gi depend on the positions of all atoms in the local environment of atom i defined by a cutoff radius Rc as indicated by the dotted arrows in Fig. 5.2, and the radial cutoff is imposed using a cosine cutoff function h i 0.5 × cos π Ri j + 1 for Ri j ≤ Rc , Rc fc (Ri j ) = (5.7) 0 for R > R . ij c The simplest radial symmetry function, G1i , is simply given by a sum over cutoff function values for each atom j N G1i = ∑ fc (Ri j ) . (5.8) j6=i The alternative radial symmetry function G2i is defined as N 2 G2i = ∑ e−η (Ri j −Rs ) fc (Ri j ) , (5.9) j6=i where the parameters η and Rs define the width and the center of Gaussian functions, respectively. An example of the radial symmetry functions G2i is shown in Fig. 5.5. While the radial symmetry functions are constructed for each pair of atoms, the angular symmetry functions depend on all triplets of atoms in the structure by combining the ~R ·~R cosine of the angles θi jk = i j ik centered at atom i, with ~Ri j = ~Ri − ~R j . In all NN Ri j Rik potentials constructed for this thesis the angular symmetry function G4i has been used, which is defined as 2 +R2 +R2 −η R ζ 4 1−ζ ij ik jk Gi = 2 · fc Ri j · fc (Rik ) · fc R jk ∑ ∑ 1 + λ · cos θi jk · e j , k (5.10) 32 1.0 0.8 G 1 0.6 0.4 η=0.0009 Bohr -2 η=0.0100 Bohr -2 η=0.0200 Bohr -2 η=0.0350 Bohr -2 η=0.0600 Bohr -2 η=0.1000 Bohr -2 η=0.2000 Bohr -2 η=0.4000 Bohr -2 0.2 0.0 0 1 2 3 4 5 6 7 Rij (Å) Figure 5.5 An example of the radial symmetry functions G2i with differnt η parameters. 2.0 4 G (θijk) 1.5 1.0 λ = 1, ζ = 1 λ = -1, ζ = 1 λ = 1, ζ = 4 λ = -1, ζ = 4 0.5 0.0 0 60 120 180 θijk / o 240 300 360 Figure 5.6 An example of the angular symmetry functions G4i with different λ parameters. where the parameter λ can assume the values +1 and -1, and the value of ζ determines the angular parameter. An example of the angular symmetry functions G4i is plotted in Fig. 5.6. 5.3 High-Dimensional Neural Networks for Multicomponent Systems Structures containing different atomic species may exhibit charge transfer that leads to long-ranged non-local electrostatic interactions, which may not well be represented in dependence of a local atomic environment. The long-range electrostatic energy is dominated by the interaction between charges (that decays slowly with 1 over r), while electrostatic interactions from higher multipoles like dipoles and quadropoles 33 Figure 5.7 A high-dimensional neural network potential for multicomponent systems. An additional NN is employed to predict the atomic charges, which in turn can be used to compute the electrostatic energy of the structure. decay faster with the atomic distance and can therefore be covered already well by the short-range part. The high-dimensional NN method can be extended by an additional NN that allows the prediction of atomic charges [94, 95] and enables the application of the NN method to multicomponent systems. The choice of the charge partitioning method is arbitrary, for instance, Mulliken [91], Hirshfeld [92] , Bader [93]. A high-dimensional NN for multicomponent systems thus consists of two highdimensional NNs: one NN for the short-range energy and one NN for the atomic charges (Fig. 5.7). Standard methods, such as the Ewald summation can then be used to evaluate the electrostatic energy for periodic systems. The short-range energy contribution can be easily obtained from the reference calculation by subtracting the electrostatic energy as computed using the reference charges from the total reference energy Eshort,ref = Etot,ref − Eelec,ref . (5.11) The short-range energy and the reference charges can then be used to train the shortrange and the charge NN independently from each other. However, if one wants to use the atomic forces for the optimization process this procedure can not be applied anymore since the electrostatic force contribution is not directly accessible from the 34 reference calculation. In order to obtain the electrostatic forces related to the reference charges, the derivatives of the charges with respect to the atomic positions αk , ∂ qi ∂ αk , have to be known. Because we are dealing with environment-dependent charges this derivative is non-zero. Unfortunately, the dependence of the charges on the environment is not available from the reference calculations. In order to still make use of the atomic forces for the weight optimization the following approach is therefore taken: 1. the electrostatic NN is trained to the reference charges (This does not require the knowledge of the reference charge derivatives.), 2. the electrostatic energy and force contributions are computed using the electrostatic NN. The derivatives of the charges with respect to the atomic positions are given by the NN architecture and the definition of the symmetry functions: ∂ qi = ∂ αk Mi ∂ qi ∂ Gi, j ∂ αk ∑ ∂ Gi, j j=1 . (5.12) 3. the electrostatic energy and force components given by the electrostatic NN are subtracted from the reference total energy and forces and the short-range NN is trained to the remaining short-range part of the potential. 5.4 Benchmark of activation functions and symmetry functions In order to select the most appropriate activation functions for the NN fitting, a number of options have been tested, such as the hyperbolic tangent (t) of Eq. (5.3), Gaussian functions (g), linear functions (l), and cosine functions (c). During the iterative fitting process some quantities can be investigated to check the accuracy of the NN fits: the root mean squared error (RMSE), which is given for a reference set of M data points as s 2 ∑M i (ENN − Eref ) ERMSE = , (5.13) M is most commonly used. For the benchmark copper dimer structures with 110 different interatomic distances have been used as training set. In all cases a network architecture with a single hidden 35 Table 5.1 RMSEs of neural network (NN) fits for the Cu dimer (110 data points) after 100 iterations (epochs). The NN architecture is 1-5-1 (five nodes in the hidden layer) with radial a symmetry function of types G1i and G2i . The symbols min, max, and avg (different random seeds) in the table refer to the minimum, maximum, and average RMSE values, respectively. σtrain is the standard deviation of the training set. The activation function types (act) are discussed in the text. function type symm 1 2 act RMSE / meV min maxtrain avgtrain σtrain 0.48 14.10 2.12 1.65 0.54 0.40 11.39 2.70 1.78 l 55.90 75.79 55.90 55.90 0.00 t 0.23 0.27 21.05 1.86 1.47 c 0.16 0.34 2.31 0.49 0.25 g 0.16 0.26 3.26 0.55 0.35 l 9.03 1.49 9.03 9.03 0.00 t 0.13 0.12 2.45 0.38 0.21 train test c 0.31 g layer containing five nodes has been used (1-5-1) with with radial a symmetry function of types G1i and G2i . Table 5.1 shows the results of the benchmark for the different activation function types (c, g, l and t) using the RMSE to monitor the accuracy of the fit. The hyperbolic tangent as activation function in combination with the symmetry function of type G2i yields the lowest RMSE in the benchmark. Several further benchmarks, which are not shown here, have confirmed this finding. Therefore, these settings were used for all NN potentials constructed in this thesis. Neural networks can provide total energies which are close to reference electronic structure energies, for example from DFT calculations, and the corresponding derivatives (forces). Neural networks can not directly provide electronic properties of molecular 36 systems, i.e., NNs can not access information about the electronic states and there is no charge density available from the neural network output. Therefore, the main use of neural network potentials is to compute energies and forces to speed up molecular dynamics simulations so that longer time scales and larger system sizes can be simulated in order to make structural and dynamical properties available by still remaining very close to DFT accuracy [48, 53, 56, 96]. 5.5 Molecular dynamics simulations employing NN potentials In order to employ NN potentials in molecular dynamics (MD) simulations to study the structural and dynamical properties of complex systems, the NN code RuNNer [78] has been interfaced with the TINKER MD program [97]. TINKER is a general package for molecular mechanics and dynamics simulations. The original TINKER code has been extended to allow the use of the RuNNer program for the evaluation of the potential energy. TINKER offers a convenient interface for the implementation of new potentials, which made the implementation straightforward. In order to clarify the combination of TINKER and RuNNer, the flow chart in Fig. 5.8 schematically shows the interaction of the two programs. Note, Fig. 5.8 depicts how the added subroutines interact with the TINKER program: • nninput (nninput.f): This subroutine writes the atomic coordinates and lattice vectors to a RuNNer input file with the name input.data. The dimensions are automatically converted (e.g., Angstrom to Bohr). • nnenergy (nnenergy.f): The routine reads the current value of the energy (in Hartree atomic units) from the RuNNer output file energy.out, converts it to kcal/mol and stores it in the corresponding TINKER data structure. • nnforces (nnforces.f): The final additional subroutine reads the atomic forces (in Ha/Bohr) from the RuNNer output file forces.out, converts them to kcal/mol/Angstrom and returns the negative forces (i.e., the gradients) to TINKER. 37 Figure 5.8 A simple visualization of the interaction of RuNNer [78] and TINKER [97], see note in the text for a detailed explanation. 38 Part III Computational Details 39 6 Computational Details 6.1 DFT calculations The reference DFT calculations to train the NN potential for copper and zinc oxide have been carried out using the Fritz Haber Institute ab initio molecular simulations (FHI-aims) code discussed in Chapter 3 [57]. At the beginning of each calculation, the basis functions, which are given numerically on spherical grids centered at the nuclei, are determined by solving the Kohn-Sham equations for the free atoms. In the present work we have used the “tier 1” basis set for copper and zinc and the “tier 2” for oxygen atom. These basis sets are part of the default basis set library provided by FHI-aims, and they contain a minimal basis set of atomic orbitals as well as a set of additional basis functions constructed as hydrogen-like orbitals, i.e., they are obtained from a fictitious hydrogen atom with a modified nuclear charge and a specific set of quantum numbers. In total, 40 basis functions have been used for each atom. Dense k-point meshes have been used with k-point densities approximately equivalent to a 12×12×12 mesh of a conventional four atom fcc unit cell for all periodic structures. The calculated total energies are converged to a about 1 meV per atom, the forces to about 10 meV/Å. The PBE functional [98] has been used in all calculations to describe electronic exchange and correlation. Relativistic effects have been included via the scaled zeroth order regular approximation (ZORA) [99]. The local, atom-centered basis functions enable to calculate systems with and without periodic boundary conditions in a consistent way. This has been exploited by using periodic systems for bulk and slab structures, and non-periodic structures for clusters. The DFT calculations yield the total energies and atomic forces, which both have been used to construct the NN potential. A DFT calculation for an N atom system provides 3N + 1 pieces of information for the fitting as there is one total energy and 3N force components per structure. 41 6.2 Construction of reference data sets With the exception of the data discussed for the methanol molecule in Sec. 10, DFT calculations have been carried out to construct the reference data sets for NN potentials. In general, the configurations in the reference set include bulk structures, slabs, and clusters of variable size comprising up to 100 atoms. Approximately 10 % of these structures have been selected as test set to check the generalization properties of the potential for unknown structures. The remaining 90 % have been used to determine the NN parameters. A detailed discussion of the reference data sets is given in the chapters for each individual system. In each case, the reference data set has been generated in a self-consistent, iterative procedure. A number of schemes have been proposed in the literature to add data points step by step in important regions of the potential energy surface (PES) in the context of empirical potentials [100, 101] and also in the field of neural networks [102]. A first approximate potential has been constructed based on an initial data set derived from ideal crystal structures such as face-centered cubic (fcc), body-centered cubic (bcc), hexagonal close-packed (hcp), and simple cubic (sc) in a unit cell volume range of about 50 Bohr3 per atom. The six-dimensional space of lattice vectors has been mapped systematically for these structures employing a primitive unit cell containing one atom, except for the hcp structure, which has a lattice basis of two atoms. This has been done by systematically varying the lattice parameters a, b, c, α, β and γ. Consequently, a large number of bulk cells containing only one or two atoms are included in the reference data sets. In particular the bulk structures containing just one atom provide valuable information for the fit, because in these structures all atoms have the same chemical environment. Therefore, a unique energy can be assigned to each vector of symmetry functions describing the atomic environments. Further bulk structures have been generated for supercells containing two or four atoms. In these structures, which were derived from fcc, bcc, sc, and hcp (super)cells, the symmetry has been broken by displacing the atoms randomly by up to 0.5 Bohr from the ideal lattice sites, which corresponds to a thermal distortion. Apart from the bulk structures, the initial data set contains also surface structures, which have been represented by slabs. For example, small slabs with 10 or 12 atoms 42 Figure 6.1 A systematic approach to construct neural network potentials (one atom per layer) have been constructed for the low index surfaces of fcc, bcc and sc copper. Also here, the volume has been varied and a number of structures with randomly displaced atoms has been generated. 6.2.1 A systematic approach to construct neural network potentials The construction of neural network potentials follows a systematic approach in Fig. 6.1. The figure shows a flow chart of the general procedure that is applied. In a first step a number of DFT calculations for random structures (the first training set) is performed. Based on the results a first regression leads to a preliminary potentialenergy surface. The generated fit is then used to determine further structures to be included in the training set in order to improve the quality of the NN potential. This is done, e.g., by performing molecular dynamics simulations or structural optimizations using the preliminary potential. Successively, those structures occurring during these calculations that lay beyond the fitted regions of the PES are calculated with DFT and — in the case of poor agreement between DFT and NN energies — included in an 43 extended training set. For the new training set the whole scheme is repeated until for all occurring structures a good agreement between DFT and NN potential is achieved. 6.2.2 Iterative refinement of the data sets A disadvantage of the procedure described so far is the high number of DFT calculations for similar structures that will not necessarily improve the reference data set. To avoid such redundant calculations we have followed a different approach to refine the data set. Based on the initial data set, several NN fits have been generated. They have then been employed in MD simulations of larger bulk and surface systems containing up to a few hundred atoms using the NVT ensemble. While a few of the smaller structures generated in this way have been recalculated directly by DFT and were added to the training set to improve the fit, for most atomic environments emerging in these simulations a more efficient approach has been followed. Since the energy contribution of each atom depends only on its local chemical environment, a new atomic environment can be added to the training set by cutting a cluster centered at the respective atom from systems, which are too large for DFT calculations. The advantage of this procedure is that only comparably small clusters need to be calculated by DFT. Most of the clusters in the reference data sets have been generated in this way. Following the same spirit, a large number of bulk and surface structures, which have been generated in MD simulations at temperatures between 300 and 1000 K, have been searched systematically for appropriate clusters to be added to the data set. This can be achieved by exploiting the large flexibility of NN potentials. Two NN fits of about the same quality in terms of the RMSE of the energy and the forces can be used to identify missing points in the reference set. If an atom has an environment that is very different from the environments already included in the training set, the two NN fits are likely to predict very different energy contributions for this atom. By comparing the predicted atomic energies of two approximate potentials, the atomic environments missing in the training set can thus be systematically determined. Then, clusters centered at these atoms are recalculated by DFT to improve the reference data set. This procedure is repeated until the root mean squared errors of the training and the test set have converged. 44 This two-fits approach makes it possible to search a large region of the configuration space for missing reference points, without the need of redundant expensive DFT calculations [85]. Examples of such analyses are given in the following chapters for specific NN potentials. 6.3 Optimization of the neural network architecture As discussed in Chapter 5, apart from the numerical values of the weights, also the NN architecture is important for the accuracy of the potential, because a large number of hidden layers and nodes increases the flexibility of the NN. If the NN is too small, it is not able to represent all subtle features of the PES and some details may be missing in the final potential. If the NN is too large, it has a very high flexibility and over-fitting can occur, i.e., the training structures are well represented, while atomic configurations in between the training points can have a drastically reduced accuracy. Both situations can be detected by monitoring the RMSEs of the training set and the test set. In the initial stage of the fit, the RMSEs of both sets decrease, since the NN learns the overall topology of the NN. If the error of both sets remains similar in the course of the fitting process, but the RMSEs are still high, the NN size should be increased. If, on the other hand, the RMSE of the training set is low, but the test set RMSE is much larger, then over-fitting is present. In general, the smallest possible NN should be used that provides the desired accuracy and similar training and test errors. In practice we found it most efficient to determine the optimum NN architecture in an empirical way. A number of NN PESs with different architectures is constructed and the one with the best generalization properties, i.e., the lowest test set error, is selected for applications. Finally, it should be noted that the determination of the NN weights represents a very high-dimensional optimization problem, and there is no hope to find the global minimum, since there are typically several thousand weight parameters to be optimized. Still, in most cases very accurate local minima can be found, which represent all physical properties of the system with good accuracy. For a given NN architecture there are many local minima, and the final result depends on several initial settings, like the choice of the initial values of the weights, the order of the training points and the optimization algorithm. 45 46 Part IV Results 47 7 A Neural Network Potential for Copper The interactions present in a metal are substantially different from what is observed in covalent insulators: in contrast to covalent structures, the electronic wave functions are very long ranged and delocalized in metals. Empirical interatomic potentials are therefore usually specialized on either metallic or covalent systems. In this chapter the first application of neural network potentials for a metal, namely for a copper potential, is discussed. It is demonstrated that the high flexibility of the NN is able to capture metallic interactions with the same accuracy that is known for covalent insulators. 7.1 Reference data set All together 37,763 DFT calculations have been performed to construct the reference data set. These structures contain 8,419 clusters, 15,448 bulk structures, and 13,896 slabs. The structure sizes include various small numbers of atoms ranging from one atom (bulk structures with different lattice vectors) up to about 144 atoms. In total, there are 617,475 atomic environments in these structures. Each DFT calculation provides the total energy and three force components per atom. Therefore, this data set contains 1,890,188 pieces of information that can be used to construct the NN potential. The structures have been distributed randomly into a training set, which is used to optimize the weights of the NN (33,963 structures), and an independent test set (3,800 structures – approximately 10 % of the training set) which is used to check the transferability of the potential. Lists of the copper clusters, bulk and slab structures in the training and test sets are given in Tables 7.1, 7.2, and 7.3, respectively. The reference data set has been generated following the self-consistent iterative procedure described in Sec. 6.2. A first approximate potential has been constructed using only the ideal crystal structures fcc, hcp, bcc, sc, and diamond cubic in the volume 49 rel. energy (meV/atom) 3500 3000 2500 bcc DFT diamond DFT fcc DFT hcp DFT sc DFT 2000 1500 1000 500 0 50 60 70 80 90 100 110 120 130 3 Volume (Bohr /atom) Figure 7.1 DFT energy vs. volume curves for several crystal structures of copper. The atomic energies are relative to the most stable structure at the minimum lattice constant. The most stable fcc structure is represented by black diamonds. range from about 50 Bohr3 per atom to about 130 Bohr3 per atom, which is shown in Fig. 7.1. The equilibrium atomic volume per atom for copper in the fcc structure, the ground state structure, is 80.707 Bohr3 . Further reference structures have been generated systematically by varying the lattice parameters, as discussed in Sec. 6.2. For the ideal one-atomic crystal structures fcc, bcc, and sc there are six lattice parameters: the lengths of the lattice vectors a, b, and c and the angles α, β , and γ. For the two-atomic hcp structure the c/a ratio is an additional parameter. The resulting large number of bulk cells containing only one or two atoms is shown in Table 7.2. Additionally, bulk structures generated for two and four atom unit cells have been derived from fcc, bcc, sc, and hcp supercells. The symmetry of these structures has been broken by randomly displacing the atoms by 0.2 up to 0.5 Bohr from the ideal lattice sites (see Fig. 7.2). Apart from the bulk structures, also surface structures were included in the data set. Surfaces were represented by slab models which were generated from the bulk 50 (b) 3500 280 3000 240 2500 2000 1500 1000 500 rel. energy (meV/atom) rel. energy (meV/atom) (a) 0 50 60 70 80 90 100 110 120 130 3 Volume (Bohr /atom) 200 160 120 80 40 0 70 80 90 3 Volume (Bohr /atom) 100 Figure 7.2 DFT energies of the copper bulk structures (ideal and distorted structures) included in the reference data set (a). The enlarged region in panel (a) (black rectangle) shows structures that are close to the minimum energy (b). equilibrium lattices by truncation. In particular small slabs with 10 or 12 atoms (one atom per layer) have been constructed for the low index surfaces of fcc, bcc and sc copper. Also here, the volume has been varied and a number of structures with randomly displaced atoms has been generated. Moreover, also vacancy structures have been included, in which one atom was removed from the bulk or slab structures in larger supercells. Both, the fixed and optimized geometries of vacancy structures were considered. Based on this first input data set, various NN fits were generated. They have then been employed in MD simulations of large bulk and surface systems containing many hundreds or thousands of atoms using the NVT ensemble at temperatures between 300 and 1000 K. A few of the smaller structures generated in this way were recalculated directly by DFT and were added to the training set to improve the fit, but for most atomic environments emerging in these simulations the more efficient two-fit approach of Sec. 6.2.2 has been followed. The comparison of predicted atomic energies of two different NN fits has been used to identify missing reference data points. Figure 7.3 shows a typical series of such an analysis. As reasoned in Sec. 6.2, the energy contribution of each atom only depends on the local chemical environment. Therefore, a new atomic environment can be added to the training set by cutting a small cluster, 51 rel. energy (meV/atom) 90 NN 1 NN 2 80 70 60 50 40 0 100 200 300 400 500 600 700 MD step Figure 7.3 Comparison of the neural network (NN) energies along a molecular dynamics (MD) trajectory. The MD has been carried out employing the NN1 potential and the energies have been recalculated using the NN2 potential. These potentials have been constructed using the same training set [85]. For most configurations the energies of the two NN potentials are very close. This indicates that these structures are similar to the structures in the training set. But in the gray regions both fits predict significantly different energies, which means the configurations in that regions are missing in the training set. Those structures should be added to the training set. with a radius that is equal to the range of the symmetry functions (6 Å in the case of the copper potential), centered at the respective atom from those systems that are too large for DFT calculations. This iterative procedure was repeated until the root mean squared errors of the training and the test set had self-consistently converged. The advantage is that only relatively small clusters have to be calculated by DFT, and therefore most of the clusters in Table 7.1 have been generated in this way. The remaining clusters in the training set have been generated randomly. The atomic positions have been specified by defining minimum and maximum Cu–Cu distances and the cluster radius – all of the Cu3 clusters have been created in this way. 52 53 3 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 No. of Cu atoms 447 66 75 94 92 90 88 88 94 88 92 91 89 83 88 89 127 93 87 114 92 104 98 90 106 88 101 113 91 53 6 14 5 7 9 12 10 6 12 8 9 9 17 12 12 17 7 13 13 7 12 6 12 17 13 17 11 10 number of structures training points test points 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 No. of Cu atoms 108 109 119 97 101 116 120 131 145 113 77 34 43 48 105 59 44 39 16 126 58 50 50 25 109 41 41 38 44 16 13 17 12 12 15 21 15 16 14 8 4 3 12 9 9 5 2 2 16 7 7 5 3 7 5 5 4 4 number of structures training points test points 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 94 95 98 99 100 No. of Cu atoms 86 41 46 52 68 139 39 34 42 45 111 84 98 140 116 295 170 160 59 65 3 24 1 16 160 20 13 154 9 3 5 3 7 11 4 2 2 11 6 17 14 17 43 18 21 4 6 2 5 20 5 4 11 number of structures training points test points Table 7.1 A list of all copper cluster structures in the training and the test set. Table 7.2 A list of all copper bulk structures in the training and the test set. Cu atoms 54 number of structures training points test points 1 5145 531 2 3499 3 Cu atoms number of structures training points test points 26 1 - 377 27 2 - 2 - 31 3 - 4 4938 558 32 2 - 5 1 - 35 4 2 7 2 1 36 4 - 8 287 37 47 2 1 11 2 - 48 1 1 12 2 - 53 5 - 15 6 - 54 4 - 16 3 1 71 2 1 17 2 - 72 1 1 18 1 1 107 3 - 23 6 - 108 2 - 24 4 - Table 7.3 A list of copper surface structures in the training and the test set. Cu atoms number of structures training points test points 3 9 1 4 11 5 Cu atoms number of structures training points test points 27 8 - - 31 14 3 19 3 32 16 2 6 7 2 33 106 7 7 1 - 36 1 - 8 15 1 47 14 - 9 1 - 48 14 2 10 8550 956 63 2 - 11 9 2 64 2 - 12 3598 421 71 12 2 15 12 2 72 13 2 16 17 2 95 2 - 17 8 3 96 2 - 18 7 1 143 1 - 24 1 - 144 1 - 26 9 2 55 7.2 A neural network potential for copper Several NN potentials for the copper system have been constructed using different sets of initial weight parameters and different NN architectures in order to identify the optimum functional form. A subset of 15,448 copper bulk structures was used to determine the first settings for the NN training. At this stage, the atomic forces were not included in the fit. We have compared the root mean squared error (RMSE) results for different numbers of hidden layers of the NN architectures. One to four hidden layers with different network sizes per layer (2, 5, 10, 15, 20, 30 and 40 nodes) have been tested, and we have used hyperbolic tangent activation functions at the nodes of the hidden layers, and a linear activation function at the output node. The number of weights of each set are listed in Table 7.4. The RMSEs of the energies for the networks with three or four hidden layers are not significantly lower than for the networks with two hidden layers, i.e., less than 0.1 - 0.2 meV/atom as also shown in Table 7.4. One important factor for the selection of a potential for applications are the RMSEs of the forces (for both train and test sets). Usually we take the NN fit with low RMSEs of the energies and with the lowest RMSEs of the forces. These tests have shown that the NN with 2521 weight parameters and with a network size of 30 nodes with two hidden layers results in the lowest error in the forces as shown in Fig. 7.4. The computation time to optimize the weights parameters of larger networks (e.g. 30 and 40 nodes per layer) with three and four hidden layers, on the other hand, is about two or three times higher than for NN architectures with two hidden layers. For the following applications we therefore have only used network architectures with two hidden layers and have also included the forces for the optimization of the weights. Also for the full data set (copper clusters, bulk structures and slabs), several NN architectures have been tested like two hidden layers with different number of nodes per layer. We have used hyperbolic tangent activation functions at the nodes of the two hidden layers, and a linear activation function at the output node. It has been found that a small network with a 51-10-10-1 NN architecture is able to provide rather low RMSE values of 4.870 meV/atom for the energy RMSE of the training set and 4.592 meV/atom for the test set. The training set and test set forces obtained with this NN have RMSEs of 41.286 meV/Bohr and 41.519 meV/Bohr. The RMSE values of 56 Table 7.4 The RMSEs of energies (meV/atom) and forces (meV/Bohr) for the neural network (NN) training set for copper bulk structures with different sets of initial parameters. The errors have been observed at the 100th iteration for NN architectures with 51 symmetry functions and different number of NN weights. Number of weight parameters (energy RMSEs, force RMSEs) Nodes per layer 1 layer 2 layers 3 layers 4 layers 2 107 (3.1, 20.4) 113 (2.7, 18.4) 119 (2.5, 9.6) 125 (2.8, 18.6) 5 266 (2.7, 13.7) 296 (1.9, 11.3) 326 (1.8, 15.1) 356 (1.8, 13.1) 10 531 (2.3, 13.5) 641 (1.5, 14.6) 751 (1.4, 9.8) 861 (1.5, 9.6) 15 796 (2.0, 9.7) 1036 (1.5, 8.8) 1276 (1.3, 13.6) 1516 (1.3, 10.0) 20 1061 (2.0, 9.1) 1481 (1.5, 9.9) 1901 (1.4, 11.1) 2321 (1.4, 11.7) 30 1591 (1.9, 8.9) 2521 (1.4, 8.4) 3451 (1.5, 11.6) 4381 (1.7, 11.6) 40 2121 (1.8, 9.4) 3761 (1.8,10.0) 5401 (1.8, 11.4) 7041 (1.7, 10.6) force RMSE (meV/Bohr) 20 18 16 2 hidden layers 3 hidden layers 4 hidden layers 14 12 10 8 6 0 10 20 30 40 # Nodes in hidden layers Figure 7.4 Comparison of the RMSEs of the forces (meV/Bohr) for the neural network (NN) training set for copper bulk system. The errors have been observed at the 100th iteration for NN architectures with 51 symmetry functions and different numbers of NN weights. The NN potentials have not used atomic forces for the optimization of the weights. 57 energy RMSE (meV/atom) 7.0 51-10-10-1 NN 51-20-20-1 NN 51-30-30-1 NN 51-40-40-1 NN 6.5 6.0 5.5 5.0 4.5 4.0 3.5 2 4 6 8 10 12 14 16 18 20 Iteration Figure 7.5 Comparison of the fitting errors of the training set for several neural network (NN) architectures. the training set energies of the first 20 iterations are shown for four different network architectures in Fig. 7.5. It can be seen that the reduction of the error is only marginal if the NN size is increased beyond 51-30-30-1 NN and/or 51-40-40-1 NN, therefore these were chosen for a closer investigation. Not only the values of the RMSEs of the energies and forces are important for the selection of a NN potential for applications, but also the predictive power needs to be checked. It has to be verified that the NN potential can represent and predict properties such as geometries, energies and forces of unknown structures with the same accuracy as for the reference data. The fit, which has been selected for the productions, was obtained using a 51-30-30-1 NN architecture with 2521 weight parameters and including atomic forces for fitting. The RMSEs of the energies are 3.63 meV/atom and 3.98 meV/atom for the training and the test set, respectively. The errors of the forces are 42.79 meV/Bohr (training set) and 42.03 meV/Bohr (test set). For this fit the mean absolute errors (MAE) of the energies are 2.09 (training set) and 2.23 (test set) meV/atom. The MAEs of the forces are 29.29 and 29.43 meV/Bohr for the training set and test set, respectively. Note, that the MAEs are expected to be smaller than the RMSEs, as outliers have a smaller impact on their value. Overall, these errors suggest that there is no overfitting because the training and the test set have almost the same errors for both, the energies and the forces. 58 (a) (b) -0.5 Train points Test points -1.0 -1.5 -2.0 -2.5 -3.0 15000 number of points binding energy NN (eV/atom) 0.0 Train points Test points 10000 5000 -3.5 0 -4.0 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 binding energy DFT (eV/atom) <1 <2 <3 <4 <5 <6 <7 <8 <9 >9 energy error per atom (meV) Figure 7.6 (a) Comparison of the DFT and neural network (NN) energies of the structures in the training set and the independent test set. All points are very close to the line with a slope of 45◦ corresponding to a perfect fit. For clarity, binding energies obtained by removing the energies of free atoms are shown instead of total energies. (b) Fitting error distribution in the training and the test set. In total the training set contains 33,963 structures, and the test set consists of 3,800 structures. In Fig. 7.6(a) the NN binding energies of the training and the test set are plotted as a function of the DFT energies. As can be seen in the graph, all points are very close to a line with a slope of 45◦ (blue solid line), which corresponds to an optimal fit. In Fig. 7.6(b) the distribution of errors is shown, and most of the data points have an error of less than of about 3 meV/atom. 7.2.1 Copper clusters The first copper system we study is copper clusters. In order to investigate the accuracy of the NN potential, random Cu30 clusters from the independent test set have been selected and the NN total energies have compared to the DFT total energies. Fig. 7.7 presents a high accuracy of the NN potential. It can be seen that the NN energies can represent DFT energies very accurately. Apart from total energies, the absolute forces of a Cu14 cluster have been also checked. In Fig. 7.8 the absolute forces acting on the copper atoms of the Cu14 cluster, which is 59 rel. energy (meV/atom) 400 DFT NN 300 200 100 0 0 5 10 15 20 25 30 35 40 number of structure Figure 7.7 Comparison of the neural network (NN) energies and the DFT energies for random Cu30 clusters. generated randomly from an MD simulation at 300 K and the structure is not included in the training set, are shown. The NN atomic forces and the DFT atomic forces are in good agreement. We are more interested in applying the NN potential to real copper systems than random clusters. Therefore, in the next sections, the accuracy and reliability of the NN potential for bulk structures and real surfaces will be checked. 7.2.2 Bulk copper A reliable description of bulk copper is a necessary requirement for the application of the NN potential to surfaces, because every surface model includes features of the bulk structure. The simplest model of a surface structure is that of the truncated bulk (the ideal surface). The properties of different crystal structures of copper, have therefore been investigated first. The most important energetic quantity is the cohesive energy, which is defined as Ecoh = 60 1 · (Ebulk − N · Eatom ) N (7.1) |F| (eV/Bohr) 1.0 DFT NN 0.8 0.6 0.4 0.2 0.0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Cu atom number Figure 7.8 Comparison of the DFT and neural network (NN) forces acting on the copper atoms in a Cu14 cluster. The structure has been chosen randomly from a molecular dynamics trajectory at 300 K. where N is the number of atoms in the bulk unit cell, and Eatom is the energy of a free copper atom in its electronic ground state, which is a constant determined by DFT. The energy of an isolated atom is not included in the NN training set. The equilibrium lattice constants and bulk moduli B of the different bulk structures, i.e., fcc, bcc, sc and hcp have also been calculated. Table 8.5 shows the results that have been obtained with the NN potential and DFT. The NN predicts that fcc copper is the most stable crystal structure, which is in agreement with DFT and experiment. The equilibrium fcc lattice constant obtained by the NN potential is 3.630 Å, compared to 3.630 Å in DFT and 3.615 Å in experiment [103]. The NN results of the fcc, bcc, and sc bulk modulus of 138, 135, and 108 GPa are also very close to the DFT values of 140, 137, and 103 GPa, respectively. Additionally, the elastic constants have been calculated (the calculation of elastic constants of cubic lattices is outlined in Sec. B). For fcc copper the NN elastic constants are c11 =177 GPa (DFT: 173 GPa), c12 =119 GPa (DFT: 123 GPa), and c44 =83 GPa (DFT: 80 GPa), the agreement between DFT and the NN potential is excellent, and comparable to the experimental values. Also for other investigated crystal structures, such as the bcc and sc structures, the NN potential is able to predict their elastic constants similar to the DFT reference. All results are shown in Table 7.6. Further, the vacancy formation energies in different crystal structures for several vacancy concentrations have been investigated, which depend on the supercell size. 61 Table 7.5 Comparison of the lattice parameters, cohesive energies and bulk moduli (GPa) of four crystal structures of copper obtained with the neural network (NN) and density-functional theory (DFT). Ecoh /eV lattice parameters bulk modulus DFT NN DFT NN DFT NN fcc 3.533 3.526 a =3.630Å a =3.630Å 140 138 bcc 3.489 3.486 a =2.885Å a =2.887Å 137 135 sc 3.051 3.052 a =2.407Å a =2.407Å 103 108 hcp 3.511 3.511 a =4.862Å a =4.856Å - - c/a =1.627 c/a =1.631 Table 7.6 Elastic constants (GPa) of fcc, bcc, and simple cubic structures of copper obtained with the neural network (NN), density-functional theory (DFT) and experimental data for the fcc structure [Ref. [104]]. c11 DFT fcc(expt.) 62 c12 NN 170.0 DFT c44 NN 122.5 DFT NN 75.8 fcc 173 177 123 119 80 83 bcc 138 135 137 135 103 89 sc 294 251 36 8 -38 -22 In order to identify the effect of the atomic relaxations close to the vacancies and the description of these relaxations by the NN potential, the vacancy formation energies were determined for two different cases. First, the vacancies were generated by removing one copper atom, a single point energy calculation was performed, but without subsequent relaxation of the resulting structures (columns “fixed” in Table 7.7). Second, the structures were fully relaxed after removing the atom. This was done independently for DFT and the NN, i.e. the relaxed DFT structures were determined by minimizing the DFT forces, and the relaxed NN structures were obtained by geometry optimizations employing the NN potential. In all cases, the lattice constants were kept fixed at the values of the ideal crystal structures and only the atomic positions were optimized. The vacancy formation energy in the bulk is defined as N −1 Ebulk (N) , (7.2) Evac,bulk = Ebulk,v (N − 1) − N where Ebulk (N) is the energy of the defect-free unit cell containing N atoms and Ebulk,v (N − 1) is the energy of the system containing the vacancy. In Table 7.7 the bulk vacancy formation energies are well reproduced, and the average absolute deviation between DFT and the NN potential is about 43 meV, this error is higher than the energy RMSE of the fit. The reason for the large error is that structures with vacancies are not well represented in the training set. For example, we did not include all vacancy sites of the ideal structures, and vacancy structures do not form spontaneously in molecular dynamics simulations, which have been used to generate the majority of the training points. The large fcc supercells, (2 × 2 × 2), (2 × 2 × 3), (2 × 3 × 3) and (3 × 3 × 3) contain 32, 48, 72, and 108 atoms, respectively. Therefore the error per atom is substantially lower. Nevertheless, not all atoms contribute equally to this large error. The atoms responsible for the deviation must be the ones which are close to the vacancy site and are not well represented in the training set. This can be concluded from the finding that the error is on average the same for the supercells, i.e., the error does not increase with system size. If the NN potential was further refined by adding the structures with vacancies in the reference data set, improved results for the vacancy formation energies could be expected. 63 Table 7.7 Vacancy formation energies (eV) in different crystal structures of bulk copper. Different supercells have been used to represent various vacancy concentrations. The fixed data correspond to bulk-like atomic positions. The relaxed data have been determined by optimizing the structure with the respective method, i.e., the DFT energy has been determined by relaxing the structure using DFT, and the NN energy has been obtained by a geometry optimization using the NN potential. fixed NN Evac,bulk DFT Evac,bulk NN Evac,bulk (2 × 2 × 2) 1.196 1.195 1.164 1.174 (2 × 2 × 3) 1.180 1.214 1.143 1.180 (2 × 3 × 3) 1.171 1.232 1.130 1.194 (3 × 3 × 3) 1.147 1.250 1.108 1.214 (2 × 2 × 2) 1.069 1.021 0.982 0.958 (2 × 2 × 3) 1.074 1.055 0.874 0.788 (2 × 3 × 3) 1.068 1.076 0.864 0.802 (3 × 3 × 3) 1.070 1.084 0.925 0.944 hcp (2 × 2 × 2) 1.030 1.064 1.022 1.045 (2 × 2 × 3) 1.059 1.072 1.045 1.036 (2 × 3 × 3) 1.108 1.068 1.087 1.041 (3 × 3 × 3) 1.128 1.046 1.103 1.026 fcc bcc 64 relaxed DFT Evac,bulk rel. energy (meV/atom) 100 DFT NN 90 80 70 60 50 40 30 0 5 10 15 20 25 30 number of structure 35 40 Figure 7.9 Comparison of the DFT and neural network (NN) energies of several 16 atom bulk structures selected from a molecular dynamics trajectory of bcc copper at 500 K. Next, we have investigated the accuracy of the NN potential for disordered bulk structures. MD simulations of several crystal structures at a wide range of temperatures within the NVT ensemble have been carried out using the NN potential. Representative structures were then selected, and the energies and forces were determined by DFT calculations for comparison. Figure 7.9 shows the NN and DFT energies of distorted bcc structures containing 16 atoms, which were extracted randomly from a molecular dynamics simulation at 500 K. The typical deviation between DFT and the NN predicted energies is only a few meV per atom. The absolute forces acting on the atoms for one of these structures are compared in Fig. 7.10. It can be seen that also the agreement between the NN and DFT forces is excellent. 7.2.3 Copper surfaces Ideal low-index surfaces In this section various properties of copper surfaces are discussed after the reliable description of bulk copper has been proven. The most fundamental property of a surface is the surface energy γ, which is defined as the required energy to create a 65 |F| (eV/Bohr) 1.6 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.0 DFT NN 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Cu atom number Figure 7.10 Comparison of the DFT and neural network (NN) forces acting on the copper atoms in a 16 atom bulk structure. The atomic configuration has been selected randomly from a molecular dynamics trajectory of bcc copper at 500 K. surface by cleaving the bulk material. It is defined as γ= 1 (Eslab − N · Ebulk ) 2A , (7.3) where Eslab is the energy of a slab containing N atoms and Ebulk is the energy of an atom in the bulk material. A is the surface area of the slab and the factor 2 takes into account that in a slab calculation two surfaces are present. The comparison of the surface energies for a variety of surfaces of three different crystal structures (fcc, bcc, and sc) obtained with the NN potential and DFT are presented in Table 7.8. The surface energies have been calculated employing both, bulk-truncated slabs as well as fully relaxed slabs, to investigate the influence of differences in the final relaxed structures. The bulk structures of copper were cleaved along the (111), (100), and (110) planes to construct surfaces that are represented by slabs with a thickness of 10–12 Å, containing eight atomic layers with the central four layers constrained to the bulk lattice positions. For all calculations, a vacuum spacing of about 15 Å in the direction of the surface normal was used. For the DFT calculations it is important to ensure highly converged k-point meshes, since the slab and the bulk structures in Eq. 7.3 have different Brillouin zones. In case of the NN, which has been trained to energies obtained with well-converged k-point meshes for both, bulk and surface training structures, the energies correspond to dense k-point meshes by construction. 66 Table 7.8 Comparison of the DFT and neural network (NN) surface energies γ (in meV/Å2 ) of different copper surfaces. In the slab models of the surfaces eight metal layers have been used, fcc(110)mr is the missing row reconstruction of the Cu(110) surface. fixed surface relaxed γDFT γNN γDFT γNN fcc(111) 93.601 92.867 93.159 92.743 fcc(100) 101.251 101.767 100.532 100.995 fcc(110) 105.146 106.158 102.387 103.921 fcc(110)mr 113.364 114.484 109.927 111.689 bcc(111) 104.765 103.013 101.458 99.551 bcc(100) 97.742 95.309 96.972 94.602 bcc(110) 86.519 87.973 86.364 87.438 sc(111) 77.397 78.969 77.318 78.723 sc(100) 59.282 60.250 59.263 60.064 sc(110) 74.173 75.895 73.842 75.658 The agreement between the DFT and NN surface energies is very good and the results are extremely accurate for both, the bulk-truncated and the relaxed surfaces. For all crystal structures and all investigated low-index surfaces the energetic sequence is the same, and the average absolute error of the surface energies is as small as 1.34 meV/Å2 . The structural properties of the relaxed surfaces determined using the NN potential are also in very good agreement with the DFT results. For example, the first layer of the fcc Cu(111) surface exhibits an inwards relaxation of about 0.023 Å corresponding to 1.083 % of the bulk interlayer distance in DFT, while the NN values are 0.021 Å or 0.995 %. 67 Surface vacancies and adatoms Many structural features are important for setting up a model to represent of a real copper surface. Therefore, the NN potential should be able to reliably predict properties of model surfaces containing vacancies and adatoms. In the case of vacancies, the surface vacancy formation energy Evac,surf for a variety of surface orientations needs to be checked. The surface vacancy formation energy is defined as the energy needed to remove a copper atom from the surface Evac,surf = Eslab,vac + Eatom − Eslab , (7.4) where Eslab,vac is the energy of the supercell containing the surface vacancy, and Eslab is the energy of the defect-free surface supercell. In order to investigate if the NN potential predicts the correct relaxed surface structures, and the effect of the relaxation on the vacancy formation energies, we have determined Evac,surf for the bulk-truncated, unrelaxed or fixed surfaces as well as for the relaxed surfaces. The values obtained for a number of surfaces of different copper modifications are listed in Table 7.9. The average error is about 85 meV, and there is no dependence of the error on the supercell size. The vancancy formation energies of both, relaxed and fixed, structures are represented equally well, and the optimized structures obtained with DFT and the NN are very similar. Second, additional copper atoms, adsorbed or diffusing at the surface, need to be also described properly. The correct description of the potential experienced by an adatom is relevant for gaining mechanistic insights into atomic rearrangements at surfaces. This is very important for processes, such as reconstructions, growth, and adsorption. Models with a copper atom diffusing along copper surfaces have been investigated. For this purpose, the potential energy along a number of high-symmetry paths for a copper atom at a vertical distance of 1.85 Å at the Cu(111) surface and the Cu(100) surface has, therefore, been calculated. These paths are shown as the inset figures in Fig. 7.11(a) for the Cu(111) surface, and the corresponding NN and DFT energies are plotted, and the other path Fig. 7.11(b) for the Cu(100) surface. The NN energy profiles for both surfaces can represent the DFT energy profiles very accurately, and the deviations are in the order of a few meV only, which is the typical order of magnitude 68 Table 7.9 Vacancy formation energies (eV) at different copper surfaces represented by eight layer slabs. The fixed data corresponds to bulk-like atomic positions. The relaxed data was determined by optimizing the structure with the respective method. fixed relaxed NN Evac,surf DFT Evac,surf NN Evac,surf surface supercell DFT Evac,surf fcc(111) (2 × 1) 4.252 4.346 4.193 4.284 (2 × 2) 4.413 4.596 4.367 4.550 (2 × 3) 4.471 4.606 4.423 4.561 (3 × 3) 4.482 4.607 4.439 4.562 (2 × 1) 4.023 4.123 3.977 4.075 (2 × 2) 4.195 4.255 4.143 4.209 (2 × 3) 4.210 4.301 4.163 4.255 (3 × 3) 4.224 4.332 4.189 4.287 (2 × 1) 4.069 4.073 4.043 4.052 (2 × 2) 4.076 4.062 4.046 4.053 (2 × 3) 4.093 4.070 4.056 4.058 (3 × 3) 4.110 4.112 4.078 4.097 (2 × 1) 3.729 3.783 3.437 3.335 (2 × 2) 3.695 3.831 3.433 3.504 (2 × 3) 3.683 3.834 3.394 3.096 (3 × 3) 3.653 3.835 3.386 3.515 (2 × 1) 3.858 3.988 3.755 3.862 (2 × 2) 3.982 4.122 3.867 3.981 (2 × 3) 4.008 4.147 3.768 3.902 (3 × 3) 4.033 4.166 3.803 3.932 (2 × 1) 4.496 4.465 4.256 4.230 (2 × 2) 4.440 4.461 4.118 4.159 (2 × 3) 4.416 4.463 4.205 4.140 (3 × 3) 4.442 4.456 - 4.119 fcc(100) fcc(110) bcc(111) bcc(100) bcc(110) 69 (b) DFT NN 70 70 60 60 rel. energy (meV/atom) rel. energy (meV/atom) (a) 50 40 30 20 bridge 10 0 DFT NN 50 40 30 20 10 0 top fcc hcp bridge top top bridge hollow top Figure 7.11 Comparison of the DFT and neural network (NN) energy profiles of a copper atom moving at a distance of 1.85 Å above a clean Cu (2 × 2) surface. The energy profiles along the path given in the inset are shown for a (111) surface (a) and a (100) surface (b). of the NN potential RMSE. We found results of similar quality also for other distances from the surface. The most stable adsorption site of an additional copper atom at the Cu(111) surface in a (2 × 2) supercell is the fcc site. The binding energy in DFT is 2.903 eV (NN: 2.890 eV) and the optimum vertical distance with respect to the first metal layer is 1.75 Å (DFT) and 1.79 Å (NN). 7.3 Reliability of the neural network potential for a large realistic structure A slab model for a realistic Cu(111) surface containing various defects like vacancies, adatoms, kinks and steps has been set up. In total the surface contains 29, 443 atoms as shown in Fig. 7.12. The reliable description of such a realistic and large structure using the NN potential is challenging. This raises the question how the accuracy of the NN potential can be checked and improved for such systems that are not directly accessible by DFT calculations. Since a direct comparison of the NN energy and forces with DFT results for the full system is impossible, we have to reduce the system size and we need to use 70 Figure 7.12 A slab model for a “Real” copper (111) surface (29, 443 atoms) with a number of defects, such as vacancies, adatoms, steps, and kinks. Figure 7.13 “Real” copper surface with a number of defects, such as vacancies, adatoms, steps, and kinks. [85] The atomic environments of twelve representative atoms are shown as blue spheres. They are defined by the cutoff radius of the symmetry functions and include all atoms determining the atomic energies. The corresponding clusters are shown in Fig. 7.14. physical properties that depend only on the local atomic environments. It is one of the main advantages of NN potentials that they can be trained using DFT data for rather small system sizes, because the atomic energy contributions only depend on the local chemical environment defined by the cutoff radius of the symmetry functions. Once constructed, the NN potentials can still be applied to very large systems containing many thousand of atoms. In order to confirm this scalability of the NN potential, first a number of representative atoms of the slab model of a realistic surface have been selected. Then, we have extracted clusters centered around these atoms employing the cutoff radius of the symmetry functions (6 Å), which defines the atomic environments highlighted as blue spheres in Fig. 7.13. The obtained clusters are shown in Fig. 7.14 and contain all atoms, which determine the atomic energies of their central atoms. Unfortunately, there is no 71 uniquely defined atomic energy in DFT, and consequently the atomic energies obtained from the NN cannot be compared to DFT values. Nevertheless, while the energies of the clusters cannot be used to assess the accuracy of the NN potential, this can be done using the forces. The NN and DFT forces acting on the central atoms of these clusters can be directly compared to estimate the accuracy of the NN potential for the large surface structure. A comparison of the absolute forces acting on the central atoms of the twelve test clusters is shown in Fig. 7.15. The NN potential is able to predict the forces in very good qualitative agreement with DFT. Quantitatively, there are still some differences between the NN and the DFT forces. A further analysis of this discrepancy revealed that the forces depend on a larger chemical environment than the atomic energy contributions. This is a necessary consequence of the functional relation between the potential energy and the atomic positions. By definition, the atomic energy Ei depends only on the positions of the atoms inside the cutoff radius of the symmetry functions, while the forces also depend on all neighbors of these atoms. The reason is that the force acting on Cartesian coordinate Ri,α of atom i in direction α = {x, y, z} is given as the derivative of the total energy E with respect to Ri,α , which is the sum of the derivatives of all atomic energies, Fi,α = − ∂ ∂ E =− ∂ Ri,α ∂ Ri,α ∑Ej . (7.5) j Therefore, the derivatives of the energies of all atoms j having atom i inside their cutoff sphere enter the force Fi,α . Consequently, the force Fi,α depends on the positions of all atoms inside a sphere of radius 2 · Rc around atom i, because the largest possible distance between atom i and j is Rc . This means the clusters shown in Fig. 7.14 are not large enough to provide converged NN forces at the central atoms, and the clusters with a radius of 6 Å are not suitable to represent the atomic environments of the extended surface. Having confirmed this, calculations of the NN and DFT forces for clusters with an extended radius of 12 Å have been examined. The corresponding forces acting on the central atoms are shown in Fig. 7.16. It can be seen that the agreement between the NN and DFT forces is excellent, and improved with respect to the smaller clusters, 6 Å. 72 To verify the convergence of the NN forces, the comparison of the NN forces for clusters with radii of 6, 9, 12, 30 Å, and the slab have also been provided. In Fig. 7.17 the NN forces for the central atoms for the cutoff radii less than 12 Å, i.e., 6 and 9 Å are slightly different from the ones larger than 12 Å. Contributions from atoms that are more than 12 Å away from the central atom do not enter Fi,α , therefore, the NN forces obtained for the 12 Å, 30 Å or larger ones are exactly the same as the NN forces in the full slab. Consequently, we have shown using smaller subsystems that the NN potential is able to describe the PES of large systems very accurately. The numbers of copper atoms in different cutoff radii are presented in Table 7.10. The remaining difference between the DFT and NN forces in the 12 Å clusters, can be ascribed to small inaccuracies in the NN fits. One reason is that these clusters have not been included in the reference data set. The deviations between DFT and the NN could be decreased by including them into the training set. This approach offers a systematic way to improve NN potentials for large systems, which as a whole are impossible to access by DFT. Finally, it should be noted that although the atomic energies and forces do have a different effective dependence on the neighboring atoms, still the total energy and forces are fully consistent, as the forces are the exact analytic derivative of the total energy expression given in the Eqns. 7.5. In practical applications the different effective range is relevant only if too small cutoffs are used. For this case it can be seen that the forces acting on the central atom obtained for clusters with 6 and 12 Å radii are not substantially different. The remaining differences are mainly important for checking the quality of the potential. For the construction of the NN potential this is not relevant, because the training set anyway contains a wide range of sufficiently large periodic and non-periodic structures. Choosing a larger cutoff will produce minor improvements with higher computational cost. Here, the excellent quality of the reported copper PES shows that the employed cutoff of 6 Å is appropriate for this system. 73 1 2 3 4 5 6 7 8 9 10 11 12 Figure 7.14 Structures of the 12 clusters extracted from the large surface in Fig. 7.13 using the cutoff radius of 6 Å [85]. The forces acting on the central atoms shown in blue can be used to estimate the accuracy of the neural network (NN) potential for the extended system. A comparison of the forces obtained in DFT calculations and from the NN potential is shown in |F| (eV/Bohr) Fig. 7.15. 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.0 DFT NN 1 2 3 4 5 6 7 8 9 10 1112 cluster number Figure 7.15 Comparison of the DFT and neural network (NN) forces acting on the central atoms of the 12 clusters shown in Fig. 7.14. The clusters contain all atoms within a radius of 6 Å around the central atom. 74 |F| (eV/Bohr) 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.0 DFT NN 1 2 3 4 5 6 7 8 9 10 1112 cluster number Figure 7.16 Comparison of the DFT and neural network (NN) forces acting on the central atoms of 12 clusters cut from the slab shown in Fig. 7.13 using a radius of 12 Å. 1.2 NN 6 Å NN 9 Å NN 12 Å NN 30 Å slab |F| (eV/Bohr) 1.0 0.8 0.6 0.4 0.2 0.0 1 2 3 4 5 6 7 8 Cluster number 9 10 11 12 Figure 7.17 Convergence of the neural network (NN) forces acting on the central atoms of the 12 clusters cut from the slab shown in Fig. 7.13 with the radii of 6, 9, 12, 30 Å, and NN forces of the slab. The number of copper atoms in different cutoff radii (Rc ) is listed in Table 7.10 75 Table 7.10 Numbers of copper atoms in different cutoff radii (Rc ), Fig. 7.17, of the 12 copper clusters cut from the slab shown in Fig. 7.13. Cluster 76 Number of atoms in Rc 6Å 9Å 12 Å 30 Å 1 35 106 255 3201 2 43 128 306 3534 3 27 82 199 3077 4 40 113 263 3416 5 42 124 291 3455 6 38 110 259 3213 7 43 128 303 3231 8 50 144 328 3023 9 37 106 243 3111 10 29 92 225 3008 11 43 126 293 2894 12 40 124 300 2744 average 39 115 272 3159 8 A Multicomponent Neural Network Potential for Zinc Oxide The next consequent step into the direction of an accurate atomistic Cu/ZnO NN potential is the first application of the NN method for a multicomponent system, namely zinc oxide. Multicomponent systems may exhibit significant charge transfer resulting in long-ranged electrostatic interactions. This chapter presents the first application of an extended multicomponent NN method. 8.1 Neural network potentials for multicomponent systems Today, the proper theoretical description of realistic ZnO structures, especially the description of the polar oxygen-terminated and zinc-terminated surfaces, O–ZnO and Zn–ZnO, is still a challenging task which has not yet been completely accomplished. Polar surfaces are electrostatically unstable by nature and must undergo a reconstruction that removes the surface dipole moment. Various theoretical and experimental studies during the last decade have contributed to our understanding of the reconstruction of the polar zinc oxide surfaces [105–109]. For the Zn-terminated polar ZnO surface it has been suggested by Dulub, Diebold, and Kresse that the electrostatic instability can be removed by introducing triangularshaped islands with step edges exhibiting a particular orientation [108, 109]. The step-edges change the stoichiometry and thus the total charge of the polar surface. As a result, the instability is effectively removed. These previous reports show the significance of charge transfer in ZnO systems, and the construction of the NN potential for the ZnO system should take charge transfer into account. Therefore, we have to assume that the standard high-dimensional NN scheme of Ref. [53], in which the atomic charges do not enter the training data, can 77 not adequately describe multicomponent systems with significant charge transfer. In support of this hypothesis, in systems of arbitrary chemical composition charge transfer and the resulting long-range electrostatic interactions can play an important role and cannot be captured by only the short-range energy, Eshort . Consequently, a second set of atomic NNs has been constructed that has been trained to predict environment-dependent charges Qi at the atomic sites. The total energy expression of the NN potential now consists of two parts: a short-range energy Eshort describing the effects of local electronic structure changes due to chemical bonding and the long-range electrostatic energy contribution Eelec . More details about the multicomponent extension of the NN method can be found in Chapter 5. 8.2 Reference data set To construct a neural network (NN) potential for a multicomponent ZnO system, a reference data set was generated in a similar fashion as discussed previously for the case of copper structures, i.e., following the self-consistent iterative two-fit process of Sec. 6.2. First NN potentials were based on ZnO clusters, crystalline and amorphous bulk structures, and surface models. The first ZnO clusters of stoichiometry ZnN ON with N = 1, . . . , 40 were created randomly: the atomic positions of the clusters were generated based on the definition of reasonable atomic separations, i.e., minimum and maximum bond lengths for O–O, O–Zn and Zn–Zn bonds, and for a given cluster radius. Bulk structures were derived from the ideal wurtzite, zincblende (ZnS), sodium chloride (NaCl) and cesium chloride (CsCl) crystal structures. Apart from these ideal crystal structures, structures have also been generated by systematically varying the lattice parameters (atomic volumes from 50–110 Bohr3 ). Next, ideal surfaces have been included in the reference data set, which have been represented by slab models that were generated by truncation of the bulk equilibrium wurtzite lattice. Most included surface structures have been based on the ground state wurtzite structure lattice parameters a, c, and u as calculated using DFT with the PBE functional (see also Ref. [95]). The equilibrium unit cell volume was found to be 83.064 Bohr3 per atom. 78 Additionally, some surface models have been based on scaled wurtzite crystals with corresponding unit cell volumes between 70 and 100 Bohr3 per atom. In particular slab models of the ideal (1010) and (1120) surfaces containing 10 bulk ZnO layers and with a vacuum region of at least 12 Å were constructed and included in the data set. Several NN fits have been generated using this first input data. The NN-Two-Fit technique has been employed to improve the quality of the NN fit by searching for additional structures that were not accounted for by the training set. The good fits, e.g., fits with energy RMSEs of less than about 10 meV/atom, have subsequently been used to perform MD simulations of large bulk and surface structures in the NVT statistical ensemble at temperatures between 300 and 3000 K (imposed by a Berendsen thermostat). As discussed in Sec. 6.2, the energy contribution of each atom only depends on the local (atomic) chemical environment. Consequently, a new atomic environment can be added to the training set in form of a small cluster with a radius of the range of the symmetry functions (6 Å, see Sec. 8.3.1). Structures generated according to this procedure were automatically recalculated with DFT, if the comparison of energies and forces of the two NN fits showed significantly different values, i.e., if the difference of the two energies of the same structure is much larger than 10 meV/atom (see also Sec. 6.2.2). These structures were then added to the training set. All reference DFT calculations have been carried out with the FHI-aims package [57] employing a basis set of numerical atomic orbitals and the PBE functional [98] and the computational set-up described in Sec. 6.1. The final reference data set comprises DFT calculations of 38,750 ZnO structures, spanning a total energy range of 1.012 eV/atom, which is visualized as energy density of states (EDOS) for the data set in Fig. 8.1. The atomic forces in the data set span a range of 8.805 eV/Bohr for O atom, and of 7.494 eV/Bohr for Zn atom with respect to the optimized structure. In detail, the reference data set contains 7,366 clusters, 27,287 bulk structures (including crystal structures, random structures and snapshots from MD simulations), and 4,097 slab models. This data set was split into a training set (90 %) to fit the weight parameters of the NNs and an independent test set (10 %) to check the predictive power. The compositions of the training set and the test set are given in Table. 8.1. The number of atoms and the number of atomic forces in the training set, i.e., the total number of training data points, are 602,050 and 1,806,150, respectively. 79 3500 Bulk Clusters Slabs Number of structures 3000 2500 2000 1500 1000 500 0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 Energy Range (eV/atom) Figure 8.1 Energy density of states (EDOS) for the reference data set of zinc oxide structures. 8.3 Neural network potential for zinc oxide 8.3.1 Construction of the neural network potential for zinc oxide In order to accurately calculate the energy contributions, using the analytic expression of the neural network, symmetry functions with a cutoff radius of 6 Å were used as input nodes for the environment-dependent atomic energies and atomic charges. The symmetry functions used to describe the local atomic environments of the short-range energy and the atomic charges have been chosen to be identical. For oxygen and zinc atoms, the vector of the symmetry functions has been set up in terms of bond radial functions and angular functions. The same number of radial symmetry functions for both, short-range and long-range energies was employed for the three different bonds O–O, Zn–Zn and O–Zn, namely 14 functions for oxygen and 13 for zinc, respectively. Also the angular symmetry functions were chosen to be independent of the combination of elements and sum up to a number of 128 functions for oxygen and 116 functions for zinc. In total, the number of the symmetry functions or input nodes is 142 for oxygen, and 129 for zinc. The values of the symmetry functions are provided in Tables A.3, A.4, A.5, and A.6 in the appendix. 80 Table 8.1 Composition of the training and the test set for the ZnO system. Training set Test set 6,694 672 bulk 24,514 2,773 slabs 3,692 405 34,900 3,850 No. of atoms 602,050 62,640 No. of forces 1,806,150 187,920 clusters No. of structures In this work, atomic charges obtained from DFT calculations using the Hirshfeld method [92] have been used as reference data for the NN potential. However, any charge partitioning scheme can be applied and also higher multipoles could, in principle, be included. The atomic charges are then used to calculate the long-range electrostatic energy of the system by standard methods, i.e., by direct evaluation of Coulomb’s law for molecules or clusters and by Ewald summation [110] for periodic structures. In the reference DFT calculations the short-range energy is not directly accessible. Instead, for optimization the weight parameters of the short-range atomic NNs, the long-range electrostatic contribution to the total energy must first be removed from the DFT reference data. For this purpose, the electrostatic energies have been computed from the atomic charges, and are subtracted from the respective total DFT values. However, for the separation of the forces into a short-range and an electrostatic part it has to be taken into account that the atomic charges are not fixed, but depend on the chemical environments, and this dependence is not available from DFT calculations. Details of the energies and charges in the training set of the zinc oxide system are presented in Table 8.2. Different values of energy ranges of the short-range energy Eshort , the long-range electrostatic energy Eelec , and total energy Etot are reported for the NN schemes with and without electrostatics. For the NN including electrostatics, the energy range of Eshort is 1.399 eV/atom, and of Eelec is 0.649 eV/atom. The total energy for both NNs (with and without Eelec ) naturally is identical and has a value of 1.012 eV/atom. The averaged atomic charge of zinc and oxygen atoms is ±0.3535 e. 81 Table 8.2 Details of the energies (eV/atom) and charges (e) in the training set of the zinc oxide system. Eshort , Eelec , and Etot are the short-range energy, the long-range electrostatic energy, and the total energy, respectively. The reference atomic charges have been derived from the DFT calculations employing the Hirshfeld partitioning scheme [92]. The atomic charges are used to calculate the electrostatic energy of the system by standard methods such as Ewald summation [110]. Average Range -2.516 -3.073 1.012 0.000 0.000 0.000 0.000 -3.528 -2.516 -3.073 1.012 Eshort -2.862 -1.463 -2.407 1.399 Eelec -1.068 -0.419 -0.667 0.649 Etot -3.528 -2.516 -3.073 1.012 Qmin Qmax Average Range O -0.4995 0.1919 -0.3535 0.6914 Zn -0.0564 0.5870 0.3535 0.6434 Emin Emax Eshort -3.528 Eelec Etot NN without electrostatics NN with electrostatics Atomic charges (e) 82 8.3.2 A neural network potential for zinc oxide In order to determine the most suitable NN, various fits for different network architectures and with different initial parameters have been performed. As in the case of copper in Sec. 7, different numbers of hidden layers and nodes per layer have been tested. Further, the initial weight parameters have been initialized randomly and have been normalized applying the scheme of Nguyen and Widrow [111]. Here, we also compared the RMSE values of NN potentials constructed according to different NN methodologies, namely with and without regarding atomic forces for the fitting. The four different NN approaches thus are the following: • NN potentials not using atomic forces: (i) original high-dimensional NN scheme, (ii) high-dimensional NN scheme with electrostatic extension. • NN potentials using atomic forces: (i) original high-dimensional NN scheme, (ii) high-dimensional NN scheme with electrostatic extension. Numbers of nodes of 15, 20, 25, and 30 nodes per hidden layer have been evaluated. Again, both NN schemes have used the same set of symmetry functions with a number of 142 functions for oxygen, and 129 for zinc. The RMSE values of the energies and forces of the NN potentials without forces for fitting are shown in Table 8.3. The energy RMSEs for the NN without long-range electrostatic interactions are comparable to the NN with electrostatic extension and differ only by a few meV/atom, while the RMSEs of the forces are even smaller for the potential without electrostatic extension. The differences of the force errors are in a range of about 40 meV/Bohr. For the data in Table 8.4 the optimization of the weight parameters was done for the training set including the atomic forces. The energy RMSEs for the NNs with and without electrostatic energy are almost identical. However, the RMSE values of the forces are smaller for the NN without electrostatic energy. In general, the energy and forces RMSEs are smaller for the NN fits using the forces for the optimization. 83 Table 8.3 Root mean squared errors (RMSEs) of the energies and forces for the zinc oxide system for different neural network (NN) architectures obtained with and without electrostatic part of the atom-based approach. The optimization of the weight parameters was done for the training set not including the atomic forces. X represents the number of input nodes, which depends on the specific element as discussed in the text. See Table 8.4 for the same analysis including forces in the NN training. NN without Electrostatics NN with Electrostatics Network ERMSE FRMSE ERMSE FRMSE Architecture (meV/atom) (meV/Bohr) (meV/atom) (meV/Bohr) Train Test Train Test Train Test Train Test X-15-15-1 1.82 2.62 98.26 97.59 2.14 3.11 141.09 142.38 X-20-20-1 2.10 3.43 106.69 15.02 2.00 2.83 140.83 141.02 X-25-25-1 2.09 2.85 96.41 97.06 2.17 2.89 133.16 133.14 X-30-30-1 1.85 2.71 96.30 97.46 2.44 3.21 137.78 138.04 Table 8.4 Root mean squared errors (RMSEs) of the energies and forces for the zinc oxide system for different neural network (NN) architectures obtained with and without electrostatic part of the atom-based approach. The optimization of the weight parameters was done for the training set including the atomic forces. X represents the number of input nodes, which depends on the specific element as discussed in the text. See Table 8.3 for the same analysis without using the force information for the NN training. NN without Electrostatics 84 NN with Electrostatics Network ERMSE FRMSE ERMSE FRMSE Architecture (meV/atom) (meV/Bohr) (meV/atom) (meV/Bohr) Train Test Train Test Train Test Train Test X-15-15-1 2.30 2.80 93.00 94.85 2.96 3.53 136.96 137.21 X-20-20-1 2.02 2.73 91.70 94.06 2.60 2.90 135.16 134.51 X-25-25-1 2.08 2.58 90.07 89.94 2.51 3.49 136.37 138.74 X-30-30-1 1.93 2.67 91.67 94.84 2.59 3.33 133.82 134.36 These results of Tables 8.3 and 8.4 show that electrostatic interactions due to charge transfer are not significant for ZnO system. The electrostatic energy part increases the complexity of the NN optimization. The energy range spanned by the short-range energies in the reference set is larger than the energy-range of the total energies (Table 8.2), which additionally makes the fit more difficult. Also note, that the Hirshfeld charge partitioning yields average atomic charges (±0.3535) that are far away from the chemically intuitive charges in zinc oxide (±2.0). All these observations might be reasons for the higher RMSEs of the NN with electrostatic energy. The electrostatic energy extension of the NN method might be improved by choosing a different charge partitioning scheme. Additionally, instead of using the total atomic charges, screening functions may be used to exclude the short-ranged part of the electrostatic interactions from the Ewald summation [96]. Since the quality of the NN fit including electrostatic interactions is almost equivalent to the NN fit without charge part, the NN with electrostatic energy has been selected for the further studies. Structures that are not contained in the reference data set may exhibit stronger charge transfer and could not be properly described without electrostatic interactions. The selected NN fit employs an X-30-30-1 architecture and has been optimized including the force information. The RMSEs for the short-range energies and forces are about 2.59 meV/atom and 133.82 meV/Bohr for the training set, and about 3.33 meV/atom and 134.36 eV/Bohr for the test set, respectively. 8.3.3 Zinc oxide clusters The accuracy of the atomic forces is essential for performing MD simulations. In Fig. 8.3 the NN forces of a Zn15 O15 cluster are compared with the DFT values. A similar agreement has been found also for bulk systems. The precise representation of the atomic charges is illustrated in Fig. 8.5 showing a comparison of the NN and DFT Hirshfeld charges of the zinc and oxygen atoms in the same cluster. 85 Figure 8.2 Comparison of the NN potential and the DFT energies of random cluster structures |F| (eV/Bohr) of the composition Zn40 O40 3.0 2.5 2.0 1.5 1.0 0.5 0.0 2 4 6 8 10 12 14 Zinc atom 3.0 2.5 2.0 1.5 1.0 0.5 0.0 DFT NN 2 4 6 8 10 12 14 Oxygen atom Figure 8.3 Comparison of the absolute forces of DFT and the neural network (NN) acting on the atoms in a Zn15 O15 cluster. The cluster has been chosen randomly from a molecular dynamics simulation at 1000 K. 86 Fx (eV/Bohr) (a) DFT NN 1.5 1.0 0.5 0.0 -0.5 -1.0 -1.5 0 2 4 6 8 10 12 14 Zn atom number Fy (eV/Bohr) (b) O atom number DFT NN 1.5 1.0 0.5 0.0 -0.5 -1.0 -1.5 0 2 4 6 8 10 12 14 1.5 1.0 0.5 0.0 -0.5 -1.0 -1.5 0 2 4 6 8 10 12 14 Zn atom number (c) Fx (eV/Bohr) 1.5 1.0 0.5 0.0 -0.5 -1.0 -1.5 0 2 4 6 8 10 12 14 O atom number DFT NN 1.5 1.0 0.5 0.0 -0.5 -1.0 -1.5 0 2 4 6 8 10 12 14 1.5 1.0 0.5 0.0 -0.5 -1.0 -1.5 0 2 4 6 8 10 12 14 Zn atom number O atom number Figure 8.4 Comparison of the (a) x, (b) y and (c) z force components of the DFT and the neural network (NN) acting on the atoms in a Zn15 O15 cluster. The cluster has been chosen randomly atomic charge (e) from a molecular dynamics simulation at 1000 K. 0.8 0.0 0.6 -0.2 0.4 -0.4 0.2 -0.6 0.0 2 4 6 8 10 12 14 Zinc atom -0.8 DFT NN 2 4 6 8 10 12 14 Oxygen atom Figure 8.5 Comparison of the DFT and neural network (NN) atomic charges for the same Zn15 O15 cluster as in Fig. 8.3. 87 Table 8.5 Comparison of the lattice parameters, bulk moduli (GPa), and cohesive energies per formula unit of several crystal structures of ZnO obtained with the neural network (NN) and density-functional theory (DFT). Ecoh /eV lattice parameters bulk moduli DFT NN DFT NN DFT NN CsCl 5.584 5.588 a =2.688Å a =2.680Å 159 160 NaCl 6.747 6.745 a =4.328Å a =4.344Å 165 169 zincblende 7.043 7.041 a =4.616Å a =4.612Å 129 133 wurtzite 7.057 7.054 a =3.278Å a =3.278Å 129 129 c/a =1.614 c/a =1.614 u =0.379 u =0.379 8.3.4 Bulk zinc oxide structures The obtained NN potential is applicable to simulations of crystalline and amorphous bulk structures. The accuracy for these structure types has been checked by investigating a wide range of systems. In Fig. 8.6 the energy vs. volume curves for several crystal structures of ZnO are shown. As in DFT and experiment, the wurtzite structure is found to be energetically most stable and even the tiny energy difference to the zincblende structure is correctly resolved. It can be seen that the potential is also valid for compressed structures, as the smallest investigated volumes in Fig. 8.6 correspond to pressures of about 70 GPa. The obtained lattice parameters and cohesive energies of the investigated phases are given in Table 8.5. Figure 8.6 shows the potential-energy curves for the relative energy of the different ideal crystal structures of ZnO. The graph shows the comparison of the DFT energies (diamond symbols) versus the NN energies (solid lines) as a benchmark case. The NN and DFT energies of 40 random bulk structures containing 4 zinc and 4 oxygen atoms are shown in Fig. 8.7. These structures, which have not been included in the training set, illustrate the reliability for amorphous systems and very high temperatures. The performance of the NN for random ZnO bulk structures with up to 24 atoms per unit cell and random cluster structures with 80 atoms is depicted in Figs. 8.7 and 8.2. 88 Note that the computational time necessary for DFT calculations for such structures is about 6.5 hours and 10.5 hours (on 8 cores), respectively. The time used for the NN computations on the other hand is just one second for the bulk structures and two seconds for the cluster structures (on a single CPU). 8.3.5 Zinc oxide surfaces The relative energies of a thermally distorted ZnO(1010) surface slab with 8 layers, resulting from MD simulation at temperatures between T = 300 K and 3,000 K, as calculated using the NN and DFT are shown in figure 8.8. The results demonstrate that the Zinc Oxide potential is in good agreement with the reference DFT data. Moreover, the CPU time for the NN calculations is much lower than for the corresponding DFT calculations. 89 Figure 8.6 Comparison of the NN potential and the DFT energies of ideal crystal structures of ZnO. Figure 8.7 Comparison of the NN potential and the DFT energies of random bulk structures of the composition Zn12 O12 Figure 8.8 Comparison of the NN potential and the DFT energies of some thermally distorted ZnO(1010) surface structures. 90 9 Neural Network Potentials for Ternary Systems With the extension of the NN potential method to general multicomponent systems in Chapter 8, it is now possible to turn to the ternary Cu/Zn/O system, which lies at the center of our interests. Since the ZnO potential was the first application of the NN method to multicomponent systems, the Cu/ZnO potential of this chapter is the first ternary NN potential. The configuration space that has to be represented and the number of parameters for a multicomponent potential both grow as O(N!), where N is the number of chemical species. It therefore has to be determined, if for a ternary potential the same accuracy can be achieved as for a binary one. 9.1 Construction of a neural network potential-energy surface for copper/zinc oxide The construction of general applicable NN potentials for ternary systems, like Cu/Zn/O, is a difficult task due to the large configuration space. Therefore, the NN potential reported here is restricted to the description of copper clusters at zinc oxide surfaces. In order to use the potential in MD simulations to study clusters of a size comparable to experiment it should be able to describe very large systems containing tens of thousands of atoms. Additionally, it needs to be reliable for many subsystems, like copper particles, zinc oxide, a variety of surfaces of both systems and their interfaces including a manifold of defects. All these subsystems need to be described accurately at different temperatures to ensure that MD simulations yield accurate results under different conditions. Moreover, also chemical processes can take place at the interface resulting in structures that differ strongly from the combination of the ideal subsystems. Oxygen and zinc atoms could diffuse into the copper cluster and give rise to the oxidation of the cluster and alloy formation. Additionally, copper clusters could also penetrate the ZnO surface causing substantial structural changes in the ZnO support including significant mass transport [14]. 91 9.2 Reference data set The training data points used for the construction of the copper potential of Chapter 7 and for the zinc oxide potential of Chapter 8 have been reused for the training of the ternary potential [85, 95]. In addition, the data set has been extended by configurations of the binary subsystems Cu/Zn and Cu/O, and ternary Cu/Zn/O structures. Since the application of the ternary potential will focus on the Cu/ZnO interface, no attempt has been undertaken to construct a universal atomistic potential for the three elements. Consequently, we have restricted ourselves to Cu/ZnO structures derived from interface models, and molecular structures, such as the dioxygen molecule (O2 ), have not been included in the training set. Both, the construction of the reference data set and its composition are similar to what has been discussed for copper and zinc oxide in Chapters 7 and 8. The data set comprises bulk structures (both ideal and thermally distorted), slab models, and clusters that were derived from MD snapshots. See Chapter 6.2 for a detailed explanation of the iterative two-fit technique used for the refinement of the training set, which ensures to add only relevant structures to the data set. The total number of data points is around 100,000 , where the individual structures consist of up to 100 atoms. The exact composition of the data set is listed in Table 9.1. About 90% of the data points have entered the training set. The remaining structures form an independent test set that has been used to assess the quality of the fit. As for the copper and the zinc oxide potentials, various different NN architectures have been explored for the Cu/ZnO potential. However, in each case the same architecture has been used for the three different chemical species, in order to reduce the number of possibilities to a feasible limit. Additionally, the same set of symmetry functions has been employed for copper, zinc and oxygen, although it may be possible to tune the symmetry function set up to reduce the necessary number for a given accuracy. In this chapter, two sets of symmetry functions as input nodes have been tested, i.e., 132 and 156 functions, with different NN architecture. The energy and force RMSEs of the 132 symmetry functions are slightly higher than the values for the 156 symmetry functions. The errors are listed in Table 9.2 and Table 9.3, respectively. 92 Table 9.1 Composition of the training set and the test set for the construction of a neural network potential for the ternary system Cu/ZnO. System Training set Test set clusters bulk slabs clusters bulk slabs 5,810 10,604 12,484 670 1,125 1,405 CuO 37 909 - 5 106 - CuZn 4 588 1,122 - 61 128 7,105 24,572 3,695 833 2,715 402 18,356 - 2,317 2,033 - 263 Cu ZnO CuZnO The resulting RMSEs for the cohesive energies and atomic forces in dependence of the NN architecture are given in in Table 9.3. The selected NN fit (highlighted in bold font), i.e., the fit with the smallest errors for energies and forces in the test set, has been achieved for an architecture of three hidden layers with 15 nodes per layer. The resulting RMSE values for the cohesive energies are 4.84 meV/atom for the training set and 5.13 meV/atom for test set, respectively. The corresponding RMSEs for the atomic forces are 93.06 meV/Bohr and 88.6 meV/Bohr. Note that the small difference between the errors in the training and the test set indicate that virtually no overfitting is present. In total the potential depends on 2,851 weight parameters, which were fitted using approximately eight million data points of energies and force components. Before the quality of the ternary Cu/ZnO potential is assessed, it shall be confirmed that the new copper and zinc oxide sub-networks are of equal accuracy as the system specific potentials of the previous chapters. For a number of benchmark cases, the new potentials therefore have been compared to the potentials that have been described in Chapters 7 and 8. Since the configuration space of a ternary system is much larger than the one of a binary system, it is not directly obvious that it is possible to construct a ternary NN potential with the same accuracy. The remainder of the chapter seeks to determine the quality of the potential for ternary structures, albeit focusing on the Cu/ZnO interface and on large copper clusters. 93 Table 9.2 Root mean squared errors (RMSEs) of the neural network (NN) energies and forces obtained for the training set and the test set of the ternary Cu/ZnO system with different NN architectures with 132 symmetry functions. Network Weights ERMSE FRMSE Architecture per (meV/atom) (meV/Bohr) Element Training Set Test Set Training Set Test Set 132-2-2-1 275 10.74 10.82 187.06 182.82 132-2-2-2-1 281 8.72 9.05 148.35 144.89 132-5-5-1 701 5.99 6.24 135.68 134.81 132-5-5-5-1 731 5.54 5.94 113.81 108.52 132-10-10-1 1451 5.43 5.87 116.63 115.05 132-10-10-10-1 1561 5.00 5.33 108.79 102.04 132-15-15-1 2251 5.46 5.80 114.95 111.88 132-20-20-1 3101 5.87 6.06 112.85 109.21 132-30-30-1 4951 7.22 7.41 132.59 127.57 132-40-40-1 7001 8.73 8.67 156.23 150.09 9.2.1 Neural network potential for copper In this section the performance of the new NN potential is compared to the specialized copper potential of Chapter 7 and Ref. [85]. The lattice parameters, cohesive energies and bulk moduli of several copper crystal structures, as predicted using the two different NN potentials, are compared to their DFT reference values in Table 9.4. Both NN potentials yield very similar results, which are in very good agreement with DFT for all structures. As a second benchmark, the surface energies of ideal low index copper surfaces obtained using the two NN potentials are compared in Table 9.5. The proper description of these surfaces is essential for the modeling of large copper clusters. As is evident from Table 9.5, the quality of the two NN fits is comparable, and they accurately reproduce the DFT surface energies. At last, the forces acting on the individual atoms within ten copper clusters (Fig. 9.2) that were extracted from a realistic surface model (Fig. 9.1) are compared in Fig. 9.3. 94 Table 9.3 Root mean squared errors (RMSEs) of the neural network (NN) energies and forces obtained for the training set and the test set of the ternary Cu/ZnO system with different NN architectures. The fit used in this chapter is shown in bold. The number of weight parameters in the atomic NNs is also given for each architecture. Network Weights ERMSE FRMSE Architecture per (meV/atom) (meV/Bohr) Element Training Set Test Set Training Set Test Set 156-2-2-1 323 8.13 8.22 137.22 133.27 156-2-2-2-1 329 8.67 8.73 134.58 131.47 156-5-5-1 821 5.88 6.20 112.18 110.86 156-5-5-5-1 851 5.54 5.91 102.85 100.19 156-10-10-1 1691 5.32 5.68 98.96 95.11 156-10-10-10-1 1801 4.99 5.36 95.58 93.14 156-15-15-1 2611 5.29 5.64 99.25 96.71 156-15-15-15-1 2851 4.84 5.13 93.06 88.61 156-20-20-1 3581 5.91 6.23 109.08 105.21 156-20-20-20-1 4001 5.35 5.68 99.17 95.33 156-30-30-1 5671 7.11 7.35 117.41 112.61 156-40-40-1 7961 8.91 8.91 139.03 133.38 95 Table 9.4 Comparison of the lattice parameters, cohesive energies and bulk moduli of various crystal structures of copper obtained with two neural network (NN) potentials and density-functional theory (DFT). Property NN NN CuZnO fit Cu fit DFT Property fcc structure NN NN CuZnO fit Cu fit DFT simple cubic structure (Å) 3.628 3.630 3.630 a0 (Å) 2.407 2.407 2.407 Ecoh (eV) 3.524 3.526 3.533 Ecoh (eV) 3.051 3.052 3.051 B (GPa) 142 138 140 B (GPa) 100 108 103 2.568 2.570 2.573 a0 bcc structure hcp structure (Å) 2.885 2.887 2.885 a0 Ecoh (eV) 3.492 3.486 3.489 c/a0 1.629 1.631 1.627 B (GPa) 139 135 137 Ecoh (eV) 3.525 3.511 3.511 a0 Table 9.5 Surface energies of low index copper surfaces obtained from DFT and the neural (Å) Surface network (NN) potential for the ternary Cu/ZnO NN NN CuZnO fit Cu fit DFT system. For comparison also the NN surface (111) 88.7 92.7 93.2 energies obtained from a potential for pure (100) 98.3 101.0 100.5 copper [85] are listed. All energies are given (110) 102.6 103.9 102.4 in meV/Å2 , “mr” is the missing row recon- (110)mr 109.8 111.7 109.9 struction. 96 Figure 9.1 Slab model of a Cu(111) surface with several defects like vacancies, adatoms, steps, and kinks [112]. Ten representative atoms shown in blue have been selected to investigate the accuracy of the neural network (NN) potential-energy surface. The NN forces acting on these atoms depend on all atoms inside the transparent spheres with a radius of approximately 12 Å, which corresponds to twice the cutoff Rc of the symmetry functions. The atoms enclosed in these spheres form clusters which are small enough the be calculated by DFT. They are shown in Fig. 9.2. These clusters are shown as transparent blue spheres (within Rc = 12Å). They contain on average about 260 atoms and are sufficiently large to ensure that the central atoms have approximately the same chemical environment as in the full slab. A similar benchmark had been used in Sec. 7.3 to estimate the accuracy of the copper NN potential for structures that are too large to be computed with DFT. With the exception of a single cluster, for which the error is about 0.1 eV/Bohr, the two NN potentials are able to predict the DFT values with high accuracy. 9.2.2 Neural network potential for zinc oxide As for the copper subsystem, the new ternary potential must also be reliable for zinc oxide structures. In this section, the new potential is therefore compared to the ZnO potential of Chapter 8 and Ref. [95]. Table 9.6 presents the lattice parameters, cohesive energies and bulk moduli of a number of zinc oxide crystal structures, as calculated using the two different NN potentials and DFT. As in the case of copper, the new potential proves to be for of equally high accuracy these properties as the specialized ZnO potential. In Fig. 9.4 the NN predictions of energies of randomly generated (amorphous) zinc oxide bulk structures are compared to their DFT references. In this benchmark test the new ternary potential even performs slightly better than the potential of Chapter 8. 97 Figure 9.2 Geometries of the 10 clusters extracted from the large slab model of the Cu(111) surface in Fig. 9.1. The forces acting on the central atoms shown in blue can be used to assess the accuracy of the neural network (NN) potential for the full system. A comparison of the |F| (eV/Bohr) DFT forces in these clusters and the NN forces in the full slab is shown in Fig. 9.3. 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.0 DFT NN(CuZnO) 1 2 3 4 5 6 7 8 9 10 cluster number Figure 9.3 Comparison of the DFT and neural network (NN) forces acting on the central atoms of the 10 clusters shown in Fig. 9.2. The clusters used for the DFT calculations contain all atoms within a radius of about 12 Å around the central atom. By construction the NN forces in these clusters are identical to the NN forces in the full slab shown in Fig. 9.1. 98 Table 9.6 Comparison of the lattice parameters, bulk moduli, and cohesive energies per formula unit of several crystal structures of ZnO obtained with two neural network (NN) potentials and density-functional theory (DFT). Property NN NN CuZnO fit ZnO fit DFT Property CsCl structure NN NN CuZnO fit ZnO fit DFT zincblende structure (Å) 2.692 2.680 2.688 a0 (Å) 4.624 4.612 4.616 Ecoh (eV) 5.584 5.588 5.584 Ecoh (eV) 7.042 7.041 7.043 B (GPa) 156 160 159 B (GPa) 125 133 129 a0 NaCl structure wurtzite structure (Å) 4.332 4.344 4.328 a0 Ecoh (eV) 6.751 6.745 6.747 B (GPa) 167 169 165 a0 (Å) 3.278 3.278 3.278 c/a0 1.614 1.614 1.614 u 0.379 0.379 0.379 Ecoh (eV) 7.062 7.054 7.057 Note, that the Cu/ZnO potential does, in contrast to the pure ZnO potential, not contain an atomic charge subnetwork for electrostatic interactions. Since the training sets employed for the construction of both potentials were very similar, we have to conclude that long-range electrostatic interactions are not of high importance for the studied ZnO structures. As a final measure for the quality of the new potential for ZnO structures, the predicted forces acting on the central oxygen or zinc atoms of Zn15 O15 clusters are compared to the reference DFT values in Fig. 9.5. The clusters have been generated from snapshots of molecular dynamics simulations at 1000 K and were not included in the training set. Nevertheless, the agreement between the NN potential and DFT is very good. 9.2.3 The binary subsystems Cu/O and Cu/Zn For the additional binary subsystems Cu/O and Cu/Zn systems, the agreement between the DFT and NN energies of bulk structures Cu10 O2 and Cu10 Zn2 is compared in Fig. 9.6 and Fig. 9.7, respectively. As apparent from the diagrams, the NN potential 99 rel. energy (eV/atom) 0.40 DFT NN (ZnO) NN (CuZnO) 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 0 5 10 15 20 25 30 35 40 45 50 55 number of structure Figure 9.4 Comparison of the DFT and neural network (NN) energies of random bulk structures of the composition Zn4 O4 . The NN energies have been obtained using two different potentials, |F| (eV/Bohr) the NN potential for zinc oxide [95] and the NN potential for the Cu/ZnO system [112]. 3.0 2.5 2.0 1.5 1.0 0.5 0.0 DFT NN (ZnO) NN (CuZnO) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |F| (eV/Bohr) Zn atom number 3.0 2.5 2.0 1.5 1.0 0.5 0.0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 O atom number Figure 9.5 Comparison of the absolute forces acting on the atoms in a Zn15 O15 cluster obtained by DFT and two neural network (NN) potentials. The cluster has been chosen randomly from a molecular dynamics simulation at 1000 K. The NN forces have been obtained using two different potentials, a previously published NN potential for zinc oxide [95] and the NN potential for the ternary Cu/ZnO system [112]. 100 rel. energy (eV/atom) 2.0 DFT NN(CuZnO) 1.5 1.0 0.5 0.0 -0.5 -1.0 0 5 10 15 20 25 30 35 40 number of structure Figure 9.6 Comparison of the DFT and neural network (NN) energies of random CuO bulk structures of the composition Cu10 O2 . An insert figure is a supercell (3 × 3 × 1) CuO bulk structure. is able to accurately reproduce the DFT reference values. The insets show random (3 × 3 × 1) supercell structures for both systems. 9.2.4 Neural network potential for copper clusters at zinc oxide It should be emphasized again that the Cu/ZnO potential presented in this section has been explicitly constructed for copper clusters at zinc oxide surfaces. The ternary Cu/Zn/O structures that have been included in the reference data set were therefore extracted from large interface models. In Fig. 9.8 the NN and DFT energies of a number of examples for such clusters of the composition Cu27 Zn20 O20 are compared after they had been used to refine the NN potential. As can be clearly seen, the NN potential is able to reproduce the fitted energies very well. Also the energies of random slabs of the composition Cu12 Zn6 O6 have been checked, and the agreement of the NN and DFT energies is excellent, as shown in Fig. 9.9. In order to verify the accuracy of the preliminary potential for Cu/ZnO, a test model has been constructed: a copper cluster containing 612 atoms adsorbed with its (111) surface at a ZnO(1010) surface, as shown in Fig. 9.10. In total, this model contains 7524 atoms. An NV T MD simulation at 1000 K has been carried out employing the 101 rel. energy (eV/atom) 0.7 DFT NN(CuZnO) 0.6 0.5 0.4 0.3 0.2 0.1 0.0 -0.1 0 2 4 6 8 10 12 14 16 18 20 number of structure Figure 9.7 Comparison of the DFT and neural network (NN) energies of random CuZn bulk structures of the composition Cu10 Zn2 . An insert figure is a supercell (3 × 3 × 1) CuZn bulk structure. rel. energy (meV/atom) 250 DFT NN(CuZnO) 200 150 100 50 0 0 5 10 15 20 25 30 35 40 number of structure Figure 9.8 Comparison of the DFT and neural network (NN) energies of random clusters of the composition Cu27 Zn20 O20 . 102 (a) (b) rel. energy (meV/atom) 60 DFT NN(CuZnO) 50 40 30 20 10 0 0 10 20 30 40 50 number of structure Figure 9.9 Comparison of the DFT and neural network (NN) energies of random slabs of the composition Cu12 Zn6 O6 . Figure 9.10 Model of a Cu612 cluster at the ZnO(1010) surface, in total this system contains 7524 atoms. 103 NN potential for Cu/ZnO system to obtain highly distorted structures at the interface using the molecular dynamics program TINKER [97] Since the system is too large to access via DFT calculations, representative atoms that are close to the interface have been selected, and the reliability of the forces acting on those atoms has been investigated. Atoms far away from the interface are already accurately described, because they are embedded in chemical environments similar to subsystems like pure copper or pure zinc oxide. As has been pointed out, the NN forces in principle depend on the positions of the atoms within a distance of up to twice the cutoff radius from the central atom. However, it can be seen that for all atoms the forces in the smallest clusters are already a very good approximation to the converged forces. In Fig. 9.11 a snapshot of the MD simulation is shown in panel (B). For a visualization of the interface the downside of the copper cluster is shown in panel (A), while a top view of the ZnO surface is shown in (C), including information about the positions of the copper atoms in the first layer in the cluster. Five representative atoms of each element (Cu, O, Zn) have been selected as labeled in Fig. 9.11(A) and (C). 15 atom-centered clusters have been extracted from the slab model to determine the forces by DFT calculations. A radius of about 12 Å, which corresponds to twice the cutoff radius of the symmetry functions, results in an average of about 450 atoms per cluster, which is still too large for DFT. Therefore, the radii of the clusters for the DFT calculations of 6 Å and 9 Å (average numbers of atoms 65 and 200, respectively) have been investigated instead. Using these two different cluster sizes allows to determine the level of convergence of the DFT forces acting on the central atoms as a function of the radius of the clusters. The results are presented in Fig. 9.12. There are only small differences in the DFT forces at the central atoms in the 6 Å and 9 Å clusters, and both data sets are very similar to the NN forces calculated for these atoms in the slab model shown in Fig. 9.11. There are no differences in the quality of the NN forces for the elements copper, zinc and oxygen. Since the DFT forces are converged already at cluster radii of 9 Å. To proof this, the analysis of the effective convergence of the NN forces as a function of the cluster radius is shown in Fig. 9.13. 104 (A) (B) (C) Figure 9.11 Snapshot of a molecular dynamics simulation at 1000 K of a Cu612 cluster at the ZnO(1010) surface (B). This structure has been used to analyze the quality of the neural network potential (NN) for the description of the interface atoms. In (A) a bottom-view of the cluster is shown and five copper atoms have been selected to compare the NN forces with the DFT forces. The comparison is shown in Fig. 9.12. In panel (C) a top view of the ZnO(1010) surface is shown. Five oxygen and five zinc atoms have been chosen for a closer investigation of the forces. In order to illustrate the position of these atoms with respect to the copper cluster, the copper atoms of the first metal layer, which are directly in contact with the ZnO surface, are shown as small spheres. 105 |F| (eV/Bohr) 2.0 DFT 6 Å DFT 9 Å NN slab 1.5 1.0 0.5 0.0 1 2 3 4 5 1 2 3 4 5 1 2 Zn atom Cu atom 3 4 5 O atom Figure 9.12 Comparison of the neural network (NN) forces acting on selected atoms at the Cu(111)/ZnO(1010) interface shown in Fig. 9.11 and forces obtained by DFT in calculations for clusters centered at these atoms with radii of 6 Å and 9 Å. |F| (eV/Bohr) 2.0 NN 6 Å NN 9 Å NN 12 Å NN slab 1.5 1.0 0.5 0.0 1 2 3 4 Cu atom 5 1 2 3 4 Zn atom 5 1 2 3 4 5 O atom Figure 9.13 Convergence of the neural network (NN) forces acting on selected atoms at the Cu(111)/ZnO(1010) interface shown in Fig. 9.11. The forces have been obtained from clusters centered at these atoms with increasing radii of 6 Å, 9 Å, and 12 Å. For comparison also the forces obtained directly for the full system are shown (“slab”). While there are still small differences between the 6 Å clusters and the slab, the forces are well converged for a cluster radius of 9 Å. 106 10 A Neural Network Potential for the Methanol Molecule Recently, a first application of full dimensional NN potentials for binary molecules, namely for hydrogen and oxygen (H, O) in water monomer and dimer molecules, has been presented [96]. For this thesis the methanol molecule has been selected as a benchmark for a ternary molecular structure (H, C, and O). The construction of a NN potential for molecular applications is in principle similar to the construction of solid state potentials described in the previous chapters. In the following the differences between the approaches will be elaborated. A cutoff radius of (Rc ) 12 Bohr has been used for the atomic NNs, which ensures that each atom interacts with all other atoms in the molecule. Due to the chemical composition of the methanol molecule only a subset of the possible element combinations of the ternary (H,C,O) system is present, which has to be taken into account for the construction of the symmetry functions. For example, carbon atoms cannot have other carbon atoms in their chemical environment, as there is only a single carbon atom in methanol. The specific symmetry functions that have been used and their parameters are listed in Table 10.1 and Table 10.2. In short, in the atomic NN the chemical environments of the H, C, and O atoms have been described by 19, 10, and 10 atom-centered symmetry functions, respectively. The fitting accuracy that can be obtained depends on the energy range that needs to be covered by the potential. The training data set has been constructed by MD simulations using the MM3 force field at temperatures between 10 K and 1000 K. We have extracted 45,600 structures from the MD trajectories, spanning an energy range of 0.08 eV/atom with respect to the optimized structure. 41,047 of these structures have been selected randomly for the training set, the remaining 4,553 structures form the test set. 107 Table 10.1 Parameters defining the atom-centered symmetry functions used to describe the local atomic environments for methanol (Hydrogen atom). Symmetry functions of type G2 : element H neighbor η (Bohr−2 ) Rs (Bohr) Rc (Bohr) 1 H 0.010 0.0 12.0 2 C 0.010 0.0 12.0 3 O 0.010 0.0 12.0 No. Symmetry functions of type G4 : element H neighbors η (Bohr−2 ) λ ζ Rc (Bohr) 4 HH 0.008 -1.0 1.0 12.0 5 HC 0.008 -1.0 1.0 12.0 6 HO 0.008 -1.0 1.0 12.0 7 CO 0.008 -1.0 1.0 12.0 8 HH 0.008 1.0 1.0 12.0 9 HC 0.008 1.0 1.0 12.0 10 HO 0.008 1.0 1.0 12.0 11 CO 0.008 1.0 1.0 12.0 12 HH 0.008 -1.0 4.0 12.0 13 HC 0.008 -1.0 4.0 12.0 14 HO 0.008 -1.0 4.0 12.0 15 CO 0.008 -1.0 4.0 12.0 16 HH 0.008 1.0 4.0 12.0 17 HC 0.008 1.0 4.0 12.0 18 HO 0.008 1.0 4.0 12.0 19 CO 0.008 1.0 4.0 12.0 No. 108 Table 10.2 Parameters defining the atom-centered symmetry functions used to describe the local atomic environments for methanol (Carbon and Oxygen atoms). Symmetry functions of type G2 : element C neighbor η (Bohr−2 ) Rs (Bohr) Rc (Bohr) 1 H 0.010 0.0 12.0 2 O 0.010 0.0 12.0 No. Symmetry functions of type G4 : element C neighbors η (Bohr−2 ) λ ζ Rc (Bohr) 3 HH 0.008 -1.0 1.0 12.0 4 HO 0.008 -1.0 1.0 12.0 5 HH 0.008 1.0 1.0 12.0 6 HO 0.008 1.0 1.0 12.0 7 HH 0.008 -1.0 4.0 12.0 8 HO 0.008 -1.0 4.0 12.0 9 HH 0.008 1.0 4.0 12.0 10 HO 0.008 1.0 4.0 12.0 No. Symmetry functions of type G2 : element O neighbor η (Bohr−2 ) Rs (Bohr) Rc (Bohr) 1 H 0.010 0.0 12.0 2 C 0.010 0.0 12.0 No. Symmetry functions of type G4 : element O neighbors η (Bohr−2 ) λ ζ Rc (Bohr) 3 HH 0.008 -1.0 1.0 12.0 4 HC 0.008 -1.0 1.0 12.0 5 HH 0.008 1.0 1.0 12.0 6 HC 0.008 1.0 1.0 12.0 7 HH 0.008 -1.0 4.0 12.0 8 HC 0.008 -1.0 4.0 12.0 9 HH 0.008 1.0 4.0 12.0 10 HC 0.008 1.0 4.0 12.0 No. 109 Table 10.3 Root mean squared errors (RMSEs) of the energies and forces for the methanol molecule for different NN architectures. X represents the number of input nodes, which depends on the specific element of elements as discussed in the text. RMSE Architecture (meV/atom) (meV/Bohr) (meV/atom) (meV/Bohr) Training Set Test Set Training Set Test Set X-5-5-1 0.426 0.449 26.45 26.30 X-10-10-1 0.112 0.115 7.20 7.19 X-15-15-1 0.069 0.073 4.48 4.51 X-20-20-1 0.053 0.060 3.65 3.65 X-30-30-1 0.032 0.038 2.51 2.57 X-40-40-1 0.032 0.038 2.69 2.64 As for the Cu/Zn/O potentials in Chapters 7, 8 and 9, for the construction of the NN potential we have tested various NN architectures employing two hidden layers with five to 40 nodes per layer. The RMSE values are listed in Table 10.3. The optimum architecture corresponds to a fit with a low RMSE value for the training set and the test set. The best fit has been obtained for an architecture with 30 nodes per layer. A larger NN does not yield a further reduced energy RMSE, while the RMSE of the forces increases slightly. The RMSEs of the energies of this NN are 0.032 meV/atom and 0.038 meV/atom for the training and the test set, respectively. The total energy of the six-atom methanol molecule is thus reproduced with an RMSE well below 0.2 meV, which is only a small fraction of the 480 meV range of the total energies in the training set. In Fig. 10.1 the NN energies of the training and test points are plotted against the reference force field energies. All points are very close to the line with a slope of 45◦ indicating an almost perfect correlation. The training set and test set forces have RMSEs of 2.51 meV/Bohr and 2.57 meV/Bohr, respectively. Although the forces have not been included in the fitting process, the agreement of the NN predictions and the 110 NN energies (meV/atom) 100 80 60 40 Train points Test points 20 0 0 20 40 60 80 100 MM3 energies (meV/atom) Figure 10.1 Comparison of the MM3 and neural network (NN) energies for the methanol molecule. All train and test points are very close to the line with slope 45◦ corresponding to a very good fit. MM3 forces is excellent. In summary, the NN potential is able to provide an extremely accurate fit of the energies, and the low test set RMSEs as well as the low errors of the forces. This indicates that a smooth and reliable NN PES has been obtained. In order to test the applicability of the NN PESs to MD simulations, the comparison of the dihedral potential for a rotation about the C-O bond obtained using NN and the MM3 force field reference data. The energies of the MM3 force field and of NN approach shown in Fig. 10.2 are basically indistinguishable. The most stringent test is the comparison of the energetics along the configurations visited in molecular dynamics simulations. In Fig. 10.3 one trajectory obtained employing the NV E ensemble with an average temperature of about 350 K is shown. The propagating forces in the simulation have been calculated using the atom-based NN, and the energies of the configurations have been recalculated using the MM3 force field. The agreement between the reference force field and the NN PES is excellent. 111 Potential energy (eV) 0.10 0.09 FF MM3 NN 0.08 0.07 0.06 0.05 0.04 0 60 120 180 240 Dihedral Angle (°) 300 360 Figure 10.2 Dihedral potential of methanol for a rotation about the C-O bond. The MM3 force field reference data are shown to assess the quality of the NN potential, but these points have not been included in the training set. The initial structure of methanol has been optimized, which results in two different lengths of C-H bonds in the methyl group. Since the internal structure of the methyl group and the length of the O-H bond have been frozen for this plot, the two types of C-H bonds result in different heights of the maxima in the dihedral potential. Note, that in some cases the NN potentials may be further improved by using an alternative structural description in form of atom pairs instead of atomic contributions, which directly reflect the atomic interactions and take the chemical environments into account. However, this approach will not be discussed in detail in this work, and the interested reader is referred to Ref. [113], in which an implementation of the NN method based on atom pairs has been reported. 112 Potential energy (eV) 0.70 FF MM3 NN 0.60 0.50 0.40 0.30 0.20 0.10 0 10 20 30 40 50 Structures 60 70 80 Figure 10.3 Comparison of the MM3 and neural network (NN) energies along a molecular dynamics (MD) trajectory of the methanol molecule in the NV E statistical ensemble with an average temperature of about 350 K and a time step of 1 fs. The MD has been carried out employing the NN potential, and the energies have been recalculated using the MM3 force field. 113 114 Part V Summary and Outlook 115 11 Summary and Outlook The ultimate goal of this thesis was to develop an accurate and efficient atomistic potential to study large Cu clusters on the ZnO surfaces, which is an important catalyst for the methanol synthesis. Accurate electronic structure methods like density-functional theory (DFT) are computationally too demanding to simulate the Cu/ZnO system under realistic conditions. This realization was the major motivation for this project. On the other hand, efficient interatomic potentials based on physically motivated functional forms are not able to give a reliable and sufficiently accurate description for such systems. In order to overcome these limitations, the high-dimensional neural network method (NN) has been used to construct efficient potentials based on reference data obtained from DFT. Preceding this thesis the NN method has neither been applied to metallic systems, nor to compounds containing more than a single chemical species. The previous use of the methodology has been limited to periodic structures of covalent insulators. In Chapter 7 a potential for Cu clusters, bulk structures and surface slabs has been constructed. The NN and reference DFT energies for the reference data set have been shown to differ by only 4 meV/atom on average even for challenging benchmark structures. Bulk properties, such as cohesive energies, equilibrium lattice constants, and bulk moduli have been found to be in excellent agreement for the NN and DFT values for various different crystal structures (fcc, bcc, sc and hcp). The reliable description of Cu surfaces has been demonstrated, by a comparison of surface energies, vacancy formation energies, and energy profiles of diffusing copper surface adatoms moving along paths on different copper surfaces. Using a realistic, non-ideal slab model for a Cu(111) surface containing various defects like vacancies, adatoms, kinks and steps, it has been shown that the atomic forces acting on representative atoms can be predicted with very high precision by the NN potential. This conclusively demonstrates that NN potentials are capable to describe metallic materials and can be 117 confidently used to simulate large realistic systems with a quality that is very close to the DFT reference. The next step was the construction of a potential for the multicomponent ZnO system in Chapter 8 that is applicable to bulk structures, clusters, and surface slabs. In order to achieve the desired accuracy for this system, the long-range electrostatic energy resulting from charge transfer has to be properly described. For this purpose, the atomic NN method has been extended by a second set of high-dimensional NNs representing environment-dependent point charges. The predicted NN charges allow then to compute the electrostatic interactions. We have demonstrated that this approach makes it possible to accurately reproduce the cohesive energies, the lattice parameters, and bulk moduli of arbitrary ZnO structures using the NN potential. Additionally, the agreement of the NN atomic forces and charges with their DFT references is excellent. However, a careful analysis of the long-range interaction in ZnO has shown that electrostatic interactions due to charge transfer may play a smaller role for ZnO than assumed. An NN potential that was constructed without the electrostatic extension proved to be only marginally less accurate than the extended potential. Nevertheless, we are confident that the extended NN method is useful for structures with more varying charge transfer. Finally, a potential for the ternary Cu/ZnO system has been successfully constructed in Chapter 9. To confirm the reliability of the potential, the comparison of the NN properties with DFT reference data has been evaluated. Even though the ternary potential has been fitted to a much larger configuration space than the previous NNs, we have demonstrated that the Cu and ZnO subsystems can be described without significant loss of accuracy by the combined potential. This observation proves that NN potentials can be constructed in a straightforward way, using reference data sets of previous constructed NN potentials. Additionally, the accuracy of the ternary potential for Cu/ZnO interface structures has been shown to be very high. To demonstrate the general capability of the NN method to describe molecular systems, a highly accurate potential for the methanol molecule has been presented in Chapter 10, trained to energies obtained with the MM3 force field. In summary, this work has shown that high-dimensional NN potentials can be constructed for a variety of complex systems including metals and insulators, isolated 118 clusters, bulk structures, large surfaces and molecules. Due to the flexible form of the NN, a large number of training structures is required. However, the construction of the training set can be done in a very systematic and unbiased way as explained in Section 6.2. It has been shown that reference energies, forces, and charges can be reproduced very accurately not only for structures included in the training set but also for the independent test set. The analysis of many different properties derived from the PESs has proven that NN potentials are a reliable alternative to computationally demanding ab initio methods like DFT. Once the construction is done, NN potentials can be applied to simulate large systems, which are not accessible by DFT. Preliminary MD simulations for a slab model of a Cu(111) cluster at a ZnO(1010) surface containing more than 7,500 atoms, could be performed in a few seconds per time step on a regular eight core workstation. In contrast to electronic structure methods NN potentials scale linear with the system size and are well suited for massively parallel simulations. Continuing work employing the constructed NN potential for Cu/ZnO is already in progress. In order to further improve the potential, additional structures may be included in the training set. The choice of the regions of the PES that needs to be refined strongly depends on what properties of the PES are to be observed. Additionally, it has yet to be explored if the electrostatic extension would enhance the accuracy of the Cu/ZnO potential. Based on a refined NN potential, a number of important applications can be realized. The NN potential can be employed in molecular dynamics simulations to study the structural and dynamical properties of copper clusters at ZnO surfaces. Of particular interest are the structure and stabilization mechanisms of different surfaces and defects at these surfaces, the structures and preferred adsorption sites of Cu clusters at ZnO surfaces, and possibly also some aspects of growth mechanism of the clusters. Further, the structure of the Cu/ZnO interface and diffusion processes at this interface, which might give rise to CuZn alloying, can be studied. Including further elements, such as hydrogen in addition to the Cu/ZnO potential (in form of, e.g., hydrogen molecules or water molecules) will enable the study of Cu cluster shapes depending on the gaseous environment. Ultimately, this could help to get a better understanding of the Cu/ZnO catalyst. 119 120 Part VI Appendix 121 A Symmetry Function Parameters A.1 Copper potential Table A.1 Parameters of the radial symmetry functions used to describe the local atomic environments for copper. The parameters refer to the definition in Eqn. (5.9) in Chapter 5. Symmetry functions of type G2 No. η (Bohr−2 ) Rshift (Bohr) Rc (Bohr) 1 0.001 0.000 11.338 2 0.010 0.000 11.338 3 0.020 0.000 11.338 4 0.035 0.000 11.338 5 0.060 0.000 11.338 6 0.100 0.000 11.338 7 0.200 0.000 11.338 8 0.400 0.000 11.338 123 Table A.2 Parameters of the angular symmetry functions used to describe the local atomic environments for copper. The parameters refer to the definitions in Eqn. 5.10 in Chapter 5. Symmetry functions of type G4 No. η λ ζ (Bohr−2 ) 124 Rc Symmetry functions of type G4 No. η λ ζ (Bohr−2 ) (Bohr) Rc (Bohr) 9 0.0001 -1.0 1.0 11.338 31 0.0250 -1.0 2.0 11.338 10 0.0001 1.0 1.0 11.338 32 0.0250 1.0 2.0 11.338 11 0.0001 -1.0 2.0 11.338 33 0.0250 -1.0 4.0 11.338 12 0.0001 1.0 2.0 11.338 34 0.0250 1.0 4.0 11.338 13 0.0030 -1.0 1.0 11.338 35 0.0250 -1.0 16.0 11.338 14 0.0030 1.0 1.0 11.338 36 0.0250 1.0 16.0 11.338 15 0.0030 -1.0 2.0 11.338 37 0.0450 -1.0 1.0 11.338 16 0.0030 1.0 2.0 11.338 38 0.0450 1.0 1.0 11.338 17 0.0080 -1.0 1.0 11.338 39 0.0450 -1.0 2.0 11.338 18 0.0080 1.0 1.0 11.338 40 0.0450 1.0 2.0 11.338 19 0.0080 -1.0 2.0 11.338 41 0.0450 -1.0 4.0 11.338 20 0.0080 1.0 2.0 11.338 42 0.0450 1.0 4.0 11.338 21 0.0150 -1.0 1.0 11.338 43 0.0450 -1.0 16.0 11.338 22 0.0150 1.0 1.0 11.338 44 0.0450 1.0 16.0 11.338 23 0.0150 -1.0 2.0 11.338 45 0.0800 -1.0 1.0 11.338 24 0.0150 1.0 2.0 11.338 46 0.0800 1.0 1.0 11.338 25 0.0150 -1.0 4.0 11.338 47 0.0800 -1.0 2.0 11.338 26 0.0150 1.0 4.0 11.338 48 0.0800 1.0 2.0 11.338 27 0.0150 -1.0 16.0 11.338 49 0.0800 -1.0 4.0 11.338 28 0.0150 1.0 16.0 11.338 50 0.0800 1.0 4.0 11.338 29 0.0250 -1.0 1.0 11.338 51 0.0800 1.0 16.0 11.338 30 0.0250 1.0 1.0 11.338 A.2 Zinc oxide potential Table A.3 Parameters of the radial symmetry functions used to describe the local atomic environments around oxygen atoms in zinc oxide. Symmetry functions of type G2 Symmetry functions of type G2 Neighboring η Rc element (Bohr−2 ) (Bohr) 8 Zn 0.060 11.338 11.338 9 O 0.100 11.338 0.010 11.338 10 Zn 0.100 11.338 Zn 0.010 11.338 11 O 0.200 11.338 5 O 0.035 11.338 12 Zn 0.200 11.338 6 Zn 0.035 11.338 13 O 0.400 11.338 7 O 0.060 11.338 14 Zn 0.400 11.338 No. Neighboring η Rc element (Bohr−2 ) (Bohr) 1 O 0.001 11.338 2 Zn 0.001 3 O 4 No. Table A.4 Parameters of the radial symmetry functions used to describe the local atomic environments around zinc atoms in zinc oxide. Symmetry functions of type G2 No. Neighboring η element (Bohr−2 ) (Bohr) 1 O 0.001 11.338 2 Zn 0.001 3 O 4 Symmetry functions of type G2 Neighboring η Rc element (Bohr−2 ) (Bohr) 8 Zn 0.060 11.338 11.338 9 O 0.100 11.338 0.010 11.338 10 Zn 0.100 11.338 Zn 0.010 11.338 11 O 0.200 11.338 5 O 0.035 11.338 12 Zn 0.200 11.338 6 Zn 0.035 11.338 13 O 0.400 11.338 7 O 0.060 11.338 Rc No. 125 Table A.5 Parameters of the angular symmetry functions used to describe the local atomic environments around oxygen atoms in zinc oxide. For each set of parameters there are six functions referring to the six possible combinations of elements in the neighboring atom pairs. Symmetry functions of type G4 No. η λ ζ (Bohr−2 ) Symmetry functions of type G4 Rc No. η λ ζ (Bohr−2 ) (Bohr) Rc (Bohr) 15-17 0.000 -1.0 1.0 11.338 81-83 0.025 -1.0 2.0 11.338 18-20 0.000 1.0 1.0 11.338 84-86 0.025 1.0 2.0 11.338 21-23 0.000 -1.0 2.0 11.338 87-89 0.025 -1.0 4.0 11.338 24-26 0.000 1.0 2.0 11.338 90-92 0.025 1.0 4.0 11.338 27-29 0.003 -1.0 1.0 11.338 93-95 0.025 -1.0 16.0 11.338 30-32 0.003 1.0 1.0 11.338 96-98 0.025 1.0 16.0 11.338 33-35 0.003 -1.0 2.0 11.338 99-101 0.045 -1.0 1.0 11.338 36-38 0.003 1.0 2.0 11.338 102-104 0.045 1.0 1.0 11.338 39-41 0.008 -1.0 1.0 11.338 105-107 0.045 -1.0 2.0 11.338 42-44 0.008 1.0 1.0 11.338 108-110 0.045 1.0 2.0 11.338 45-47 0.008 -1.0 2.0 11.338 111-113 0.045 -1.0 4.0 11.338 48-50 0.008 1.0 2.0 11.338 114-116 0.045 1.0 4.0 11.338 51-53 0.015 -1.0 1.0 11.338 117-119 0.045 -1.0 16.0 11.338 54-56 0.015 1.0 1.0 11.338 120-122 0.045 1.0 16.0 11.338 57-59 0.015 -1.0 2.0 11.338 123-125 0.080 -1.0 1.0 11.338 60-62 0.015 1.0 2.0 11.338 126-128 0.080 1.0 1.0 11.338 63-65 0.015 -1.0 4.0 11.338 129-131 0.080 -1.0 2.0 11.338 66-68 0.015 1.0 4.0 11.338 132-134 0.080 1.0 2.0 11.338 69-71 0.015 -1.0 16.0 11.338 135-137 0.080 -1.0 4.0 11.338 72-74 0.015 1.0 16.0 11.338 136-140 0.080 1.0 4.0 11.338 75-77 0.025 -1.0 1.0 11.338 141-142 0.080 1.0 16.0 11.338 78-80 0.025 1.0 1.0 11.338 126 Table A.6 Parameters of the angular symmetry functions used to describe the local atomic environments around zinc atoms in zinc oxide. For each set of parameters there are six functions referring to the six possible combinations of elements in the neighboring atom pairs. Symmetry functions of type G4 No. η λ ζ (Bohr−2 ) Symmetry functions of type G4 Rc No. η λ ζ (Bohr−2 ) (Bohr) Rc (Bohr) 14-16 0.000 -1.0 1.0 11.338 77-79 0.025 -1.0 2.0 11.338 17-19 0.000 1.0 1.0 11.338 80-82 0.025 1.0 2.0 11.338 20-22 0.000 -1.0 2.0 11.338 83-85 0.025 -1.0 4.0 11.338 23-25 0.000 1.0 2.0 11.338 86-88 0.025 1.0 4.0 11.338 26-28 0.003 -1.0 1.0 11.338 89-91 0.025 -1.0 16.0 11.338 29-31 0.003 1.0 1.0 11.338 92-94 0.025 1.0 16.0 11.338 32-34 0.003 -1.0 2.0 11.338 95-97 0.045 -1.0 1.0 11.338 35-37 0.003 1.0 2.0 11.338 98-100 0.045 1.0 1.0 11.338 38-39 0.008 -1.0 1.0 11.338 101-103 0.045 -1.0 2.0 11.338 40-42 0.008 1.0 1.0 11.338 104-106 0.045 1.0 2.0 11.338 43-44 0.008 -1.0 2.0 11.338 107-108 0.045 -1.0 4.0 11.338 45-46 0.008 1.0 2.0 11.338 109-111 0.045 1.0 4.0 11.338 47-49 0.015 -1.0 1.0 11.338 112 0.045 -1.0 16.0 11.338 50-52 0.015 1.0 1.0 11.338 113-115 0.045 1.0 16.0 11.338 53-55 0.015 -1.0 2.0 11.338 116-117 0.080 -1.0 1.0 11.338 56-58 0.015 1.0 2.0 11.338 118-120 0.080 1.0 1.0 11.338 59-61 0.015 -1.0 4.0 11.338 121-122 0.080 -1.0 2.0 11.338 62-64 0.015 1.0 4.0 11.338 123-124 0.080 1.0 2.0 11.338 65-67 0.015 -1.0 16.0 11.338 125 0.080 -1.0 4.0 11.338 68-70 0.015 1.0 16.0 11.338 126-127 0.080 1.0 4.0 11.338 71-73 0.025 -1.0 1.0 11.338 128-129 0.080 1.0 16.0 11.338 74-76 0.025 1.0 1.0 11.338 127 A.3 Copper/zinc oxide potential Table A.7 Parameters of the radial symmetry functions used to describe the local atomic environments (copper/zince oxide system). Symmetry functions of type G2 No. 128 Neighboring η Rc element (Bohr−2 ) (Bohr) 1 Cu 0.0009 12.0 2 Zn 0.0009 3 O 4 Symmetry functions of type G2 No. Neighboring η Rc element (Bohr−2 ) (Bohr) 13 Cu 0.060 12.0 12.0 14 Zn 0.060 12.0 0.0009 12.0 15 O 0.060 12.0 Cu 0.010 12.0 16 Cu 0.100 12.0 5 Zn 0.010 12.0 17 Zn 0.100 12.0 6 O 0.010 12.0 18 O 0.100 12.0 7 Cu 0.020 12.0 19 Cu 0.200 12.0 8 Zn 0.020 12.0 20 Zn 0.200 12.0 9 O 0.020 12.0 21 O 0.200 12.0 10 Cu 0.035 12.0 22 Cu 0.400 12.0 11 Zn 0.035 12.0 23 Zn 0.400 12.0 12 O 0.035 12.0 24 O 0.400 12.0 Table A.8 Parameters of the angular symmetry functions used to describe the local atomic environments. For each set of parameters there are six functions referring to the six possible combinations of elements in the neighboring atom pairs (copper/zinc oxide system). Symmetry functions of type G4 No. η λ ζ (Bohr−2 ) Rc Symmetry functions of type G4 No. η λ ζ (Bohr−2 ) (Bohr) Rc (Bohr) 25-30 0.0001 -1.0 1.0 12.0 91-96 0.015 1.0 2.0 12.0 31-36 0.0001 1.0 1.0 12.0 97-102 0.015 1.0 4.0 12.0 37-42 0.0001 -1.0 2.0 12.0 103-108 0.015 1.0 16.0 12.0 43-48 0.0001 1.0 2.0 12.0 109-114 0.025 1.0 1.0 12.0 49-54 0.003 -1.0 1.0 12.0 115-120 0.025 1.0 2.0 12.0 55-60 0.003 1.0 1.0 12.0 121-126 0.025 1.0 4.0 12.0 61-66 0.003 -1.0 2.0 12.0 127-132 0.025 1.0 16.0 12.0 67-72 0.003 1.0 2.0 12.0 133-138 0.045 1.0 1.0 12.0 73-78 0.008 1.0 1.0 12.0 139-144 0.045 1.0 2.0 12.0 79-84 0.008 1.0 2.0 12.0 145-150 0.045 1.0 4.0 12.0 85-90 0.015 1.0 1.0 12.0 151-156 0.045 1.0 16.0 12.0 129 130 B Calculation of Elastic Constants of Cubic Lattices ↔ ↔ The elasticity (or stiffness) tensor C relates the strain ε acting on a material to the ↔ resulting stress σ via Hooke’s law ↔ ↔↔ σ =Cε . (B.1) ↔ In general, the tensor C has 21 independent components, the elasticity constants. However, for cubic lattices this number reduces, due to the high symmetry, to only three independent constants, C11 , C12 , C44 , and the elasticity tensor is given by C11 C12 C12 0 0 0 C12 C11 C12 0 0 0 ↔ 0 0 C12 C12 C11 0 C = (B.2) . 0 0 0 C44 0 0 0 0 0 0 C44 0 0 0 0 0 0 C44 Note, that the bulk modulus B can be related to the elasticity constants by B= 1 (C11 + 2C12 ) . 3 (B.3) For the calculation of the three elastic constants for copper we closely follow the method suggested by Mehl et al. [114], which we will briefly outline in the following. The relaxed lattice shall be described by the vectors ~a1 , ~a2 and ~a3 . We define the strain ↔ tensor ε such that the deformation of the lattice under strain is described by 0 ~a1 ~a1 0 ↔ ↔ , ~a2 = ~a2 I + ε ~a03 ~a3 (B.4) 131 ↔ where I is the 3 × 3 identity matrix. The strain can then be represented by a symmetric tensor with six independent components ↔ e1 e6 /2 e5 /2 ε = e6 /2 e4 /2 e2 e5 /2 e4 /2 . (B.5) e3 As a direct result of Hooke’s law (B.1), the total energy changes under strain according to 6 ∆E(ei ) = −p(V )∆V +V 6 1 ∑ ∑ 2 Ci j ei e j + O(e3i ) , (B.6) i=1 j=1 where V and p are the unit cell volume and pressure of the undistorted lattice. The calculation of the elastic constants using expression (B.6) simplifies for the case of volume conserving strain, i. e. ∆V = 0. For cubic lattices the volume conserving orthorhombic strain with e1 = −e2 = x, e3 = x2 , 1 − x2 e4 = e5 = e6 = 0 (B.7) results in a symmetric change of the total energy as ∆E(x) = V C11 −C12 x2 + O(x4 ) (B.8) and allows for a simple access to the difference C11 −C12 . The volume conserving strain e6 = x, e3 = x2 , 4 − x2 e1 = e2 = e4 = e5 = 0 (B.9) on the other hand, yields a change of energy in dependence of the strain x as 1 ∆E(x) = V C44 x2 + O(x4 ) , 2 (B.10) which only depends on the elastic modulus C44 . If the bulk modulus B is known, the three independent elastic constants C11 , C12 and C44 can thus be calculated using the relations (B.3), (B.8) and (B.10). 132 Bibliography [1] G. A. Olah, “Beyond oil and gas: The methanol economy”, Angew. Chem. Int. Ed., 44, 2636 – 2639, 2005. [2] G. A. Olah, A. Goeppert, and G. K. S. Prakash, “Chemical recycling of carbon dioxide to methanol and dimethyl ether: From greenhouse gas to renewable, environmentally carbon neutral fuels and synthetic hydrocarbons”, J. Org. Chem., 74, 487 – 498, 2009. [3] M. Behrens, F. Studt, I. Kasatkin, S. Kuhl, M. Havecker, F. Abild-Pedersen, S. Zander, F. Girgsdies, P. Kurr, B. L. Kniep, M. Tovar, R. W. Fischer, J. K. Nørskov, and R. Schlögl, “The active site of methanol synthesis over Cu/ZnO/Al2 O3 industrial catalysts”, Science, 336, 893 – 897, 2012. [4] M. Escudero-Escribano, A. Verdaguer-Casadevall, P. Malacrida, U. Grønbjerg, B. P. Knudsen, A. K. Jepsen, J. Rossmeisl, I. E. L. Stephens, and I. Chorkendorff, “Pt5gd as a highly active and stable catalyst for oxygen electroreduction”, J. Am. Chem. Soc., 134, 16476 – 16479, 2012. [5] B. Lim, M. Jiang, P. H. C. Camargo, E. C. Cho, J. Tao, X. Lu, Y. Zhu, and Y. Xia, “Pd-pt bimetallic nanodendrites with high activity for oxygen reduction”, Science, 324, 1302 – 1305, 2009. [6] M. Appl, Ammonia, 1. Introduction, 3, 107 – 137. Wiley-VCH Verlag GmbH & Co. KGaA, 2000. [7] SFB 558, http://www.sfb558.de/, 2012. [8] J. D. Grunwaldt, A. M. Molenbroek, N. Y. Topsøe, H. Topsøe, and B. S. Clausen, “In situ investigations of structural changes in Cu/ZnO catalysts”, J. Catal., 194, 452 – 460, 2000. 133 [9] K. H. Ernst, A. Ludviksson, R. Zhang, J. Yoshihara, and C. T. Campbell, “Growth-model for metal-films on oxide surfaces: Cu on ZnO(0001)-O”, Phys. Rev. B, 47, 13782 – 13796, 1993. [10] I. Kasatkin, P. Kurr, B. Kniep, A. Trunschke, and R. Schlögl, “Role of lattice strain and defects in copper particles on the activity of Cu/ZnO/Al2 O3 catalysts for methanol synthesis”, Angew. Chem. Int. Ed., 46, 7324 – 7327, 2007. [11] N. Y. Topsøe and H. Topsøe, “On the nature of surface structural changes in Cu/ZnO methanol synthesis catalysts”, Topics In Catalysis, 8, 267 – 270, 1999. [12] J. B. Wagner, P. L. Hansen, A. M. Molenbroek, H. Topsøe, B. S. Clausen, and S. Helveg, “In situ electron energy loss spectroscopy studies of gas-dependent metal-support interactions in Cu/ZnO catalysts”, J. Phys. Chem. B, 107, 7753 – 7758, 2003. [13] H. S. Qiu, F. Gallino, C. Di Valentin, and Y. M. Wang, “Shallow donor states induced by in-diffused Cu in ZnO: A combined HREELS and hybrid DFT study”, Phys. Rev. Lett., 106, 066401-1 – 066401-4, 2011. [14] M. Kroll, T. Löber, V. Schott, C. Wöll, and U. Köhler, “Thermal behavior of MOCVDgrown Cu-clusters on ZnO(1010)”, Phys. Chem. Chem. Phys., 14, 1654 – 1659, 2012. [15] K. Ozawa, Y. Oba, and K. Edamoto, “Oxidation of copper clusters on ZnO(1010): Effect of temperature and preadsorbed water”, Surf. Sci., 601, 3125 – 3132, 2007. [16] P. L. Hansen, J. B. Wagner, S. Helveg, J. R. Rostrup-Nielsen, B. S. Clausen, and H. Topsøe, “Atom-resolved imaging of dynamic shape changes in supported copper nanocrystals”, Science, 295, 2053 – 2055, 2002. [17] M. Kroll, and U. Köhler, Private communication, 2012. [18] J. Kiss, J. Frenzel, N. N. Nair, B. Meyer, and D. Marx, “Methanol synthesis on ZnO(0001). III. Free energy landscapes, reaction pathways, and mechanistic insights”, J. Chem. Phys., 134, 064710-1 – 064710-14, 2011. [19] J. Kossmann, G. Rossmüller, and C. Hättig, “Prediction of vibrational frequencies of possible intermediates and side products of the methanol synthesis on ZnO(0001) by ab initio calculations”, J. Chem. Phys., 136, 034706-1 – 034706-12, 2012. 134 [20] X. Duan, O. Warschkow, A. Soon, B. Delley, and C. Stampfl, “Density functional study of oxygen on Cu(100) and Cu(110) surfaces”, Phys. Rev. B, 81, 075430-1 – 075430-15, 2010. [21] A. Soon, M. Todorova, B. Delley, and C. Stampfl, “Oxygen adsorption and stability of surface oxides on Cu(111): A first-principles investigation”, Phys. Rev. B, 73, 165424-1 – 165424-12, 2006. [22] A. Soon, M. Todorova, B. Delley, and C. Stampfi, “Surface oxides of the oxygen-copper system: Precursors to the bulk oxide phase?”, Surf. Sci., 601, 5809 – 5813, 2007. [23] B. Meyer and D. Marx, “Density-functional study of the structure and stability of ZnO surfaces”, Phys. Rev. B, 67, 035403-1 – 035403-11, 2003. [24] R. Kovácik, B. Meyer, and D. Marx, “F centers versus dimer vacancies on ZnO surfaces: Characterization by STM and STS calculations”, Angew. Chem. Int. Ed., 46, 4894 – 4897, 2007. [25] M. Valtiner, M. Todorova, G. Grundmeier, and J. Neugebauer, “Temperature stabilized surface reconstructions at polar ZnO(0001)”, Phys. Rev. Lett., 103, 065502-1 – 065502-4, 2009. [26] O. Warschkow, K. Chuasiripattana, M. J. Lyle, B. Delley, and C. Stampfl, “Cu/ZnO(0001) under oxidating and reducing conditions: A first-principles survey of surface structures”, Phys. Rev. B, 84, 125311-1 – 125311-25, 2011. [27] B. Meyer and D. Marx, “Density-functional study of Cu atoms, monolayers, films, and coadsorbates on polar ZnO surfaces”, Phys. Rev. B, 69, 235420-1 – 235420-7, 2004. [28] G. C. Abell, “Empirical chemical pseudopotential theory of molecular and metallic bonding”, Phys. Rev. B, 31, 6184 – 6196, 1985. [29] J. Tersoff, “New empirical approach for the structure and energy of covalent systems”, Phys. Rev. B, 37, 6991 – 7000, 1988. [30] D. W. Brenner, “Empirical potential for hydrocarbons for use in simulating the chemical vapor deposition of diamond films”, Phys. Rev. B, 42, 9458 – 9471, 1990. [31] D. G. Pettifor and I. I. Oleinik, “Analytic bond-order potentials beyond Tersoff-Brenner. I. Theory”, Phys. Rev. B, 59, 8487 – 8499, 1999. 135 [32] M. W. Finnis and J. E. Sinclair, “A simple empirical n-body potential for transition metals”, Phil. Mag. A, 50, 45 – 55, 1984. [33] M. S. Daw and M. I. Baskes, “Embedded-atom method: Derivation and application to impurities, surfaces, and other defects in metals”, Phys. Rev. B, 29, 6443 – 6453, 1984. [34] W. R. P. Scott, P. H. Hünenberger, I. G. Tironi, A. E. Mark, S. R. Billeter, J. Fennen, A. E. Torda, T. Huber, P. Krüger, and W. F. van Gunsteren, “The gromos biomolecular simulation program package”, J. Phys. Chem. A, 103, 3596 – 3607, 1999. [35] M. T. Nelson, W. Humphrey, A. Gursoy, A. Dalke, L. V. Kalé, R. D. Skeel, and K. Schulten, “NAMD: a parallel, object-oriented molecular dynamics program”, Int. J. High Perform. Comput., 10, 251 – 268, 1996. [36] W. D. Cornell, P. Cieplak, C. I. Bayly, I. R. Gould, K. M. Merz, D. M. Ferguson, D. C. Spellmeyer, T. Fox, J. W. Caldwell, and P. A. Kollman, “A second generation force field for the simulation of proteins, nucleic acids, and organic molecules”, J. Am. Chem. Soc., 117, 5179 – 5197, 1995. [37] D. Van Der Spoel1, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark, and H. J. C. Berendsen, “GROMACS: fast, flexible, and free”, J. Comput. Chem., 26, 1701 – 1718, 2005. [38] W. D. Cornell , P. Cieplak, C. I. Bayly , I. R. Gould , K. M. Merz , D. M. Ferguson , D. C. Spellmeyer, T. Fox, J. W. Caldwell, and P. A. Kollman “A second generation force field for the simulation of proteins, nucleic acids, and organic molecules”, J. Am. Chem. Soc., 117, 5179 – 5197, 1995. [39] B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and M. Karplus “A program for macromolecular energy, minimization, and dynamics calculations”, J. Comput. Chem., 4, 187 – 217, 1983. [40] M. S. Daw and M. I. Baskes, “Semiempirical, Quantum Mechanical calculation of hydrogen embrittlement in metals”, Phys. Rev. Lett., 50, 1285 – 1288, 1983. [41] M. I. Baskes, “Modified embedded-atom potentials for cubic materials and impurities”, Phys. Rev. B, 46, 2727 – 2742, 1992. 136 [42] S. Ryu, C. R. Weinberger, M. I. Baskes, and W. Cai, “Improved modified embeddedatom method potentials for gold and silicon”, Modelling Simul. Mater. Sci. Eng., 17, 075008-1 – 075008-14, 2009. [43] A. C. T. van Duin, S. Dasgupta, F. Lorant, and W. A. Goddard, “ReaxFF: A reactive force field for hydrocarbons”, J. Phys. Chem. A, 105, 9396 – 9409, 2001. [44] D. Raymand, A. C. T. van Duin, M. Baudin, and K. Hermansson, “A reactive force field (ReaxFF) for zinc oxide”, Surface Science, 602, 1020 – 1031, 2008. [45] A. P. Bartók, Gaussian Approximation Potential: An Interatomic Potential Derived from First Principles Quantum Mechanics. PhD thesis, University of Cambridge, Cambridge, 2009. [46] A. P. Bartók, M. C. Payne, R. Kondor, and G. Csányi, “Gaussian approximation potentials: The accuracy of Quantum Mechanics, without the electrons”, Phys. Rev. Lett., 104, 136403-1 – 136403-4, 2010. [47] T. B. Blank, S. D. Brown, A. W. Calhoun, and D. J. Doren, “Neural-network models of potential-energy surfaces”, J. Chem. Phys., 103, 4129 – 4137, 1995. [48] S. Lorenz, A. Gross, and M. Scheffler, “Representing high-dimensional potential-energy surfaces for reactions at surfaces by neural networks”, Chem. Phys. Lett., 395, 210 – 215, 2004. [49] J. Hertz, A. Krogh, and R. G. Palmer, Introduction to the Theory of Neural Computation. Addison-Wesley, Reading, 1996. [50] C. M. Handley, and P. L. A. Popelier, “Potential energy surfaces fitted by artificial neural networks”, J. Phys. Chem. A, 114, 3371 – 3383, 2010. [51] J. Behler, “Neural network potential-energy surfaces in chemistry: a tool for large-scale simulations”, Phys. Chem. Chem. Phys., 13, 17930 – 17955, 2011. [52] L. Raff, R. Komanduri, M. Hagan and S. Bukkapatnam, Neural Networks in Chemical Reaction Dynamics. Oxford University Press, New York, 2012. [53] J. Behler and M. Parrinello, “Generalized neural-network representation of highdimensional potential-energy surfaces”, Phys. Rev. Lett., 98, 146401 – 146405, 2007. 137 [54] J. Behler, “Neural network potential-energy surfaces for atomistic simulations”, Chemical Modelling Applications and Theory, 7, 1 – 41, The Royal Society of Chemistry, 2010. [55] J. Behler, R. Marton̂ák, D. Donadio, and M. Parrinello, “Metadynamics simulations of the high-pressure phases of silicon employing a high-dimensional neural network potential”, Phys. Rev. Lett., 100, 185501-1 – 185501-4, 2008. [56] J. Behler, R. Marton̂ák, D. Donadio, and M. Parrinello, “Pressure-induced phase transitions in silicon studied by neural network-based metadynamics simulations”, Phys. Status Solidi B, 245, 2618 – 2629, 2008. [57] V. Blum, R. Gehrke, F. Hanke, P. Havu, V. Havu, X. G. Ren, K. Reuter, and M. Scheffler, “Ab initio molecular simulations with numeric atom-centered orbitals”, Comput. Phys. Commun., 180, 2175 – 2196, 2009. [58] P. Giannozzi, S. Baroni, N. Bonini, M. Calandra, R. Car, C. Cavazzoni, D. Ceresoli, G. L. Chiarotti, M. Cococcioni, I. Dabo, A. Dal Corso, S. de Gironcoli, S. Fabris, G. Fratesi, R. Gebauer, U. Gerstmann, C. Gougoussis, A. Kokalj, M. Lazzeri, L. Martin-Samos, N. Marzari, F. Mauri, R. Mazzarello, S. Paolini, A. Pasquarello, L. Paulatto, C. Sbraccia, S. Scandolo, G. Sclauzero, A. P. Seitsonen, A. Smogunov, P. Umari, and R. M. Wentzcovitch, “Quantum espresso: a modular and open-source software project for quantum simulations of materials”, J. Phys.: Condens. Matter, 21, 395502-1 – 395502-19, 2009. [59] E. Schrödinger, “Quantisierung als eigenwertproblem”, Ann. Phys. (Berlin), 384, 361 – 376, 1926. [60] M. Born and R. Oppenheimer, “Zur quantentheorie der molekeln”, Ann. Phys. (Berlin), 389, 457 – 484, 1927. [61] R. G. Parr and W. Yang, Density-Functional Theory of Atoms and Molecules. International Series of Monographs on Chemistry, Oxford University Press, 1989. [62] P. Hohenberg and W. Kohn, “Inhomogeneous electron gas”, Phys. Rev., 136, B864 – B871, 1964. [63] W. Kohn and L. J. Sham, “Self-consistent equations including exchange and correlation effects”, Phys. Rev., 140, A1133 – A1138, 1965. 138 [64] C. C. J. Roothaan, “New developments in molecular orbital theory”, Rev. Mod. Phys., 23, 69 – 89, 1951. [65] G. G. Hall, “The molecular orbital theory of chemical valency. viii. a method of calculating ionization potentials”, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences, 205, 541 – 552, 1951. [66] T. Auckenthaler, V. Blum, H.-J. Bungartz, T. Huckle, R. Johanni, L. Krämer, B. Lang, H. Lederer, and P. R. Willems, “Parallel solution of partial symmetric eigenvalue problems from electronic structure calculations”, Parallel Computing, 37, 783 – 794, 2011. [67] L. Pauling, “The application of the quantum mechanics to the structure of the hydrogen molecule and hydrogen molecule-ion and to related problems.”, Chem. Rev., 5, 173 – 213, 1928. [68] J. E. Lennard-Jones, “The electronic structure of some diatomic molecules”, Trans. Faraday Soc., 25, 668 – 686, 1929. [69] V. Blum, R. Gehrke, F. Hanke, P. Havu, V. Havu, X. Ren, K. Reuter, and M. Scheffler, “Ab initio molecular simulations with numeric atom-centered orbitals”, Comput. Phys. Commun., 180, 2175 – 2196, 2009. [70] J. Ortega, “First-principles methods for tight-binding molecular dynamics”, Comp. Mat. Sci., 12, 192 – 209, 1998. [71] D. Frenkel and B. Smit, Understanding Molecular Simulation: From Algorithms to Applications. Academic Press, 2. edition, 2001. [72] D. A. McQuarrie, Statistical Mechanics. Harper & Row. London, 1976. [73] M. P. Allen and D. J. Tildesley, Computer Simulations of Liquids. Clarendon Press, Oxford, 1987. [74] W. C. Swope, H. C. Andersen, P. H. Berens, and K. R. Wilson, “A computer-simulation method for the calculation of equilibrium-constants for the formation of physical clusters of molecules - application to small water clusters”, J. Chem. Phys., 76, 637 – 649, 1982. 139 [75] D. Marx and J. Hutter, Ab Initio Molecular Dynamics. Cambridge University Press, Cambridge, 2009. [76] W. Koch and M. C. Holthausen, A Chemist’s Guide to Density Functional Theory. Wiley-VCH Verlag GmbH, Weinheim, 2000. [77] S. Lorenz, Reactions on Surfaces with Neural Networks. PhD Thesis, Technische Universität, Berlin, 2001. [78] J. Behler, 2012. RuNNer – A Neural Network Code for High-Dimensional PotentialEnergy Surfaces, Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum. [79] A. Bholoa, S. Kenny, and R. Smith, “A new approach to potential fitting using neural networks”, Nuclear Instruments and Methods in Physics Research Section B: Beam Interactions with Materials and Atoms, 255, 1 – 7, 2007. [80] E. Sanville, A. Bholoa, R. Smith, and S. D. Kenny, “Silicon potentials investigated using density functional theory fitted neural networks”, J. Phys.: Condens. Matter, 20, 285219-1 – 285219-10, 2008. [81] J. Behler, “Atom-centered symmetry functions for constructing high-dimensional neural network potentials”, J. Chem. Phys., 134, 074106-1 – 074106-13, 2011. [82] H. Eshet, R. Z. Khaliullin, T. D. Kuhne, J. Behler, and M. Parrinello, “Ab initio quality neural-network potential for sodium”, Phys. Rev. B, 81, 184107-1 – 184107-8, 2010. [83] R. Z. Khaliullin, H. Eshet, T. D. Kuhne, J. Behler, and M. Parrinello, “Graphite-diamond phase coexistence study employing a neural-network mapping of the ab initio potential energy surface”, Phys. Rev. B, 81, 100103-1 – 100103-4, 2010. [84] R. Z. Khaliullin, H. Eshet, T. D. Kuhne, J. Behler, and M. Parrinello, “Nucleation mechanism for the direct graphite-to-diamond phase transition”, Nature Mater., 10, 693 – 697, 2011. [85] N. Artrith and J. Behler, “High-dimensional neural network potentials for metal surfaces: A prototype study for copper”, Phys. Rev. B, 85, 045439-1 – 045439-13, 2012. [86] T. B. Blank and S. D. Brown, “Adaptive, global, extended kalman filters for training feedforward neural networks”, J. Chemometrics, 8, 391 – 407, 1994. 140 [87] J. B. Witkoskie and D. J. Doren, “Neural network models of potential energy surfaces: Prototypical examples”, J. Chem. Theory Comput., 1, 14 – 23, 2005. [88] S. Shah, F. Palmieri, and M. Datum, “Optimal filtering algorithms for fast learning in feedforward neural networks”, Neural Networks, 5, 779 – 787, 1992. [89] A. Pukrittayakamee, M. Malshe, M. Hagan, L. M. Raff, R. Narulkar, S. Bukkapatnum, and R. Komanduri, “Simultaneous fitting of a potential-energy surface and its corresponding force fields using feedforward neural networks”, J. Chem. Phys., 130, 134101-1 – 134101-10, 2009. [90] H. M. Le and L. M. Raff, “Molecular dynamics investigation of the bimolecular reaction BeH + H(2) −→ BeH(2) + H on an ab initio potential-energy surface obtained using neural network methods with both potential and gradient accuracy determination”, The J. Phys. Chem. A, 114, 45 – 53, 2010. [91] R. S. Mulliken, “Electronic population analysis on LCAO[single bond]MO molecular wave functions. I”, J. Chem. Phys., 23, 1833 – 1840, 1955. [92] F. L. Hirshfeld, “Bonded-atom fragments for describing molecular charge-densities”, Theor. Chem. Acc., 44, 129 – 138, 1977. [93] R. Bader, Atoms in Molecules: A Quantum Theory. Oxford University Press, New York, 1990. [94] C. M. Handley, and P. L. A. Popelier, “Dynamically polarizable water potential based on multipole moments trained by machine learning”, J. Chem. Theory Comput., 5, 1474 – 1489, 2009. [95] N. Artrith, T. Morawietz, and J. Behler, “High-dimensional neural-network potentials for multicomponent systems: Applications to zinc oxide”, Phys. Rev. B, 83, 153101-1 – 153101-4, 2011. [96] T. Morawietz, V. Sharma, and J. Behler, “A neural network potential-energy surface for the water dimer based on environment-dependent atomic energies and charges”, J. Chem. Phys., 136, 064103-1 – 064103-11, 2012. [97] J. W. Ponder. TINKER, Version 5.1, Software Tools for Molecular Design, Biochemistry and Molecular Biophysics, Washington University, St. Louis, USA. 141 [98] J. P. Perdew, K. Burke, and M. Ernzerhof, “Generalized gradient approximation made simple”, Phys. Rev. Lett., 77, 3865 – 3868, 1996. [99] E. Vanlenthe, E. J. Baerends, and J. G. Snijders, “Relativistic total-energy using regular approximations”, J. Chem. Phys., 101, 9783 – 9792, 1994. [100] J. Ischtwan and M. A. Collins, “Molecular-potential energy surfaces by interpolation”, J. Chem. Phys., 100, 8080 – 8088, 1994. [101] R. Dawes, D. L. Thompson, A. F. Wagner, and M. Minkoff, “Interpolating moving least-squares methods for fitting potential energy surfaces: A strategy for efficient automatic data point placement in high dimensions”, J. Chem. Phys., 128, 084107-1 – 084107-10, 2008. [102] L. M. Raff, M. Malshe, M. Hagan, D. I. Doughan, M. G. Rockley, and R. Komanduri, “Ab initio potential-energy surfaces for complex, multichannel systems using modified novelty sampling and feedforward neural networks”, J. Chem. Phys., 122, 084104-1 – 084104-16, 2005. [103] D. R. Lide, Handbook of Chemistry and Physics. CRC Press, Boca Raton, 90th ed., 2009. [104] G. Simons and H. Wang, Single Crystal Elastic Constants and Calculated Aggregate Properties. MIT Press, Cambridge, MA, 1977. [105] C. Noguera, “Polar oxide surfaces”, J. Phys.: Condens. Matter, 12, R367 – R410, 2000. [106] A. Wander, F. Schedin, P. Steadman, A. Norris, R. McGrath, T. S. Turner, G. Thornton, and N. M. Harrison, “Stability of polar oxide surfaces”, Phys. Rev. Lett., 86, 3811 – 3814, 2001. [107] V. Staemmler, Theorectical Aspects of Transition Metal Catalysis, The Cluster Approach for the Adsorption of Small Molecules on Oxide Surfaces, 219 – 256. Springer Berlin/Heidelberg, 2003. [108] O. Dulub, U. Diebold, and G. Kresse, “Novel stabilization mechanism on polar surfaces: ZnO(0001)-Zn”, Phys. Rev. Lett., 90, 016102-1 – 016102-4, 2003. [109] G. Kresse, O. Dulub, and U. Diebold, “Competing stabilization mechanism for the polar ZnO(0001)-Zn surface”, Phys. Rev. B, 68, 245409-1 – 245409-15, 2003. 142 [110] P. P. Ewald, “Die berechnung optischer und elektrostatischer gitterpotentiale”, Ann. Phys., 369, 253 – 287, 1921. [111] D. H. Nguyen, “Neural networks for self-learning control systems”, IEEE Control Systems Magazine, 10, 18 – 23, 1990. [112] N. Artrith, B. Hiller, and J. Behler Phys. Status Solidi B, 2012. (invited feature article) accepted. [113] K. V. J. Jose, N. Artrith, and J. Behler, “Construction of high-dimensional neural network potentials using environment-dependent atom pairs”, J. Chem. Phys., 136, 194111-1 – 194111-15, 2012. [114] J. H. Westbrook and R. L. Fleisher, eds., First principles calculations of elastic properties of metals, Ch. 9, 195 – 210. John Wiley and Sons, London, 1993. 143 Acknowledgements I would like to thank all those who have contributed with their support and motivation to the accomplishment of my PhD studies and my PhD thesis. First of all, I wish to thank my supervisor Dr. Jörg Behler for giving me the opportunity to join his research group and for introducing me to a very fascinating project at the Department of Theoretical Chemistry, Ruhr University, Bochum, Germany (TheoChem@RUB), and for his support over the past years and my future plans. I am grateful to TheoChem@RUB for providing the great facilities during my PhD studies. Special thanks go to Björn Hiller for the great discussions about the research project and for his kind help with computer problems, to Tobias Morawietz for exchanging experiences with the neural network stuff, and to Dr. Volker Blum for stimulating discussions about the FHI-aims code. I would also like to thank Prof. Dr. Ulrich Köhler and Martin Kroll for sharing the STM results of the Cu@ZnO structures, and for all the discussions. Furthermore, I would like to thank all of my present and former colleagues (especially the Behler group and the Marx group) for the excellent working environment at TheoChem@RUB. I am much indebted to Dr. Holger Langer and Dr. Harald Forbert for their continuous support whenever a technical problem occurred. Additionally, I very much appreciate Mrs. Sylke Kohlpoth, Mrs. Doris Fischer-Niess, and Mrs. Gundula Talbot for their excellent and kind administrative support during my stay at TheoChem@RUB. Finally, I would like to express my thankfulness to Dr. Alexander Urban for proofreading my thesis and for his support when I had no motivation sometimes. This work was financially supported by the Deutsche Forschungsgemeinschaft (DFG) through the collaborative research center SFB 558 “Metal-substrate interactions in heterogeneous catalysis” and an Emmy Noether program. 145