High-dimensional neural network potentials for solids and surfaces

advertisement
RUHR-UNIVERSITÄT BOCHUM
FAKULTÄT FÜR CHEMIE UND BIOCHEMIE
LEHRSTUHL FÜR THEORETISCHE CHEMIE
High-Dimensional Neural Network Potentials
for Solids and Surfaces:
Applications to Copper and Zinc Oxide
Dissertation
zur Erlangung des Doktorgrades der Naturwissenschaften
Nongnuch Artrith
Die vorliegende Dissertation wurde in der Zeit von Juni 2008 bis November 2012 am
Lehrstuhl für Theoretische Chemie an der Fakultät für Chemie und Biochemie der
Ruhr-Universität Bochum angefertigt.
Leiter der Arbeit:
Dr. Jörg Behler
Referent:
Dr. Jörg Behler
Koreferent:
Prof. Dr. Dominik Marx
Dekan:
Prof. Dr. Wolfram Sander
วิทยานิ พนธเลมนี้ ขอมอบใหแด บิดา-มารดาของขาพเจา นายประยุทธ-นางสังวาลย อาจฤทธิ ์
นางสาวบุษบา อาจฤทธิ ์ (นองสาว) ตลอดจนครอบครัว อาจฤทธิ ์ ศรีกะกุล และ คุณพี่ดารา-คุณพี่สมบูรณ พิลาโสภา
ทุกคนคอยสงกําลังใจ ใหความสําคัญกับการศึกษา และใหการสนับสนุนทุกๆ อยางแกขาพเจาตลอดมา
โดยเฉพาะชวงเวลาที่ขาพเจาอยูหางไกลคนละทวีปของโลกใบนี้ ขอกราบขอบพระคุณดวยความเคารพและรักสุดหัวใจ
นางสาวนงคนุช อาจฤทธิ ์ (2012)
To my parents and my family (Artrith and Srikakul)
for their love, support and patience.
A special thanks for my mother's words:
“Keep studying as much as you can, and study well,
because no one can take from you what you have already learned.”
Meinen Eltern und meiner Familie Artrith und Srikakul
für ihre Liebe, Unterstützung und Geduld
Einen besonderen Dank für die Worte meiner Mutter:
„Lerne, soviel du kannst, und lerne gut,
denn niemand kann dir nehmen, was du schon gelernt hast.”
Zusammenfassung
Die Simulation großer, realistischer Oberflächen nicht-idealisierter heterogener Katalysatoren
erfordert die Modellierung von Systemen mehrerer Tausend Atome. Insbesondere die Zuverlässigkeit von Molekulardynamik-Simulationen großer Systeme ist hierbei stark abhängig von
einer genauen Beschreibung der zugrunde liegenden Potentialenergiefläche (PES). Methoden
basierend auf first principles, wie zum Beispiel Dichtefunktionaltheorie (DFT), erlauben zwar
die genaue Vorhersage von Energien und atomaren Kräften, jedoch sind die notwendigen
Systemgrößen aufgrund des hohen Rechenaufwands derzeit nicht mittels DFT zugänglich.
In dieser Dissertation wird gezeigt, dass hochdimensionale neuronale Netzwerke (NNs), die
mit first principles Daten trainiert wurden, die PES von Kupfer/Zinkoxid-Grenzflächen akkurat
darstellen können. Das System aus Kupferclustern auf Zinkoxid ist ein wichtiger heterogener
Katalysator für die industrielle Methanolsynthese. Im Vergleich zu DFT-Rechnungen ist die
Auswertung von NN-Potentialen um einige Größenordnungen schneller. Darüber hinaus skaliert
der Rechenaufwand linear mit der Anzahl der Atome.
Die Konstruktion eines akkuraten Kupfer/Zinkoxid-Potentials erforderte eine Reihe von Zwischenschritten, die in dieser Arbeit erörtert werden: Zunächst wurde die allgemeine Anwendbarkeit der NN-Methode auf metallische Systeme anhand des Fallbeispiels Kupfer demonstriert.
Zudem wurde eine Erweiterung der NN-Methode basierend auf umgebungsabhängigen atomaren Ladungen vorgestellt, welche die zuverlässige Beschreibung von Ladungstransport in
Mehrkomponentenstrukturen ermöglicht. Diese Methodik wird anhand eines NN-Potentials für
Zinkoxid erläutert. Die Erkenntnisse aus den ersten beiden Schritten erlauben anschließend
die Konstruktion eines ersten NN-Potentials für das ternäre System aus Kupfer und Zinkoxid.
Zuletzt wird die Genauigkeit der NN-Methode für molekulare Strukturen an dem Beispiel des
Methanolmoleküls untersucht.
Jedes in dieser Arbeit vorgestellte NN-Potential wurde sorgfältig getestet. Hierfür wurden
vielfältige Eigenschaften, wie strukturelle Energieunterschiede, atomare Kräfte, Fehlstellenbildungsenergien, elastische Eigenschaften und Oberflächenenergien verschiedener Kupfer- und
Zinkoxidoberflächen untersucht. Die vorhergesagten Geometrien, Energien, atomaren Kräfte
und Ladungen sind in hervorragender Übereinstimmung mit den DFT Referenzwerten.
vii
Abstract
The simulation of large realistic surfaces of non-ideal heterogeneous catalysts makes it
necessary to model systems containing several thousand atoms. Especially molecular
dynamics simulations of large systems critically depend on the accurate description
of the underlying potential energy surface (PES). First-principles methods such as
density-functional theory (DFT) can provide very accurate energies and forces, but
the simulation of such system sizes with DFT currently is unfeasible due to the high
computational costs.
In this thesis it is demonstrated that high-dimensional neural networks (NN) trained to
first-principles data are able to accurately represent the PES of zinc oxide supported
copper clusters, an important heterogeneous catalyst for the methanol synthesis. The
evaluation of NN potentials is several orders of magnitude faster than DFT calculations,
and its computational cost scales linearly with the simulated number of atoms.
The construction of an accurate copper/zinc oxide potential made several steps necessary that are discussed in the thesis. First, the general applicability of the NN method
to metallic systems has been investigated at the example of a copper NN potential.
Second, it is demonstrated that an extension of the NN methodology based on environmentally dependent atomic charges allows the accurate description of multicomponent
systems exhibiting charge transfer. Third, a working ternary potential for copper/zinc
oxide interface structures is presented. Finally, the accuracy of NN potentials for
molecular structures is reported for the methanol molecule as a benchmark example.
Each constructed NN potential has been carefully tested. Several properties, e.g., structural energy differences, atomic forces, vacancy formation energies, elastic properties
and surface energies for different copper and zinc oxide surfaces have been presented
here. The predicted geometries, energies, atomic forces, and atomic charges are in
excellent agreement with reference DFT calculations.
viii
Associated Publications
Some of the results presented in this thesis have already been published in the following
articles:
N. Artrith and J. Behler,
“High-dimensional neural network potentials for metal surfaces:
A prototype study for copper”,
Phys. Rev. B 85 (2012) 045439.
N. Artrith, T. Morawietz and J. Behler,
“High-dimensional neural-network potentials for multicomponent systems:
Applications to zinc oxide”,
Phys. Rev. B 83 (2011) 153101.
N. Artrith, B. Hiller and J. Behler,
“Neural network potentials for metals and oxides – First applications to
copper clusters at zinc oxide”,
Phys. Status Solidi B 1–13 (2012), accepted (invited feature article).
K. V. J. Jose, N. Artrith and J. Behler,
“Construction of high-dimensional neural network potentials using
environment-dependent atom pairs”,
J. Chem. Phys. 136 (2012) 194111.
ix
Contents
Zusammenfassung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Associated Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . .
I
1
II
2
3
Introduction
ix
1
Introduction
3
1.1
Heterogeneous catalysis . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.2
Accurate and efficient atomistic potentials for materials . . . . . . . .
5
1.3
Structure of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . .
7
Theoretical Background
9
Electronic Structure Calculations
11
2.1
The Born–Oppenheimer approximation . . . . . . . . . . . . . . . .
11
2.2
The electronic structure problem . . . . . . . . . . . . . . . . . . . .
12
2.3
Density-functional theory . . . . . . . . . . . . . . . . . . . . . . . .
15
The FHI-aims Code
19
3.1
19
Basis set expansion . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
Molecular dynamics simulations
23
5
Neural Network Potentials
25
5.1
Artificial neural networks . . . . . . . . . . . . . . . . . . . . . . . .
25
5.2
High-dimensional neural network potentials . . . . . . . . . . . . . .
27
5.3
High-Dimensional Neural Networks for Multicomponent Systems . .
33
xi
5.4
Benchmark of activation functions and symmetry functions . . . . . .
35
5.5
Molecular dynamics simulations employing NN potentials . . . . . .
37
III Computational Details
39
6
Computational Details
41
6.1
DFT calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41
6.2
Construction of reference data sets . . . . . . . . . . . . . . . . . . .
42
6.3
Optimization of the neural network architecture . . . . . . . . . . . .
45
IV Results
47
7
A Neural Network Potential for Copper
49
7.1
Reference data set . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
7.2
A neural network potential for copper . . . . . . . . . . . . . . . . .
56
7.3
Reliability of the neural network potential for a large realistic structure 70
8
9
A Multicomponent Neural Network Potential for Zinc Oxide
77
8.1
Neural network potentials for multicomponent systems . . . . . . . .
77
8.2
Reference data set . . . . . . . . . . . . . . . . . . . . . . . . . . . .
78
8.3
Neural network potential for zinc oxide . . . . . . . . . . . . . . . .
80
Neural Network Potentials for Ternary Systems
9.1
9.2
Construction of a neural network potential-energy surface for copper/zinc oxide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
91
Reference data set . . . . . . . . . . . . . . . . . . . . . . . . . . . .
92
10 A Neural Network Potential for the Methanol Molecule
V Summary and Outlook
11 Summary and Outlook
xii
91
107
115
117
VI Appendix
A Symmetry Function Parameters
121
123
A.1 Copper potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
A.2 Zinc oxide potential . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
A.3 Copper/zinc oxide potential . . . . . . . . . . . . . . . . . . . . . . . 128
B Calculation of Elastic Constants of Cubic Lattices
131
Bibliography
133
Acknowledgements
145
xiii
xiv
List of Figures
5.1
Small example of a feed-forward neural network . . . . . . . . . . .
26
5.2
High-dimensional neural network . . . . . . . . . . . . . . . . . . .
28
5.3
Demonstration of a NN fit without force information . . . . . . . . .
30
5.4
Demonstration of a NN fit including forces . . . . . . . . . . . . . .
31
5.5
An example of the radial symmetry functions G2i . . . . . . . . . . .
33
G4i
5.6
An example of the angular symmetry functions
. . . . . . . . . .
33
5.7
High-dimensional NN for multicomponent systems . . . . . . . . . .
34
5.8
Flow chart of the RuNNer–TINKER interface . . . . . . . . . . . . .
38
6.1
A systematic approach to construct neural network potentials . . . . .
43
7.1
Energy vs. Volume curves for Cu crystal structures . . . . . . . . . .
50
7.2
DFT energies of Cu bulk structures in the reference data set . . . . . .
51
7.3
Comparison of NN energies of two different Cu potentials. . . . . . .
52
7.4
Comparison of RMSEs of copper bulk systems . . . . . . . . . . . .
57
7.5
Comparison of the fitting error of different copper NNs . . . . . . . .
58
7.6
Comparison of errors for the Cu training and test sets . . . . . . . . .
59
7.7
Comparison of NN and DFT energies for a random Cu30 cluster . . .
60
7.8
Comparison of NN and DFT forces for atoms in a Cu14 cluster . . . .
61
7.9
Comparison of NN and DFT energies of 16 atom bulk Cu bulk structures 65
7.10 Comparison of NN and DFT forces acting on atoms in Cu bulk structures 66
7.11 Energy profiles of a diffusing Cu surface adatom . . . . . . . . . . .
70
7.12 Slab model of a realistic Cu (111) surface . . . . . . . . . . . . . . .
71
7.13 Atomic benchmark environments in a realistic Cu surface . . . . . . .
71
7.14 Clusters extracted from the realistic Cu surface model . . . . . . . . .
74
7.15 Comparison of NN and DFT atomic forces for Cu clusters (6 Å) . . .
74
7.16 Comparison of NN and DFT atomic forces for Cu clusters (12 Å) . . .
75
xv
7.17 Convergence of the NN forces with increasing cutoff . . . . . . . . .
75
8.1
Energy density of states for the ZnO data set . . . . . . . . . . . . . .
80
8.2
Comparison of NN and DFT energies of random Zn40 O40 clusters . .
86
8.3
Comparison of NN and DFT forces in a Zn15 O15 cluster . . . . . . .
86
8.4
Comparison of NN and DFT force components in random Zn15 O15
cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.5
87
Comparison of NN and DFT atomic charges for the same Zn15 O15
cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
87
8.6
Comparison of NN and DFT energies of ideal ZnO crystal structures .
90
8.7
Comparison of NN and DFT energies of random ZnO bulk structures .
90
8.8
Comparison of NN and DFT energies of thermally distorted surface
structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
90
9.1
Realistic Cu surface model with imperfections . . . . . . . . . . . . .
97
9.2
Cu clusters extracted from a large surface model . . . . . . . . . . . .
98
9.3
Comparison of NN and DFT forces acting on the central atoms of Cu
clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
98
9.4
Comparison of NN and DFT energies of random ZnO bulk structures . 100
9.5
Comparison of NN and DFT forces in a Zn15 O15 cluster . . . . . . . 100
9.6
Comparison of NN and DFT energies of random CuO bulk structures 101
9.7
Comparison of NN and DFT energies of random CuZn bulk structures 102
9.8
Comparison of NN and DFT energies of random Cu27 Zn20 O20 clusters 102
9.9
Comparison of NN and DFT energies for random Cu/ZnO slabs . . . 103
9.10 A model of a large Cu cluster on a ZnO surface . . . . . . . . . . . . 103
9.11 Snapshot of an MD simulation of a large Cu cluster on a ZnO surface 105
9.12 Comparison of NN and DFT atomic forces in a large Cu/ZnO interface
structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
9.13 Convergence of the NN forces with respect to the cutoff radius . . . . 106
10.1 Comparison of NN and MM3 energies for the methanol molecule . . 111
10.2 Dihedral potential for the CH3 rotation in methanol . . . . . . . . . . 112
10.3 Comparison of NN and MM3 energies during an MD simulation of
methanol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
xvi
List of Tables
5.1
RMSEs of NN fits for the Cu dimer . . . . . . . . . . . . . . . . . .
36
7.1
Cu clusters in the training and the test set . . . . . . . . . . . . . . .
53
7.2
Cu bulk structures in the training and the test set. . . . . . . . . . . .
54
7.3
Cu surface structures in the training and the test set . . . . . . . . . .
55
7.4
RMSEs for different copper NN potentials . . . . . . . . . . . . . . .
57
7.5
Comparison of NN and DFT Cu lattice parameters . . . . . . . . . .
62
7.6
Comparison: NN vs. DFT for Cu elastic constants . . . . . . . . . . .
62
7.7
Cu vacancy formation energies . . . . . . . . . . . . . . . . . . . . .
64
7.8
DFT and NN copper surface energies . . . . . . . . . . . . . . . . . .
67
7.9
Vacancy formation energies at various Cu surfaces . . . . . . . . . .
69
7.10 Number of atoms in Cu clusters of increasing diameter . . . . . . . .
76
8.1
Composition of the training and the test set for the ZnO system. . . .
81
8.2
Energies and charges in the ZnO reference data set . . . . . . . . . .
82
8.3
RMSEs of energies and forces of different ZnO NN potentials (A) . .
84
8.4
RMSEs of energies and forces of different ZnO NN potentials (B) . .
84
8.5
Lattice parameters and bulk moduli of ZnO crystal structures . . . . .
88
9.1
Composition of the Cu/ZnO reference data set . . . . . . . . . . . . .
93
9.2
RMSEs for energies and forces of various Cu/ZnO NN fits (A) . . . .
94
9.3
RMSEs for energies and forces of various Cu/ZnO NN fits (B) . . . .
95
9.4
Cu lattice parameters as obtained using the Cu/ZnO potential . . . . .
96
9.5
Cu surface energies as obtained using the Cu/ZnO potential . . . . . .
96
9.6
ZnO lattice parameters as obtained using the Cu/ZnO potential . . . .
99
10.1 Symmetry function parameters for the methanol NN potential (A) . . 108
10.2 Symmetry function parameters for the methanol NN potential (B) . . 109
xvii
10.3 RMSEs of energies and forces for the methanol potential . . . . . . . 110
A.1 Radial symmetry functions for Cu . . . . . . . . . . . . . . . . . . . 123
A.2 Angular symmetry functions for Cu . . . . . . . . . . . . . . . . . . 124
A.3 Radial symmetry functions for O in ZnO . . . . . . . . . . . . . . . . 125
A.4 Radial symmetry functions for Zn in ZnO . . . . . . . . . . . . . . . 125
A.5 Angular symmetry functions for O in ZnO . . . . . . . . . . . . . . . 126
A.6 Angular symmetry functions for Zn in ZnO . . . . . . . . . . . . . . 127
A.7 Radial symmetry functions used in the Cu/ZnO potential . . . . . . . 128
A.8 Angular symmetry functions used in the Cu/ZnO potential . . . . . . 129
xviii
Part I
Introduction
1
1 Introduction
The combustion of fossil fuels leads to a climate change and to environmental pollution,
due to the emission of green house gases, such as carbon dioxide (CO2 ). Nuclear
electric power, on the other hand, does not only bear high and potentially uncontrollable
safety risks, but also leads to the unsolved problem of nuclear waste disposal. However,
most sustainable energy sources, such as solar radiation, wind or tidal wave energy,
do not provide a continuous energy output. Both, solar radiation and wind depend on
the weather, the hour of the day, and the season. It is therefore inevitable to store the
energy when it is abundant for consumption at a later point in time.
Synthetic fuels, such as alcohols, are a particularly appealing option for energy storage,
as they are highly transportable, have a high energy density, and can be used with
available technology (e.g., combustion engines and fuel cells). Olah has pictured a
prospective methanol based economy, in which methanol, an important precursor in the
chemical industry, is used as the main energy storage [1, 2]. The methanol synthesis
involves the conversion of carbon monoxide (CO) and CO2 , so that the net emission of
green house gases is close to zero, even though the combustion of methanol releases
CO2 . Today’s standard synthesis route to methanol is based on a heterogeneous
catalysis over oxide-supported copper clusters (Cu/ZnO/Al2 O3 ) catalyst [3].
1.1 Heterogeneous catalysis
Heterogeneous catalytic chemical reactions, involving a solid catalyst and gaseous or
liquid reactants, are at the core of many energy and environment related challenges.
The conversion of toxic exhaust gases to less harmful substances is achieved with
catalytic converters. The high overpotential of the electrochemical oxygen reduction
reaction (ORR), the fundamental reaction in fuel cells, is lowered by a heterogeneous
3
electrocatalyst (traditionally Pt/C) [4, 5]. In fact, the industrial production of the
majority of chemical compounds is based on heterogeneous catalysis. The most
prominent example is the ammonia synthesis following the Haber–Bosch process,
which accounts for 1.4 % of the world’s consumption of fossil fuels [6] and has been
recognized in three Nobel Prizes (Fritz Haber 1918, Carl Bosch 1931, and Gerhard
Ertl 2007).
The work that is the subject of this thesis was done as part of the collaborative research
center SFB 558 at the University of Bochum, which has focused on the experimental
and theoretical investigation of the methanol synthesis for the past 12 years (2000–
2012) [7]. Many experimental studies within the SFB 558 and in other research
groups have improved the understanding of the catalytic methanol synthesis, and
of heterogeneous catalysis over oxide supported metal clusters in general [3, 8–10].
Especially the interaction between the copper clusters and the support has been found
to result in complex phenomena, such as the formation of copper–zinc alloys under
reducing conditions [11, 12], the migration of copper atoms [13] and small copper
clusters [14] into the zinc oxide surface, the thermal oxidation of copper [15], and
the sensitive dependence of the shape of the adsorbed copper clusters on the gaseous
environment [16].
All these effects of the strong metal–support interaction (SMSI) have a significant
influence on the activity of the catalyst. The inspiration for this thesis came from the
surprising dependence of the shapes of copper clusters at zinc oxide surfaces on the
gas phase of the environment, as discovered by Hansen and co-workers [16]. It has
been the motivation of the studies presented here to obtain an understanding of the
dynamical and structural properties of this system. As a first step, this thesis is focused
on copper clusters at zinc oxide surfaces in vacuum. The gaseous environment was not
included.
A recent study of copper cluster on non-polar zinc oxide surfaces by Köhler et al. [17]
using scanning tunneling microscopy (STM) has showed evidence for the penetration of
the copper clusters into the support surface. In their experiment the zinc oxide surfaces
were heated to different temperatures (290 K and 650 K) before depositing copper
by means of Molecular Beam Epitaxy. Subsequently footprints of the copper clusters
were revealed, when individual clusters were removed from the support surface using
4
the STM tip. The ambitious goal of this thesis was the realistic theoretical modeling of
the Cu/ZnO interactions that lead to the observed effects.
Since the catalytic reaction itself is governed by processes at the atomic and electronic
length-scales, several theoretical studies based on electronic structure calculations
have provided additional insight into the reaction mechanism of the pure zinc oxide
catalyst [18, 19], the thermodynamical properties of copper clusters and surfaces [20–
22] and zinc oxide surfaces [23–25]. Because of the inherent complexity of the system,
lack of theoretical studies have addressed model calculations of the entire copper/zinc
oxide catalyst [3, 26, 27].
Electronic structure calculations are, however, limited to small, idealized model structures containing a maximum of a few hundreds of atoms. The simulation of a realistic
active catalyst including the effects of the SMSI would make it necessary not only to
model nanometer-scale copper clusters adsorbed on non-ideal zinc oxide surfaces, but
also the containing gaseous atmosphere. In general, the active site of solid catalysts is
often related to imperfections, such as step edges, defects, or ad-atoms, which spoil
the periodicity of the ideal surface and render it necessary to simulate large supercells [3]. A computational model of such a realistic solid catalyst has to contain several
thousands or even tens of thousands of atoms, which is well beyond the feasibility
of density-functional theory. It is therefore necessary to turn to a more approximate
theoretical method, that should, however, still be sufficiently accurate to describe the
complex geometries of a non-ideal catalyst surface.
1.2 Accurate and efficient atomistic potentials for materials
For the reliable simulation of non-ideal catalyst surfaces a method is needed that is at
the same time sufficiently accurate to allow for quantitative predictions, not restricted
to a particular class of materials, and efficient to allow the simulation of long time
scales.
Atomic interaction potentials allow the simulation of large structures containing tens
of thousands of atoms. Usually, these potentials have been developed for a certain
application, such as for the simulation of solid insulators and molecules [28–31],
5
metals [32, 33, 40], and large biological entities (e.g., proteins, DNA fragments) [28,
34–36]. However, despite the vast number of different atomic interaction potentials,
none of the conventional potentials is suitable for the accurate simulation of nonideal oxide supported metal clusters. To give just a few specific examples: molecular
force fields (FF) [37–39], are specialized on the description of the chemical bonds
in large organic or biological molecules, and are in general not suited to describe
solids or surfaces. This class of potentials also relies on the definition of atomic
connectivities and therefore does not allow the formation or the cleavage of bonds
during the simulation. The functional form of embedded atom models (EAM) [40]
on the other hand has been derived from the electronic energy expression in metals,
and can not in general be applied to covalent insulators or molecular materials, even
though extended EAM potentials have been suggested [41, 42]. The most general class
of atomistic potentials are bond order potentials (BOP), such as the Tersoff potential
for solids [31], or potentials based on the reactive force field approach by Duin and
co-workers [43]. BOPs are applicable to a wide range of structures and have previously
been used to simulate zinc oxide [44]. However, the functional form of BOPs, which
is based on physical approximations, is not flexible enough to reach the accuracy that
is necessary for reliable predictions of structural and dynamical properties.
Recently, two new potential types have been suggested that could be a remedy to
the restrictions discussed above: Gaussian Approximation Potentials (GAP) [45, 46]
and potentials based on artificial Neural Networks (NN) [47–49]. Both approaches
are purely mathematically motivated, and allow the interpolation of the potential
energy surface using a flexible, non-linear functional form based on a set of reference
structures. In case of GAP, all reference structures enter the energy expression, and the
efficiency of the potential therefore depends on the number of reference structures. The
NN potentials are, on the other hand, fitted once to reproduce the reference energies and
forces and can be subsequently used in large-scale simulations without any information
about the initial set of reference structures.
In recent years NN potentials have emerged that promises to surpass the shortcomings of the conventional potentials [50–52]. Behler and Parrinello have shown that
symmetry-adapted artificial neural networks can be trained to accurately represent
the high-dimensional potential energy surface (PES) of atomic structures [51, 53, 54].
6
Within the scope of this thesis the applicability of NN potentials for the simulation
of zinc oxide supported copper clusters is explored. Previously, the method had been
used for the simulation of silicon crystal phases, i.e., for a covalent insulator of a single
atomic species [55, 56]. As a first step it is thus mandatory to assess the capability
of the NN potential method for the description of metals, in particular for copper. To
simulate the Cu/ZnO system it was furthermore necessary to extend the methodology
to multicomponent systems of more than a single chemical element. Eventually, for
the simulation of the actual methanol synthesis the NN potentials need to provide an
accurate description of molecules.
1.3 Structure of the thesis
In the first part of the thesis a brief review of the employed theoretical and computational methods is provided. The NN potential methodology and its extension to
multicomponent structures is discussed in detail.
The second part of the thesis focuses on the steps outlined in the previous section: in
Chapter 7 the first application of the NN potential method for a metal is presented
for the example of copper. The first application of a multicomponent NN potential
for the construction of a zinc oxide potential is discussed in Chapter 8. The results of
these two chapters are combined in a ternary Cu/ZnO potential in Chapter 9. Finally,
the applicability of the NN method for molecular structures is evaluated for a single
methanol molecule in Chapter 10.
The final part of the thesis provides a summary of the results and an outlook to the
prospective simulations using the constructed potentials.
7
8
Part II
Theoretical Background
9
2 Electronic Structure Calculations
The investigation of the structural, energetic, and dynamical properties of complex
condensed matter and materials requires a reliable description of the atomic interactions.
In this project density-functional theory (DFT) has been used, which provides an
accurate description of many complex systems, in particular for solids and surfaces
like copper/zinc oxide. The FHI-aims code (Fritz Haber Institute ab initio molecular
simulations) [57], an all-electron code that employs numerical atomic orbitals as basis
functions, has been used for all production calculations to set up training sets for
neural network potentials. Additionally, some tests have also been carried out with
the PWSCF code [58], a pseudo potential program that employs plane waves as basis
functions. This code was used to verify the accuracy of the FHI-aims electronic
structure calculations.
2.1 The Born–Oppenheimer approximation
From quantum mechanics we know that the static1 eigenstate of an atomic system can
be described by a wave function Ψ, which depends on all electronic coordinates {~r}
and all ionic coordinates {~R}. The Schrödinger equation [59]
b r}, {~R}] Ψ[{~r}, {~R}] = E Ψ[{~r}, {~R}]
H[{~
(2.1)
b which also depends
relates the eigenstate Ψ to an energy E. The Hamilton operator H,
on all electronic and ionic coordinates, is given by the sum of the kinetic and potential
energy operators Tb and V
b = Tb +V = Tbe + Tbn +Vee +Ven +Vnn
H
1 We
,
(2.2)
will not discuss time dependent properties in this work.
11
where the indices “e” and “n” refer to the electrons and the nuclei.
For an atomic system of Ne electrons and Nn nuclei the solution of Schrödinger’s
equation (2.1) thus involves 3 Ne + 3 Nn independent variables. In 1927 Born and
Oppenheimer suggested to separate the degrees of freedom of the electrons and nuclei
[60]
Ψ[{~r}, {~R}] ≈ Ψe [{~r}] · Ψn [{~R}] ,
(2.3)
and the justification of this approximation are the very different time-scales of the
electronic and ionic dynamics. The “light and fast” electrons are expected to adjust
adiabatically to the “slow” changes of the ionic positions. In that case the purely ionic
contributions to the Hamiltonian (2.2), namely the kinetic energy of the nuclei and the
ion–ion repulsion, can also be treated separately and the electronic structure problem
can be reformulated in an electronic Schrödinger equation
be Ψe = Ee Ψe
H
be = Tbe +Vee +Ven
with H
.
(2.4)
Note, that—by convention—the ionic repulsion Vnn is kept in the electronic Hamiltonian, but usually treated in a classically way.
In general the Born–Oppenheimer approximation (2.3) is a very good approximation.
There are situations, in which the separation of the electronic and ionic degrees of
freedom is not justified. An example for such non-adiabatic problems is the avoided
crossing of two quantum states close in energy. In this work, however, only atomic
structures within the Born–Oppenheimer approximation were considered. We will
therefore usually drop the index “e” in the following sections, and we implicitly refer
b and Ψ.
to the electronic Hamiltonian and wave function by H
2.2 The electronic structure problem
In section 2.1 the electronic Schrödinger equation was introduced. Our objective shall
be to determine the electronic ground-state energy for a system with N electrons and
Nn nuclei that is described by a given Hamiltonian of the form of equation (2.2). The
different contributions in the position representation of the Hamilton operator (all
12
expressions are given in Hartree atomic units) are the kinetic energy of the electrons
N
1
Tb = − ∑ ~∇2i
2 i
,
(2.5)
the electron–electron repulsion
1
j<i |~r j −~ri |
Vee = ∑
,
(2.6)
the electron–ion attraction
N Nn
Ven = − ∑ ∑
i
α
zα
|~rα −~ri |
,
(2.7)
where zα is the ionic charge at nucleus α, and the classical ion–ion repulsion
Nn
Vnn =
zα zβ
|~r −~rα |
β <α β
.
∑
(2.8)
Note, that especially in the literature related to density-functional theory (see section
2.3) it is common to combine the electron–ion interactions with further external field
contributions (for example from electric fields) in a more general energy due to an
external potential Vext = Ve,n + . . ., which governs the electronic degrees of freedom.
The external potential can then be written as a sum of the interactions of the individual
electrons with an external potential that depends on the coordinates of all nuclei {~rα }
Vext = ∑ vext ({~rα },~ri ) .
(2.9)
i
The energy of any normalized quantum state Ψ, is then given by the expectation value
of the Hamiltonian
b = hΨ|H|Ψi
b
E[Ψ] = hHi
=
Z
b Ψdτ
Ψ∗ H
,
(2.10)
where the integration is over all electronic coordinates. The ground state wave function
minimizes the energy functional of Eq. (2.10) (variational principle)
E0 = min E[Ψ] .
(2.11)
Ψ
Note, that an ansatz for the wave function Ψ is necessary to actually evaluate the
expectation value (2.10) and make use of the variational principle (2.11).
13
2.2.1 The electronic wave function
The many-electron wave function Ψ is a function of all electronic coordinates. Similar
as in the Born–Oppenheimer approximation, Eq. (2.3), the problem of finding an ansatz
for the wave function simplifies if the total function is decomposed in a product ansatz.
The simplest such ansatz for an N-electron wave function Ψ is the Hartree product
Ψ(~r1 ,~r2 , . . . ,~rN ) ≈ ΨHartree (~r1 ,~r2 , . . . ,~rN ) = ψ1 (~r1 ) · ψ2 (~r2 ) · . . . · ψN (~rN )
(2.12)
of single-electron functions ψ. The Hartree product is, however, not a suitable representation of an electronic state. Electrons are Fermions and any ansatz for an electronic
wave function has therefore to obey the antisymmetry (Pauli) principle with respect to
the exchange of two particles
Ψ(. . . , i, . . . , j, . . .) = −Ψ(. . . , j, . . . , i, . . .) .
(2.13)
The antisymmetrization of the Hartree product (2.12) leads to the Slater determinant
ψ (~r ) · · · ψ (~r ) 1 1
1 N 1 .
.. ..
Ψ(~r1 ,~r2 , . . . ,~rN ) ≈ ΨSD (~r1 ,~r2 , . . . ,~rN ) = √ ..
.
. N! ψN (~r1 ) · · · ψN (~rN )
which satisfies the Pauli principle (2.13). The prefactor of
√1
N!
,
(2.14)
provides for normaliza-
tion, given the one-electron functions themselves are normalized and orthogonal, i.e.,
hψi |ψ j i = δi j , so that the probabilistic interpretation of the squared norm of the wave
function is possible.
The use of a single Slater determinant as ansatz for the all-electron wave function to
exploit the variational principle is the foundation of the Hartree-Fock method.
The approximation of the many-electron wave function by a Slater determinant is,
however, not always sufficiently accurate. A hierarchy of theoretical chemistry methods
exists that improves on this approximation, either by means of perturbation theory
(MP2, MP4), by including further electronic configurations that correspond to excited
states (CI, MCSCF, CASSCF, CC), or by combining the two approaches (CASPT2).
14
2.3 Density-functional theory
In the previous section the Hartree–Fock method was mentioned, which is based on a
Slater-determinant ansatz for the all-electron wave function. The results of Hartree–
Fock calculations are often not satisfactory and one way to cure the shortcomings is
to choose a more complex ansatz for the wave function. However, density-functional
theory (DFT) [76] takes a conceptually different route, where the description of the
correlated all-electron wave function is completely avoided [61].
The fundamental quantity of DFT is the electron density n(~r), which is related to the
N-electron wave function Ψ by
Z
n(~r) = N
Z
...
Ψ∗ (~r,~r2 , . . . ,~rN )Ψ(~r,~r2 , . . . ,~rN ) d~r2 . . . d~rN
,
(2.15)
so that n(~r) is the spacial distribution function of all electrons that are described by Ψ
and the integration over the whole space consequently yields the number of electrons
Z
n(~r) d~r = N
.
(2.16)
The goal of DFT is to express the total energy functional (2.11) directly in terms of the
electron density (2.15)
E = E[n(~r)]
(2.17)
instead of the wave function Ψ. This does not only have the advantage of avoiding
any ansatz for the all-electron wave function, it would also mean that the number of
degrees of freedom of the electronic structure problem for N electrons is reduced from
3 N to just 3. That such a (rather counter-intuitive) density-functional exists has been
shown by Hohenberg and Kohn, who could additionally prove that the variational
principle (2.11) also applies to DFT. The Hohenberg–Kohn theorems and their proofs
are discussed in the following section 2.3.1.
2.3.1 The Hohenberg–Kohn theorems
In section 2.2 it was explained that the electronic (Born–Oppenheimer) Hamilton
operator is entirely determined by the knowledge of an external potential Vext (which in
15
the simplest case is just determined by the ionic positions) and the number of electrons
N. The Hamilton operator in turn determines the all-electron states via the Schrödinger
equation (2.1) and in particular the ground-state wave function Ψ0 , from which the
ground-state electron density n0 can be derived:
b r1 , . . . ,~rNe ) → Ψ0 (~r1 , . . . ,~rNe ) → n0 (~r) .
{Vext (~r), N} → H(~
(2.18)
In their first theorem Hohenberg and Kohn [62] showed that the above relation can
be reversed, i.e., a given ground-state electron density n0 (~r) can only be the result of
exactly one particular external potential Vext (~r) and one particular number of electrons
N, which in turn determines the ground-state
be (~r1 , . . . ,~rNe ) → Ψ0 (~r1 , . . . ,~rNe ) .
n0 (~r) → {Vext (~r), N} → H
(2.19)
If the ground-state wave function is uniquely determined by the ground-state electron
density, the wave function itself is a functional of the density and there must also be a
b
density-functional for the expectation value of any operator O
b 0 [n0 (~r)]i .
O[n0 ] = hΨ0 [n0 (~r)]|O|Ψ
(2.20)
The second Hohenberg–Kohn theorem derives the applicability of the variational
principle to the ground-state electron density. The one-to-one mapping of the electron
density and the wave function, which is the result of the first theorem, immediately
transfers the minimum-energy principle to the density:
b 0 i ≤ hΨ̃|H|
b Ψ̃i
hΨ0 |H|Ψ
b
b
⇔ hΨ[n0 ]|H|Ψ[n
0 ]i ≤ hΨ[ñ]|H|Ψ[ñ]i
⇔
E[n0 ] ≤ E[ñ] .
(2.21)
For the variational minimization of the energy, it must be additionally guaranteed
that the ground-state is a stationary point of the energy functional with respect to the
variation of the density. With the constraint that the integration of the density over the
whole space must yield the number of electrons, the problem can be formulated as
Lagrangian function
δ n
E[n] + µ
δn
where µ is a Lagrange multiplier.
16
Z
n(~r) d~r − Ne
o
=0
,
(2.22)
2.3.2 Kohn–Sham density-functional theory
The Hohenberg–Kohn theorems demonstrate that an energy functional E[n] of the
electron density exists and that its minimum is the ground state energy. However, it is
not obvious how such a functional can be obtained. Of the individual contributions to
the total electronic energy (see Sec. 2.2)
E[n] = T [n] +Vee [n] +Vext
(2.23)
neither the kinetic energy functional T [n] nor the functional of the electronic interaction
potential Vee [n] is known.
Kohn and Sham [63] identified the known contributions to the unknown quantities in
the energy functional and substituted the classical electrostatic Hartree energy
VH [n] =
1
2
Z
Z
n(~r) vH (~r) d~r
with vH (~r) =
n(~r0 )
d~r0
|~r −~r0 |
(2.24)
for the electron–electron interaction Vee [n]. The kinetic energy of the correlated
electrons T [n] is replaced by the kinetic energy Ts [n] of a fictitious auxiliary system
of non-interacting electrons. The presumably small missing energy contributions due
to these approximations are captured by an additional term, the exchange–correlation
energy Exc [n]. The Kohn–Sham total energy functional thus reads
E KS [n] = Ts [n] +VH [n] + Exc [n] +Vext
,
(2.25)
which is formally equivalent to the energy functional in Eq. (2.23), if the functional
Exc is known. To obtain the kinetic energy of the auxiliary Kohn–Sham system, it is
still necessary to solve a Schrödinger equation. However, in the case of non-interacting
electrons the many-electron wave function Ψ is exactly given by a Slater determinant,
Eq. (2.14), of one-electron wave functions {ψi } (Kohn–Sham orbitals). Thus, as in the
Hartree–Fock method, a set of one-electron eigenvalue problems has to be solved
b
h ψi = εi ψi
1
with b
h = − ~∇2 + veff (~r) ,
2
(2.26)
where b
h is the one-electron Hamiltonian, and veff (~r) is an effective potential, which
will be discussed. The kinetic energy of the auxiliary system is then given as sum of
17
the one-electron energies
1
Ts = ∑ − hψi |~∇2 |ψi i ,
2
i
(2.27)
and the density of the non-interacting electrons is
ns (~r) = ∑ |ψi (~r)|2
.
(2.28)
i
As an additional condition, Kohn and Sham require the ground state electron density
n0 of the auxiliary system to be equal to the one of the real system. Thus, at the ground
state n = ns = n0 , the variation n → n + δ n, Eq. (2.22), of the energy of the fictitious
and of the real system must both become zero and can be set equal. This leads to an
expression for the effective potential
veff [n](~r) = vext (~r) + vH [n] + vxc [n] ,
where the exchange–correlation potential vxc is defined as
δ Exc [n] vxc =
.
δ n n=n0
(2.29)
(2.30)
The Kohn–Sham equations, Eq. (2.26), have to be solved self-consistently, since the
effective potential depends on the density, which in turn depends on the Kohn–Sham
orbitals that belong to one specific effective potential. A good initial guess for the
ground state density is the superposition of atomic densities.
Note, that the exact exchange–correlation functional Exc is not known. However, good
approximations of Exc derived from the local density of the homogeneous electron gas
(local-density approximation, LDA) and density gradient corrected approximations
(generalized gradient approximation, GGA) are available. More advanced approximations include a dependence on the second derivative of the density or even depend on
the Kohn–Sham orbitals.
18
3 The FHI-aims Code
FHI-aims (Fritz Haber Institute ab initio molecular simulations) [57] is an efficient
computer program package for the calculation of physical and chemical properties
of condensed matter and materials (such as molecules, clusters, solids, surfaces, and
liquids) based on first-principles descriptions of the electronic structure (e.g., using
DFT). The program implements all-electron methods that use numerical atom-centered
orbitals as the basis functions for the Kohn–Sham orbitals. This enables accurate allelectron and full-potential calculations at a computational cost, which is competitive
with, for example, plane wave methods. Further, it allows to carry out calculations
with and without periodic boundary conditions.
3.1 Basis set expansion
The general solution of the self-consistent Kohn–Sham equations (2.26) for arbitrary
molecular Kohn–Sham orbitals is infeasible for structures containing many atoms. In
practice, the orbital space is restricted by expanding the wave functions in a set of
suitable basis functions {φµ }
|ψi i = ∑ ciµ |φµ i .
(3.1)
µ
This technique was independently suggested by Roothaan and Hall [64, 65] for the
iterative solution of the Hartree–Fock equations. The expansion of Eq. (3.1) transforms the abstract operator eigenvalue problem (2.26) to a generalized matrix–vector
eigenvalue problem
b
h|ψi i = εi |ψi i
−→
∑ Hµν ciν = εi ∑ Sµν ciν
ν
↔
H~ci = εi S~ci
,
(3.2)
ν
where H is the matrix representation of the one-electron Hamiltonian in the chosen
basis set with matrix elements Hµν = hφµ |b
h|φν i. The eigenvectors {~ci } correspond
19
to the Kohn–Sham orbitals {ψi }. The elements of the overlap matrix Sµν = hφµ |φν i
depend only on the choice of the basis functions and are structurally independent. If the
basis functions are chosen to be pairwise orthogonal, the overlap matrix becomes the
identity matrix, and Eq. (3.2) simplifies to a regular matrix–vector eigenvalue problem.
Note, that Hermitian eigenvalue problems of the form (3.2) can be solved numerically
with high efficiency [66].
Most electronic structure methods at their core rely on a basis set expansion of some
kind. Differences stem from the choice of the functional form of the basis {φµ },
for which many different approaches are employed: common examples are Slater
type orbitals (STOs), Gaussian type orbitals (GTOs), plane waves, wavelets, and grid
distributed or atom centered numerical basis functions. All electronic structure calculations presented in this work have been performed using the FHI-aims program [57],
which employs a basis set of numerical atom-centered orbitals.
3.1.1 Atomic orbitals
H
The analytic solutions ψnlm
(~r) of the Schrödinger equation (2.1) for the hydrogen atom
H
are, in spherical coordinates, given by the product of a radial function Rnl
(r) and a
spherical harmonic function Ylm (ϑ , ϕ)
H
H
H
ψnlm
(~r) = ψnlm
(r, ϑ , ϕ) = Rnl
(r) ·Ylm (ϑ , ϕ) ,
(3.3)
and are often called atomic orbitals (in contrast to Hartree–Fock or Kohn–Sham
orbitals, which are called molecular orbitals).
The concept of atomic orbitals can be extended to arbitrary atom types, where the
all-electron wave function is approximated by a combination of one-electron atomic
orbitals, the simplest of which is a Slater determinant. The radial functions Rαnl (r)
for the atom type of atom α are the solutions of the radial Schrödinger equation (in
Hartree atomic units)
"
!
#
1 1 ∂ 2∂
l(l + 1)
−
r
−
+ veff (r) Rαnl (r) = εnα Rαnl (r) ,
2 r2 ∂ r ∂ r
r2
(3.4)
which can equivalently be expressed as one dimensional Schrödinger equation with
20
transformed eigenfunctions
!
l(l + 1)
1 ∂2
α
α
α
α
r
R
(r)
.
r
R
(r)
=
ε
+
+
v
(r)
−
n
eff
nl
nl
2 ∂ r2
2 r2
(3.5)
Note, that the spherical effective atomic potential vαeff (r) depends on the atom type, i.e.,
on the core charge and the number of electrons, but the radial part Ylm (ϑ , ϕ) of the
atomic orbitals is independent of the atomic species.
3.1.2 Linear combination of atomic orbitals (LCAO)
Atoms are the building blocks of larger structures, such as molecules and crystals. An
intuitive basis set {φµ } for the wave functions of many-atoms structures is therefore
α } of the atoms in the system. In such a case the basis set
the set of atomic orbitals {ψnlm
expansion of the molecular orbitals, Eq. (3.1), corresponds to a superposition (linear
combination) of atomic orbitals [67, 68]
α
ψ(~r) = ∑ cµ ψnlm
(~r −~rα ) with
µ = (α, n, l, m) ,
(3.6)
µ
where α enumerates the atoms. This ansatz is the starting point of optimized atomiclocal basis sets, which mainly differ in the representation of the structurally dependent
radial functions Rαnl (r) of the atomic orbitals. Numerical basis functions directly employ the numerical solutions of the radial Schrödinger equation, Eq. (3.5), as radial
functions. Motivated by the analytic solution for the hydrogen atom, linear combinations of Slater functions, i.e., exponential functions multiplied with polynomials
have been used. A common choice in quantum chemistry are linear combinations of
Gaussian functions, which make it possible to solve the one- and two-electron integrals
of the HF method analytically.
All DFT calculations performed within the scope of this thesis employed numerical
atomic-local basis sets as implemented in the FHI–aims program [69].
3.1.3 Numerical basis functions
In general, the direct solutions of the radial Schrödinger equation of free atoms are
no good basis functions for the representation of molecular or crystal wave functions.
21
The atomic wave functions have, in principle, an infinite range, whereas the orbitals in
compounds experience an effective screening due to the other atoms in the structure.
Numerical basis functions are therefore often constructed from a confined radial
Schrödinger equation, where a confining potential vcut (r) is added to the effective
potential vαeff (r) of Eq. (3.5) [69, 70]. The confining potential is constructed in such a
way that the solutions decay smoothly to zero at a given cutoff radius rcut . Blum and
coworkers, Ref. 69, employ a confining potential with a second order pole at the cutoff
radius



0


vcut (r) = s exp



∞
if
w
r−ronset
1
(r−rcut )2
r ≤ ronset
if ronset < r < rcut
if
,
(3.7)
r ≥ rcut
which ensures that also the first derivative of the radial function vanishes at the cutoff
radius (s and w are adjustable parameters). Using the potential form of Eq. (3.7),
the solutions of the confined Schrödinger equation are equal to the solutions of the
unconfined one for radii r smaller than the onset radius ronset .
The range restriction of the atomic-orbitals has the additional advantage that integrals
involving orbitals, for example the elements of the Hamilton matrix, also become
localized in space and can be more efficiently evaluated.
22
4 Molecular dynamics simulations
Molecular dynamics simulations (MD) are tools to compute structural and dynamical
properties of classical many-body systems, and allow to analyze experimental results
on an atomic level. Classical, here, means that the movement of the nuclei follows the
laws of classical mechanics [71]. Computational simulations are based on statistical
mechanics. In this framework, the physical behavior of macroscopic systems is related
to ensemble averages over micro-states M, which are characterized by the positions
and momenta of all particles in the system [72]. In an MD simulation the equations of
motion of a many-body system are solved numerically in consecutive time intervals
τ which generates a time sequence of micro-states. This time sequence represents a
trajectory in phase space. According to the ergodic hypothesis the ensemble average
hAi can be replaced by a time average A if one allows the system to evolve infinitely in
time [73]:
hAi ≡ lim
M→∞
1
M
M
∑ Ai = lim
i
τ→∞
1
τ
Z t0 +τ
A(t) dt ≡ A
.
(4.1)
t0
Therefore, experimental observables can be approximated by time averages obtained
from MD simulations if the simulation time is sufficient.
In order to perform an MD simulation, one has to carry out two steps: first the forces
acting on each particle have to be calculated. In the second step Newton’s equations of
motion are integrated numerically based on the forces calculated in the first step:
Fi = mi ai
⇐⇒
d2 ri
∂V ({ri })
mi 2 = −
dt
∂ ri
.
(4.2)
A widely used numerical algorithm to integrate the equations of motion is the Velocity
Verlet algorithm [74].
In ab initio molecular dynamics (AIMD) [75] the atomic forces are computed by
approximately solving the time-independent Schrödinger equation on-the-fly. These
23
methods, unfortunately, are computationally too expensive to allow for a configurational sampling or MD simulations of the kind of structure this work is dealing with
(typically several thousand of atoms) with long time scales.
More efficient but sufficiently accurate potentials are required. Within the scope of
this work, methods for the construction of efficient and accurate high-dimensional
potential energy surfaces (PES) have been employed, which are based on artificial
neural networks (NN) [47, 48]. These potentials allow to study systems of experimental
length and time scales beyond the capabilities of conventional DFT implementations.
24
5 Neural Network Potentials
5.1 Artificial neural networks
Artificial neural networks (NN) represent a general fitting scheme that in principle
allows to approximate any function to arbitrary accuracy [49]. The NN algorithm is
inspired by the biological neural network of the brain. In contrast to other regression
methods the functional form of the underlying problem does not need to be known
when using NNs. They state a flexible class of fitting functions that are able to learn
unknown target functions to high accuracy using a training set of known function
values. The training set is presented to the NN in order to find the best values for
the rather large number of parameters using an optimization algorithm. Out of the
many types of neural networks, the class of multilayer feed-forward neural networks
has particularly proven to be a useful tool for the representation of potential-energy
surfaces (PES) [78].
5.1.1 Feed-forward neural networks
The feed-forward neural network shown in Fig. 5.1 consists of one input layer, one
hidden layer and one output layer. In each layer there are several nodes represented in
the figure by squares (input layer and output layer) and circles (hidden layer). For the
use as atomistic potential the input nodes define the atomic configuration (e.g., in form
of bond lengths and bond angles) and may be given by a set of atomic coordinates:
Gi = (Xi ,Yi , Zi ). The output layer consists of only a single node, whose value is the
predicted energy of the atomic configuration. The nodes in the hidden layer do not
have any physical meaning, but provide the functional flexibility of the NN. All nodes
in each layer are connected to the nodes in the adjacent layers by weight-parameters,
wkij , represented by the black arrows in Fig. 5.1. They are the fitting parameters of the
neural network.
25
Figure 5.1 A small example of a two-dimensional feed-forward neural network (NN) presenting
a functional relation between the energy E (output) and the coordinates G1 and G2 describing
the atomic configuration (input). The analytic expression for this atomic NN is given in
equation 5.1.
The output value of such a neural network is calculated in the following process: first,
the coordinates of an atomic configuration are provided in the input nodes of the NN,
where each input node refers to one particular degree of freedom. The coordinates are
then passed to the nodes in the first hidden layer by multiplying their numerical values
by the connection weight values. On each node in the hidden layer these products are
summed up and an activation function fai is applied to the sum
!#
"
3
Eatom = fa2 w201 + ∑ w2j1 fa1 w10 j +
j=1
2
∑ w1µ j Gi
µ
.
(5.1)
µ=1
In general, the activation function is a non-linear function that introduces the capability
to fit non-linear functions into the NN. Typical examples are the sigmoid function
f (x) = 1/(1 + e−x ) ,
(5.2)
which has a similar form as the activation functions of biological neurons, the hyperbolic tangent
f (x) = tanh(x) ≡
26
e2x − 1
e2x + 1
,
(5.3)
and the Gaussian functions
f (x) = e−α x
2
.
(5.4)
Sometimes periodic functions, such as the cosine function can be useful for fitting
periodic potentials. To avoid any constraint in the range of output number, a linear
activation function f (x) = x is used in the output layer.
There are several advantages of NN potentials over conventional atomistic potentials:
the functional form of the potential energy surface (PES) does not need to be known
beforehand, as NNs are highly flexible and can represent any arbitrary function. Moreover, a systematic improvement of the NN potential is possible, if new data for the
training set becomes available. Also note, that a NN fit from any electronic structure
method (DFT, HF, MP2, etc. ) is possible. The main target of the NN potentials is
the total energy. Nevertheless, the analytic derivatives of the functional form of the
neural network are readily available, which allows for the fast computation of forces
and therefore for the speedup of molecular dynamics (MD) simulations.
A number of conceptual problems prevent a direct application of potentials based
on conventional feed-forward neural networks to condensed systems. First, there
is a fixed number of input nodes that is defined for a certain number of atoms (or
degrees of freedom). An NN fit is therefore only valid for one system size. Second,
the energy expression is not invariant with respect to rotation, translation, and the
permutation of equivalent atoms. Therefore, a new NN scheme is required to deal with
high-dimensional systems.
5.2 High-dimensional neural network potentials
To be useful as general class of atomistic potentials, it is necessary to overcome the
limitations of the conventional low-dimensional NNs, so that the method becomes
applicable to systems with a significantly larger number of atoms. Independently,
Smith and coworkers [79, 80] and Behler and Parrinello [53] suggested to decompose
27
Figure 5.2 A high-dimensional neural network potential consisting of N atomic neural networks.
The total energy output E of the high-dimensional network is given as sum of the individual
atomic energies Ei .
the total structural energy E into a sum of atomic energies
N
E = ∑ Ei
.
(5.5)
i=1
Each atomic energy Ei is in turn represented by a conventional feed-forward neural
network that takes the local chemical environment into account. While Smith et al.
employ a description of the atomic environment using chains of atoms and coordination functions, Behler and Parrinello have developed universal many-body functions,
the symmetry functions Gi , that are able to capture a local structural fingerprint. A
schematic depiction of the Behler–Parrinello approach is shown in Fig. 5.2. The
atomic configuration is given by a set of Cartesian coordinates Ri = (Xi ,Yi , Zi ) (red
squares in Fig. 5.2), which are transformed to rotationally and translationally invariant
coordinates in form of symmetry functions Gi . These many-body functions depend
only on the relative positions of the atoms in the structure [81]. Additionally, the radial
extension of the symmetry functions is confined by a cutoff function fc , so that only
the local atomic environment of atom i contributes to Gi . The values of the symmetry
functions are the input vectors of the atomic neural networks, which yield environment
dependent atomic energy contributions. Note, that the atomic NNs of all atoms of the
same species are identical, which implicitly imposes a permutational symmetry for
28
equivalent atoms. It is furthermore straightforward to adapt the high-dimensional NN
of Fig. 5.2 to any arbitrary number of atoms, simply by including the corresponding
number of atomic NNs.
High-dimensional NNs of the Behler–Parrinello type have been successfully employed
in studies of various kinds of materials, such as silicon [55, 56], sodium [82], carbon [83, 84], and copper [85]. For all applications in this thesis the Behler–Parrinello
method as implemented in the RuNNer code has been used [78].
5.2.1 Training of the atomic neural networks
The optimization of the weight parameters of the atomic NNs proceeds by an iterative
minimization of the error for a given set of reference energies. In this work, the
adaptive Kalman filter optimization algorithm has been used [77, 86–88]. Reference
energies can be obtained from electronic structure calculations, for example using
density-functional theory. Note, that it is not necessary to decompose the reference
total energy into atomic contributions, as the training of the atomic NNs can be done
simultaneously, using the total energies as target values. Not only the potential energy
surface itself, but also its gradient, i.e., the atomic force components, can be used as
reference data for the NN optimization [87, 89, 90]. A structure containing N atoms
thus provides 3 N + 1 pieces of information, namely the total energy and the 3 N force
components. The force ~Fk acting on atom k is given by the negative gradient of the NN
function, i.e., for the cartesian components Fα,k (with α = x, y, z)
N
N Mi
∂E
∂ Ei
∂ Ei ∂ Gi, j
Fαk = −
=−∑
=−∑ ∑
∂ αk
i=1 ∂ αk
i=1 j=1 ∂ Gi, j ∂ αk
,
(5.6)
where Mi is the total number of symmetry functions for atom i. The derivative of the
symmetry function with respect to the cartesian direction,
∂ Gi, j
∂ αk ,
only depends on the
functional form of the symmetry function Gi, j , whereas the specific NN function enters
the derivative of the atomic energy,
∂ Ei
∂ Gi, j .
A comparison of the NN training procedure with and without the use of the force
information is shown in Figs. 5.3 and 5.4 for a model potential.
29
(a)
(b)
(c)
(d)
Figure 5.3 Demonstration of the neural network optimization process: A feed-forward NN
with one hidden layer and two nodes is trained to a one-dimensional model potential using the
reference energies (black diamonds) only. The NN output (red line) as well as the indiviual
contributions to the total NN energy from both nodes (blue and green lines) and the bias weights
(dashed blue line) are shown for different epochs.
30
(a)
(b)
(c)
(d)
Figure 5.4 Demonstration of the neural network optimization process: A feed-forward NN
with one hidden layer and two nodes is trained to a one-dimensional model potential using
reference energies (black diamonds) and forces (black line). The NN output (red line) as well
as the indiviual contributions to the total NN energy from both nodes (blue and green lines)
and the bias weights (dashed blue line) are shown for different epochs. The NN forces are
represented by the orange line.
31
5.2.2 Symmetry functions
A number of many-body functions that are suitable for the use as symmetry functions
in high-dimensional NNs are given in Ref. [81]. For the discussion in this section we
will restrict ourselves to the three most common symmetry functions. The notation of
Ref. [81] is followed.
In general, the symmetry function values Gi depend on the positions of all atoms in the
local environment of atom i defined by a cutoff radius Rc as indicated by the dotted
arrows in Fig. 5.2, and the radial cutoff is imposed using a cosine cutoff function

h i
0.5 × cos π Ri j + 1 for Ri j ≤ Rc ,
Rc
fc (Ri j ) =
(5.7)
0
for R > R .
ij
c
The simplest radial symmetry function, G1i , is simply given by a sum over cutoff
function values for each atom j
N
G1i = ∑ fc (Ri j ) .
(5.8)
j6=i
The alternative radial symmetry function G2i is defined as
N
2
G2i = ∑ e−η (Ri j −Rs ) fc (Ri j ) ,
(5.9)
j6=i
where the parameters η and Rs define the width and the center of Gaussian functions,
respectively. An example of the radial symmetry functions G2i is shown in Fig. 5.5.
While the radial symmetry functions are constructed for each pair of atoms, the angular
symmetry functions depend on all triplets of atoms in the structure by combining the
~R ·~R
cosine of the angles θi jk = i j ik centered at atom i, with ~Ri j = ~Ri − ~R j . In all NN
Ri j Rik
potentials constructed for this thesis the angular symmetry function G4i has been used,
which is defined as
2 +R2 +R2
−η
R
ζ
4
1−ζ
ij
ik
jk
Gi = 2
· fc Ri j · fc (Rik ) · fc R jk
∑ ∑ 1 + λ · cos θi jk · e
j
,
k
(5.10)
32
1.0
0.8
G
1
0.6
0.4
η=0.0009 Bohr
-2
η=0.0100 Bohr
-2
η=0.0200 Bohr
-2
η=0.0350 Bohr
-2
η=0.0600 Bohr
-2
η=0.1000 Bohr
-2
η=0.2000 Bohr
-2
η=0.4000 Bohr
-2
0.2
0.0
0
1
2
3
4
5
6
7
Rij (Å)
Figure 5.5 An example of the radial symmetry functions G2i with differnt η parameters.
2.0
4
G (θijk)
1.5
1.0
λ = 1, ζ = 1
λ = -1, ζ = 1
λ = 1, ζ = 4
λ = -1, ζ = 4
0.5
0.0
0
60
120
180
θijk /
o
240
300
360
Figure 5.6 An example of the angular symmetry functions G4i with different λ parameters.
where the parameter λ can assume the values +1 and -1, and the value of ζ determines
the angular parameter. An example of the angular symmetry functions G4i is plotted in
Fig. 5.6.
5.3 High-Dimensional Neural Networks for Multicomponent Systems
Structures containing different atomic species may exhibit charge transfer that leads to
long-ranged non-local electrostatic interactions, which may not well be represented
in dependence of a local atomic environment. The long-range electrostatic energy
is dominated by the interaction between charges (that decays slowly with 1 over r),
while electrostatic interactions from higher multipoles like dipoles and quadropoles
33
Figure 5.7 A high-dimensional neural network potential for multicomponent systems. An
additional NN is employed to predict the atomic charges, which in turn can be used to compute
the electrostatic energy of the structure.
decay faster with the atomic distance and can therefore be covered already well by the
short-range part.
The high-dimensional NN method can be extended by an additional NN that allows the
prediction of atomic charges [94, 95] and enables the application of the NN method to
multicomponent systems. The choice of the charge partitioning method is arbitrary, for
instance, Mulliken [91], Hirshfeld [92] , Bader [93].
A high-dimensional NN for multicomponent systems thus consists of two highdimensional NNs: one NN for the short-range energy and one NN for the atomic
charges (Fig. 5.7). Standard methods, such as the Ewald summation can then be
used to evaluate the electrostatic energy for periodic systems. The short-range energy
contribution can be easily obtained from the reference calculation by subtracting the
electrostatic energy as computed using the reference charges from the total reference
energy
Eshort,ref = Etot,ref − Eelec,ref
.
(5.11)
The short-range energy and the reference charges can then be used to train the shortrange and the charge NN independently from each other. However, if one wants to
use the atomic forces for the optimization process this procedure can not be applied
anymore since the electrostatic force contribution is not directly accessible from the
34
reference calculation. In order to obtain the electrostatic forces related to the reference
charges, the derivatives of the charges with respect to the atomic positions αk ,
∂ qi
∂ αk ,
have to be known. Because we are dealing with environment-dependent charges this
derivative is non-zero. Unfortunately, the dependence of the charges on the environment
is not available from the reference calculations. In order to still make use of the atomic
forces for the weight optimization the following approach is therefore taken:
1. the electrostatic NN is trained to the reference charges (This does not require the
knowledge of the reference charge derivatives.),
2. the electrostatic energy and force contributions are computed using the electrostatic NN. The derivatives of the charges with respect to the atomic positions are
given by the NN architecture and the definition of the symmetry functions:
∂ qi
=
∂ αk
Mi
∂ qi ∂ Gi, j
∂ αk
∑ ∂ Gi, j
j=1
.
(5.12)
3. the electrostatic energy and force components given by the electrostatic NN are
subtracted from the reference total energy and forces and the short-range NN is
trained to the remaining short-range part of the potential.
5.4 Benchmark of activation functions and symmetry functions
In order to select the most appropriate activation functions for the NN fitting, a number
of options have been tested, such as the hyperbolic tangent (t) of Eq. (5.3), Gaussian
functions (g), linear functions (l), and cosine functions (c). During the iterative fitting
process some quantities can be investigated to check the accuracy of the NN fits: the
root mean squared error (RMSE), which is given for a reference set of M data points as
s
2
∑M
i (ENN − Eref )
ERMSE =
,
(5.13)
M
is most commonly used.
For the benchmark copper dimer structures with 110 different interatomic distances
have been used as training set. In all cases a network architecture with a single hidden
35
Table 5.1 RMSEs of neural network (NN) fits for the Cu dimer
(110 data points) after 100 iterations (epochs). The NN architecture is 1-5-1 (five nodes in the hidden layer) with radial
a symmetry function of types G1i and G2i . The symbols min,
max, and avg (different random seeds) in the table refer to the
minimum, maximum, and average RMSE values, respectively.
σtrain is the standard deviation of the training set. The activation
function types (act) are discussed in the text.
function type
symm
1
2
act
RMSE / meV
min
maxtrain
avgtrain
σtrain
0.48
14.10
2.12
1.65
0.54
0.40
11.39
2.70
1.78
l
55.90
75.79
55.90
55.90
0.00
t
0.23
0.27
21.05
1.86
1.47
c
0.16
0.34
2.31
0.49
0.25
g
0.16
0.26
3.26
0.55
0.35
l
9.03
1.49
9.03
9.03
0.00
t
0.13
0.12
2.45
0.38
0.21
train
test
c
0.31
g
layer containing five nodes has been used (1-5-1) with with radial a symmetry function
of types G1i and G2i . Table 5.1 shows the results of the benchmark for the different
activation function types (c, g, l and t) using the RMSE to monitor the accuracy of the
fit.
The hyperbolic tangent as activation function in combination with the symmetry function of type G2i yields the lowest RMSE in the benchmark. Several further benchmarks,
which are not shown here, have confirmed this finding. Therefore, these settings were
used for all NN potentials constructed in this thesis.
Neural networks can provide total energies which are close to reference electronic structure energies, for example from DFT calculations, and the corresponding derivatives
(forces). Neural networks can not directly provide electronic properties of molecular
36
systems, i.e., NNs can not access information about the electronic states and there is
no charge density available from the neural network output.
Therefore, the main use of neural network potentials is to compute energies and forces
to speed up molecular dynamics simulations so that longer time scales and larger
system sizes can be simulated in order to make structural and dynamical properties
available by still remaining very close to DFT accuracy [48, 53, 56, 96].
5.5 Molecular dynamics simulations employing NN potentials
In order to employ NN potentials in molecular dynamics (MD) simulations to study
the structural and dynamical properties of complex systems, the NN code RuNNer [78]
has been interfaced with the TINKER MD program [97].
TINKER is a general package for molecular mechanics and dynamics simulations. The
original TINKER code has been extended to allow the use of the RuNNer program for
the evaluation of the potential energy. TINKER offers a convenient interface for the
implementation of new potentials, which made the implementation straightforward. In
order to clarify the combination of TINKER and RuNNer, the flow chart in Fig. 5.8
schematically shows the interaction of the two programs.
Note, Fig. 5.8 depicts how the added subroutines interact with the TINKER program:
• nninput (nninput.f):
This subroutine writes the atomic coordinates and lattice vectors to a RuNNer input file with the name input.data. The dimensions are automatically converted
(e.g., Angstrom to Bohr).
• nnenergy (nnenergy.f):
The routine reads the current value of the energy (in Hartree atomic units) from
the RuNNer output file energy.out, converts it to kcal/mol and stores it in the
corresponding TINKER data structure.
• nnforces (nnforces.f):
The final additional subroutine reads the atomic forces (in Ha/Bohr) from the
RuNNer output file forces.out, converts them to kcal/mol/Angstrom and
returns the negative forces (i.e., the gradients) to TINKER.
37
Figure 5.8 A simple visualization of the interaction of RuNNer [78] and TINKER [97], see
note in the text for a detailed explanation.
38
Part III
Computational Details
39
6 Computational Details
6.1 DFT calculations
The reference DFT calculations to train the NN potential for copper and zinc oxide
have been carried out using the Fritz Haber Institute ab initio molecular simulations
(FHI-aims) code discussed in Chapter 3 [57]. At the beginning of each calculation, the
basis functions, which are given numerically on spherical grids centered at the nuclei,
are determined by solving the Kohn-Sham equations for the free atoms. In the present
work we have used the “tier 1” basis set for copper and zinc and the “tier 2” for oxygen
atom. These basis sets are part of the default basis set library provided by FHI-aims,
and they contain a minimal basis set of atomic orbitals as well as a set of additional
basis functions constructed as hydrogen-like orbitals, i.e., they are obtained from a
fictitious hydrogen atom with a modified nuclear charge and a specific set of quantum
numbers. In total, 40 basis functions have been used for each atom.
Dense k-point meshes have been used with k-point densities approximately equivalent
to a 12×12×12 mesh of a conventional four atom fcc unit cell for all periodic structures.
The calculated total energies are converged to a about 1 meV per atom, the forces to
about 10 meV/Å. The PBE functional [98] has been used in all calculations to describe
electronic exchange and correlation. Relativistic effects have been included via the
scaled zeroth order regular approximation (ZORA) [99].
The local, atom-centered basis functions enable to calculate systems with and without
periodic boundary conditions in a consistent way. This has been exploited by using
periodic systems for bulk and slab structures, and non-periodic structures for clusters.
The DFT calculations yield the total energies and atomic forces, which both have been
used to construct the NN potential. A DFT calculation for an N atom system provides
3N + 1 pieces of information for the fitting as there is one total energy and 3N force
components per structure.
41
6.2 Construction of reference data sets
With the exception of the data discussed for the methanol molecule in Sec. 10, DFT
calculations have been carried out to construct the reference data sets for NN potentials.
In general, the configurations in the reference set include bulk structures, slabs, and
clusters of variable size comprising up to 100 atoms. Approximately 10 % of these
structures have been selected as test set to check the generalization properties of the
potential for unknown structures. The remaining 90 % have been used to determine the
NN parameters. A detailed discussion of the reference data sets is given in the chapters
for each individual system.
In each case, the reference data set has been generated in a self-consistent, iterative
procedure. A number of schemes have been proposed in the literature to add data
points step by step in important regions of the potential energy surface (PES) in the
context of empirical potentials [100, 101] and also in the field of neural networks [102].
A first approximate potential has been constructed based on an initial data set derived
from ideal crystal structures such as face-centered cubic (fcc), body-centered cubic
(bcc), hexagonal close-packed (hcp), and simple cubic (sc) in a unit cell volume range
of about 50 Bohr3 per atom. The six-dimensional space of lattice vectors has been
mapped systematically for these structures employing a primitive unit cell containing
one atom, except for the hcp structure, which has a lattice basis of two atoms. This
has been done by systematically varying the lattice parameters a, b, c, α, β and γ.
Consequently, a large number of bulk cells containing only one or two atoms are
included in the reference data sets. In particular the bulk structures containing just
one atom provide valuable information for the fit, because in these structures all atoms
have the same chemical environment. Therefore, a unique energy can be assigned to
each vector of symmetry functions describing the atomic environments.
Further bulk structures have been generated for supercells containing two or four atoms.
In these structures, which were derived from fcc, bcc, sc, and hcp (super)cells, the
symmetry has been broken by displacing the atoms randomly by up to 0.5 Bohr from
the ideal lattice sites, which corresponds to a thermal distortion.
Apart from the bulk structures, the initial data set contains also surface structures,
which have been represented by slabs. For example, small slabs with 10 or 12 atoms
42
Figure 6.1 A systematic approach to construct neural network potentials
(one atom per layer) have been constructed for the low index surfaces of fcc, bcc and
sc copper. Also here, the volume has been varied and a number of structures with
randomly displaced atoms has been generated.
6.2.1 A systematic approach to construct neural network potentials
The construction of neural network potentials follows a systematic approach in Fig. 6.1.
The figure shows a flow chart of the general procedure that is applied. In a first
step a number of DFT calculations for random structures (the first training set) is
performed. Based on the results a first regression leads to a preliminary potentialenergy surface. The generated fit is then used to determine further structures to be
included in the training set in order to improve the quality of the NN potential. This is
done, e.g., by performing molecular dynamics simulations or structural optimizations
using the preliminary potential. Successively, those structures occurring during these
calculations that lay beyond the fitted regions of the PES are calculated with DFT and
— in the case of poor agreement between DFT and NN energies — included in an
43
extended training set. For the new training set the whole scheme is repeated until for
all occurring structures a good agreement between DFT and NN potential is achieved.
6.2.2 Iterative refinement of the data sets
A disadvantage of the procedure described so far is the high number of DFT calculations
for similar structures that will not necessarily improve the reference data set. To avoid
such redundant calculations we have followed a different approach to refine the data
set.
Based on the initial data set, several NN fits have been generated. They have then
been employed in MD simulations of larger bulk and surface systems containing
up to a few hundred atoms using the NVT ensemble. While a few of the smaller
structures generated in this way have been recalculated directly by DFT and were
added to the training set to improve the fit, for most atomic environments emerging
in these simulations a more efficient approach has been followed. Since the energy
contribution of each atom depends only on its local chemical environment, a new
atomic environment can be added to the training set by cutting a cluster centered at the
respective atom from systems, which are too large for DFT calculations. The advantage
of this procedure is that only comparably small clusters need to be calculated by DFT.
Most of the clusters in the reference data sets have been generated in this way.
Following the same spirit, a large number of bulk and surface structures, which have
been generated in MD simulations at temperatures between 300 and 1000 K, have
been searched systematically for appropriate clusters to be added to the data set. This
can be achieved by exploiting the large flexibility of NN potentials. Two NN fits of
about the same quality in terms of the RMSE of the energy and the forces can be
used to identify missing points in the reference set. If an atom has an environment
that is very different from the environments already included in the training set, the
two NN fits are likely to predict very different energy contributions for this atom. By
comparing the predicted atomic energies of two approximate potentials, the atomic
environments missing in the training set can thus be systematically determined. Then,
clusters centered at these atoms are recalculated by DFT to improve the reference data
set. This procedure is repeated until the root mean squared errors of the training and
the test set have converged.
44
This two-fits approach makes it possible to search a large region of the configuration
space for missing reference points, without the need of redundant expensive DFT
calculations [85]. Examples of such analyses are given in the following chapters for
specific NN potentials.
6.3 Optimization of the neural network architecture
As discussed in Chapter 5, apart from the numerical values of the weights, also the NN
architecture is important for the accuracy of the potential, because a large number of
hidden layers and nodes increases the flexibility of the NN. If the NN is too small, it is
not able to represent all subtle features of the PES and some details may be missing in
the final potential. If the NN is too large, it has a very high flexibility and over-fitting
can occur, i.e., the training structures are well represented, while atomic configurations
in between the training points can have a drastically reduced accuracy.
Both situations can be detected by monitoring the RMSEs of the training set and the
test set. In the initial stage of the fit, the RMSEs of both sets decrease, since the NN
learns the overall topology of the NN. If the error of both sets remains similar in the
course of the fitting process, but the RMSEs are still high, the NN size should be
increased. If, on the other hand, the RMSE of the training set is low, but the test set
RMSE is much larger, then over-fitting is present.
In general, the smallest possible NN should be used that provides the desired accuracy
and similar training and test errors. In practice we found it most efficient to determine
the optimum NN architecture in an empirical way. A number of NN PESs with different
architectures is constructed and the one with the best generalization properties, i.e., the
lowest test set error, is selected for applications.
Finally, it should be noted that the determination of the NN weights represents a
very high-dimensional optimization problem, and there is no hope to find the global
minimum, since there are typically several thousand weight parameters to be optimized.
Still, in most cases very accurate local minima can be found, which represent all
physical properties of the system with good accuracy. For a given NN architecture
there are many local minima, and the final result depends on several initial settings,
like the choice of the initial values of the weights, the order of the training points and
the optimization algorithm.
45
46
Part IV
Results
47
7 A Neural Network Potential for Copper
The interactions present in a metal are substantially different from what is observed in
covalent insulators: in contrast to covalent structures, the electronic wave functions
are very long ranged and delocalized in metals. Empirical interatomic potentials are
therefore usually specialized on either metallic or covalent systems. In this chapter the
first application of neural network potentials for a metal, namely for a copper potential,
is discussed. It is demonstrated that the high flexibility of the NN is able to capture
metallic interactions with the same accuracy that is known for covalent insulators.
7.1 Reference data set
All together 37,763 DFT calculations have been performed to construct the reference data set. These structures contain 8,419 clusters, 15,448 bulk structures, and
13,896 slabs. The structure sizes include various small numbers of atoms ranging from
one atom (bulk structures with different lattice vectors) up to about 144 atoms. In
total, there are 617,475 atomic environments in these structures. Each DFT calculation
provides the total energy and three force components per atom. Therefore, this data
set contains 1,890,188 pieces of information that can be used to construct the NN
potential. The structures have been distributed randomly into a training set, which is
used to optimize the weights of the NN (33,963 structures), and an independent test set
(3,800 structures – approximately 10 % of the training set) which is used to check the
transferability of the potential. Lists of the copper clusters, bulk and slab structures in
the training and test sets are given in Tables 7.1, 7.2, and 7.3, respectively.
The reference data set has been generated following the self-consistent iterative procedure described in Sec. 6.2. A first approximate potential has been constructed using
only the ideal crystal structures fcc, hcp, bcc, sc, and diamond cubic in the volume
49
rel. energy (meV/atom)
3500
3000
2500
bcc DFT
diamond DFT
fcc DFT
hcp DFT
sc DFT
2000
1500
1000
500
0
50 60 70 80 90 100 110 120 130
3
Volume (Bohr /atom)
Figure 7.1 DFT energy vs. volume curves for several crystal structures of copper. The atomic
energies are relative to the most stable structure at the minimum lattice constant. The most
stable fcc structure is represented by black diamonds.
range from about 50 Bohr3 per atom to about 130 Bohr3 per atom, which is shown in
Fig. 7.1. The equilibrium atomic volume per atom for copper in the fcc structure, the
ground state structure, is 80.707 Bohr3 .
Further reference structures have been generated systematically by varying the lattice
parameters, as discussed in Sec. 6.2. For the ideal one-atomic crystal structures fcc,
bcc, and sc there are six lattice parameters: the lengths of the lattice vectors a, b, and
c and the angles α, β , and γ. For the two-atomic hcp structure the c/a ratio is an
additional parameter. The resulting large number of bulk cells containing only one or
two atoms is shown in Table 7.2.
Additionally, bulk structures generated for two and four atom unit cells have been
derived from fcc, bcc, sc, and hcp supercells. The symmetry of these structures has
been broken by randomly displacing the atoms by 0.2 up to 0.5 Bohr from the ideal
lattice sites (see Fig. 7.2).
Apart from the bulk structures, also surface structures were included in the data
set. Surfaces were represented by slab models which were generated from the bulk
50
(b)
3500
280
3000
240
2500
2000
1500
1000
500
rel. energy (meV/atom)
rel. energy (meV/atom)
(a)
0
50 60 70 80 90 100 110 120 130
3
Volume (Bohr /atom)
200
160
120
80
40
0
70
80
90
3
Volume (Bohr /atom)
100
Figure 7.2 DFT energies of the copper bulk structures (ideal and distorted structures) included
in the reference data set (a). The enlarged region in panel (a) (black rectangle) shows structures
that are close to the minimum energy (b).
equilibrium lattices by truncation. In particular small slabs with 10 or 12 atoms (one
atom per layer) have been constructed for the low index surfaces of fcc, bcc and
sc copper. Also here, the volume has been varied and a number of structures with
randomly displaced atoms has been generated.
Moreover, also vacancy structures have been included, in which one atom was removed
from the bulk or slab structures in larger supercells. Both, the fixed and optimized
geometries of vacancy structures were considered.
Based on this first input data set, various NN fits were generated. They have then
been employed in MD simulations of large bulk and surface systems containing many
hundreds or thousands of atoms using the NVT ensemble at temperatures between 300
and 1000 K. A few of the smaller structures generated in this way were recalculated
directly by DFT and were added to the training set to improve the fit, but for most
atomic environments emerging in these simulations the more efficient two-fit approach
of Sec. 6.2.2 has been followed. The comparison of predicted atomic energies of two
different NN fits has been used to identify missing reference data points. Figure 7.3
shows a typical series of such an analysis. As reasoned in Sec. 6.2, the energy
contribution of each atom only depends on the local chemical environment. Therefore,
a new atomic environment can be added to the training set by cutting a small cluster,
51
rel. energy (meV/atom)
90
NN 1
NN 2
80
70
60
50
40
0
100
200
300
400
500
600
700
MD step
Figure 7.3 Comparison of the neural network (NN) energies along a molecular dynamics (MD)
trajectory. The MD has been carried out employing the NN1 potential and the energies have
been recalculated using the NN2 potential. These potentials have been constructed using the
same training set [85]. For most configurations the energies of the two NN potentials are
very close. This indicates that these structures are similar to the structures in the training
set. But in the gray regions both fits predict significantly different energies, which means the
configurations in that regions are missing in the training set. Those structures should be added
to the training set.
with a radius that is equal to the range of the symmetry functions (6 Å in the case of the
copper potential), centered at the respective atom from those systems that are too large
for DFT calculations. This iterative procedure was repeated until the root mean squared
errors of the training and the test set had self-consistently converged. The advantage is
that only relatively small clusters have to be calculated by DFT, and therefore most of
the clusters in Table 7.1 have been generated in this way. The remaining clusters in the
training set have been generated randomly. The atomic positions have been specified
by defining minimum and maximum Cu–Cu distances and the cluster radius – all of
the Cu3 clusters have been created in this way.
52
53
3
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
No. of
Cu atoms
447
66
75
94
92
90
88
88
94
88
92
91
89
83
88
89
127
93
87
114
92
104
98
90
106
88
101
113
91
53
6
14
5
7
9
12
10
6
12
8
9
9
17
12
12
17
7
13
13
7
12
6
12
17
13
17
11
10
number of structures
training points test points
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
No. of
Cu atoms
108
109
119
97
101
116
120
131
145
113
77
34
43
48
105
59
44
39
16
126
58
50
50
25
109
41
41
38
44
16
13
17
12
12
15
21
15
16
14
8
4
3
12
9
9
5
2
2
16
7
7
5
3
7
5
5
4
4
number of structures
training points test points
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
94
95
98
99
100
No. of
Cu atoms
86
41
46
52
68
139
39
34
42
45
111
84
98
140
116
295
170
160
59
65
3
24
1
16
160
20
13
154
9
3
5
3
7
11
4
2
2
11
6
17
14
17
43
18
21
4
6
2
5
20
5
4
11
number of structures
training points test points
Table 7.1 A list of all copper cluster structures in the training and the test set.
Table 7.2 A list of all copper bulk structures in the training and the test set.
Cu atoms
54
number of structures
training points
test points
1
5145
531
2
3499
3
Cu atoms
number of structures
training points
test points
26
1
-
377
27
2
-
2
-
31
3
-
4
4938
558
32
2
-
5
1
-
35
4
2
7
2
1
36
4
-
8
287
37
47
2
1
11
2
-
48
1
1
12
2
-
53
5
-
15
6
-
54
4
-
16
3
1
71
2
1
17
2
-
72
1
1
18
1
1
107
3
-
23
6
-
108
2
-
24
4
-
Table 7.3 A list of copper surface structures in the training and the test set.
Cu atoms
number of structures
training points
test points
3
9
1
4
11
5
Cu atoms
number of structures
training points
test points
27
8
-
-
31
14
3
19
3
32
16
2
6
7
2
33
106
7
7
1
-
36
1
-
8
15
1
47
14
-
9
1
-
48
14
2
10
8550
956
63
2
-
11
9
2
64
2
-
12
3598
421
71
12
2
15
12
2
72
13
2
16
17
2
95
2
-
17
8
3
96
2
-
18
7
1
143
1
-
24
1
-
144
1
-
26
9
2
55
7.2 A neural network potential for copper
Several NN potentials for the copper system have been constructed using different
sets of initial weight parameters and different NN architectures in order to identify
the optimum functional form. A subset of 15,448 copper bulk structures was used to
determine the first settings for the NN training. At this stage, the atomic forces were
not included in the fit. We have compared the root mean squared error (RMSE) results
for different numbers of hidden layers of the NN architectures. One to four hidden
layers with different network sizes per layer (2, 5, 10, 15, 20, 30 and 40 nodes) have
been tested, and we have used hyperbolic tangent activation functions at the nodes of
the hidden layers, and a linear activation function at the output node. The number of
weights of each set are listed in Table 7.4. The RMSEs of the energies for the networks
with three or four hidden layers are not significantly lower than for the networks with
two hidden layers, i.e., less than 0.1 - 0.2 meV/atom as also shown in Table 7.4. One
important factor for the selection of a potential for applications are the RMSEs of the
forces (for both train and test sets). Usually we take the NN fit with low RMSEs of the
energies and with the lowest RMSEs of the forces. These tests have shown that the
NN with 2521 weight parameters and with a network size of 30 nodes with two hidden
layers results in the lowest error in the forces as shown in Fig. 7.4. The computation
time to optimize the weights parameters of larger networks (e.g. 30 and 40 nodes per
layer) with three and four hidden layers, on the other hand, is about two or three times
higher than for NN architectures with two hidden layers. For the following applications
we therefore have only used network architectures with two hidden layers and have
also included the forces for the optimization of the weights.
Also for the full data set (copper clusters, bulk structures and slabs), several NN
architectures have been tested like two hidden layers with different number of nodes
per layer. We have used hyperbolic tangent activation functions at the nodes of the
two hidden layers, and a linear activation function at the output node. It has been
found that a small network with a 51-10-10-1 NN architecture is able to provide rather
low RMSE values of 4.870 meV/atom for the energy RMSE of the training set and
4.592 meV/atom for the test set. The training set and test set forces obtained with this
NN have RMSEs of 41.286 meV/Bohr and 41.519 meV/Bohr. The RMSE values of
56
Table 7.4 The RMSEs of energies (meV/atom) and forces (meV/Bohr) for the neural network (NN)
training set for copper bulk structures with different sets of initial parameters. The errors have been
observed at the 100th iteration for NN architectures with 51 symmetry functions and different number
of NN weights.
Number of weight parameters (energy RMSEs, force RMSEs)
Nodes per layer
1 layer
2 layers
3 layers
4 layers
2
107
(3.1, 20.4)
113
(2.7, 18.4)
119
(2.5, 9.6)
125
(2.8, 18.6)
5
266
(2.7, 13.7)
296
(1.9, 11.3)
326
(1.8, 15.1)
356
(1.8, 13.1)
10
531
(2.3, 13.5)
641
(1.5, 14.6)
751
(1.4, 9.8)
861
(1.5, 9.6)
15
796
(2.0, 9.7)
1036
(1.5, 8.8)
1276
(1.3, 13.6)
1516
(1.3, 10.0)
20
1061
(2.0, 9.1)
1481
(1.5, 9.9)
1901
(1.4, 11.1)
2321
(1.4, 11.7)
30
1591
(1.9, 8.9)
2521
(1.4, 8.4)
3451
(1.5, 11.6)
4381
(1.7, 11.6)
40
2121
(1.8, 9.4)
3761
(1.8,10.0)
5401
(1.8, 11.4)
7041
(1.7, 10.6)
force RMSE (meV/Bohr)
20
18
16
2 hidden layers
3 hidden layers
4 hidden layers
14
12
10
8
6
0
10
20
30
40
# Nodes in hidden layers
Figure 7.4 Comparison of the RMSEs of the forces (meV/Bohr) for the neural network (NN)
training set for copper bulk system. The errors have been observed at the 100th iteration for
NN architectures with 51 symmetry functions and different numbers of NN weights. The NN
potentials have not used atomic forces for the optimization of the weights.
57
energy RMSE (meV/atom)
7.0
51-10-10-1 NN
51-20-20-1 NN
51-30-30-1 NN
51-40-40-1 NN
6.5
6.0
5.5
5.0
4.5
4.0
3.5
2
4
6
8
10 12 14 16 18 20
Iteration
Figure 7.5 Comparison of the fitting errors of the training set for several neural network (NN)
architectures.
the training set energies of the first 20 iterations are shown for four different network
architectures in Fig. 7.5. It can be seen that the reduction of the error is only marginal
if the NN size is increased beyond 51-30-30-1 NN and/or 51-40-40-1 NN, therefore
these were chosen for a closer investigation. Not only the values of the RMSEs of the
energies and forces are important for the selection of a NN potential for applications,
but also the predictive power needs to be checked. It has to be verified that the NN
potential can represent and predict properties such as geometries, energies and forces
of unknown structures with the same accuracy as for the reference data.
The fit, which has been selected for the productions, was obtained using a 51-30-30-1
NN architecture with 2521 weight parameters and including atomic forces for fitting.
The RMSEs of the energies are 3.63 meV/atom and 3.98 meV/atom for the training
and the test set, respectively. The errors of the forces are 42.79 meV/Bohr (training
set) and 42.03 meV/Bohr (test set).
For this fit the mean absolute errors (MAE) of the energies are 2.09 (training set) and
2.23 (test set) meV/atom. The MAEs of the forces are 29.29 and 29.43 meV/Bohr
for the training set and test set, respectively. Note, that the MAEs are expected to be
smaller than the RMSEs, as outliers have a smaller impact on their value. Overall,
these errors suggest that there is no overfitting because the training and the test set
have almost the same errors for both, the energies and the forces.
58
(a)
(b)
-0.5
Train points
Test points
-1.0
-1.5
-2.0
-2.5
-3.0
15000
number of points
binding energy NN (eV/atom)
0.0
Train points
Test points
10000
5000
-3.5
0
-4.0
-4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0
binding energy DFT (eV/atom)
<1 <2 <3 <4 <5 <6 <7 <8 <9 >9
energy error per atom (meV)
Figure 7.6 (a) Comparison of the DFT and neural network (NN) energies of the structures in
the training set and the independent test set. All points are very close to the line with a slope
of 45◦ corresponding to a perfect fit. For clarity, binding energies obtained by removing the
energies of free atoms are shown instead of total energies. (b) Fitting error distribution in the
training and the test set. In total the training set contains 33,963 structures, and the test set
consists of 3,800 structures.
In Fig. 7.6(a) the NN binding energies of the training and the test set are plotted as a
function of the DFT energies. As can be seen in the graph, all points are very close
to a line with a slope of 45◦ (blue solid line), which corresponds to an optimal fit. In
Fig. 7.6(b) the distribution of errors is shown, and most of the data points have an error
of less than of about 3 meV/atom.
7.2.1 Copper clusters
The first copper system we study is copper clusters. In order to investigate the accuracy
of the NN potential, random Cu30 clusters from the independent test set have been
selected and the NN total energies have compared to the DFT total energies. Fig. 7.7
presents a high accuracy of the NN potential. It can be seen that the NN energies can
represent DFT energies very accurately.
Apart from total energies, the absolute forces of a Cu14 cluster have been also checked.
In Fig. 7.8 the absolute forces acting on the copper atoms of the Cu14 cluster, which is
59
rel. energy (meV/atom)
400
DFT
NN
300
200
100
0
0
5
10
15
20
25
30
35
40
number of structure
Figure 7.7 Comparison of the neural network (NN) energies and the DFT energies for random
Cu30 clusters.
generated randomly from an MD simulation at 300 K and the structure is not included
in the training set, are shown. The NN atomic forces and the DFT atomic forces are in
good agreement.
We are more interested in applying the NN potential to real copper systems than
random clusters. Therefore, in the next sections, the accuracy and reliability of the NN
potential for bulk structures and real surfaces will be checked.
7.2.2 Bulk copper
A reliable description of bulk copper is a necessary requirement for the application of
the NN potential to surfaces, because every surface model includes features of the bulk
structure. The simplest model of a surface structure is that of the truncated bulk (the
ideal surface). The properties of different crystal structures of copper, have therefore
been investigated first. The most important energetic quantity is the cohesive energy,
which is defined as
Ecoh =
60
1
· (Ebulk − N · Eatom )
N
(7.1)
|F| (eV/Bohr)
1.0
DFT
NN
0.8
0.6
0.4
0.2
0.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Cu atom number
Figure 7.8 Comparison of the DFT and neural network (NN) forces acting on the copper
atoms in a Cu14 cluster. The structure has been chosen randomly from a molecular dynamics
trajectory at 300 K.
where N is the number of atoms in the bulk unit cell, and Eatom is the energy of a free
copper atom in its electronic ground state, which is a constant determined by DFT.
The energy of an isolated atom is not included in the NN training set. The equilibrium
lattice constants and bulk moduli B of the different bulk structures, i.e., fcc, bcc, sc
and hcp have also been calculated. Table 8.5 shows the results that have been obtained
with the NN potential and DFT.
The NN predicts that fcc copper is the most stable crystal structure, which is in agreement with DFT and experiment. The equilibrium fcc lattice constant obtained by the
NN potential is 3.630 Å, compared to 3.630 Å in DFT and 3.615 Å in experiment [103].
The NN results of the fcc, bcc, and sc bulk modulus of 138, 135, and 108 GPa are also
very close to the DFT values of 140, 137, and 103 GPa, respectively.
Additionally, the elastic constants have been calculated (the calculation of elastic
constants of cubic lattices is outlined in Sec. B). For fcc copper the NN elastic constants
are c11 =177 GPa (DFT: 173 GPa), c12 =119 GPa (DFT: 123 GPa), and c44 =83 GPa
(DFT: 80 GPa), the agreement between DFT and the NN potential is excellent, and
comparable to the experimental values. Also for other investigated crystal structures,
such as the bcc and sc structures, the NN potential is able to predict their elastic
constants similar to the DFT reference. All results are shown in Table 7.6.
Further, the vacancy formation energies in different crystal structures for several
vacancy concentrations have been investigated, which depend on the supercell size.
61
Table 7.5 Comparison of the lattice parameters, cohesive energies and
bulk moduli (GPa) of four crystal structures of copper obtained with the
neural network (NN) and density-functional theory (DFT).
Ecoh /eV
lattice parameters
bulk modulus
DFT
NN
DFT
NN
DFT
NN
fcc
3.533
3.526
a =3.630Å
a =3.630Å
140
138
bcc
3.489
3.486
a =2.885Å
a =2.887Å
137
135
sc
3.051
3.052
a =2.407Å
a =2.407Å
103
108
hcp
3.511
3.511
a =4.862Å
a =4.856Å
-
-
c/a =1.627
c/a =1.631
Table 7.6 Elastic constants (GPa) of fcc, bcc, and simple cubic structures of copper obtained with the neural
network (NN), density-functional theory (DFT) and experimental data for the fcc structure [Ref. [104]].
c11
DFT
fcc(expt.)
62
c12
NN
170.0
DFT
c44
NN
122.5
DFT
NN
75.8
fcc
173
177
123
119
80
83
bcc
138
135
137
135
103
89
sc
294
251
36
8
-38
-22
In order to identify the effect of the atomic relaxations close to the vacancies and the
description of these relaxations by the NN potential, the vacancy formation energies
were determined for two different cases.
First, the vacancies were generated by removing one copper atom, a single point
energy calculation was performed, but without subsequent relaxation of the resulting
structures (columns “fixed” in Table 7.7). Second, the structures were fully relaxed
after removing the atom. This was done independently for DFT and the NN, i.e. the
relaxed DFT structures were determined by minimizing the DFT forces, and the relaxed
NN structures were obtained by geometry optimizations employing the NN potential.
In all cases, the lattice constants were kept fixed at the values of the ideal crystal
structures and only the atomic positions were optimized.
The vacancy formation energy in the bulk is defined as
N −1
Ebulk (N) ,
(7.2)
Evac,bulk = Ebulk,v (N − 1) −
N
where Ebulk (N) is the energy of the defect-free unit cell containing N atoms and
Ebulk,v (N − 1) is the energy of the system containing the vacancy.
In Table 7.7 the bulk vacancy formation energies are well reproduced, and the average
absolute deviation between DFT and the NN potential is about 43 meV, this error is
higher than the energy RMSE of the fit. The reason for the large error is that structures
with vacancies are not well represented in the training set. For example, we did not
include all vacancy sites of the ideal structures, and vacancy structures do not form
spontaneously in molecular dynamics simulations, which have been used to generate
the majority of the training points.
The large fcc supercells, (2 × 2 × 2), (2 × 2 × 3), (2 × 3 × 3) and (3 × 3 × 3) contain
32, 48, 72, and 108 atoms, respectively. Therefore the error per atom is substantially
lower. Nevertheless, not all atoms contribute equally to this large error. The atoms
responsible for the deviation must be the ones which are close to the vacancy site and
are not well represented in the training set. This can be concluded from the finding
that the error is on average the same for the supercells, i.e., the error does not increase
with system size. If the NN potential was further refined by adding the structures with
vacancies in the reference data set, improved results for the vacancy formation energies
could be expected.
63
Table 7.7 Vacancy formation energies (eV) in different crystal structures
of bulk copper. Different supercells have been used to represent various
vacancy concentrations. The fixed data correspond to bulk-like atomic
positions. The relaxed data have been determined by optimizing the
structure with the respective method, i.e., the DFT energy has been
determined by relaxing the structure using DFT, and the NN energy has
been obtained by a geometry optimization using the NN potential.
fixed
NN
Evac,bulk
DFT
Evac,bulk
NN
Evac,bulk
(2 × 2 × 2)
1.196
1.195
1.164
1.174
(2 × 2 × 3)
1.180
1.214
1.143
1.180
(2 × 3 × 3)
1.171
1.232
1.130
1.194
(3 × 3 × 3)
1.147
1.250
1.108
1.214
(2 × 2 × 2)
1.069
1.021
0.982
0.958
(2 × 2 × 3)
1.074
1.055
0.874
0.788
(2 × 3 × 3)
1.068
1.076
0.864
0.802
(3 × 3 × 3)
1.070
1.084
0.925
0.944
hcp (2 × 2 × 2)
1.030
1.064
1.022
1.045
(2 × 2 × 3)
1.059
1.072
1.045
1.036
(2 × 3 × 3)
1.108
1.068
1.087
1.041
(3 × 3 × 3)
1.128
1.046
1.103
1.026
fcc
bcc
64
relaxed
DFT
Evac,bulk
rel. energy (meV/atom)
100
DFT
NN
90
80
70
60
50
40
30
0
5
10 15 20 25 30
number of structure
35
40
Figure 7.9 Comparison of the DFT and neural network (NN) energies of several 16 atom bulk
structures selected from a molecular dynamics trajectory of bcc copper at 500 K.
Next, we have investigated the accuracy of the NN potential for disordered bulk structures. MD simulations of several crystal structures at a wide range of temperatures
within the NVT ensemble have been carried out using the NN potential. Representative
structures were then selected, and the energies and forces were determined by DFT
calculations for comparison. Figure 7.9 shows the NN and DFT energies of distorted
bcc structures containing 16 atoms, which were extracted randomly from a molecular dynamics simulation at 500 K. The typical deviation between DFT and the NN
predicted energies is only a few meV per atom. The absolute forces acting on the
atoms for one of these structures are compared in Fig. 7.10. It can be seen that also the
agreement between the NN and DFT forces is excellent.
7.2.3 Copper surfaces
Ideal low-index surfaces
In this section various properties of copper surfaces are discussed after the reliable
description of bulk copper has been proven. The most fundamental property of a
surface is the surface energy γ, which is defined as the required energy to create a
65
|F| (eV/Bohr)
1.6
1.4
1.2
1.0
0.8
0.6
0.4
0.2
0.0
DFT
NN
1 2
3
4
5
6
7
8
9 10 11 12 13 14 15 16
Cu atom number
Figure 7.10 Comparison of the DFT and neural network (NN) forces acting on the copper
atoms in a 16 atom bulk structure. The atomic configuration has been selected randomly from
a molecular dynamics trajectory of bcc copper at 500 K.
surface by cleaving the bulk material. It is defined as
γ=
1
(Eslab − N · Ebulk )
2A
,
(7.3)
where Eslab is the energy of a slab containing N atoms and Ebulk is the energy of an
atom in the bulk material. A is the surface area of the slab and the factor 2 takes into
account that in a slab calculation two surfaces are present.
The comparison of the surface energies for a variety of surfaces of three different crystal
structures (fcc, bcc, and sc) obtained with the NN potential and DFT are presented in
Table 7.8. The surface energies have been calculated employing both, bulk-truncated
slabs as well as fully relaxed slabs, to investigate the influence of differences in the final
relaxed structures. The bulk structures of copper were cleaved along the (111), (100),
and (110) planes to construct surfaces that are represented by slabs with a thickness
of 10–12 Å, containing eight atomic layers with the central four layers constrained
to the bulk lattice positions. For all calculations, a vacuum spacing of about 15 Å in
the direction of the surface normal was used. For the DFT calculations it is important
to ensure highly converged k-point meshes, since the slab and the bulk structures in
Eq. 7.3 have different Brillouin zones. In case of the NN, which has been trained
to energies obtained with well-converged k-point meshes for both, bulk and surface
training structures, the energies correspond to dense k-point meshes by construction.
66
Table 7.8 Comparison of the DFT and neural network (NN)
surface energies γ (in meV/Å2 ) of different copper surfaces.
In the slab models of the surfaces eight metal layers have
been used, fcc(110)mr is the missing row reconstruction of
the Cu(110) surface.
fixed
surface
relaxed
γDFT
γNN
γDFT
γNN
fcc(111)
93.601
92.867
93.159
92.743
fcc(100)
101.251
101.767
100.532
100.995
fcc(110)
105.146
106.158
102.387
103.921
fcc(110)mr
113.364
114.484
109.927
111.689
bcc(111)
104.765
103.013
101.458
99.551
bcc(100)
97.742
95.309
96.972
94.602
bcc(110)
86.519
87.973
86.364
87.438
sc(111)
77.397
78.969
77.318
78.723
sc(100)
59.282
60.250
59.263
60.064
sc(110)
74.173
75.895
73.842
75.658
The agreement between the DFT and NN surface energies is very good and the results
are extremely accurate for both, the bulk-truncated and the relaxed surfaces. For all
crystal structures and all investigated low-index surfaces the energetic sequence is the
same, and the average absolute error of the surface energies is as small as 1.34 meV/Å2 .
The structural properties of the relaxed surfaces determined using the NN potential are
also in very good agreement with the DFT results. For example, the first layer of the
fcc Cu(111) surface exhibits an inwards relaxation of about 0.023 Å corresponding to
1.083 % of the bulk interlayer distance in DFT, while the NN values are 0.021 Å or
0.995 %.
67
Surface vacancies and adatoms
Many structural features are important for setting up a model to represent of a real
copper surface. Therefore, the NN potential should be able to reliably predict properties
of model surfaces containing vacancies and adatoms. In the case of vacancies, the
surface vacancy formation energy Evac,surf for a variety of surface orientations needs to
be checked. The surface vacancy formation energy is defined as the energy needed to
remove a copper atom from the surface
Evac,surf = Eslab,vac + Eatom − Eslab
,
(7.4)
where Eslab,vac is the energy of the supercell containing the surface vacancy, and Eslab
is the energy of the defect-free surface supercell. In order to investigate if the NN
potential predicts the correct relaxed surface structures, and the effect of the relaxation
on the vacancy formation energies, we have determined Evac,surf for the bulk-truncated,
unrelaxed or fixed surfaces as well as for the relaxed surfaces. The values obtained
for a number of surfaces of different copper modifications are listed in Table 7.9.
The average error is about 85 meV, and there is no dependence of the error on the
supercell size. The vancancy formation energies of both, relaxed and fixed, structures
are represented equally well, and the optimized structures obtained with DFT and the
NN are very similar.
Second, additional copper atoms, adsorbed or diffusing at the surface, need to be also
described properly. The correct description of the potential experienced by an adatom
is relevant for gaining mechanistic insights into atomic rearrangements at surfaces.
This is very important for processes, such as reconstructions, growth, and adsorption.
Models with a copper atom diffusing along copper surfaces have been investigated.
For this purpose, the potential energy along a number of high-symmetry paths for a
copper atom at a vertical distance of 1.85 Å at the Cu(111) surface and the Cu(100)
surface has, therefore, been calculated. These paths are shown as the inset figures in
Fig. 7.11(a) for the Cu(111) surface, and the corresponding NN and DFT energies
are plotted, and the other path Fig. 7.11(b) for the Cu(100) surface. The NN energy
profiles for both surfaces can represent the DFT energy profiles very accurately, and the
deviations are in the order of a few meV only, which is the typical order of magnitude
68
Table 7.9 Vacancy formation energies (eV) at different copper surfaces represented by eight layer slabs. The fixed data corresponds
to bulk-like atomic positions. The relaxed data was determined by
optimizing the structure with the respective method.
fixed
relaxed
NN
Evac,surf
DFT
Evac,surf
NN
Evac,surf
surface
supercell
DFT
Evac,surf
fcc(111)
(2 × 1)
4.252
4.346
4.193
4.284
(2 × 2)
4.413
4.596
4.367
4.550
(2 × 3)
4.471
4.606
4.423
4.561
(3 × 3)
4.482
4.607
4.439
4.562
(2 × 1)
4.023
4.123
3.977
4.075
(2 × 2)
4.195
4.255
4.143
4.209
(2 × 3)
4.210
4.301
4.163
4.255
(3 × 3)
4.224
4.332
4.189
4.287
(2 × 1)
4.069
4.073
4.043
4.052
(2 × 2)
4.076
4.062
4.046
4.053
(2 × 3)
4.093
4.070
4.056
4.058
(3 × 3)
4.110
4.112
4.078
4.097
(2 × 1)
3.729
3.783
3.437
3.335
(2 × 2)
3.695
3.831
3.433
3.504
(2 × 3)
3.683
3.834
3.394
3.096
(3 × 3)
3.653
3.835
3.386
3.515
(2 × 1)
3.858
3.988
3.755
3.862
(2 × 2)
3.982
4.122
3.867
3.981
(2 × 3)
4.008
4.147
3.768
3.902
(3 × 3)
4.033
4.166
3.803
3.932
(2 × 1)
4.496
4.465
4.256
4.230
(2 × 2)
4.440
4.461
4.118
4.159
(2 × 3)
4.416
4.463
4.205
4.140
(3 × 3)
4.442
4.456
-
4.119
fcc(100)
fcc(110)
bcc(111)
bcc(100)
bcc(110)
69
(b)
DFT
NN
70
70
60
60
rel. energy (meV/atom)
rel. energy (meV/atom)
(a)
50
40
30
20
bridge
10
0
DFT
NN
50
40
30
20
10
0
top
fcc
hcp
bridge
top
top
bridge
hollow
top
Figure 7.11 Comparison of the DFT and neural network (NN) energy profiles of a copper atom
moving at a distance of 1.85 Å above a clean Cu (2 × 2) surface. The energy profiles along the
path given in the inset are shown for a (111) surface (a) and a (100) surface (b).
of the NN potential RMSE. We found results of similar quality also for other distances
from the surface.
The most stable adsorption site of an additional copper atom at the Cu(111) surface
in a (2 × 2) supercell is the fcc site. The binding energy in DFT is 2.903 eV (NN:
2.890 eV) and the optimum vertical distance with respect to the first metal layer is
1.75 Å (DFT) and 1.79 Å (NN).
7.3 Reliability of the neural network potential for a large realistic
structure
A slab model for a realistic Cu(111) surface containing various defects like vacancies,
adatoms, kinks and steps has been set up. In total the surface contains 29, 443 atoms
as shown in Fig. 7.12. The reliable description of such a realistic and large structure
using the NN potential is challenging. This raises the question how the accuracy of
the NN potential can be checked and improved for such systems that are not directly
accessible by DFT calculations.
Since a direct comparison of the NN energy and forces with DFT results for the
full system is impossible, we have to reduce the system size and we need to use
70
Figure 7.12 A slab model for a “Real” copper (111) surface (29, 443 atoms) with a number of
defects, such as vacancies, adatoms, steps, and kinks.
Figure 7.13 “Real” copper surface with a number of defects, such as vacancies, adatoms, steps,
and kinks. [85] The atomic environments of twelve representative atoms are shown as blue
spheres. They are defined by the cutoff radius of the symmetry functions and include all atoms
determining the atomic energies. The corresponding clusters are shown in Fig. 7.14.
physical properties that depend only on the local atomic environments. It is one of the
main advantages of NN potentials that they can be trained using DFT data for rather
small system sizes, because the atomic energy contributions only depend on the local
chemical environment defined by the cutoff radius of the symmetry functions. Once
constructed, the NN potentials can still be applied to very large systems containing
many thousand of atoms.
In order to confirm this scalability of the NN potential, first a number of representative
atoms of the slab model of a realistic surface have been selected. Then, we have
extracted clusters centered around these atoms employing the cutoff radius of the
symmetry functions (6 Å), which defines the atomic environments highlighted as blue
spheres in Fig. 7.13. The obtained clusters are shown in Fig. 7.14 and contain all atoms,
which determine the atomic energies of their central atoms. Unfortunately, there is no
71
uniquely defined atomic energy in DFT, and consequently the atomic energies obtained
from the NN cannot be compared to DFT values. Nevertheless, while the energies of
the clusters cannot be used to assess the accuracy of the NN potential, this can be done
using the forces. The NN and DFT forces acting on the central atoms of these clusters
can be directly compared to estimate the accuracy of the NN potential for the large
surface structure. A comparison of the absolute forces acting on the central atoms of
the twelve test clusters is shown in Fig. 7.15. The NN potential is able to predict the
forces in very good qualitative agreement with DFT.
Quantitatively, there are still some differences between the NN and the DFT forces. A
further analysis of this discrepancy revealed that the forces depend on a larger chemical
environment than the atomic energy contributions. This is a necessary consequence
of the functional relation between the potential energy and the atomic positions. By
definition, the atomic energy Ei depends only on the positions of the atoms inside the
cutoff radius of the symmetry functions, while the forces also depend on all neighbors
of these atoms. The reason is that the force acting on Cartesian coordinate Ri,α of atom
i in direction α = {x, y, z} is given as the derivative of the total energy E with respect
to Ri,α , which is the sum of the derivatives of all atomic energies,
Fi,α = −
∂
∂
E =−
∂ Ri,α
∂ Ri,α
∑Ej
.
(7.5)
j
Therefore, the derivatives of the energies of all atoms j having atom i inside their cutoff
sphere enter the force Fi,α . Consequently, the force Fi,α depends on the positions of
all atoms inside a sphere of radius 2 · Rc around atom i, because the largest possible
distance between atom i and j is Rc . This means the clusters shown in Fig. 7.14 are not
large enough to provide converged NN forces at the central atoms, and the clusters with
a radius of 6 Å are not suitable to represent the atomic environments of the extended
surface.
Having confirmed this, calculations of the NN and DFT forces for clusters with an
extended radius of 12 Å have been examined. The corresponding forces acting on the
central atoms are shown in Fig. 7.16. It can be seen that the agreement between the NN
and DFT forces is excellent, and improved with respect to the smaller clusters, 6 Å.
72
To verify the convergence of the NN forces, the comparison of the NN forces for
clusters with radii of 6, 9, 12, 30 Å, and the slab have also been provided. In Fig. 7.17
the NN forces for the central atoms for the cutoff radii less than 12 Å, i.e., 6 and 9 Å
are slightly different from the ones larger than 12 Å. Contributions from atoms that
are more than 12 Å away from the central atom do not enter Fi,α , therefore, the NN
forces obtained for the 12 Å, 30 Å or larger ones are exactly the same as the NN forces
in the full slab. Consequently, we have shown using smaller subsystems that the NN
potential is able to describe the PES of large systems very accurately. The numbers of
copper atoms in different cutoff radii are presented in Table 7.10.
The remaining difference between the DFT and NN forces in the 12 Å clusters, can be
ascribed to small inaccuracies in the NN fits. One reason is that these clusters have not
been included in the reference data set. The deviations between DFT and the NN could
be decreased by including them into the training set. This approach offers a systematic
way to improve NN potentials for large systems, which as a whole are impossible to
access by DFT.
Finally, it should be noted that although the atomic energies and forces do have a
different effective dependence on the neighboring atoms, still the total energy and
forces are fully consistent, as the forces are the exact analytic derivative of the total
energy expression given in the Eqns. 7.5. In practical applications the different effective
range is relevant only if too small cutoffs are used. For this case it can be seen that the
forces acting on the central atom obtained for clusters with 6 and 12 Å radii are not
substantially different. The remaining differences are mainly important for checking
the quality of the potential. For the construction of the NN potential this is not relevant,
because the training set anyway contains a wide range of sufficiently large periodic and
non-periodic structures. Choosing a larger cutoff will produce minor improvements
with higher computational cost. Here, the excellent quality of the reported copper PES
shows that the employed cutoff of 6 Å is appropriate for this system.
73
1
2
3
4
5
6
7
8
9
10
11
12
Figure 7.14 Structures of the 12 clusters extracted from the large surface in Fig. 7.13 using
the cutoff radius of 6 Å [85]. The forces acting on the central atoms shown in blue can be
used to estimate the accuracy of the neural network (NN) potential for the extended system. A
comparison of the forces obtained in DFT calculations and from the NN potential is shown in
|F| (eV/Bohr)
Fig. 7.15.
1.4
1.2
1.0
0.8
0.6
0.4
0.2
0.0
DFT
NN
1 2 3 4 5 6 7 8 9 10 1112
cluster number
Figure 7.15 Comparison of the DFT and neural network (NN) forces acting on the central
atoms of the 12 clusters shown in Fig. 7.14. The clusters contain all atoms within a radius of
6 Å around the central atom.
74
|F| (eV/Bohr)
1.4
1.2
1.0
0.8
0.6
0.4
0.2
0.0
DFT
NN
1 2 3 4 5 6 7 8 9 10 1112
cluster number
Figure 7.16 Comparison of the DFT and neural network (NN) forces acting on the central
atoms of 12 clusters cut from the slab shown in Fig. 7.13 using a radius of 12 Å.
1.2
NN 6 Å
NN 9 Å
NN 12 Å
NN 30 Å
slab
|F| (eV/Bohr)
1.0
0.8
0.6
0.4
0.2
0.0
1
2
3
4
5
6
7
8
Cluster number
9
10
11 12
Figure 7.17 Convergence of the neural network (NN) forces acting on the central atoms of the
12 clusters cut from the slab shown in Fig. 7.13 with the radii of 6, 9, 12, 30 Å, and NN forces
of the slab. The number of copper atoms in different cutoff radii (Rc ) is listed in Table 7.10
75
Table 7.10 Numbers of copper atoms in different cutoff radii (Rc ), Fig. 7.17, of the 12
copper clusters cut from the slab shown in
Fig. 7.13.
Cluster
76
Number of atoms in Rc
6Å
9Å
12 Å
30 Å
1
35
106
255
3201
2
43
128
306
3534
3
27
82
199
3077
4
40
113
263
3416
5
42
124
291
3455
6
38
110
259
3213
7
43
128
303
3231
8
50
144
328
3023
9
37
106
243
3111
10
29
92
225
3008
11
43
126
293
2894
12
40
124
300
2744
average
39
115
272
3159
8 A Multicomponent Neural Network Potential for Zinc
Oxide
The next consequent step into the direction of an accurate atomistic Cu/ZnO NN
potential is the first application of the NN method for a multicomponent system,
namely zinc oxide. Multicomponent systems may exhibit significant charge transfer
resulting in long-ranged electrostatic interactions. This chapter presents the first
application of an extended multicomponent NN method.
8.1 Neural network potentials for multicomponent systems
Today, the proper theoretical description of realistic ZnO structures, especially the
description of the polar oxygen-terminated and zinc-terminated surfaces, O–ZnO and
Zn–ZnO, is still a challenging task which has not yet been completely accomplished.
Polar surfaces are electrostatically unstable by nature and must undergo a reconstruction
that removes the surface dipole moment. Various theoretical and experimental studies
during the last decade have contributed to our understanding of the reconstruction of
the polar zinc oxide surfaces [105–109].
For the Zn-terminated polar ZnO surface it has been suggested by Dulub, Diebold,
and Kresse that the electrostatic instability can be removed by introducing triangularshaped islands with step edges exhibiting a particular orientation [108, 109]. The
step-edges change the stoichiometry and thus the total charge of the polar surface. As
a result, the instability is effectively removed.
These previous reports show the significance of charge transfer in ZnO systems, and
the construction of the NN potential for the ZnO system should take charge transfer
into account. Therefore, we have to assume that the standard high-dimensional NN
scheme of Ref. [53], in which the atomic charges do not enter the training data, can
77
not adequately describe multicomponent systems with significant charge transfer. In
support of this hypothesis, in systems of arbitrary chemical composition charge transfer
and the resulting long-range electrostatic interactions can play an important role and
cannot be captured by only the short-range energy, Eshort .
Consequently, a second set of atomic NNs has been constructed that has been trained
to predict environment-dependent charges Qi at the atomic sites. The total energy
expression of the NN potential now consists of two parts: a short-range energy Eshort
describing the effects of local electronic structure changes due to chemical bonding
and the long-range electrostatic energy contribution Eelec . More details about the
multicomponent extension of the NN method can be found in Chapter 5.
8.2 Reference data set
To construct a neural network (NN) potential for a multicomponent ZnO system, a
reference data set was generated in a similar fashion as discussed previously for the
case of copper structures, i.e., following the self-consistent iterative two-fit process of
Sec. 6.2.
First NN potentials were based on ZnO clusters, crystalline and amorphous bulk
structures, and surface models. The first ZnO clusters of stoichiometry ZnN ON with
N = 1, . . . , 40 were created randomly: the atomic positions of the clusters were generated based on the definition of reasonable atomic separations, i.e., minimum and
maximum bond lengths for O–O, O–Zn and Zn–Zn bonds, and for a given cluster
radius. Bulk structures were derived from the ideal wurtzite, zincblende (ZnS), sodium
chloride (NaCl) and cesium chloride (CsCl) crystal structures. Apart from these ideal
crystal structures, structures have also been generated by systematically varying the
lattice parameters (atomic volumes from 50–110 Bohr3 ). Next, ideal surfaces have
been included in the reference data set, which have been represented by slab models
that were generated by truncation of the bulk equilibrium wurtzite lattice. Most included surface structures have been based on the ground state wurtzite structure lattice
parameters a, c, and u as calculated using DFT with the PBE functional (see also
Ref. [95]). The equilibrium unit cell volume was found to be 83.064 Bohr3 per atom.
78
Additionally, some surface models have been based on scaled wurtzite crystals with
corresponding unit cell volumes between 70 and 100 Bohr3 per atom. In particular
slab models of the ideal (1010) and (1120) surfaces containing 10 bulk ZnO layers and
with a vacuum region of at least 12 Å were constructed and included in the data set.
Several NN fits have been generated using this first input data. The NN-Two-Fit
technique has been employed to improve the quality of the NN fit by searching for
additional structures that were not accounted for by the training set. The good fits,
e.g., fits with energy RMSEs of less than about 10 meV/atom, have subsequently
been used to perform MD simulations of large bulk and surface structures in the
NVT statistical ensemble at temperatures between 300 and 3000 K (imposed by a
Berendsen thermostat). As discussed in Sec. 6.2, the energy contribution of each
atom only depends on the local (atomic) chemical environment. Consequently, a new
atomic environment can be added to the training set in form of a small cluster with
a radius of the range of the symmetry functions (6 Å, see Sec. 8.3.1). Structures
generated according to this procedure were automatically recalculated with DFT, if the
comparison of energies and forces of the two NN fits showed significantly different
values, i.e., if the difference of the two energies of the same structure is much larger than
10 meV/atom (see also Sec. 6.2.2). These structures were then added to the training set.
All reference DFT calculations have been carried out with the FHI-aims package [57]
employing a basis set of numerical atomic orbitals and the PBE functional [98] and the
computational set-up described in Sec. 6.1.
The final reference data set comprises DFT calculations of 38,750 ZnO structures,
spanning a total energy range of 1.012 eV/atom, which is visualized as energy density
of states (EDOS) for the data set in Fig. 8.1. The atomic forces in the data set span a
range of 8.805 eV/Bohr for O atom, and of 7.494 eV/Bohr for Zn atom with respect
to the optimized structure. In detail, the reference data set contains 7,366 clusters,
27,287 bulk structures (including crystal structures, random structures and snapshots
from MD simulations), and 4,097 slab models. This data set was split into a training
set (90 %) to fit the weight parameters of the NNs and an independent test set (10 %)
to check the predictive power. The compositions of the training set and the test set
are given in Table. 8.1. The number of atoms and the number of atomic forces in the
training set, i.e., the total number of training data points, are 602,050 and 1,806,150,
respectively.
79
3500
Bulk
Clusters
Slabs
Number of structures
3000
2500
2000
1500
1000
500
0
0.0
0.2 0.4 0.6 0.8 1.0 1.2 1.4
Energy Range (eV/atom)
Figure 8.1 Energy density of states (EDOS) for the reference data set of zinc oxide structures.
8.3 Neural network potential for zinc oxide
8.3.1 Construction of the neural network potential for zinc oxide
In order to accurately calculate the energy contributions, using the analytic expression
of the neural network, symmetry functions with a cutoff radius of 6 Å were used as
input nodes for the environment-dependent atomic energies and atomic charges. The
symmetry functions used to describe the local atomic environments of the short-range
energy and the atomic charges have been chosen to be identical.
For oxygen and zinc atoms, the vector of the symmetry functions has been set up
in terms of bond radial functions and angular functions. The same number of radial
symmetry functions for both, short-range and long-range energies was employed for
the three different bonds O–O, Zn–Zn and O–Zn, namely 14 functions for oxygen
and 13 for zinc, respectively. Also the angular symmetry functions were chosen to be
independent of the combination of elements and sum up to a number of 128 functions
for oxygen and 116 functions for zinc. In total, the number of the symmetry functions
or input nodes is 142 for oxygen, and 129 for zinc. The values of the symmetry
functions are provided in Tables A.3, A.4, A.5, and A.6 in the appendix.
80
Table 8.1 Composition of the training and the
test set for the ZnO system.
Training set
Test set
6,694
672
bulk
24,514
2,773
slabs
3,692
405
34,900
3,850
No. of atoms
602,050
62,640
No. of forces
1,806,150
187,920
clusters
No. of structures
In this work, atomic charges obtained from DFT calculations using the Hirshfeld
method [92] have been used as reference data for the NN potential. However, any
charge partitioning scheme can be applied and also higher multipoles could, in principle,
be included. The atomic charges are then used to calculate the long-range electrostatic
energy of the system by standard methods, i.e., by direct evaluation of Coulomb’s
law for molecules or clusters and by Ewald summation [110] for periodic structures.
In the reference DFT calculations the short-range energy is not directly accessible.
Instead, for optimization the weight parameters of the short-range atomic NNs, the
long-range electrostatic contribution to the total energy must first be removed from the
DFT reference data. For this purpose, the electrostatic energies have been computed
from the atomic charges, and are subtracted from the respective total DFT values.
However, for the separation of the forces into a short-range and an electrostatic part it
has to be taken into account that the atomic charges are not fixed, but depend on the
chemical environments, and this dependence is not available from DFT calculations.
Details of the energies and charges in the training set of the zinc oxide system are
presented in Table 8.2. Different values of energy ranges of the short-range energy
Eshort , the long-range electrostatic energy Eelec , and total energy Etot are reported for
the NN schemes with and without electrostatics. For the NN including electrostatics,
the energy range of Eshort is 1.399 eV/atom, and of Eelec is 0.649 eV/atom. The total
energy for both NNs (with and without Eelec ) naturally is identical and has a value of
1.012 eV/atom. The averaged atomic charge of zinc and oxygen atoms is ±0.3535 e.
81
Table 8.2 Details of the energies (eV/atom) and charges (e) in the training set of the zinc oxide system. Eshort , Eelec , and Etot are the short-range
energy, the long-range electrostatic energy, and the total energy, respectively. The reference atomic charges have been derived from the
DFT calculations employing the Hirshfeld partitioning scheme [92].
The atomic charges are used to calculate the electrostatic energy of the
system by standard methods such as Ewald summation [110].
Average
Range
-2.516
-3.073
1.012
0.000
0.000
0.000
0.000
-3.528
-2.516
-3.073
1.012
Eshort
-2.862
-1.463
-2.407
1.399
Eelec
-1.068
-0.419
-0.667
0.649
Etot
-3.528
-2.516
-3.073
1.012
Qmin
Qmax
Average
Range
O
-0.4995
0.1919
-0.3535
0.6914
Zn
-0.0564
0.5870
0.3535
0.6434
Emin
Emax
Eshort
-3.528
Eelec
Etot
NN without electrostatics
NN with electrostatics
Atomic charges (e)
82
8.3.2 A neural network potential for zinc oxide
In order to determine the most suitable NN, various fits for different network architectures and with different initial parameters have been performed. As in the case of
copper in Sec. 7, different numbers of hidden layers and nodes per layer have been
tested. Further, the initial weight parameters have been initialized randomly and have
been normalized applying the scheme of Nguyen and Widrow [111]. Here, we also
compared the RMSE values of NN potentials constructed according to different NN
methodologies, namely with and without regarding atomic forces for the fitting. The
four different NN approaches thus are the following:
• NN potentials not using atomic forces:
(i) original high-dimensional NN scheme,
(ii) high-dimensional NN scheme with electrostatic extension.
• NN potentials using atomic forces:
(i) original high-dimensional NN scheme,
(ii) high-dimensional NN scheme with electrostatic extension.
Numbers of nodes of 15, 20, 25, and 30 nodes per hidden layer have been evaluated.
Again, both NN schemes have used the same set of symmetry functions with a number
of 142 functions for oxygen, and 129 for zinc. The RMSE values of the energies and
forces of the NN potentials without forces for fitting are shown in Table 8.3. The energy
RMSEs for the NN without long-range electrostatic interactions are comparable to the
NN with electrostatic extension and differ only by a few meV/atom, while the RMSEs
of the forces are even smaller for the potential without electrostatic extension. The
differences of the force errors are in a range of about 40 meV/Bohr.
For the data in Table 8.4 the optimization of the weight parameters was done for the
training set including the atomic forces. The energy RMSEs for the NNs with and
without electrostatic energy are almost identical. However, the RMSE values of the
forces are smaller for the NN without electrostatic energy.
In general, the energy and forces RMSEs are smaller for the NN fits using the forces
for the optimization.
83
Table 8.3 Root mean squared errors (RMSEs) of the energies and forces for the
zinc oxide system for different neural network (NN) architectures obtained with
and without electrostatic part of the atom-based approach. The optimization of the
weight parameters was done for the training set not including the atomic forces. X
represents the number of input nodes, which depends on the specific element as
discussed in the text. See Table 8.4 for the same analysis including forces in the NN
training.
NN without Electrostatics
NN with Electrostatics
Network
ERMSE
FRMSE
ERMSE
FRMSE
Architecture
(meV/atom)
(meV/Bohr)
(meV/atom)
(meV/Bohr)
Train
Test
Train
Test
Train
Test
Train
Test
X-15-15-1
1.82
2.62
98.26
97.59
2.14
3.11
141.09
142.38
X-20-20-1
2.10
3.43
106.69
15.02
2.00
2.83
140.83
141.02
X-25-25-1
2.09
2.85
96.41
97.06
2.17
2.89
133.16
133.14
X-30-30-1
1.85
2.71
96.30
97.46
2.44
3.21
137.78
138.04
Table 8.4 Root mean squared errors (RMSEs) of the energies and forces for the
zinc oxide system for different neural network (NN) architectures obtained with
and without electrostatic part of the atom-based approach. The optimization of
the weight parameters was done for the training set including the atomic forces. X
represents the number of input nodes, which depends on the specific element as
discussed in the text. See Table 8.3 for the same analysis without using the force
information for the NN training.
NN without Electrostatics
84
NN with Electrostatics
Network
ERMSE
FRMSE
ERMSE
FRMSE
Architecture
(meV/atom)
(meV/Bohr)
(meV/atom)
(meV/Bohr)
Train
Test
Train
Test
Train
Test
Train
Test
X-15-15-1
2.30
2.80
93.00
94.85
2.96
3.53
136.96
137.21
X-20-20-1
2.02
2.73
91.70
94.06
2.60
2.90
135.16
134.51
X-25-25-1
2.08
2.58
90.07
89.94
2.51
3.49
136.37
138.74
X-30-30-1
1.93
2.67
91.67
94.84
2.59
3.33
133.82
134.36
These results of Tables 8.3 and 8.4 show that electrostatic interactions due to charge
transfer are not significant for ZnO system.
The electrostatic energy part increases the complexity of the NN optimization. The
energy range spanned by the short-range energies in the reference set is larger than
the energy-range of the total energies (Table 8.2), which additionally makes the fit
more difficult. Also note, that the Hirshfeld charge partitioning yields average atomic
charges (±0.3535) that are far away from the chemically intuitive charges in zinc oxide
(±2.0). All these observations might be reasons for the higher RMSEs of the NN with
electrostatic energy. The electrostatic energy extension of the NN method might be
improved by choosing a different charge partitioning scheme. Additionally, instead
of using the total atomic charges, screening functions may be used to exclude the
short-ranged part of the electrostatic interactions from the Ewald summation [96].
Since the quality of the NN fit including electrostatic interactions is almost equivalent
to the NN fit without charge part, the NN with electrostatic energy has been selected
for the further studies. Structures that are not contained in the reference data set
may exhibit stronger charge transfer and could not be properly described without
electrostatic interactions. The selected NN fit employs an X-30-30-1 architecture and
has been optimized including the force information. The RMSEs for the short-range
energies and forces are about 2.59 meV/atom and 133.82 meV/Bohr for the training
set, and about 3.33 meV/atom and 134.36 eV/Bohr for the test set, respectively.
8.3.3 Zinc oxide clusters
The accuracy of the atomic forces is essential for performing MD simulations. In
Fig. 8.3 the NN forces of a Zn15 O15 cluster are compared with the DFT values. A
similar agreement has been found also for bulk systems. The precise representation of
the atomic charges is illustrated in Fig. 8.5 showing a comparison of the NN and DFT
Hirshfeld charges of the zinc and oxygen atoms in the same cluster.
85
Figure 8.2 Comparison of the NN potential and the DFT energies of random cluster structures
|F| (eV/Bohr)
of the composition Zn40 O40
3.0
2.5
2.0
1.5
1.0
0.5
0.0
2 4 6 8 10 12 14
Zinc atom
3.0
2.5
2.0
1.5
1.0
0.5
0.0
DFT
NN
2 4 6 8 10 12 14
Oxygen atom
Figure 8.3 Comparison of the absolute forces of DFT and the neural network (NN) acting
on the atoms in a Zn15 O15 cluster. The cluster has been chosen randomly from a molecular
dynamics simulation at 1000 K.
86
Fx (eV/Bohr)
(a)
DFT
NN
1.5
1.0
0.5
0.0
-0.5
-1.0
-1.5
0 2 4 6 8 10 12 14
Zn atom number
Fy (eV/Bohr)
(b)
O atom number
DFT
NN
1.5
1.0
0.5
0.0
-0.5
-1.0
-1.5
0 2 4 6 8 10 12 14
1.5
1.0
0.5
0.0
-0.5
-1.0
-1.5
0 2 4 6 8 10 12 14
Zn atom number
(c)
Fx (eV/Bohr)
1.5
1.0
0.5
0.0
-0.5
-1.0
-1.5
0 2 4 6 8 10 12 14
O atom number
DFT
NN
1.5
1.0
0.5
0.0
-0.5
-1.0
-1.5
0 2 4 6 8 10 12 14
1.5
1.0
0.5
0.0
-0.5
-1.0
-1.5
0 2 4 6 8 10 12 14
Zn atom number
O atom number
Figure 8.4 Comparison of the (a) x, (b) y and (c) z force components of the DFT and the neural
network (NN) acting on the atoms in a Zn15 O15 cluster. The cluster has been chosen randomly
atomic charge (e)
from a molecular dynamics simulation at 1000 K.
0.8
0.0
0.6
-0.2
0.4
-0.4
0.2
-0.6
0.0
2
4
6
8 10 12 14
Zinc atom
-0.8
DFT
NN
2
4
6
8 10 12 14
Oxygen atom
Figure 8.5 Comparison of the DFT and neural network (NN) atomic charges for the same
Zn15 O15 cluster as in Fig. 8.3.
87
Table 8.5 Comparison of the lattice parameters, bulk moduli (GPa), and cohesive energies per formula unit of several crystal structures of ZnO obtained
with the neural network (NN) and density-functional theory (DFT).
Ecoh /eV
lattice parameters
bulk moduli
DFT
NN
DFT
NN
DFT
NN
CsCl
5.584
5.588
a =2.688Å
a =2.680Å
159
160
NaCl
6.747
6.745
a =4.328Å
a =4.344Å
165
169
zincblende
7.043
7.041
a =4.616Å
a =4.612Å
129
133
wurtzite
7.057
7.054
a =3.278Å
a =3.278Å
129
129
c/a =1.614
c/a =1.614
u =0.379
u =0.379
8.3.4 Bulk zinc oxide structures
The obtained NN potential is applicable to simulations of crystalline and amorphous
bulk structures. The accuracy for these structure types has been checked by investigating a wide range of systems. In Fig. 8.6 the energy vs. volume curves for several
crystal structures of ZnO are shown. As in DFT and experiment, the wurtzite structure
is found to be energetically most stable and even the tiny energy difference to the
zincblende structure is correctly resolved. It can be seen that the potential is also valid
for compressed structures, as the smallest investigated volumes in Fig. 8.6 correspond
to pressures of about 70 GPa. The obtained lattice parameters and cohesive energies of
the investigated phases are given in Table 8.5.
Figure 8.6 shows the potential-energy curves for the relative energy of the different
ideal crystal structures of ZnO. The graph shows the comparison of the DFT energies
(diamond symbols) versus the NN energies (solid lines) as a benchmark case.
The NN and DFT energies of 40 random bulk structures containing 4 zinc and 4 oxygen
atoms are shown in Fig. 8.7. These structures, which have not been included in the
training set, illustrate the reliability for amorphous systems and very high temperatures.
The performance of the NN for random ZnO bulk structures with up to 24 atoms per
unit cell and random cluster structures with 80 atoms is depicted in Figs. 8.7 and 8.2.
88
Note that the computational time necessary for DFT calculations for such structures
is about 6.5 hours and 10.5 hours (on 8 cores), respectively. The time used for the
NN computations on the other hand is just one second for the bulk structures and two
seconds for the cluster structures (on a single CPU).
8.3.5 Zinc oxide surfaces
The relative energies of a thermally distorted ZnO(1010) surface slab with 8 layers,
resulting from MD simulation at temperatures between T = 300 K and 3,000 K, as
calculated using the NN and DFT are shown in figure 8.8. The results demonstrate that
the Zinc Oxide potential is in good agreement with the reference DFT data. Moreover,
the CPU time for the NN calculations is much lower than for the corresponding DFT
calculations.
89
Figure 8.6 Comparison of the NN potential and the DFT energies of ideal crystal structures of
ZnO.
Figure 8.7 Comparison of the NN potential and the DFT energies of random bulk structures of
the composition Zn12 O12
Figure 8.8 Comparison of the NN potential and the DFT energies of some thermally distorted
ZnO(1010) surface structures.
90
9 Neural Network Potentials for Ternary Systems
With the extension of the NN potential method to general multicomponent systems
in Chapter 8, it is now possible to turn to the ternary Cu/Zn/O system, which lies at
the center of our interests. Since the ZnO potential was the first application of the
NN method to multicomponent systems, the Cu/ZnO potential of this chapter is the
first ternary NN potential. The configuration space that has to be represented and the
number of parameters for a multicomponent potential both grow as O(N!), where N
is the number of chemical species. It therefore has to be determined, if for a ternary
potential the same accuracy can be achieved as for a binary one.
9.1 Construction of a neural network potential-energy surface for
copper/zinc oxide
The construction of general applicable NN potentials for ternary systems, like Cu/Zn/O,
is a difficult task due to the large configuration space. Therefore, the NN potential
reported here is restricted to the description of copper clusters at zinc oxide surfaces.
In order to use the potential in MD simulations to study clusters of a size comparable
to experiment it should be able to describe very large systems containing tens of
thousands of atoms. Additionally, it needs to be reliable for many subsystems, like
copper particles, zinc oxide, a variety of surfaces of both systems and their interfaces
including a manifold of defects. All these subsystems need to be described accurately
at different temperatures to ensure that MD simulations yield accurate results under
different conditions. Moreover, also chemical processes can take place at the interface
resulting in structures that differ strongly from the combination of the ideal subsystems.
Oxygen and zinc atoms could diffuse into the copper cluster and give rise to the
oxidation of the cluster and alloy formation. Additionally, copper clusters could also
penetrate the ZnO surface causing substantial structural changes in the ZnO support
including significant mass transport [14].
91
9.2 Reference data set
The training data points used for the construction of the copper potential of Chapter 7
and for the zinc oxide potential of Chapter 8 have been reused for the training of the
ternary potential [85, 95]. In addition, the data set has been extended by configurations
of the binary subsystems Cu/Zn and Cu/O, and ternary Cu/Zn/O structures. Since the
application of the ternary potential will focus on the Cu/ZnO interface, no attempt
has been undertaken to construct a universal atomistic potential for the three elements.
Consequently, we have restricted ourselves to Cu/ZnO structures derived from interface
models, and molecular structures, such as the dioxygen molecule (O2 ), have not been
included in the training set.
Both, the construction of the reference data set and its composition are similar to
what has been discussed for copper and zinc oxide in Chapters 7 and 8. The data set
comprises bulk structures (both ideal and thermally distorted), slab models, and clusters
that were derived from MD snapshots. See Chapter 6.2 for a detailed explanation
of the iterative two-fit technique used for the refinement of the training set, which
ensures to add only relevant structures to the data set. The total number of data points
is around 100,000 , where the individual structures consist of up to 100 atoms. The
exact composition of the data set is listed in Table 9.1. About 90% of the data points
have entered the training set. The remaining structures form an independent test set
that has been used to assess the quality of the fit.
As for the copper and the zinc oxide potentials, various different NN architectures have
been explored for the Cu/ZnO potential. However, in each case the same architecture
has been used for the three different chemical species, in order to reduce the number of
possibilities to a feasible limit. Additionally, the same set of symmetry functions has
been employed for copper, zinc and oxygen, although it may be possible to tune the
symmetry function set up to reduce the necessary number for a given accuracy.
In this chapter, two sets of symmetry functions as input nodes have been tested, i.e.,
132 and 156 functions, with different NN architecture. The energy and force RMSEs
of the 132 symmetry functions are slightly higher than the values for the 156 symmetry
functions. The errors are listed in Table 9.2 and Table 9.3, respectively.
92
Table 9.1 Composition of the training set and the test set for the construction of a neural network potential for the ternary system Cu/ZnO.
System
Training set
Test set
clusters
bulk
slabs
clusters
bulk
slabs
5,810
10,604
12,484
670
1,125
1,405
CuO
37
909
-
5
106
-
CuZn
4
588
1,122
-
61
128
7,105
24,572
3,695
833
2,715
402
18,356
-
2,317
2,033
-
263
Cu
ZnO
CuZnO
The resulting RMSEs for the cohesive energies and atomic forces in dependence of
the NN architecture are given in in Table 9.3. The selected NN fit (highlighted in bold
font), i.e., the fit with the smallest errors for energies and forces in the test set, has
been achieved for an architecture of three hidden layers with 15 nodes per layer. The
resulting RMSE values for the cohesive energies are 4.84 meV/atom for the training
set and 5.13 meV/atom for test set, respectively. The corresponding RMSEs for the
atomic forces are 93.06 meV/Bohr and 88.6 meV/Bohr. Note that the small difference
between the errors in the training and the test set indicate that virtually no overfitting is
present. In total the potential depends on 2,851 weight parameters, which were fitted
using approximately eight million data points of energies and force components.
Before the quality of the ternary Cu/ZnO potential is assessed, it shall be confirmed
that the new copper and zinc oxide sub-networks are of equal accuracy as the system
specific potentials of the previous chapters. For a number of benchmark cases, the new
potentials therefore have been compared to the potentials that have been described in
Chapters 7 and 8. Since the configuration space of a ternary system is much larger
than the one of a binary system, it is not directly obvious that it is possible to construct
a ternary NN potential with the same accuracy. The remainder of the chapter seeks
to determine the quality of the potential for ternary structures, albeit focusing on the
Cu/ZnO interface and on large copper clusters.
93
Table 9.2 Root mean squared errors (RMSEs) of the neural network (NN) energies
and forces obtained for the training set and the test set of the ternary Cu/ZnO system
with different NN architectures with 132 symmetry functions.
Network
Weights
ERMSE
FRMSE
Architecture
per
(meV/atom)
(meV/Bohr)
Element
Training Set
Test Set
Training Set
Test Set
132-2-2-1
275
10.74
10.82
187.06
182.82
132-2-2-2-1
281
8.72
9.05
148.35
144.89
132-5-5-1
701
5.99
6.24
135.68
134.81
132-5-5-5-1
731
5.54
5.94
113.81
108.52
132-10-10-1
1451
5.43
5.87
116.63
115.05
132-10-10-10-1
1561
5.00
5.33
108.79
102.04
132-15-15-1
2251
5.46
5.80
114.95
111.88
132-20-20-1
3101
5.87
6.06
112.85
109.21
132-30-30-1
4951
7.22
7.41
132.59
127.57
132-40-40-1
7001
8.73
8.67
156.23
150.09
9.2.1 Neural network potential for copper
In this section the performance of the new NN potential is compared to the specialized
copper potential of Chapter 7 and Ref. [85]. The lattice parameters, cohesive energies
and bulk moduli of several copper crystal structures, as predicted using the two different
NN potentials, are compared to their DFT reference values in Table 9.4. Both NN
potentials yield very similar results, which are in very good agreement with DFT for
all structures.
As a second benchmark, the surface energies of ideal low index copper surfaces
obtained using the two NN potentials are compared in Table 9.5. The proper description
of these surfaces is essential for the modeling of large copper clusters. As is evident
from Table 9.5, the quality of the two NN fits is comparable, and they accurately
reproduce the DFT surface energies.
At last, the forces acting on the individual atoms within ten copper clusters (Fig. 9.2)
that were extracted from a realistic surface model (Fig. 9.1) are compared in Fig. 9.3.
94
Table 9.3 Root mean squared errors (RMSEs) of the neural network (NN) energies
and forces obtained for the training set and the test set of the ternary Cu/ZnO system
with different NN architectures. The fit used in this chapter is shown in bold. The
number of weight parameters in the atomic NNs is also given for each architecture.
Network
Weights
ERMSE
FRMSE
Architecture
per
(meV/atom)
(meV/Bohr)
Element
Training Set
Test Set
Training Set
Test Set
156-2-2-1
323
8.13
8.22
137.22
133.27
156-2-2-2-1
329
8.67
8.73
134.58
131.47
156-5-5-1
821
5.88
6.20
112.18
110.86
156-5-5-5-1
851
5.54
5.91
102.85
100.19
156-10-10-1
1691
5.32
5.68
98.96
95.11
156-10-10-10-1
1801
4.99
5.36
95.58
93.14
156-15-15-1
2611
5.29
5.64
99.25
96.71
156-15-15-15-1
2851
4.84
5.13
93.06
88.61
156-20-20-1
3581
5.91
6.23
109.08
105.21
156-20-20-20-1
4001
5.35
5.68
99.17
95.33
156-30-30-1
5671
7.11
7.35
117.41
112.61
156-40-40-1
7961
8.91
8.91
139.03
133.38
95
Table 9.4 Comparison of the lattice parameters, cohesive energies and bulk moduli of
various crystal structures of copper obtained with two neural network (NN) potentials
and density-functional theory (DFT).
Property
NN
NN
CuZnO fit
Cu fit
DFT
Property
fcc structure
NN
NN
CuZnO fit
Cu fit
DFT
simple cubic structure
(Å)
3.628
3.630
3.630
a0
(Å)
2.407
2.407
2.407
Ecoh (eV)
3.524
3.526
3.533
Ecoh (eV)
3.051
3.052
3.051
B (GPa)
142
138
140
B (GPa)
100
108
103
2.568
2.570
2.573
a0
bcc structure
hcp structure
(Å)
2.885
2.887
2.885
a0
Ecoh (eV)
3.492
3.486
3.489
c/a0
1.629
1.631
1.627
B (GPa)
139
135
137
Ecoh (eV)
3.525
3.511
3.511
a0
Table 9.5 Surface energies of low index copper surfaces obtained from DFT and the neural
(Å)
Surface
network (NN) potential for the ternary Cu/ZnO
NN
NN
CuZnO fit
Cu fit
DFT
system. For comparison also the NN surface
(111)
88.7
92.7
93.2
energies obtained from a potential for pure
(100)
98.3
101.0
100.5
copper [85] are listed. All energies are given
(110)
102.6
103.9
102.4
in meV/Å2 , “mr” is the missing row recon-
(110)mr
109.8
111.7
109.9
struction.
96
Figure 9.1 Slab model of a Cu(111) surface with several defects like vacancies, adatoms, steps,
and kinks [112]. Ten representative atoms shown in blue have been selected to investigate the
accuracy of the neural network (NN) potential-energy surface. The NN forces acting on these
atoms depend on all atoms inside the transparent spheres with a radius of approximately 12 Å,
which corresponds to twice the cutoff Rc of the symmetry functions. The atoms enclosed in
these spheres form clusters which are small enough the be calculated by DFT. They are shown
in Fig. 9.2.
These clusters are shown as transparent blue spheres (within Rc = 12Å). They contain
on average about 260 atoms and are sufficiently large to ensure that the central atoms
have approximately the same chemical environment as in the full slab. A similar
benchmark had been used in Sec. 7.3 to estimate the accuracy of the copper NN
potential for structures that are too large to be computed with DFT. With the exception
of a single cluster, for which the error is about 0.1 eV/Bohr, the two NN potentials are
able to predict the DFT values with high accuracy.
9.2.2 Neural network potential for zinc oxide
As for the copper subsystem, the new ternary potential must also be reliable for zinc
oxide structures. In this section, the new potential is therefore compared to the ZnO
potential of Chapter 8 and Ref. [95]. Table 9.6 presents the lattice parameters, cohesive
energies and bulk moduli of a number of zinc oxide crystal structures, as calculated
using the two different NN potentials and DFT. As in the case of copper, the new
potential proves to be for of equally high accuracy these properties as the specialized
ZnO potential.
In Fig. 9.4 the NN predictions of energies of randomly generated (amorphous) zinc
oxide bulk structures are compared to their DFT references. In this benchmark test
the new ternary potential even performs slightly better than the potential of Chapter 8.
97
Figure 9.2 Geometries of the 10 clusters extracted from the large slab model of the Cu(111)
surface in Fig. 9.1. The forces acting on the central atoms shown in blue can be used to assess
the accuracy of the neural network (NN) potential for the full system. A comparison of the
|F| (eV/Bohr)
DFT forces in these clusters and the NN forces in the full slab is shown in Fig. 9.3.
1.4
1.2
1.0
0.8
0.6
0.4
0.2
0.0
DFT
NN(CuZnO)
1 2 3 4 5 6 7 8 9 10
cluster number
Figure 9.3 Comparison of the DFT and neural network (NN) forces acting on the central atoms
of the 10 clusters shown in Fig. 9.2. The clusters used for the DFT calculations contain all
atoms within a radius of about 12 Å around the central atom. By construction the NN forces in
these clusters are identical to the NN forces in the full slab shown in Fig. 9.1.
98
Table 9.6 Comparison of the lattice parameters, bulk moduli, and cohesive energies per
formula unit of several crystal structures of ZnO obtained with two neural network (NN)
potentials and density-functional theory (DFT).
Property
NN
NN
CuZnO fit
ZnO fit
DFT
Property
CsCl structure
NN
NN
CuZnO fit
ZnO fit
DFT
zincblende structure
(Å)
2.692
2.680
2.688
a0
(Å)
4.624
4.612
4.616
Ecoh (eV)
5.584
5.588
5.584
Ecoh (eV)
7.042
7.041
7.043
B (GPa)
156
160
159
B (GPa)
125
133
129
a0
NaCl structure
wurtzite structure
(Å)
4.332
4.344
4.328
a0
Ecoh (eV)
6.751
6.745
6.747
B (GPa)
167
169
165
a0
(Å)
3.278
3.278
3.278
c/a0
1.614
1.614
1.614
u
0.379
0.379
0.379
Ecoh (eV)
7.062
7.054
7.057
Note, that the Cu/ZnO potential does, in contrast to the pure ZnO potential, not
contain an atomic charge subnetwork for electrostatic interactions. Since the training
sets employed for the construction of both potentials were very similar, we have to
conclude that long-range electrostatic interactions are not of high importance for the
studied ZnO structures.
As a final measure for the quality of the new potential for ZnO structures, the predicted
forces acting on the central oxygen or zinc atoms of Zn15 O15 clusters are compared to
the reference DFT values in Fig. 9.5. The clusters have been generated from snapshots
of molecular dynamics simulations at 1000 K and were not included in the training set.
Nevertheless, the agreement between the NN potential and DFT is very good.
9.2.3 The binary subsystems Cu/O and Cu/Zn
For the additional binary subsystems Cu/O and Cu/Zn systems, the agreement between
the DFT and NN energies of bulk structures Cu10 O2 and Cu10 Zn2 is compared in
Fig. 9.6 and Fig. 9.7, respectively. As apparent from the diagrams, the NN potential
99
rel. energy (eV/atom)
0.40
DFT
NN (ZnO)
NN (CuZnO)
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
0
5 10 15 20 25 30 35 40 45 50 55
number of structure
Figure 9.4 Comparison of the DFT and neural network (NN) energies of random bulk structures
of the composition Zn4 O4 . The NN energies have been obtained using two different potentials,
|F| (eV/Bohr)
the NN potential for zinc oxide [95] and the NN potential for the Cu/ZnO system [112].
3.0
2.5
2.0
1.5
1.0
0.5
0.0
DFT
NN (ZnO)
NN (CuZnO)
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15
|F| (eV/Bohr)
Zn atom number
3.0
2.5
2.0
1.5
1.0
0.5
0.0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15
O atom number
Figure 9.5 Comparison of the absolute forces acting on the atoms in a Zn15 O15 cluster obtained
by DFT and two neural network (NN) potentials. The cluster has been chosen randomly
from a molecular dynamics simulation at 1000 K. The NN forces have been obtained using
two different potentials, a previously published NN potential for zinc oxide [95] and the NN
potential for the ternary Cu/ZnO system [112].
100
rel. energy (eV/atom)
2.0
DFT
NN(CuZnO)
1.5
1.0
0.5
0.0
-0.5
-1.0
0
5
10
15
20
25
30
35
40
number of structure
Figure 9.6 Comparison of the DFT and neural network (NN) energies of random CuO bulk
structures of the composition Cu10 O2 . An insert figure is a supercell (3 × 3 × 1) CuO bulk
structure.
is able to accurately reproduce the DFT reference values. The insets show random
(3 × 3 × 1) supercell structures for both systems.
9.2.4 Neural network potential for copper clusters at zinc oxide
It should be emphasized again that the Cu/ZnO potential presented in this section has
been explicitly constructed for copper clusters at zinc oxide surfaces. The ternary
Cu/Zn/O structures that have been included in the reference data set were therefore
extracted from large interface models. In Fig. 9.8 the NN and DFT energies of a
number of examples for such clusters of the composition Cu27 Zn20 O20 are compared
after they had been used to refine the NN potential. As can be clearly seen, the NN
potential is able to reproduce the fitted energies very well.
Also the energies of random slabs of the composition Cu12 Zn6 O6 have been checked,
and the agreement of the NN and DFT energies is excellent, as shown in Fig. 9.9.
In order to verify the accuracy of the preliminary potential for Cu/ZnO, a test model
has been constructed: a copper cluster containing 612 atoms adsorbed with its (111)
surface at a ZnO(1010) surface, as shown in Fig. 9.10. In total, this model contains
7524 atoms. An NV T MD simulation at 1000 K has been carried out employing the
101
rel. energy (eV/atom)
0.7
DFT
NN(CuZnO)
0.6
0.5
0.4
0.3
0.2
0.1
0.0
-0.1
0
2
4
6
8
10 12 14 16 18 20
number of structure
Figure 9.7 Comparison of the DFT and neural network (NN) energies of random CuZn bulk
structures of the composition Cu10 Zn2 . An insert figure is a supercell (3 × 3 × 1) CuZn bulk
structure.
rel. energy (meV/atom)
250
DFT
NN(CuZnO)
200
150
100
50
0
0
5
10
15
20
25
30
35
40
number of structure
Figure 9.8 Comparison of the DFT and neural network (NN) energies of random clusters of
the composition Cu27 Zn20 O20 .
102
(a)
(b)
rel. energy (meV/atom)
60
DFT
NN(CuZnO)
50
40
30
20
10
0
0
10
20
30
40
50
number of structure
Figure 9.9 Comparison of the DFT and neural network (NN) energies of random slabs of the
composition Cu12 Zn6 O6 .
Figure 9.10 Model of a Cu612 cluster at the ZnO(1010) surface, in total this system contains
7524 atoms.
103
NN potential for Cu/ZnO system to obtain highly distorted structures at the interface
using the molecular dynamics program TINKER [97] Since the system is too large
to access via DFT calculations, representative atoms that are close to the interface
have been selected, and the reliability of the forces acting on those atoms has been
investigated. Atoms far away from the interface are already accurately described,
because they are embedded in chemical environments similar to subsystems like pure
copper or pure zinc oxide.
As has been pointed out, the NN forces in principle depend on the positions of the
atoms within a distance of up to twice the cutoff radius from the central atom. However,
it can be seen that for all atoms the forces in the smallest clusters are already a very
good approximation to the converged forces.
In Fig. 9.11 a snapshot of the MD simulation is shown in panel (B). For a visualization
of the interface the downside of the copper cluster is shown in panel (A), while a top
view of the ZnO surface is shown in (C), including information about the positions
of the copper atoms in the first layer in the cluster. Five representative atoms of
each element (Cu, O, Zn) have been selected as labeled in Fig. 9.11(A) and (C). 15
atom-centered clusters have been extracted from the slab model to determine the
forces by DFT calculations. A radius of about 12 Å, which corresponds to twice the
cutoff radius of the symmetry functions, results in an average of about 450 atoms per
cluster, which is still too large for DFT. Therefore, the radii of the clusters for the DFT
calculations of 6 Å and 9 Å (average numbers of atoms 65 and 200, respectively) have
been investigated instead. Using these two different cluster sizes allows to determine
the level of convergence of the DFT forces acting on the central atoms as a function of
the radius of the clusters. The results are presented in Fig. 9.12. There are only small
differences in the DFT forces at the central atoms in the 6 Å and 9 Å clusters, and
both data sets are very similar to the NN forces calculated for these atoms in the slab
model shown in Fig. 9.11. There are no differences in the quality of the NN forces for
the elements copper, zinc and oxygen. Since the DFT forces are converged already at
cluster radii of 9 Å. To proof this, the analysis of the effective convergence of the NN
forces as a function of the cluster radius is shown in Fig. 9.13.
104
(A)
(B)
(C)
Figure 9.11 Snapshot of a molecular dynamics simulation at 1000 K of a Cu612 cluster at
the ZnO(1010) surface (B). This structure has been used to analyze the quality of the neural
network potential (NN) for the description of the interface atoms. In (A) a bottom-view of the
cluster is shown and five copper atoms have been selected to compare the NN forces with the
DFT forces. The comparison is shown in Fig. 9.12. In panel (C) a top view of the ZnO(1010)
surface is shown. Five oxygen and five zinc atoms have been chosen for a closer investigation
of the forces. In order to illustrate the position of these atoms with respect to the copper cluster,
the copper atoms of the first metal layer, which are directly in contact with the ZnO surface,
are shown as small spheres.
105
|F| (eV/Bohr)
2.0
DFT 6 Å
DFT 9 Å
NN slab
1.5
1.0
0.5
0.0
1
2
3
4
5
1
2
3
4
5
1
2
Zn atom
Cu atom
3
4
5
O atom
Figure 9.12 Comparison of the neural network (NN) forces acting on selected atoms at the
Cu(111)/ZnO(1010) interface shown in Fig. 9.11 and forces obtained by DFT in calculations
for clusters centered at these atoms with radii of 6 Å and 9 Å.
|F| (eV/Bohr)
2.0
NN 6 Å
NN 9 Å
NN 12 Å
NN slab
1.5
1.0
0.5
0.0
1
2
3
4
Cu atom
5
1
2
3
4
Zn atom
5
1
2
3
4
5
O atom
Figure 9.13 Convergence of the neural network (NN) forces acting on selected atoms at the
Cu(111)/ZnO(1010) interface shown in Fig. 9.11. The forces have been obtained from clusters
centered at these atoms with increasing radii of 6 Å, 9 Å, and 12 Å. For comparison also the
forces obtained directly for the full system are shown (“slab”). While there are still small
differences between the 6 Å clusters and the slab, the forces are well converged for a cluster
radius of 9 Å.
106
10 A Neural Network Potential for the Methanol Molecule
Recently, a first application of full dimensional NN potentials for binary molecules,
namely for hydrogen and oxygen (H, O) in water monomer and dimer molecules, has
been presented [96]. For this thesis the methanol molecule has been selected as a
benchmark for a ternary molecular structure (H, C, and O). The construction of a NN
potential for molecular applications is in principle similar to the construction of solid
state potentials described in the previous chapters. In the following the differences
between the approaches will be elaborated.
A cutoff radius of (Rc ) 12 Bohr has been used for the atomic NNs, which ensures
that each atom interacts with all other atoms in the molecule. Due to the chemical
composition of the methanol molecule only a subset of the possible element combinations of the ternary (H,C,O) system is present, which has to be taken into account
for the construction of the symmetry functions. For example, carbon atoms cannot
have other carbon atoms in their chemical environment, as there is only a single carbon
atom in methanol. The specific symmetry functions that have been used and their
parameters are listed in Table 10.1 and Table 10.2. In short, in the atomic NN the
chemical environments of the H, C, and O atoms have been described by 19, 10, and
10 atom-centered symmetry functions, respectively.
The fitting accuracy that can be obtained depends on the energy range that needs
to be covered by the potential. The training data set has been constructed by MD
simulations using the MM3 force field at temperatures between 10 K and 1000 K. We
have extracted 45,600 structures from the MD trajectories, spanning an energy range
of 0.08 eV/atom with respect to the optimized structure. 41,047 of these structures
have been selected randomly for the training set, the remaining 4,553 structures form
the test set.
107
Table 10.1 Parameters defining the atom-centered symmetry functions used
to describe the local atomic environments for methanol (Hydrogen atom).
Symmetry functions of type G2 : element H
neighbor
η (Bohr−2 )
Rs (Bohr)
Rc (Bohr)
1
H
0.010
0.0
12.0
2
C
0.010
0.0
12.0
3
O
0.010
0.0
12.0
No.
Symmetry functions of type G4 : element H
neighbors
η (Bohr−2 )
λ
ζ
Rc (Bohr)
4
HH
0.008
-1.0
1.0
12.0
5
HC
0.008
-1.0
1.0
12.0
6
HO
0.008
-1.0
1.0
12.0
7
CO
0.008
-1.0
1.0
12.0
8
HH
0.008
1.0
1.0
12.0
9
HC
0.008
1.0
1.0
12.0
10
HO
0.008
1.0
1.0
12.0
11
CO
0.008
1.0
1.0
12.0
12
HH
0.008
-1.0
4.0
12.0
13
HC
0.008
-1.0
4.0
12.0
14
HO
0.008
-1.0
4.0
12.0
15
CO
0.008
-1.0
4.0
12.0
16
HH
0.008
1.0
4.0
12.0
17
HC
0.008
1.0
4.0
12.0
18
HO
0.008
1.0
4.0
12.0
19
CO
0.008
1.0
4.0
12.0
No.
108
Table 10.2 Parameters defining the atom-centered symmetry functions used
to describe the local atomic environments for methanol (Carbon and Oxygen
atoms).
Symmetry functions of type G2 : element C
neighbor
η (Bohr−2 )
Rs (Bohr)
Rc (Bohr)
1
H
0.010
0.0
12.0
2
O
0.010
0.0
12.0
No.
Symmetry functions of type G4 : element C
neighbors
η (Bohr−2 )
λ
ζ
Rc (Bohr)
3
HH
0.008
-1.0
1.0
12.0
4
HO
0.008
-1.0
1.0
12.0
5
HH
0.008
1.0
1.0
12.0
6
HO
0.008
1.0
1.0
12.0
7
HH
0.008
-1.0
4.0
12.0
8
HO
0.008
-1.0
4.0
12.0
9
HH
0.008
1.0
4.0
12.0
10
HO
0.008
1.0
4.0
12.0
No.
Symmetry functions of type G2 : element O
neighbor
η (Bohr−2 )
Rs (Bohr)
Rc (Bohr)
1
H
0.010
0.0
12.0
2
C
0.010
0.0
12.0
No.
Symmetry functions of type G4 : element O
neighbors
η (Bohr−2 )
λ
ζ
Rc (Bohr)
3
HH
0.008
-1.0
1.0
12.0
4
HC
0.008
-1.0
1.0
12.0
5
HH
0.008
1.0
1.0
12.0
6
HC
0.008
1.0
1.0
12.0
7
HH
0.008
-1.0
4.0
12.0
8
HC
0.008
-1.0
4.0
12.0
9
HH
0.008
1.0
4.0
12.0
10
HC
0.008
1.0
4.0
12.0
No.
109
Table 10.3 Root mean squared errors (RMSEs) of the energies and forces for
the methanol molecule for different NN architectures. X represents the number
of input nodes, which depends on the specific element of elements as discussed
in the text.
RMSE
Architecture
(meV/atom)
(meV/Bohr)
(meV/atom)
(meV/Bohr)
Training Set
Test Set
Training Set
Test Set
X-5-5-1
0.426
0.449
26.45
26.30
X-10-10-1
0.112
0.115
7.20
7.19
X-15-15-1
0.069
0.073
4.48
4.51
X-20-20-1
0.053
0.060
3.65
3.65
X-30-30-1
0.032
0.038
2.51
2.57
X-40-40-1
0.032
0.038
2.69
2.64
As for the Cu/Zn/O potentials in Chapters 7, 8 and 9, for the construction of the NN
potential we have tested various NN architectures employing two hidden layers with
five to 40 nodes per layer. The RMSE values are listed in Table 10.3. The optimum
architecture corresponds to a fit with a low RMSE value for the training set and the
test set.
The best fit has been obtained for an architecture with 30 nodes per layer. A larger
NN does not yield a further reduced energy RMSE, while the RMSE of the forces
increases slightly. The RMSEs of the energies of this NN are 0.032 meV/atom and
0.038 meV/atom for the training and the test set, respectively. The total energy of the
six-atom methanol molecule is thus reproduced with an RMSE well below 0.2 meV,
which is only a small fraction of the 480 meV range of the total energies in the training
set.
In Fig. 10.1 the NN energies of the training and test points are plotted against the
reference force field energies. All points are very close to the line with a slope of
45◦ indicating an almost perfect correlation. The training set and test set forces have
RMSEs of 2.51 meV/Bohr and 2.57 meV/Bohr, respectively. Although the forces have
not been included in the fitting process, the agreement of the NN predictions and the
110
NN energies (meV/atom)
100
80
60
40
Train points
Test points
20
0
0
20
40
60
80
100
MM3 energies (meV/atom)
Figure 10.1 Comparison of the MM3 and neural network (NN) energies for the methanol
molecule. All train and test points are very close to the line with slope 45◦ corresponding to a
very good fit.
MM3 forces is excellent. In summary, the NN potential is able to provide an extremely
accurate fit of the energies, and the low test set RMSEs as well as the low errors of the
forces. This indicates that a smooth and reliable NN PES has been obtained.
In order to test the applicability of the NN PESs to MD simulations, the comparison
of the dihedral potential for a rotation about the C-O bond obtained using NN and
the MM3 force field reference data. The energies of the MM3 force field and of NN
approach shown in Fig. 10.2 are basically indistinguishable.
The most stringent test is the comparison of the energetics along the configurations
visited in molecular dynamics simulations. In Fig. 10.3 one trajectory obtained employing the NV E ensemble with an average temperature of about 350 K is shown. The
propagating forces in the simulation have been calculated using the atom-based NN,
and the energies of the configurations have been recalculated using the MM3 force
field. The agreement between the reference force field and the NN PES is excellent.
111
Potential energy (eV)
0.10
0.09
FF MM3
NN
0.08
0.07
0.06
0.05
0.04
0
60
120
180
240
Dihedral Angle (°)
300
360
Figure 10.2 Dihedral potential of methanol for a rotation about the C-O bond. The MM3 force
field reference data are shown to assess the quality of the NN potential, but these points have
not been included in the training set. The initial structure of methanol has been optimized,
which results in two different lengths of C-H bonds in the methyl group. Since the internal
structure of the methyl group and the length of the O-H bond have been frozen for this plot, the
two types of C-H bonds result in different heights of the maxima in the dihedral potential.
Note, that in some cases the NN potentials may be further improved by using an
alternative structural description in form of atom pairs instead of atomic contributions,
which directly reflect the atomic interactions and take the chemical environments into
account. However, this approach will not be discussed in detail in this work, and
the interested reader is referred to Ref. [113], in which an implementation of the NN
method based on atom pairs has been reported.
112
Potential energy (eV)
0.70
FF MM3
NN
0.60
0.50
0.40
0.30
0.20
0.10
0
10
20
30 40 50
Structures
60
70
80
Figure 10.3 Comparison of the MM3 and neural network (NN) energies along a molecular
dynamics (MD) trajectory of the methanol molecule in the NV E statistical ensemble with an
average temperature of about 350 K and a time step of 1 fs. The MD has been carried out
employing the NN potential, and the energies have been recalculated using the MM3 force
field.
113
114
Part V
Summary and Outlook
115
11 Summary and Outlook
The ultimate goal of this thesis was to develop an accurate and efficient atomistic potential to study large Cu clusters on the ZnO surfaces, which is an important catalyst for
the methanol synthesis. Accurate electronic structure methods like density-functional
theory (DFT) are computationally too demanding to simulate the Cu/ZnO system under
realistic conditions. This realization was the major motivation for this project. On the
other hand, efficient interatomic potentials based on physically motivated functional
forms are not able to give a reliable and sufficiently accurate description for such
systems. In order to overcome these limitations, the high-dimensional neural network
method (NN) has been used to construct efficient potentials based on reference data
obtained from DFT.
Preceding this thesis the NN method has neither been applied to metallic systems, nor
to compounds containing more than a single chemical species. The previous use of the
methodology has been limited to periodic structures of covalent insulators.
In Chapter 7 a potential for Cu clusters, bulk structures and surface slabs has been
constructed. The NN and reference DFT energies for the reference data set have been
shown to differ by only 4 meV/atom on average even for challenging benchmark
structures. Bulk properties, such as cohesive energies, equilibrium lattice constants,
and bulk moduli have been found to be in excellent agreement for the NN and DFT
values for various different crystal structures (fcc, bcc, sc and hcp). The reliable
description of Cu surfaces has been demonstrated, by a comparison of surface energies,
vacancy formation energies, and energy profiles of diffusing copper surface adatoms
moving along paths on different copper surfaces. Using a realistic, non-ideal slab
model for a Cu(111) surface containing various defects like vacancies, adatoms, kinks
and steps, it has been shown that the atomic forces acting on representative atoms
can be predicted with very high precision by the NN potential. This conclusively
demonstrates that NN potentials are capable to describe metallic materials and can be
117
confidently used to simulate large realistic systems with a quality that is very close to
the DFT reference.
The next step was the construction of a potential for the multicomponent ZnO system
in Chapter 8 that is applicable to bulk structures, clusters, and surface slabs. In order
to achieve the desired accuracy for this system, the long-range electrostatic energy
resulting from charge transfer has to be properly described. For this purpose, the
atomic NN method has been extended by a second set of high-dimensional NNs
representing environment-dependent point charges. The predicted NN charges allow
then to compute the electrostatic interactions. We have demonstrated that this approach
makes it possible to accurately reproduce the cohesive energies, the lattice parameters,
and bulk moduli of arbitrary ZnO structures using the NN potential. Additionally,
the agreement of the NN atomic forces and charges with their DFT references is
excellent. However, a careful analysis of the long-range interaction in ZnO has shown
that electrostatic interactions due to charge transfer may play a smaller role for ZnO
than assumed. An NN potential that was constructed without the electrostatic extension
proved to be only marginally less accurate than the extended potential. Nevertheless,
we are confident that the extended NN method is useful for structures with more
varying charge transfer.
Finally, a potential for the ternary Cu/ZnO system has been successfully constructed
in Chapter 9. To confirm the reliability of the potential, the comparison of the NN
properties with DFT reference data has been evaluated. Even though the ternary
potential has been fitted to a much larger configuration space than the previous NNs,
we have demonstrated that the Cu and ZnO subsystems can be described without
significant loss of accuracy by the combined potential. This observation proves that
NN potentials can be constructed in a straightforward way, using reference data sets of
previous constructed NN potentials. Additionally, the accuracy of the ternary potential
for Cu/ZnO interface structures has been shown to be very high.
To demonstrate the general capability of the NN method to describe molecular systems,
a highly accurate potential for the methanol molecule has been presented in Chapter 10,
trained to energies obtained with the MM3 force field.
In summary, this work has shown that high-dimensional NN potentials can be constructed for a variety of complex systems including metals and insulators, isolated
118
clusters, bulk structures, large surfaces and molecules. Due to the flexible form of
the NN, a large number of training structures is required. However, the construction
of the training set can be done in a very systematic and unbiased way as explained
in Section 6.2. It has been shown that reference energies, forces, and charges can be
reproduced very accurately not only for structures included in the training set but also
for the independent test set. The analysis of many different properties derived from
the PESs has proven that NN potentials are a reliable alternative to computationally
demanding ab initio methods like DFT. Once the construction is done, NN potentials
can be applied to simulate large systems, which are not accessible by DFT.
Preliminary MD simulations for a slab model of a Cu(111) cluster at a ZnO(1010)
surface containing more than 7,500 atoms, could be performed in a few seconds per
time step on a regular eight core workstation. In contrast to electronic structure methods
NN potentials scale linear with the system size and are well suited for massively parallel
simulations.
Continuing work employing the constructed NN potential for Cu/ZnO is already in
progress. In order to further improve the potential, additional structures may be
included in the training set. The choice of the regions of the PES that needs to be
refined strongly depends on what properties of the PES are to be observed. Additionally,
it has yet to be explored if the electrostatic extension would enhance the accuracy of
the Cu/ZnO potential.
Based on a refined NN potential, a number of important applications can be realized.
The NN potential can be employed in molecular dynamics simulations to study the
structural and dynamical properties of copper clusters at ZnO surfaces. Of particular
interest are the structure and stabilization mechanisms of different surfaces and defects
at these surfaces, the structures and preferred adsorption sites of Cu clusters at ZnO
surfaces, and possibly also some aspects of growth mechanism of the clusters.
Further, the structure of the Cu/ZnO interface and diffusion processes at this interface,
which might give rise to CuZn alloying, can be studied.
Including further elements, such as hydrogen in addition to the Cu/ZnO potential (in
form of, e.g., hydrogen molecules or water molecules) will enable the study of Cu
cluster shapes depending on the gaseous environment. Ultimately, this could help to
get a better understanding of the Cu/ZnO catalyst.
119
120
Part VI
Appendix
121
A Symmetry Function Parameters
A.1 Copper potential
Table A.1 Parameters of the radial symmetry functions used to describe the local atomic environments
for copper. The parameters refer to the definition in
Eqn. (5.9) in Chapter 5.
Symmetry functions of type G2
No.
η (Bohr−2 )
Rshift (Bohr)
Rc (Bohr)
1
0.001
0.000
11.338
2
0.010
0.000
11.338
3
0.020
0.000
11.338
4
0.035
0.000
11.338
5
0.060
0.000
11.338
6
0.100
0.000
11.338
7
0.200
0.000
11.338
8
0.400
0.000
11.338
123
Table A.2 Parameters of the angular symmetry functions used to describe the local
atomic environments for copper. The parameters refer to the definitions in Eqn. 5.10 in
Chapter 5.
Symmetry functions of type G4
No.
η
λ
ζ
(Bohr−2 )
124
Rc
Symmetry functions of type G4
No.
η
λ
ζ
(Bohr−2 )
(Bohr)
Rc
(Bohr)
9
0.0001
-1.0
1.0
11.338
31
0.0250
-1.0
2.0
11.338
10
0.0001
1.0
1.0
11.338
32
0.0250
1.0
2.0
11.338
11
0.0001
-1.0
2.0
11.338
33
0.0250
-1.0
4.0
11.338
12
0.0001
1.0
2.0
11.338
34
0.0250
1.0
4.0
11.338
13
0.0030
-1.0
1.0
11.338
35
0.0250
-1.0
16.0
11.338
14
0.0030
1.0
1.0
11.338
36
0.0250
1.0
16.0
11.338
15
0.0030
-1.0
2.0
11.338
37
0.0450
-1.0
1.0
11.338
16
0.0030
1.0
2.0
11.338
38
0.0450
1.0
1.0
11.338
17
0.0080
-1.0
1.0
11.338
39
0.0450
-1.0
2.0
11.338
18
0.0080
1.0
1.0
11.338
40
0.0450
1.0
2.0
11.338
19
0.0080
-1.0
2.0
11.338
41
0.0450
-1.0
4.0
11.338
20
0.0080
1.0
2.0
11.338
42
0.0450
1.0
4.0
11.338
21
0.0150
-1.0
1.0
11.338
43
0.0450
-1.0
16.0
11.338
22
0.0150
1.0
1.0
11.338
44
0.0450
1.0
16.0
11.338
23
0.0150
-1.0
2.0
11.338
45
0.0800
-1.0
1.0
11.338
24
0.0150
1.0
2.0
11.338
46
0.0800
1.0
1.0
11.338
25
0.0150
-1.0
4.0
11.338
47
0.0800
-1.0
2.0
11.338
26
0.0150
1.0
4.0
11.338
48
0.0800
1.0
2.0
11.338
27
0.0150
-1.0
16.0
11.338
49
0.0800
-1.0
4.0
11.338
28
0.0150
1.0
16.0
11.338
50
0.0800
1.0
4.0
11.338
29
0.0250
-1.0
1.0
11.338
51
0.0800
1.0
16.0
11.338
30
0.0250
1.0
1.0
11.338
A.2 Zinc oxide potential
Table A.3 Parameters of the radial symmetry functions used to describe the local atomic
environments around oxygen atoms in zinc oxide.
Symmetry functions of type G2
Symmetry functions of type G2
Neighboring
η
Rc
element
(Bohr−2 )
(Bohr)
8
Zn
0.060
11.338
11.338
9
O
0.100
11.338
0.010
11.338
10
Zn
0.100
11.338
Zn
0.010
11.338
11
O
0.200
11.338
5
O
0.035
11.338
12
Zn
0.200
11.338
6
Zn
0.035
11.338
13
O
0.400
11.338
7
O
0.060
11.338
14
Zn
0.400
11.338
No.
Neighboring
η
Rc
element
(Bohr−2 )
(Bohr)
1
O
0.001
11.338
2
Zn
0.001
3
O
4
No.
Table A.4 Parameters of the radial symmetry functions used to describe the local atomic
environments around zinc atoms in zinc oxide.
Symmetry functions of type G2
No.
Neighboring
η
element
(Bohr−2 )
(Bohr)
1
O
0.001
11.338
2
Zn
0.001
3
O
4
Symmetry functions of type G2
Neighboring
η
Rc
element
(Bohr−2 )
(Bohr)
8
Zn
0.060
11.338
11.338
9
O
0.100
11.338
0.010
11.338
10
Zn
0.100
11.338
Zn
0.010
11.338
11
O
0.200
11.338
5
O
0.035
11.338
12
Zn
0.200
11.338
6
Zn
0.035
11.338
13
O
0.400
11.338
7
O
0.060
11.338
Rc
No.
125
Table A.5 Parameters of the angular symmetry functions used to describe the local atomic
environments around oxygen atoms in zinc oxide. For each set of parameters there are six
functions referring to the six possible combinations of elements in the neighboring atom pairs.
Symmetry functions of type G4
No.
η
λ
ζ
(Bohr−2 )
Symmetry functions of type G4
Rc
No.
η
λ
ζ
(Bohr−2 )
(Bohr)
Rc
(Bohr)
15-17
0.000
-1.0
1.0
11.338
81-83
0.025
-1.0
2.0
11.338
18-20
0.000
1.0
1.0
11.338
84-86
0.025
1.0
2.0
11.338
21-23
0.000
-1.0
2.0
11.338
87-89
0.025
-1.0
4.0
11.338
24-26
0.000
1.0
2.0
11.338
90-92
0.025
1.0
4.0
11.338
27-29
0.003
-1.0
1.0
11.338
93-95
0.025
-1.0
16.0
11.338
30-32
0.003
1.0
1.0
11.338
96-98
0.025
1.0
16.0
11.338
33-35
0.003
-1.0
2.0
11.338
99-101
0.045
-1.0
1.0
11.338
36-38
0.003
1.0
2.0
11.338
102-104
0.045
1.0
1.0
11.338
39-41
0.008
-1.0
1.0
11.338
105-107
0.045
-1.0
2.0
11.338
42-44
0.008
1.0
1.0
11.338
108-110
0.045
1.0
2.0
11.338
45-47
0.008
-1.0
2.0
11.338
111-113
0.045
-1.0
4.0
11.338
48-50
0.008
1.0
2.0
11.338
114-116
0.045
1.0
4.0
11.338
51-53
0.015
-1.0
1.0
11.338
117-119
0.045
-1.0
16.0
11.338
54-56
0.015
1.0
1.0
11.338
120-122
0.045
1.0
16.0
11.338
57-59
0.015
-1.0
2.0
11.338
123-125
0.080
-1.0
1.0
11.338
60-62
0.015
1.0
2.0
11.338
126-128
0.080
1.0
1.0
11.338
63-65
0.015
-1.0
4.0
11.338
129-131
0.080
-1.0
2.0
11.338
66-68
0.015
1.0
4.0
11.338
132-134
0.080
1.0
2.0
11.338
69-71
0.015
-1.0
16.0
11.338
135-137
0.080
-1.0
4.0
11.338
72-74
0.015
1.0
16.0
11.338
136-140
0.080
1.0
4.0
11.338
75-77
0.025
-1.0
1.0
11.338
141-142
0.080
1.0
16.0
11.338
78-80
0.025
1.0
1.0
11.338
126
Table A.6 Parameters of the angular symmetry functions used to describe the local atomic
environments around zinc atoms in zinc oxide. For each set of parameters there are six
functions referring to the six possible combinations of elements in the neighboring atom pairs.
Symmetry functions of type G4
No.
η
λ
ζ
(Bohr−2 )
Symmetry functions of type G4
Rc
No.
η
λ
ζ
(Bohr−2 )
(Bohr)
Rc
(Bohr)
14-16
0.000
-1.0
1.0
11.338
77-79
0.025
-1.0
2.0
11.338
17-19
0.000
1.0
1.0
11.338
80-82
0.025
1.0
2.0
11.338
20-22
0.000
-1.0
2.0
11.338
83-85
0.025
-1.0
4.0
11.338
23-25
0.000
1.0
2.0
11.338
86-88
0.025
1.0
4.0
11.338
26-28
0.003
-1.0
1.0
11.338
89-91
0.025
-1.0
16.0
11.338
29-31
0.003
1.0
1.0
11.338
92-94
0.025
1.0
16.0
11.338
32-34
0.003
-1.0
2.0
11.338
95-97
0.045
-1.0
1.0
11.338
35-37
0.003
1.0
2.0
11.338
98-100
0.045
1.0
1.0
11.338
38-39
0.008
-1.0
1.0
11.338
101-103
0.045
-1.0
2.0
11.338
40-42
0.008
1.0
1.0
11.338
104-106
0.045
1.0
2.0
11.338
43-44
0.008
-1.0
2.0
11.338
107-108
0.045
-1.0
4.0
11.338
45-46
0.008
1.0
2.0
11.338
109-111
0.045
1.0
4.0
11.338
47-49
0.015
-1.0
1.0
11.338
112
0.045
-1.0
16.0
11.338
50-52
0.015
1.0
1.0
11.338
113-115
0.045
1.0
16.0
11.338
53-55
0.015
-1.0
2.0
11.338
116-117
0.080
-1.0
1.0
11.338
56-58
0.015
1.0
2.0
11.338
118-120
0.080
1.0
1.0
11.338
59-61
0.015
-1.0
4.0
11.338
121-122
0.080
-1.0
2.0
11.338
62-64
0.015
1.0
4.0
11.338
123-124
0.080
1.0
2.0
11.338
65-67
0.015
-1.0
16.0
11.338
125
0.080
-1.0
4.0
11.338
68-70
0.015
1.0
16.0
11.338
126-127
0.080
1.0
4.0
11.338
71-73
0.025
-1.0
1.0
11.338
128-129
0.080
1.0
16.0
11.338
74-76
0.025
1.0
1.0
11.338
127
A.3 Copper/zinc oxide potential
Table A.7 Parameters of the radial symmetry functions used to describe the local atomic
environments (copper/zince oxide system).
Symmetry functions of type G2
No.
128
Neighboring
η
Rc
element
(Bohr−2 )
(Bohr)
1
Cu
0.0009
12.0
2
Zn
0.0009
3
O
4
Symmetry functions of type G2
No.
Neighboring
η
Rc
element
(Bohr−2 )
(Bohr)
13
Cu
0.060
12.0
12.0
14
Zn
0.060
12.0
0.0009
12.0
15
O
0.060
12.0
Cu
0.010
12.0
16
Cu
0.100
12.0
5
Zn
0.010
12.0
17
Zn
0.100
12.0
6
O
0.010
12.0
18
O
0.100
12.0
7
Cu
0.020
12.0
19
Cu
0.200
12.0
8
Zn
0.020
12.0
20
Zn
0.200
12.0
9
O
0.020
12.0
21
O
0.200
12.0
10
Cu
0.035
12.0
22
Cu
0.400
12.0
11
Zn
0.035
12.0
23
Zn
0.400
12.0
12
O
0.035
12.0
24
O
0.400
12.0
Table A.8 Parameters of the angular symmetry functions used to describe the local atomic
environments. For each set of parameters there are six functions referring to the six possible
combinations of elements in the neighboring atom pairs (copper/zinc oxide system).
Symmetry functions of type G4
No.
η
λ
ζ
(Bohr−2 )
Rc
Symmetry functions of type G4
No.
η
λ
ζ
(Bohr−2 )
(Bohr)
Rc
(Bohr)
25-30
0.0001
-1.0
1.0
12.0
91-96
0.015
1.0
2.0
12.0
31-36
0.0001
1.0
1.0
12.0
97-102
0.015
1.0
4.0
12.0
37-42
0.0001
-1.0
2.0
12.0
103-108
0.015
1.0
16.0
12.0
43-48
0.0001
1.0
2.0
12.0
109-114
0.025
1.0
1.0
12.0
49-54
0.003
-1.0
1.0
12.0
115-120
0.025
1.0
2.0
12.0
55-60
0.003
1.0
1.0
12.0
121-126
0.025
1.0
4.0
12.0
61-66
0.003
-1.0
2.0
12.0
127-132
0.025
1.0
16.0
12.0
67-72
0.003
1.0
2.0
12.0
133-138
0.045
1.0
1.0
12.0
73-78
0.008
1.0
1.0
12.0
139-144
0.045
1.0
2.0
12.0
79-84
0.008
1.0
2.0
12.0
145-150
0.045
1.0
4.0
12.0
85-90
0.015
1.0
1.0
12.0
151-156
0.045
1.0
16.0
12.0
129
130
B Calculation of Elastic Constants of Cubic Lattices
↔
↔
The elasticity (or stiffness) tensor C relates the strain ε acting on a material to the
↔
resulting stress σ via Hooke’s law
↔
↔↔
σ =Cε
.
(B.1)
↔
In general, the tensor C has 21 independent components, the elasticity constants.
However, for cubic lattices this number reduces, due to the high symmetry, to only
three independent constants, C11 , C12 , C44 , and the elasticity tensor is given by


C11 C12 C12 0
0
0


C12 C11 C12 0

0
0



↔
0
0 
C12 C12 C11 0

C =
(B.2)
 .
 0
0
0 C44 0
0 


 0
0
0
0 C44 0 


0
0
0
0
0 C44
Note, that the bulk modulus B can be related to the elasticity constants by
B=
1
(C11 + 2C12 ) .
3
(B.3)
For the calculation of the three elastic constants for copper we closely follow the
method suggested by Mehl et al. [114], which we will briefly outline in the following.
The relaxed lattice shall be described by the vectors ~a1 , ~a2 and ~a3 . We define the strain
↔
tensor ε such that the deformation of the lattice under strain is described by
 0  
~a1
~a1
 0    ↔ ↔
,
~a2  = ~a2  I + ε
~a03
~a3
(B.4)
131
↔
where I is the 3 × 3 identity matrix. The strain can then be represented by a symmetric
tensor with six independent components

↔
e1
e6 /2 e5 /2

ε = e6 /2


e4 /2
e2
e5 /2 e4 /2
.
(B.5)
e3
As a direct result of Hooke’s law (B.1), the total energy changes under strain according
to
6
∆E(ei ) = −p(V )∆V +V
6
1
∑ ∑ 2 Ci j ei e j + O(e3i )
,
(B.6)
i=1 j=1
where V and p are the unit cell volume and pressure of the undistorted lattice. The
calculation of the elastic constants using expression (B.6) simplifies for the case of
volume conserving strain, i. e. ∆V = 0. For cubic lattices the volume conserving
orthorhombic strain with
e1 = −e2 = x,
e3 =
x2
,
1 − x2
e4 = e5 = e6 = 0
(B.7)
results in a symmetric change of the total energy as
∆E(x) = V C11 −C12 x2 + O(x4 )
(B.8)
and allows for a simple access to the difference C11 −C12 . The volume conserving
strain
e6 = x,
e3 =
x2
,
4 − x2
e1 = e2 = e4 = e5 = 0
(B.9)
on the other hand, yields a change of energy in dependence of the strain x as
1
∆E(x) = V C44 x2 + O(x4 ) ,
2
(B.10)
which only depends on the elastic modulus C44 .
If the bulk modulus B is known, the three independent elastic constants C11 , C12 and
C44 can thus be calculated using the relations (B.3), (B.8) and (B.10).
132
Bibliography
[1] G. A. Olah, “Beyond oil and gas: The methanol economy”, Angew. Chem. Int. Ed., 44,
2636 – 2639, 2005.
[2] G. A. Olah, A. Goeppert, and G. K. S. Prakash, “Chemical recycling of carbon dioxide to
methanol and dimethyl ether: From greenhouse gas to renewable, environmentally
carbon neutral fuels and synthetic hydrocarbons”, J. Org. Chem., 74, 487 – 498,
2009.
[3] M. Behrens, F. Studt, I. Kasatkin, S. Kuhl, M. Havecker, F. Abild-Pedersen, S. Zander,
F. Girgsdies, P. Kurr, B. L. Kniep, M. Tovar, R. W. Fischer, J. K. Nørskov, and
R. Schlögl, “The active site of methanol synthesis over Cu/ZnO/Al2 O3 industrial
catalysts”, Science, 336, 893 – 897, 2012.
[4] M. Escudero-Escribano, A. Verdaguer-Casadevall, P. Malacrida, U. Grønbjerg, B. P.
Knudsen, A. K. Jepsen, J. Rossmeisl, I. E. L. Stephens, and I. Chorkendorff, “Pt5gd
as a highly active and stable catalyst for oxygen electroreduction”, J. Am. Chem.
Soc., 134, 16476 – 16479, 2012.
[5] B. Lim, M. Jiang, P. H. C. Camargo, E. C. Cho, J. Tao, X. Lu, Y. Zhu, and Y. Xia, “Pd-pt
bimetallic nanodendrites with high activity for oxygen reduction”, Science, 324,
1302 – 1305, 2009.
[6] M. Appl, Ammonia, 1. Introduction, 3, 107 – 137. Wiley-VCH Verlag GmbH & Co.
KGaA, 2000.
[7] SFB 558, http://www.sfb558.de/, 2012.
[8] J. D. Grunwaldt, A. M. Molenbroek, N. Y. Topsøe, H. Topsøe, and B. S. Clausen, “In situ
investigations of structural changes in Cu/ZnO catalysts”, J. Catal., 194, 452 – 460,
2000.
133
[9] K. H. Ernst, A. Ludviksson, R. Zhang, J. Yoshihara, and C. T. Campbell, “Growth-model
for metal-films on oxide surfaces: Cu on ZnO(0001)-O”, Phys. Rev. B, 47, 13782 –
13796, 1993.
[10] I. Kasatkin, P. Kurr, B. Kniep, A. Trunschke, and R. Schlögl, “Role of lattice strain and
defects in copper particles on the activity of Cu/ZnO/Al2 O3 catalysts for methanol
synthesis”, Angew. Chem. Int. Ed., 46, 7324 – 7327, 2007.
[11] N. Y. Topsøe and H. Topsøe, “On the nature of surface structural changes in Cu/ZnO
methanol synthesis catalysts”, Topics In Catalysis, 8, 267 – 270, 1999.
[12] J. B. Wagner, P. L. Hansen, A. M. Molenbroek, H. Topsøe, B. S. Clausen, and S. Helveg,
“In situ electron energy loss spectroscopy studies of gas-dependent metal-support
interactions in Cu/ZnO catalysts”, J. Phys. Chem. B, 107, 7753 – 7758, 2003.
[13] H. S. Qiu, F. Gallino, C. Di Valentin, and Y. M. Wang, “Shallow donor states induced by
in-diffused Cu in ZnO: A combined HREELS and hybrid DFT study”, Phys. Rev.
Lett., 106, 066401-1 – 066401-4, 2011.
[14] M. Kroll, T. Löber, V. Schott, C. Wöll, and U. Köhler, “Thermal behavior of MOCVDgrown Cu-clusters on ZnO(1010)”, Phys. Chem. Chem. Phys., 14, 1654 – 1659,
2012.
[15] K. Ozawa, Y. Oba, and K. Edamoto, “Oxidation of copper clusters on ZnO(1010): Effect
of temperature and preadsorbed water”, Surf. Sci., 601, 3125 – 3132, 2007.
[16] P. L. Hansen, J. B. Wagner, S. Helveg, J. R. Rostrup-Nielsen, B. S. Clausen, and H. Topsøe,
“Atom-resolved imaging of dynamic shape changes in supported copper nanocrystals”,
Science, 295, 2053 – 2055, 2002.
[17] M. Kroll, and U. Köhler, Private communication, 2012.
[18] J. Kiss, J. Frenzel, N. N. Nair, B. Meyer, and D. Marx, “Methanol synthesis on ZnO(0001).
III. Free energy landscapes, reaction pathways, and mechanistic insights”, J. Chem.
Phys., 134, 064710-1 – 064710-14, 2011.
[19] J. Kossmann, G. Rossmüller, and C. Hättig, “Prediction of vibrational frequencies of
possible intermediates and side products of the methanol synthesis on ZnO(0001) by
ab initio calculations”, J. Chem. Phys., 136, 034706-1 – 034706-12, 2012.
134
[20] X. Duan, O. Warschkow, A. Soon, B. Delley, and C. Stampfl, “Density functional study of
oxygen on Cu(100) and Cu(110) surfaces”, Phys. Rev. B, 81, 075430-1 – 075430-15,
2010.
[21] A. Soon, M. Todorova, B. Delley, and C. Stampfl, “Oxygen adsorption and stability
of surface oxides on Cu(111): A first-principles investigation”, Phys. Rev. B, 73,
165424-1 – 165424-12, 2006.
[22] A. Soon, M. Todorova, B. Delley, and C. Stampfi, “Surface oxides of the oxygen-copper
system: Precursors to the bulk oxide phase?”, Surf. Sci., 601, 5809 – 5813, 2007.
[23] B. Meyer and D. Marx, “Density-functional study of the structure and stability of ZnO
surfaces”, Phys. Rev. B, 67, 035403-1 – 035403-11, 2003.
[24] R. Kovácik, B. Meyer, and D. Marx, “F centers versus dimer vacancies on ZnO surfaces:
Characterization by STM and STS calculations”, Angew. Chem. Int. Ed., 46, 4894 –
4897, 2007.
[25] M. Valtiner, M. Todorova, G. Grundmeier, and J. Neugebauer, “Temperature stabilized
surface reconstructions at polar ZnO(0001)”, Phys. Rev. Lett., 103, 065502-1 –
065502-4, 2009.
[26] O. Warschkow, K. Chuasiripattana, M. J. Lyle, B. Delley, and C. Stampfl, “Cu/ZnO(0001)
under oxidating and reducing conditions: A first-principles survey of surface structures”, Phys. Rev. B, 84, 125311-1 – 125311-25, 2011.
[27] B. Meyer and D. Marx, “Density-functional study of Cu atoms, monolayers, films, and
coadsorbates on polar ZnO surfaces”, Phys. Rev. B, 69, 235420-1 – 235420-7, 2004.
[28] G. C. Abell, “Empirical chemical pseudopotential theory of molecular and metallic
bonding”, Phys. Rev. B, 31, 6184 – 6196, 1985.
[29] J. Tersoff, “New empirical approach for the structure and energy of covalent systems”,
Phys. Rev. B, 37, 6991 – 7000, 1988.
[30] D. W. Brenner, “Empirical potential for hydrocarbons for use in simulating the chemical
vapor deposition of diamond films”, Phys. Rev. B, 42, 9458 – 9471, 1990.
[31] D. G. Pettifor and I. I. Oleinik, “Analytic bond-order potentials beyond Tersoff-Brenner.
I. Theory”, Phys. Rev. B, 59, 8487 – 8499, 1999.
135
[32] M. W. Finnis and J. E. Sinclair, “A simple empirical n-body potential for transition
metals”, Phil. Mag. A, 50, 45 – 55, 1984.
[33] M. S. Daw and M. I. Baskes, “Embedded-atom method: Derivation and application to
impurities, surfaces, and other defects in metals”, Phys. Rev. B, 29, 6443 – 6453,
1984.
[34] W. R. P. Scott, P. H. Hünenberger, I. G. Tironi, A. E. Mark, S. R. Billeter, J. Fennen, A. E.
Torda, T. Huber, P. Krüger, and W. F. van Gunsteren, “The gromos biomolecular
simulation program package”, J. Phys. Chem. A, 103, 3596 – 3607, 1999.
[35] M. T. Nelson, W. Humphrey, A. Gursoy, A. Dalke, L. V. Kalé, R. D. Skeel, and K. Schulten, “NAMD: a parallel, object-oriented molecular dynamics program”, Int. J. High
Perform. Comput., 10, 251 – 268, 1996.
[36] W. D. Cornell, P. Cieplak, C. I. Bayly, I. R. Gould, K. M. Merz, D. M. Ferguson, D. C.
Spellmeyer, T. Fox, J. W. Caldwell, and P. A. Kollman, “A second generation force
field for the simulation of proteins, nucleic acids, and organic molecules”, J. Am.
Chem. Soc., 117, 5179 – 5197, 1995.
[37] D. Van Der Spoel1, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark, and H. J. C. Berendsen,
“GROMACS: fast, flexible, and free”, J. Comput. Chem., 26, 1701 – 1718, 2005.
[38] W. D. Cornell , P. Cieplak, C. I. Bayly , I. R. Gould , K. M. Merz , D. M. Ferguson ,
D. C. Spellmeyer, T. Fox, J. W. Caldwell, and P. A. Kollman “A second generation
force field for the simulation of proteins, nucleic acids, and organic molecules”, J.
Am. Chem. Soc., 117, 5179 – 5197, 1995.
[39] B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and M.
Karplus “A program for macromolecular energy, minimization, and dynamics calculations”, J. Comput. Chem., 4, 187 – 217, 1983.
[40] M. S. Daw and M. I. Baskes, “Semiempirical, Quantum Mechanical calculation of
hydrogen embrittlement in metals”, Phys. Rev. Lett., 50, 1285 – 1288, 1983.
[41] M. I. Baskes, “Modified embedded-atom potentials for cubic materials and impurities”,
Phys. Rev. B, 46, 2727 – 2742, 1992.
136
[42] S. Ryu, C. R. Weinberger, M. I. Baskes, and W. Cai, “Improved modified embeddedatom method potentials for gold and silicon”, Modelling Simul. Mater. Sci. Eng., 17,
075008-1 – 075008-14, 2009.
[43] A. C. T. van Duin, S. Dasgupta, F. Lorant, and W. A. Goddard, “ReaxFF: A reactive force
field for hydrocarbons”, J. Phys. Chem. A, 105, 9396 – 9409, 2001.
[44] D. Raymand, A. C. T. van Duin, M. Baudin, and K. Hermansson, “A reactive force field
(ReaxFF) for zinc oxide”, Surface Science, 602, 1020 – 1031, 2008.
[45] A. P. Bartók, Gaussian Approximation Potential: An Interatomic Potential Derived
from First Principles Quantum Mechanics. PhD thesis, University of Cambridge,
Cambridge, 2009.
[46] A. P. Bartók, M. C. Payne, R. Kondor, and G. Csányi, “Gaussian approximation potentials:
The accuracy of Quantum Mechanics, without the electrons”, Phys. Rev. Lett., 104,
136403-1 – 136403-4, 2010.
[47] T. B. Blank, S. D. Brown, A. W. Calhoun, and D. J. Doren, “Neural-network models of
potential-energy surfaces”, J. Chem. Phys., 103, 4129 – 4137, 1995.
[48] S. Lorenz, A. Gross, and M. Scheffler, “Representing high-dimensional potential-energy
surfaces for reactions at surfaces by neural networks”, Chem. Phys. Lett., 395, 210 –
215, 2004.
[49] J. Hertz, A. Krogh, and R. G. Palmer, Introduction to the Theory of Neural Computation.
Addison-Wesley, Reading, 1996.
[50] C. M. Handley, and P. L. A. Popelier, “Potential energy surfaces fitted by artificial neural
networks”, J. Phys. Chem. A, 114, 3371 – 3383, 2010.
[51] J. Behler, “Neural network potential-energy surfaces in chemistry: a tool for large-scale
simulations”, Phys. Chem. Chem. Phys., 13, 17930 – 17955, 2011.
[52] L. Raff, R. Komanduri, M. Hagan and S. Bukkapatnam, Neural Networks in Chemical
Reaction Dynamics. Oxford University Press, New York, 2012.
[53] J. Behler and M. Parrinello, “Generalized neural-network representation of highdimensional potential-energy surfaces”, Phys. Rev. Lett., 98, 146401 – 146405,
2007.
137
[54] J. Behler, “Neural network potential-energy surfaces for atomistic simulations”, Chemical
Modelling Applications and Theory, 7, 1 – 41, The Royal Society of Chemistry, 2010.
[55] J. Behler, R. Marton̂ák, D. Donadio, and M. Parrinello, “Metadynamics simulations of
the high-pressure phases of silicon employing a high-dimensional neural network
potential”, Phys. Rev. Lett., 100, 185501-1 – 185501-4, 2008.
[56] J. Behler, R. Marton̂ák, D. Donadio, and M. Parrinello, “Pressure-induced phase transitions in silicon studied by neural network-based metadynamics simulations”, Phys.
Status Solidi B, 245, 2618 – 2629, 2008.
[57] V. Blum, R. Gehrke, F. Hanke, P. Havu, V. Havu, X. G. Ren, K. Reuter, and M. Scheffler,
“Ab initio molecular simulations with numeric atom-centered orbitals”, Comput. Phys.
Commun., 180, 2175 – 2196, 2009.
[58] P. Giannozzi, S. Baroni, N. Bonini, M. Calandra, R. Car, C. Cavazzoni, D. Ceresoli,
G. L. Chiarotti, M. Cococcioni, I. Dabo, A. Dal Corso, S. de Gironcoli, S. Fabris, G. Fratesi, R. Gebauer, U. Gerstmann, C. Gougoussis, A. Kokalj, M. Lazzeri,
L. Martin-Samos, N. Marzari, F. Mauri, R. Mazzarello, S. Paolini, A. Pasquarello,
L. Paulatto, C. Sbraccia, S. Scandolo, G. Sclauzero, A. P. Seitsonen, A. Smogunov,
P. Umari, and R. M. Wentzcovitch, “Quantum espresso: a modular and open-source
software project for quantum simulations of materials”, J. Phys.: Condens. Matter,
21, 395502-1 – 395502-19, 2009.
[59] E. Schrödinger, “Quantisierung als eigenwertproblem”, Ann. Phys. (Berlin), 384, 361 –
376, 1926.
[60] M. Born and R. Oppenheimer, “Zur quantentheorie der molekeln”, Ann. Phys. (Berlin),
389, 457 – 484, 1927.
[61] R. G. Parr and W. Yang, Density-Functional Theory of Atoms and Molecules. International Series of Monographs on Chemistry, Oxford University Press, 1989.
[62] P. Hohenberg and W. Kohn, “Inhomogeneous electron gas”, Phys. Rev., 136, B864 –
B871, 1964.
[63] W. Kohn and L. J. Sham, “Self-consistent equations including exchange and correlation
effects”, Phys. Rev., 140, A1133 – A1138, 1965.
138
[64] C. C. J. Roothaan, “New developments in molecular orbital theory”, Rev. Mod. Phys., 23,
69 – 89, 1951.
[65] G. G. Hall, “The molecular orbital theory of chemical valency. viii. a method of calculating ionization potentials”, Proceedings of the Royal Society of London. Series A.
Mathematical and Physical Sciences, 205, 541 – 552, 1951.
[66] T. Auckenthaler, V. Blum, H.-J. Bungartz, T. Huckle, R. Johanni, L. Krämer, B. Lang,
H. Lederer, and P. R. Willems, “Parallel solution of partial symmetric eigenvalue
problems from electronic structure calculations”, Parallel Computing, 37, 783 – 794,
2011.
[67] L. Pauling, “The application of the quantum mechanics to the structure of the hydrogen
molecule and hydrogen molecule-ion and to related problems.”, Chem. Rev., 5, 173 –
213, 1928.
[68] J. E. Lennard-Jones, “The electronic structure of some diatomic molecules”, Trans.
Faraday Soc., 25, 668 – 686, 1929.
[69] V. Blum, R. Gehrke, F. Hanke, P. Havu, V. Havu, X. Ren, K. Reuter, and M. Scheffler, “Ab
initio molecular simulations with numeric atom-centered orbitals”, Comput. Phys.
Commun., 180, 2175 – 2196, 2009.
[70] J. Ortega, “First-principles methods for tight-binding molecular dynamics”, Comp. Mat.
Sci., 12, 192 – 209, 1998.
[71] D. Frenkel and B. Smit, Understanding Molecular Simulation: From Algorithms to
Applications. Academic Press, 2. edition, 2001.
[72] D. A. McQuarrie, Statistical Mechanics. Harper & Row. London, 1976.
[73] M. P. Allen and D. J. Tildesley, Computer Simulations of Liquids. Clarendon Press,
Oxford, 1987.
[74] W. C. Swope, H. C. Andersen, P. H. Berens, and K. R. Wilson, “A computer-simulation
method for the calculation of equilibrium-constants for the formation of physical
clusters of molecules - application to small water clusters”, J. Chem. Phys., 76, 637 –
649, 1982.
139
[75] D. Marx and J. Hutter, Ab Initio Molecular Dynamics. Cambridge University Press,
Cambridge, 2009.
[76] W. Koch and M. C. Holthausen, A Chemist’s Guide to Density Functional Theory.
Wiley-VCH Verlag GmbH, Weinheim, 2000.
[77] S. Lorenz, Reactions on Surfaces with Neural Networks. PhD Thesis, Technische Universität, Berlin, 2001.
[78] J. Behler, 2012. RuNNer – A Neural Network Code for High-Dimensional PotentialEnergy Surfaces, Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum.
[79] A. Bholoa, S. Kenny, and R. Smith, “A new approach to potential fitting using neural
networks”, Nuclear Instruments and Methods in Physics Research Section B: Beam
Interactions with Materials and Atoms, 255, 1 – 7, 2007.
[80] E. Sanville, A. Bholoa, R. Smith, and S. D. Kenny, “Silicon potentials investigated using
density functional theory fitted neural networks”, J. Phys.: Condens. Matter, 20,
285219-1 – 285219-10, 2008.
[81] J. Behler, “Atom-centered symmetry functions for constructing high-dimensional neural
network potentials”, J. Chem. Phys., 134, 074106-1 – 074106-13, 2011.
[82] H. Eshet, R. Z. Khaliullin, T. D. Kuhne, J. Behler, and M. Parrinello, “Ab initio quality
neural-network potential for sodium”, Phys. Rev. B, 81, 184107-1 – 184107-8, 2010.
[83] R. Z. Khaliullin, H. Eshet, T. D. Kuhne, J. Behler, and M. Parrinello, “Graphite-diamond
phase coexistence study employing a neural-network mapping of the ab initio potential
energy surface”, Phys. Rev. B, 81, 100103-1 – 100103-4, 2010.
[84] R. Z. Khaliullin, H. Eshet, T. D. Kuhne, J. Behler, and M. Parrinello, “Nucleation
mechanism for the direct graphite-to-diamond phase transition”, Nature Mater., 10,
693 – 697, 2011.
[85] N. Artrith and J. Behler, “High-dimensional neural network potentials for metal surfaces:
A prototype study for copper”, Phys. Rev. B, 85, 045439-1 – 045439-13, 2012.
[86] T. B. Blank and S. D. Brown, “Adaptive, global, extended kalman filters for training
feedforward neural networks”, J. Chemometrics, 8, 391 – 407, 1994.
140
[87] J. B. Witkoskie and D. J. Doren, “Neural network models of potential energy surfaces:
Prototypical examples”, J. Chem. Theory Comput., 1, 14 – 23, 2005.
[88] S. Shah, F. Palmieri, and M. Datum, “Optimal filtering algorithms for fast learning in
feedforward neural networks”, Neural Networks, 5, 779 – 787, 1992.
[89] A. Pukrittayakamee, M. Malshe, M. Hagan, L. M. Raff, R. Narulkar, S. Bukkapatnum,
and R. Komanduri, “Simultaneous fitting of a potential-energy surface and its corresponding force fields using feedforward neural networks”, J. Chem. Phys., 130,
134101-1 – 134101-10, 2009.
[90] H. M. Le and L. M. Raff, “Molecular dynamics investigation of the bimolecular reaction
BeH + H(2) −→ BeH(2) + H on an ab initio potential-energy surface obtained using
neural network methods with both potential and gradient accuracy determination”,
The J. Phys. Chem. A, 114, 45 – 53, 2010.
[91] R. S. Mulliken, “Electronic population analysis on LCAO[single bond]MO molecular
wave functions. I”, J. Chem. Phys., 23, 1833 – 1840, 1955.
[92] F. L. Hirshfeld, “Bonded-atom fragments for describing molecular charge-densities”,
Theor. Chem. Acc., 44, 129 – 138, 1977.
[93] R. Bader, Atoms in Molecules: A Quantum Theory. Oxford University Press, New York,
1990.
[94] C. M. Handley, and P. L. A. Popelier, “Dynamically polarizable water potential based on
multipole moments trained by machine learning”, J. Chem. Theory Comput., 5, 1474
– 1489, 2009.
[95] N. Artrith, T. Morawietz, and J. Behler, “High-dimensional neural-network potentials for
multicomponent systems: Applications to zinc oxide”, Phys. Rev. B, 83, 153101-1 –
153101-4, 2011.
[96] T. Morawietz, V. Sharma, and J. Behler, “A neural network potential-energy surface for
the water dimer based on environment-dependent atomic energies and charges”, J.
Chem. Phys., 136, 064103-1 – 064103-11, 2012.
[97] J. W. Ponder. TINKER, Version 5.1, Software Tools for Molecular Design, Biochemistry
and Molecular Biophysics, Washington University, St. Louis, USA.
141
[98] J. P. Perdew, K. Burke, and M. Ernzerhof, “Generalized gradient approximation made
simple”, Phys. Rev. Lett., 77, 3865 – 3868, 1996.
[99] E. Vanlenthe, E. J. Baerends, and J. G. Snijders, “Relativistic total-energy using regular
approximations”, J. Chem. Phys., 101, 9783 – 9792, 1994.
[100] J. Ischtwan and M. A. Collins, “Molecular-potential energy surfaces by interpolation”,
J. Chem. Phys., 100, 8080 – 8088, 1994.
[101] R. Dawes, D. L. Thompson, A. F. Wagner, and M. Minkoff, “Interpolating moving
least-squares methods for fitting potential energy surfaces: A strategy for efficient
automatic data point placement in high dimensions”, J. Chem. Phys., 128, 084107-1
– 084107-10, 2008.
[102] L. M. Raff, M. Malshe, M. Hagan, D. I. Doughan, M. G. Rockley, and R. Komanduri,
“Ab initio potential-energy surfaces for complex, multichannel systems using modified
novelty sampling and feedforward neural networks”, J. Chem. Phys., 122, 084104-1
– 084104-16, 2005.
[103] D. R. Lide, Handbook of Chemistry and Physics. CRC Press, Boca Raton, 90th ed.,
2009.
[104] G. Simons and H. Wang, Single Crystal Elastic Constants and Calculated Aggregate
Properties. MIT Press, Cambridge, MA, 1977.
[105] C. Noguera, “Polar oxide surfaces”, J. Phys.: Condens. Matter, 12, R367 – R410, 2000.
[106] A. Wander, F. Schedin, P. Steadman, A. Norris, R. McGrath, T. S. Turner, G. Thornton,
and N. M. Harrison, “Stability of polar oxide surfaces”, Phys. Rev. Lett., 86, 3811 –
3814, 2001.
[107] V. Staemmler, Theorectical Aspects of Transition Metal Catalysis, The Cluster Approach
for the Adsorption of Small Molecules on Oxide Surfaces, 219 – 256. Springer
Berlin/Heidelberg, 2003.
[108] O. Dulub, U. Diebold, and G. Kresse, “Novel stabilization mechanism on polar surfaces:
ZnO(0001)-Zn”, Phys. Rev. Lett., 90, 016102-1 – 016102-4, 2003.
[109] G. Kresse, O. Dulub, and U. Diebold, “Competing stabilization mechanism for the polar
ZnO(0001)-Zn surface”, Phys. Rev. B, 68, 245409-1 – 245409-15, 2003.
142
[110] P. P. Ewald, “Die berechnung optischer und elektrostatischer gitterpotentiale”, Ann.
Phys., 369, 253 – 287, 1921.
[111] D. H. Nguyen, “Neural networks for self-learning control systems”, IEEE Control
Systems Magazine, 10, 18 – 23, 1990.
[112] N. Artrith, B. Hiller, and J. Behler Phys. Status Solidi B, 2012. (invited feature article)
accepted.
[113] K. V. J. Jose, N. Artrith, and J. Behler, “Construction of high-dimensional neural network
potentials using environment-dependent atom pairs”, J. Chem. Phys., 136, 194111-1
– 194111-15, 2012.
[114] J. H. Westbrook and R. L. Fleisher, eds., First principles calculations of elastic properties of metals, Ch. 9, 195 – 210. John Wiley and Sons, London, 1993.
143
Acknowledgements
I would like to thank all those who have contributed with their support and motivation
to the accomplishment of my PhD studies and my PhD thesis.
First of all, I wish to thank my supervisor Dr. Jörg Behler for giving me the opportunity to join his research group and for introducing me to a very fascinating project
at the Department of Theoretical Chemistry, Ruhr University, Bochum, Germany
(TheoChem@RUB), and for his support over the past years and my future plans.
I am grateful to TheoChem@RUB for providing the great facilities during my PhD
studies.
Special thanks go to Björn Hiller for the great discussions about the research project
and for his kind help with computer problems, to Tobias Morawietz for exchanging
experiences with the neural network stuff, and to Dr. Volker Blum for stimulating
discussions about the FHI-aims code. I would also like to thank Prof. Dr. Ulrich Köhler
and Martin Kroll for sharing the STM results of the Cu@ZnO structures, and for all
the discussions.
Furthermore, I would like to thank all of my present and former colleagues (especially the Behler group and the Marx group) for the excellent working environment at
TheoChem@RUB. I am much indebted to Dr. Holger Langer and Dr. Harald Forbert
for their continuous support whenever a technical problem occurred. Additionally, I
very much appreciate Mrs. Sylke Kohlpoth, Mrs. Doris Fischer-Niess, and Mrs. Gundula Talbot for their excellent and kind administrative support during my stay at
TheoChem@RUB.
Finally, I would like to express my thankfulness to Dr. Alexander Urban for proofreading my thesis and for his support when I had no motivation sometimes.
This work was financially supported by the Deutsche Forschungsgemeinschaft (DFG)
through the collaborative research center SFB 558 “Metal-substrate interactions in
heterogeneous catalysis” and an Emmy Noether program.
145
Download