Applying machine learning to
 quantum mechanics Gábor Csányi Engineering Laboratory

advertisement
Applying machine learning to
quantum mechanics
Gábor Csányi
Engineering Laboratory
−0.2
5
4
3
Configuration exploration:
Nested Sampling
0.05
0.1
0.15
0.2
Relative temperature (kT/ε)
0.25
Force-fields by
Machine Learning
0.3
-156.0
A
1:4.5
B
-158.0
-160.0
15:1
-165
L2
-170
-162.0
-175
-164.0
-166.0
1:16
Free energy surface reconstruction
Bayesian Inference
-168.0
-180
-185
L1
-170.0
-172.0
Global
-190
Log phase space volume
Energy
Heat capacity
−0.1
Adaptive QM/MM
with particle exchange
Swift introduction to molecular modelling
The world works according to the Schrödinger equation:
@
i~
=H
@t
(x1 , x2 , . . .)
xi 2 R 3
X 1
X Zi Zj
H=
r2xi +
2m
|r
r
|
i
i
j
i
ij
⌘
“matter waves”
tendency to spread
For fermions, antisymmetry:
(. . . , xi , . . . , xj , . . .) =
(. . . , xj , . . . , xi , . . .)
Long range Coulomb
attraction/repulsion
macroscopically observable!
• Range of validity: particles are electrons and atomic nuclei,
stay the same
• Thus far not contradicted by experiment
Newtonian molecular dynamics
At ambient conditions nuclear do not overlap
(other than for H…), separable solution:
=
1 (R1 )
2 (R2 ) . . .
Correspondence principle: quantum system reduces
to a classical system:
pi = h |rRi | i
qi = h |Ri | i
difficult electronic problem
X p2
X Zi Zj
i
H = h |H| i =
+
+ Vel (q1 , q2 , . . .)
2m
|q
q
|
i
i
j
i
ij
|
{z
}
“Interatomic potential”
V (q1, q2, …)
Typical questions
“What are the defects that control material response?”
Typical questions
“What are the typically observed conformations?”
Ingredients
• Representation
smoothness, faithfulness,
continuity
• Interpolation
flexible but smooth
functional form, few
sensible parameters
• Database
predictive power
non-domain specific
of atomic
neighbourhood
of functions
of configurations
Function fitting with basis functions
Fit a function f (x) based on observations y ⌘ {yi } at {xi }
f (x) =
yj =
N
X
i=1
N
X
↵i k(xi , x)
↵i k(xi , xj ) +
i=1
y = (K +
↵=C
1
2
⌫ I)↵
y
ff(x)
(x) =
= kkTTC
C 11yy
0
e.g. k(x, x ) =
2
⌫ ij
2
|x x0 |2 /2
we
regularised fit:
arbitrary σ,σw,σν
[K]ij ⌘ k(xi , xj )
C⌘K+
2
⌫I
k ⌘ k(xi , x)
2
Machine learning framework: Kernel regression
"(q(i) ) =
N
X
↵k K q(i) , q(k)
k
• Linear regression:
(i)
KDP (q , q
(k)
)=q
(i)
·q
(k)
"(q(i) ) =
KNN q(i) , q(k) =
• Gaussian kernel
(i)
KSE q , q
(k)
|q(i)
q(k) |2 + const.
⇣ X (q (i)
= exp
j
q
2
2
j
(k) 2 ⌘
)
N
X
(i)
qj
j
• Neural networks
X
k
(k)
↵ k qj
= q(i) ·
1D example
1
f(x)
GP
splines
0.5
f(x)
0
-0.5
-1
1.5
2
2.5
3
3.5
4
r
Gaussians basis functions are wide!
f (x) = kT C
1
y
Bayesian function fitting: Gaussian Process Regression
Fit a function f (x) based on observations y ⌘ {yi } at {xi }
f (x) =
H
X
wh
basis functions
h (x)
h (x)
h=1
fi =
X
Rih ⌘
Rih wh
h
Gaussian prior: P (w) = Normal(0,
P (f ) = Normal(0,
h (xi )
2
w I)
T
2
w RR )
" ⇠ Normal(0,
Gaussian likelihood: y = f + "
Joint of observations also Gaussian:
P (y) = Normal(0,
T
2
w RR
+
Expected
function Var
2
⌫ I)
Var of
data
2
⌫ I)
Predict the next observation using the conditional
P (yN +1 |yN ) = P (yN +1 , yN )/P (yN )
f (x) = kT C
1
y
Rih ⌘
h (xi )
N data points
Limit of ∞ basis functions
H basis functions
RRT
Use Gaussian basis functions, take H ! 1 , RRT is finite
Prior is a Gaussian Process with
⇥
⇤
2
2
Cov[f (xi ), f (xj )] / exp (xi xj ) /
Similarly for the observations:
Cov[y(xi ), y(xj )] / exp
⇥
(xi
2
xj ) /
2
⇤
+
2
⌫ ij
f (x) = kT C
1
y
Gaussian Process connection to linear fit
P (yN +1 |yN ) = P (yN +1 , yN )/P (yN ) / P (yN +1 , yN )
P (y1 . . . yN ) / Normal(0, CN ) / exp(
N +1
P (yN +1 , yN ) / Normal(0, C
) / exp(
CN +1
22
3 2 33
64 CN 5 4k 57
7
⌘6
5
4
⇥
⇤
⇥ ⇤
T
k

CN1+1
[k]i ⌘ C(xi , xN +1 )
 ⌘ C(xN +1 , xN +1 )
T
[yN
1
yN +1 ]CN +1 [yN
22
3 2 33
6 4 M 5 4 m5 7
7
⌘6
4
5
⇥
⇤ ⇥ ⇤
T
m
µ
µ
=
m
=
M
=
k
T
yN +1 ])
CN1 k
1
µ CN1 k
1
1
CN + mmT
µ
T
2
yN +1 ] = yN
My + 2yT myN +1 + µyN
+1
T
2
= µ(yN +1 + y m/µ) + . . .
P (yN +1 |yN ) / exp( (yN +1
ȳ)2 /2ˆ 2 )
arg max P (yN +1 |yN ) = ȳ = kT CN1 yN
yN +1
1
T
y N CN y N )
1
T
[yN yN +1 ]CN +1 [yN
ȳ = kT CN1 yN
ˆ=
kT CN1 k
f (x) = kT C
1
y
Gaussian Process Regression summary
• Choose Gaussian covariance: xj )2 /2 2 )
Prior assumption
X
f (x) = arg max P (f |data) =
↵i K(x, x(i) )
K(xi , xj ) = exp( (xi
f
Maximum of posterior
i
↵=C
1
2
wK
+
C
2
w K(xi , xi0 )
+
ii0
=
y⌘(
2
1
y
⌫ I)
2
⌫ ii0
• Meaningful hyper-parameters:
σ : smoothness (x-scale) of
f
σ w : y-scale of f
σ ν: variance (noise) of input data
• This is not an optimised fit, but a closed form estimate!
Samples from the Gaussian prior
P (f ) / Normal(0, C)
f = f1.0
l = 1.0,
= 1.0
f =f 2.0
l = 2.0,
= 1.0
2
2
0
0
-2
-2
-4
-2
0
2
4
-4
-2
0
2
4
Samples from Gaussian posterior
P (yN +1 |y) / Normal(kT C
1
y, . . .)
2
0
-2
-4
-2
0
2
4
Mean of posterior distribution
ȳ(x) = kT (x)(K +
T
⌫ I)
1
y
|xi x0j |2 /
1
cov(y(x))
k(x,
Kernel:=k(x
xj ) =k e (x)(K + ⌫ I) k(x)
i , x)
[k(x)]i = k(x, xi ) Kij = k(xi , xj )
2
0
-2
-4
-2
0
2
4
Generalised input data
• Often direct measurements of yi not available
• Consider any linear operator L(y)
Compute
P (yN +1 (xN +1 )|L(y))
L can be built using Σ , @/@x etc
• Can deal with gradient data, sums, etc.
Interpolating the solution of an eigenproblem
X Zj ( 1)
X
1
1X 2
rri +
Hel =
+
2 i
|R
r
|
|r
r
|
j
i
i
j
ij
ij
@
i~
=H
@t
has spatial locality
Weak
Lowest eigenstate: E0 ⌘ Vel (R1 , R2 , . . .)
=
Many-body
(cluster) expansion
+
+
+
+
+
+
+
+
Strong
+ higher multi-body terms...
E=
X
(i)
"(q1 , q2 ,(i)
(i)
. . . , qM )
+ analytic long range terms
+ analytic long range terms
i
qi are descriptors of local atomic neighbourhood
Quantum mechanics is many-body
7
bcc
fcc
sc
6
DFT bcc
Energy [eV / atom]
Carbon: tight binding
Tungsten: DFT,
Embedded
Atom Model
Tungsten,
Finnis-Sinclair
fcc
sc
5
4
3
2
1
0
-1
12
14
16
18
Volume [A3 / atom]
20
22
Quantum mechanics has some locality
force F
neighbourhood
r
Force errors around
Si self interstitial
far field
Force errors around
O in water with QM/MM
r
Is there a finite range atomic energy function that gives the
correct total energy and forces?
Traditional ideas for functional forms
• Pair potentials: Lennard-Jones, RDF-derived, etc.
• Three-body terms: Stillinger-Weber, MEAM, etc.
•
Embedded Atom (no angular dependence)
1X
"i =
V2 (|rij |)
2 j
"i =
• Bond Order Potential (BOP)
Tight-binding-derived attractive term with pair-potential repulsion
• ReaxFF: kitchen-sink + hundreds of parameters
P
j
⇢(|rij |)
Representation
is implicit
These are NOT THE CORRECT functions.
Limited accuracy, not systematic
given by
GOAL: potentials based on quantum mechanics
Construct smooth similarity kernel directly
• Overlap integral
Z
S(⇢i , ⇢i0 ) =
⇢i (r)⇢i0 (r)dr,
• Integrate over all 3D rotations:
k(⇢i , ⇢i0 ) =
Z
2
S(⇢i , R̂⇢i0 ) dR̂ =
cutoff: compact support
Z
• After LOTS of algebra: SOAP kernel
k(⇢i , ⇢ ) =
i0
X
n,n0 ,l
(i)
(i0 )
pnn0 l pnn0 l
dR̂
Z
2
⇢i (r)⇢i0 (R̂r)dr
pnl =
pnn0 l =
†
cnl cnl
†
cnl cn0 l
Smooth Overlap of Atomic Positions (SOAP)
1
SOAP
l m ax = 2
l m ax = 4
l m ax = 6
0
Aver age dRMSD
REF
Error in distinguishing
10
10
−1
10
−2
10
4
6
8
10
n
12
14
16
Number of neighbours
18
20
Diamond
Brenner potential
50 random data points
1 data point
300 configurations from ab initio MD:
Brenner
a)
1061
133
736
717
40
α / K−1
GAP (20ps MD)
39.8
30
Frequency / THz
Frequency / THz
−6
0
40
35
x 10
6
5
4
3
2
1
0
b)
25
20
15
10
DFT (QH)
500
1000
Temperature / K
1500
Thermal expansion
39.4
39.2
39
5
2000
39.6
38.8
0
Δ
X
Σ
Γ
Λ
L
0
500
1000
Temperature / K
Phonon spectrum
1500
Reactive potentials
phase 2
Database:
0.1Å
Energy / eV
6
phase 1
0.1Å
4
DFT−LDA
GAP
Brenner
2
0
Rhombohedral graphite
0
0.2
0.4
0.6
Reaction coordinate
Diamond
0.8
1
Vacancy migration
Energy / eV
4
DFT
GAP
2
0
reaction coordinate
How to generate databases?
• Target applications: large systems
• Capability of full quantum mechanics (QM): small systems
QM MD on “representative” small systems:
sheared primitive cell ➝ elasticity
large unit cell ➝ phonons
surface unit cells ➝ surface energy
gamma surfaces ➝ screw dislocation
vacancy in small cell ➝ vacancy
vacancy @ gamma surface ➝ vacancy near dislocation
Iterative refinement
1. QM MD ➝ Initial database
2. Model MD
3. QM ➝ Revised database
What is the acceptable validation protocol?
How far can the domain of validity be extended?
Existing potentials for tungsten
Error
15%
50%
GAP
BOP
MEAM
FS
DFT reference
0
C11
C12
C44
Elastic const.
Vacancy
energy
(100)
(110)
(111)
(112)
Surface energy
Peierls barrier for screw dislocation glide
Vacancy-dislocation binding energy
(~100,000 atoms in 3D simulation box)
Self-aware potentials: predicting the error
• Bulk silicon: Diamond and Beta-tin phases
Error
Energy
• Predicted error correctly signals where GAP is unreliable
Outstanding problems
• Accuracy on database
accuracy in properties?
• Database contents
region of validity ?
• Alloys - permutational complexity? Chemical variability?
• Systematic treatment of long range effects
• Electronic temperature
Download