A tour on the computational side of Large Deviation Theory •

advertisement
A tour on the computational side of
Large Deviation Theory
Eric Vanden-Eijnden
Courant Institute
• Rare events matter - challenge for modeling and computations.
• Rare events pathways are often predictable - large deviation theory (LDT).
Numerical aspects: minimum action method & string method.
• LDT-based importance sampling strategies - UQ and reliability Q, filtering, etc.
• Beyond LDT - when entropy matters.
It is easier to go down a hill than
up but the view is much better at
the top - H. W. Beecher
Friday, September 13, 13
Unlikely (or infrequent) events matter
• Some events are rare (i.e. infrequent) but have dramatic consequences - massive
earthquakes, giant hurricanes, pandemics, etc.
• Rare events can be less dramatic but important nonetheless - for example, electronic
components or engineering devices used in automobile, aerospace and medical are
required to be extremely reliable, with very small probability of failure.
• Rare events always happen eventually given enough time and may be the most
interesting/important aspect of the system’s dynamics - e.g. conformational changes of
macromolecules, kinetic phase transitions, thermally induced magnetization reversal in
micromagnets, regime changes in climate, etc.
- Metastability.
Friday, September 13, 13
Unlikely (or infrequent) events matter
• Realistic models (SODEs, SPDEs, Markov jump processes, etc.) are often too
complex to be amenable to analytical solution.
• Direct numerical simulations of these models very challenging or even impossible.
• Requires new computational approaches based e.g. on large deviation theory (LDT).
• Requires to go beyond LDT in certain situations - entropic effects, ruggedness, etc.
Friday, September 13, 13
Large Deviation Theory
• The way rare events occur is often predictable: the probability of the rare event is
dominated by the most likely (i.e. least unlikely) scenario for it to happen - essence of
large deviation theory.
Happy families are all alike; every unhappy family is unhappy in its own way -Tolstoi.
• Calculation of the path of maximum likelihood (PML) reduces to a deterministic
optimization problem.
1.2
1
• For example, in gradient systems (i.e.
0.8
0.6
0.4
0.2
0
1.5
−0.2
1
−0.4
0.5
−1
−0.5
0
x
0.5
1
position
y
systems navigating over an energy landscape),
rare events are associated with barrier crossing
events and follow the minimum energy path (MEP)
connecting two minima of the energy potential.
0
−0.5
−1
−1.5
0
2
4
6
time
Friday, September 13, 13
8
10
4
x 10
Freidlin-Wentzell Approach to LDT
Wentzell-Freidlin theory of large deviations
(W. & F.; Da Prato & Zabczyk)
Key object: action functional
1
ST ( ) =
2
associated with S(P)DE
Z
T
1
( (t))( ˙ (t)
b( (t)))
2
dt
0
"
"
dX (t) = b(X (t))dt +
p
" (X " (t))dW (t)
Then: Probability that {X " (t)}t2[0,T ] be close to a given path { (t)}t2[0,T ] is roughly
⇢
P sup |X " (t)
(t)| <
⇣ exp( " 1 ST ( ))
.
0tT
Reduce estimation of probability to a minimization problem, e.g.
P{X " (T ) 2 A|X " (0) = x} ⇣ exp( "
where the infinum is taken over all
Looking at long times, T ⇣ exp("
Friday, September 13, 13
(·) such that
1 C)
1
inf ST ( )),
(0) = x and
(T ) 2 A.
for some C > 0, the quasi-potential is key:
The role of the Quasi-Potential (QP)
Looking at long time-intervals, T ⇣ exp("
V : H 2 7! R is the key quantity:
1 C)
for some C > 0, the quasi-potential
V (x, y) = inf inf {ST ( ) : (0) = x, (T ) = y}
T
Interpretation:
V (x, y) =
lim lim lim "
T !1 !0 "!0
1
log P{⌧y" (x)  T }
where ⌧y" (x) is the first is the first entrance time in a -neighborhood of y:
⌧y" (x) = inf{t : X " (t) 2 B (y), X " (0) = x}
Long time intervals most relevant since the e↵ects of the noise
is unavoidable on these timescales.
Friday, September 13, 13
Over every mountain there is a
path, although it may not be seen
from the valley -T. Roethke
Computational side of LDT
• Numerical tools (string method, minimum action method - MAM, etc.) have been
developed to identify the paths by which rare events are most likely to occur.
.
String method and MAM evolve curves
while controlling their parametrization until
they converge to the MLP.
Applicable to both gradient systems (where
MLPs are MEPs) and non-gradient
systems.
1.5
1.2
1
1
0.8
0.5
position
y
0.6
0.4
0.2
0
−0.5
0
−1
−0.2
−0.4
−1
−0.5
0
x
0.5
1
−1.5
0
2
4
6
time
• These tools have been applied to problems from biochemistry, material sciences,
atmosphere-ocean sciences, etc.
Friday, September 13, 13
8
10
4
x 10
Geometric rephrasing of QP definition
(M. Heymann & E. V.-E., 2008)
V (x, y) ⇣ inf inf{ST (') : '(0) = x, '(T ) = y}
T
nR
o
= 12 inf
kbka sin2 (2⌘)ds : joins x to y
where kuk2a = hu, uia with hu, via = ha 1 (x)u, vi if h·, ·i was the original inner-product,
and ⌘ is the local value of the angle between
and b.
Parametrize
= {'(s) : s 2 [0, 1]} so that:
V (x, y) = inf
=
nR
inf
1
0 k kb(')k
k'
(
a
a
0
sup
'(·)
✓(·)
'(0)=x H(',✓)=0
'(1)=y
R1
h'0 , b(')ia ) ds : '(0) = x, '(1) = y
0 , ✓ids
h'
0
o
where H(', ✓) is the Hamiltonian associated with the Lagrangian L(', '0 ) = k'0 b(')k2a :
H(', ✓) = hb('), ✓i + 12 h✓, a(')✓i
Friday, September 13, 13
Geometric rephrasing of QP definition
(M. Heymann & E. V.-E., 2008)
The associated Euler-Lagrange equations can be written as
⇢ 0
' = H✓ (', ✓) = b(') + ✓
✓ 0 = H' (', ✓) = rbT (')✓
where is determined such that 0 = H(', ✓) and we impose some (arbitray) constraint
on the paramterization of , e.g. |'0 | = cst.
Di↵erentiating the equation for ' and using the one for ✓, this system of equations
can be written as
0=
2
0 0
'” +
'
H✓' '0 + H✓✓ H'
which can be reduced to
0=
2
'”
P
H✓' '
0
H✓✓ H' ,
P = Id
where we used P '0 = 0, P '00 = '00 (since |'0 | = cst) and
✓=a
Friday, September 13, 13
1
'
0
b ,
kbka
=
k'0 ka
'0 ⌦ '0
|'0 |2
(?)
Allen-Cahn equation in 2D - MEPs
Example: Allen-Cahn in one- and two-dimensions
(Faris & Jona-Lasiinio (1d); E, Ren
& V.-E.)
Allen-Cahn energy for u : [0, 1]2 7! R:
E(u) =
Z
[0,1]2
1
2
|ru|2 +
1
4
1
(1
u2 )2 dx
with Dirichlet boundary condition: u = +1 on the right and left edges of [0, 1]2 ,
u = 1 on top and bottom ones.
Two minimizers of the Allen-Cahn energy
Simulations by W. Ren
Friday, September 13, 13
Allen-Cahn equation in 2D - MEPs
Example: Allen-Cahn in one- and two-dimensions
(Faris & Jona-Lasiinio (1d); E, Ren
& V.-E.)
Allen-Cahn energy for u : [0, 1]2 7! R:
E(u) =
Z
[0,1]2
1
2
|ru|2 +
1
4
1
(1
u2 )2 dx
with Dirichlet boundary condition: u = +1 on the right and left edges of [0, 1]2 ,
u = 1 on top and bottom ones.
Two minimizers of the Allen-Cahn energy
Simulations by W. Ren
Friday, September 13, 13
Allen-Cahn equation in 2D - MLPs at finite T
Example: Allen-Cahn in two-dimensions
(E, Ren & V.-E.)
Dynamics:
ut =
Action:
1
ST (') =
2
Unconstrained reversal
Friday, September 13, 13
2
Z
u+u
T
0
Z
⌦
|'t
3
u +
2
p
'
2" ⌘(t, x)
' + '3 |2 dxdt
Time-constrained reversal
Allen-Cahn equation in 2D - MLPs at finite T
Example: Allen-Cahn in two-dimensions
(E, Ren & V.-E.)
Dynamics:
ut =
Action:
1
ST (') =
2
Unconstrained reversal
Friday, September 13, 13
2
Z
u+u
T
0
Z
⌦
|'t
3
u +
2
p
'
2" ⌘(t, x)
' + '3 |2 dxdt
Time-constrained reversal
Thermally induced magnetization reversal
(W. E, W. Ren & E. V.-E., 2003: A. Kent, D. Stein & E. V.-E., 2009 ...)
Main building blocks in Magnetoelectronics (used e.g. as
storage devices, etc.)
Small elements at superparamagnetic limit, where thermal
effects become important and limit data retention time by
magnetization reversal.
Reversal complex due to non-uniformity in space.
a)
1
S2
S3
b)
C3,4
1
1
V5,6
V3,4
S1
1
V1,2
3
V2
1
V7,8
C
C1,2
0
1
0.026
V24
1
m2
S4
V2
V22
5,6
2
V8
V15,16
8
V1
13,14
1
V9,10
2
V6
V2
7
S
0.025
V25
C
0.024
S5
V111,12
6
7
0
m1
1
Dynamics can be reduced to a Markov jump process on
energy map, whose nodes are the energy minima and
whose edges are the minimum energy paths
Friday, September 13, 13
0.023
0.022
S
S
1
1
E[m]
7,8
(b)
0.021
(a)
0.02
0.019
0
0.2
0.4
0.6
α
0.8
1
on:
1
Z
T
Z
Non-gradient effects
2 0
⌦
S
( ' ) =
T
|'
t
2
'
'
+
example:
3 2
'
|
d
x
d
t
AC with
c sin( y ) u
p
amics:
u
t
=
2
u +
u
u
3
+
x
+
advection
2 " ⌘( t , x )
in 2D with periodic BC
MLPs for different shear strengths
Simulations by M. Heymann and M. Tao
Friday, September 13, 13
on:
1
Z
T
Z
Non-gradient effects
2 0
⌦
S
( ' ) =
T
|'
t
2
'
'
+
example:
3 2
'
|
d
x
d
t
AC with
c sin( y ) u
p
amics:
u
t
=
2
u +
u
u
3
+
x
+
advection
2 " ⌘( t , x )
in 2D with periodic BC
minimal action
0.22
0.21
S
0.2
min
0.19
MLPs for different shear strengths
Simulations by M. Heymann and M. Tao
Friday, September 13, 13
0.18
0.17
0
0.2
0.4
c
0.6
0.8
1
on:
1
Z
T
probable path
The system
ZRare event and most
Gradient system
Most probable paths of different types
Non-gradient
system
shear
2
3 Optimal
2
|'
'
'
+
'
|
d
x
d
t
t Ginzburg-Landau
Sheared
SPDE
Brief description
of the numerical method
example: AC with advection
Non-gradient effects
2 0
⌦
S
( ' ) =
T
amics:
u
t
=
2
u +
u
u
3
+
c sin( y ) u
x
+
p
2 " ⌘( t , x )
in 2D with periodic BC
Transition via a limit cycle instead of saddle
Action⇡0.4.
Dynamics near the limit cycle @ Matlab.
minimal action
0.22
0.21
S
0.2
min
0.19
MLPs for different shear strengths
Molei Tao and Eric Vanden-Eijnden
Simulations by M. Heymann and M. Tao
Friday, September 13, 13
0.18
gMAM and most probable paths
0.17
0
0.2
0.4
c
0.6
0.8
1
Unstable limit cycle along PML
14
Friday, September 13, 13
kB(u)k
kB(u)k
prime denotes di↵erentiation with respect to arclength-coordinate ⌧ . The
dition for the velocity field u is u(x, t) = 0 for t ! 1, the final condition
iliary field p is p(t = 0 ) = wx as we can see from (5.12) using integration
The norm k · k is given by the inverse of on its support (which is compact
domain), hence
⇣
⌘
(with T. Grafke, R. Grauer and T. Schäfer)
▷ Random Burgers equation
kf k = f F 1 ˆ 1 fˆ
,
(5.14)
1
Application to fluid dynamics problems
L2
Key question: statistics of high gradients (responsible
for dissipation)
ˆ
denotes the inverse Fourier transform operator and f is the Fourier trans-
0 0
0
0
u
+
uu
=
⌫u
+
⌘(x,
t),
h⌘(x,
t)⌘(x
,
t
)i
=
(x
x
)
(t
t
t
x
xx
mparison with previous work [2, 5], it is useful to compare the above equa- )
e corresponding instanton equations in the original parametrization (using
as parameter)
Hamilton’s equation associated with LD action (cf. Balkovsky et al.; Chernykh & Stepanov):
Z
ut + uux ⌫uxx =
(x x0 )p(x0 )dx0
(5.15a)
1.5
pt + upx + ⌫pxx = 0.
geometric
t =−1000
0.05
(5.15b)
t =−100
min
min
1
tmin=−10
−0.05
−20
u(x, t=0)
n problem in solving the above equations with the boundary conditions
! 1 and p(0 100, x) = wx consists in the fact that the fixpoint
of the
0.5
nly reached in the limit t ! 1 and not in finite time. When Chernykh
200
0
nov computed a numerical solution to the system above, they used
a comtwo clever tricks
300 to mitigate this difficulty: First, they used self-similar
of the heat-kernel to design a coordinate transform that leads−0.5to an expo400
aling in time, second
they used the linearization of the system around zero
−1
replace the boundary
condition
u
=
0
by
u
=
p
(where
the
constant 0
0
500
−20
−15
−10
200 function
400
600
800 following
1000
mputed from the correlation
). In the
we show
that
the
0
−15
−10
−5
100
200
300
15
400
500
200
400
600
Friday, September 13, 13
800
1000
−5
0
x
5
10
15
20
Two-dimensional barotropic QG equations
Predictions of equilibrium statistical mecha
@q
@q
@
+ r? · rq + U(t)
+
= Dq ( )q + Fq
@t
@x
@x
R
IM maximizes
µ log µ under the constr
q = the+entropy
h,
Z
Z@
dU
=
h
dx dy + DU (U) + FU ,
dt
@x
Canonical:
Q dµ = Q0 , Microcanonical: E =
Application to fluid dynamics problems
Motivation
Methodology
Stochastic
Motivation
ODE model
▷ 2D
Methodology
Stochastic
barotropic QG equation
(with W. Ren & X. Yang)
Here (t, x, y ) is the stream function, q(t, x, y ) the vort
2
Dissipation via Ekman drag: 1 =
, 2 = eh(x, y ) the bottom topography, U(t) the zonal base flow
Stochastic
model
Metastable
states of invariant measure PDE FDT
1
1
the beta-plane
dµ = Zparameter
exp( "(Pedlosky
Q) (E 79,
E0Vallis
) dUd05).
,
Dissipation by Eckman drag - fixes the noise by
Conclusions
PDE model
Stochastic
ODEConclusions
model
Inviscid equations with D = F = 0 preserve
p
@q
@q
@
?
+r
· rq + U(t)
+
=
q + 2" ⌘(t, x, y )
@t
@x
@x
Minimize the enstrophy Q subject to the constraint E = E0 :
q=
+ h,
Z
Z
p
1
dU
@
min
Q
=
U
+
q 2+dxeUdy ,(t),
=
h
dx e + 2"e⇠(t)
2
U,
dt
@x
Z
1
1
s.t. U 2 +
|r |2 dx dy = E0 .
2 equilibrium distribution
Most likely
Z 2states of
p
p
minimize enstrophy
( q + at2"E = cst:
⌘) dx + U(e
2"e ⇠)
Using Lagrange multiplier,
where
(t) =
µUc =
,
µ
Z
2
2
|
|
dx
+
e
U
(
c =
c + h).
states – in QG
(t),Most likely states = selective decay
Z
1
enstrophy varies faster1than
energy.
Energy
E = U2 +
|r |2 dx dy ,
2
Only choice leading to 2multimodal
Z distribution.
Enstrophy
Q= U+
1
2
|q|2 dx dy .
Dynamics? choose D and F so that we formally have
invariant measure (cf strategy of Landau-Lifshitz for N
equations)
50
.
40
30
E
N.B.
the stationary
solutions
original system
N.B.These
fixing are
the damping
fixes the
forcing of
by the
fluctuation-dissipation
For
E large enough,
there are more than one solution,
20
without
dissipation
and forcing.
theorem
(FDT).
out of which two are stable in absence of noise.
10
0
−15
Friday, September 13, 13
−10
−5
µ
0
5
16
Example (16-mode topography
Motivation
We take the domain size as Lx =
parameters are chosen as = 1,
coefficients are taken as = e =
Application to fluid dynamics problems
Methodology
Stochastic
ODE model
▷ 2D
Stochastic
PDE model
barotropic
QGStatistical
equation
Nonequilibrium
Mechanics of Climate Variability
(a)
Conclusions
(b)
(c)
6
2.5
4
2.5
4
2
2
2
2
2
2
1.5
0
1.5
0
1.5
0
y
4
1
−2
1
−2
1
−2
0.5
−4
0.5
−4
0.5
−4
0
0
2
4
6
0
0
−6
2
4
6
0
0
−6
2
4
x
x
x
(d)
(e)
(f )
6
6
6
−6
6
4
2.5
4
2.5
4
2
2
2
2
2
2
1.5
0
1.5
0
1.5
0
y
2.5
y
1
−2
1
−2
1
−2
0.5
−4
0.5
−4
0.5
−4
0
0
2
4
6
0
0
−6
2
4
6
0
0
−6
2
4
x
x
x
(g)
(h)
(i)
6
6
6
−6
6
4
2.5
4
2.5
4
2
2
2
2
2
2
1.5
0
1.5
0
1.5
0
y
2.5
y
y
6
2.5
y
y
6
y
1615
modes topography with periodic BC:
1
−2
1
−2
1
−2
0.5
−4
0.5
−4
0.5
−4
0
0
2
4
x
6
−6
0
0
2
4
x
6
−6
0
0
2
4
6
−6
x
Snapshots
of streamfunction
along
PML path from X 0 to X 1 ; the
Figure 8. Snapshots
of the stream
function along
the transition
states
labeled
by (a) and (i) are the initial and final states, and (f ) is the saddle point. The
Friday,
September
13, 13
17
Markov Jump Processes
• M species involved in N reactions with rates depending on system state
• Reactions are fast but lead to small changes:
(Lf )(x) =
N
X
"
1
aj (x) (f (x + "⌫j )
j=1
• Hamiltonian of LDT:
H(x, ✓) =
N
X
⇣
aj (x) eh✓,⌫j i
j=1
▷ Application
f (x))
⌘
1
to genetic toggle switch (Roma et al.)
3.5
25
3
20
2.5
2
15
1.5
Expression of gene a inhibits
expression of gene b and vice-versa
10
1
5
0.5
0
20
40
Friday, September 13, 13
60
80
100
120
140
20
40
60
80
100
120
140
160
Beyond LDT - when entropy matters
• LDT can fail if entropic effects matter
- many alternative paths for the event,
with lower probability individually, but large one globally.
• These situations require a more general approach to rare event analysis.
Friday, September 13, 13
(Weinan E & V.-E. 2006, ...
Schütte, Metzner & V.-E. 2009,
Berezhkovskii, Hummer & Szabo 2009)
Transition Path Theory
• Main idea: focus on `reactive’ trajectories
associated with a given transition (reaction)
• Generator = transition rate matrix
A
B
i, j 2 S = {1, 2, . . . , N }
Li,j
• Microscopic reversibility (detailed balance)
µi Li,j = µj Lj,i
• Probability distribution of reactive trajectories
µR
i = µi qi (1
• Probability current of reactive trajectories
R
fi,j
= µi Li,j (qj
• Committor function:
Friday, September 13, 13
qi ) +
qi )
Generator of loop erased reactive paths
LR
i,j = Li,j (qj
qi = Ei (⌧B < ⌧A )
qi ) +
The example of a maze
No energy landscape, no clear transition state - not a mountain pass problem.
Many different reactive paths from entrance to exit - the walkers wanders
in the many deadends of the maze and this affects dramatically the rate
of the `reaction’.
Yet the current of these reactive trajectories concentrates on a single path.
Committor
Effective current
1
A
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
B
Friday, September 13, 13
0
Reorganization of LJ cluster after self-assembly
with Masha Cameron
• Double funnel landscape - ground state is not accessible directly by
self-assembly and requires dynamical reorganization;
•
Can be described by a MJP using the network calculated by Wales;
icosahedron
• ~1e4 local minima of potential;
• ~1e4 saddle points between these minima;
• Network of orbits (minimum energy paths);
• Transition matrix ;
Li,j = ⌫e
Vi,j /"
• Spectral analysis difficult (large network, many small eigenvalues).
• Can be analyzed by TPT.
• LDT only applicable at extremely low temperatures. At physical
temperature, system remains strongly metastable, but pathways and
rate of transition are different than those predicted by LDT (entropic
effects).
Friday, September 13, 13
octahedron
Reorganization of LJ cluster after self-assembly
with Masha Cameron
Number of transition pathways
increase with T
They become shorter and go thru
higher barriers
Figure 3: The cartoon of the transition process at T = 0.05 from ICO (state 7) to FCC (state
1). The edges shown carry at least 10% of the total reactive flux from ICO to FCC. The blue
arrows indicate the dominant representative pathway. The red arrows indicate the highest barriers:
V(342,254) = 4.219 and V(958,1) = 4.272. The percentages next to each arrow show the percentages
of the total reactive flux carried by the edge. The values of the committor q at some states are
shown.
Friday, September 13, 13
Reorganization of LJ cluster after self-assembly
with Masha Cameron
Number of transition pathways
increase with T
They become shorter and go thru
higher barriers
Figure 3: The cartoon of the transition process at T = 0.05 from ICO (state 7) to FCC (state
1). The edges shown carry at least 10% of the total reactive flux from ICO to FCC. The blue
arrows indicate the dominant representative pathway. The red arrows indicate the highest barriers:
V(342,254) = 4.219 and V(958,1) = 4.272. The percentages next to each arrow show the percentages
of the total reactive flux carried by the edge. The values of the committor q at some states are
shown.
Friday, September 13, 13
Reorganization of LJ cluster after self-assembly
with Masha Cameron
T= 0.05
T= 0.15
Percentage of pathways
Percentage of pathways
342 - 354
342 -354
958 - 1
958 - 1
3223 - 354
958 - 607
3886 - 354
351 - 354
3184 - 2831
396 - 1502
1208 - 3299
248 - 3299
355 - 354
4950 - 2933
396 - 5162
The highest potential barrier
Friday, September 13, 13
958 - 607
3223 - 354
The highest potential barrier
Conclusions
• LDT gives rough estimate of probability and path of maximum likelihood by which rare
event occur.
• Can be integrated in importance sampling procedures to estimate the probability of the
rare events, its rate of occurrence, etc.
• Can be used in data assimilation techniques - allows to optimally incorporate the
information from noisy observations. Also offers the possibility to prevent or precipitate the
occurrence of rare events in situations where the system’s evolution can be acted upon.
• In situation where entropy matters and LDT becomes inapplicable, TPT is the right
framework.
Thanks to Weinan E, Weiqing Ren, Matthias Heymann, Masha Cameron, Jonathan Weare, ...
Thank you!
Friday, September 13, 13
A few references
• W. E, W. Ren, and and E. V.-E. ``String method for the study of rare events,'' Phys. Rev. B. 66
(2002), 052301.
• W. E, W. Ren, and E. V.-E., ``Minimum action method for the study of rare events,'' Comm.
Pure Applied Math. 52, 637-656 (2004).
• M. Heymann and E. V.-E., ``The Geometric Minimum Action Method: A least action principle on
the space of curves,'' Comm. Pure. App. Math. 61(8), 1052-1117 (2008).
• W. E and E. V.-E., ``Toward a theory of transition paths,'' J. Stat. Phys. 123, 503-523 (2006).
• W. E and E. V.-E., ``Transition-Path Theory and Path-Finding Algorithms for the Study of Rare
Events,'' Annu. Rev. Phys. Chem. 61, 391-420 (2010).
26
Friday, September 13, 13
Download