Bayesian Inference for Inverse Problems Using Surrogate Model Dongbin Xiu

advertisement
Bayesian Inference for Inverse Problems
Using Surrogate Model
Dongbin Xiu
Department of Mathematics, Purdue University
Supported by AFOSR, DOE, NNSA, NSF
Overview
•  Uncertainty quantification (UQ) and stochastic modeling
•  Forward problem: uncertainty propagation
•  Brief introduction of generalized polynomial chaos (gPC)
•  Inverse problem: Bayesian inference
•  Use generalized polynomial chaos (gPC)
•  Back to the forward problem
•  Issues and challenges
•  Key Issues:
•  Efficiency
•  Curse-of-dimensionality
Illustrative Example: Burgers Equation
•  Burgers’ equation :
∂u
∂u
∂ 2u
+u
=ν 2 ,
∂t
∂x
∂x
•  Steady state solution:
⎡ A
⎤
∂u
u(x) = −Atanh ⎢ ( x − z )⎥ , where u(z) = 0, A = −
⎢ 2ν
⎥
∂x
⎣
⎦
u(−1) = 1, u(1) = −1
1
0.8
0.6
u(-1)=1.01
0.4
0.2
0
−0.2
u(-1)=1
−0.4
−0.6
−0.8
perturbed solution
unperturbed solution
mesh distribution
−1
−1
−0.8
−0.6
−0.4
−0.2
0
ν=0.05
0.2
0.4
0.6
0.8
1
x= z
Effects of Uncertainty – “Supersensitivity”
•  Burgers’ equation :
∂u
∂u
∂ 2u
+u
=ν 2 ,
∂t
∂x
∂x
•  Boundary conditions :
u(−1) = 1+ δ; u(1) = −1; δ ~ U (0,0.1)
x ∈ ⎡⎢⎣−1,1⎤⎥⎦
upper bd
Mean
Standard dev
Lower bd
(Xiu & Karniadakis, Int. J. Numer. Eng., vol. 61, 2004)
(Re-)Formulation of PDE: Input Parameterization
∂u
(t, x) = L(u) + boundary/initial conditions
∂t
•  Goal: To characterize the random inputs by a set of random variables
  Finite number
  Mutual independence
•  If inputs == parameters
  Identify the (smallest) independent set
  Prescribe probability distribution
•  Else if inputs == fields/processes
  Approximate the field by a function of finite number of RVs
  Well-studied for Gaussian processes
  Under-developed for non-Gaussian processes
  Examples: Karhunen-Loeve expansion, spectral decomposition, etc.
d
a(x,ω) ≈ µa (x) + ∑ ai (x)Z i (ω)
i=1
The Reformulation
•  Stochastic PDE:
∂u
(t, x, Z ) = L(u) + boundary/initial conditions
∂t
•  Solution:
n
u(t, x, Z ) :[0,T ] × D × R Z → R
•  Uncertain inputs are characterized by nz random variables Z
•  Probability distribution of Z is prescribed
FZ (s) = Pr(Z ≤ s),
s ∈R
nZ
Non-trivial task
Generalized Polynomial Chaos (gPC)
n
u(i, Z ) : R Z → R
•  Focus on dependence on Z:
•  Nth-order gPC expansion:
uN (Z ) ∈ PN = {space of nZ -variate polynomials of degree up to N}
•  Orthogonal basis:
i = (i1 ,,in ),
Z
i = i1 ++ in
Z
E ⎡⎢Φi (Z )Φ j (Z )⎤⎥ = ∫ Φi (z)Φ j (z)ρ(z) dz = δij
⎣
⎦
uN (t, x, Z ) 
N
∑ û (t, x)Φ
k
k =0
k
(Z )
⎛ n +N
⎜
dim PN = ⎜⎜ z
⎜⎜⎝
N
⎞⎟
⎟⎟
⎟⎟
⎠
gPC: Basis
E ( g(Z )) = ∫ g(z)ρ(z) dz
  Expectation:
R
∫ Φ (z)Φ (z)ρ(z) dz = E ⎡⎢⎣Φ (Z )Φ (Z )⎤⎥⎦ = δ
  Orthogonality:
i
j
Gaussian distribution
∞
i
Gamma distribution
∞
∫ Φ (z)Φ (z)e
−z 2
i
j
dz = δij
−∞
Hermite polynomial
j
ij
Beta distribution
1
∫ Φ (z)Φ (z)e
−z
i
j
dz = δij
0
Laguerre polynomial
∫ Φ (z)Φ (z) dz = δ
i
j
ij
−1
Legendre polynomial
(Xiu & Karniadakis, SISC, 2002)
gPC Basis: the Choices
∫ Φ (z)Φ (z)ρ(z) dz = E ⎡⎣⎢Φ (Z )Φ (Z )⎤⎦⎥ = δ
  Orthogonality:
i
j
i
j
ij
  Example: Hermite polynomial
∞
∫
2
Φi (z)Φ j (z)e−z dz = δij
−∞
  The polynomials: Z~N(0,1)
Φ0 = 1, Φ1 = Z, Φ2 = Z 2 −1, Φ3 = Z 3 − 3Z, 
  Approximation of arbitrary random variable: Requires L2 integrability
  Example: Uniform random variable
o  Convergence
o  Non-optimal
o  First-order Legendre is exact
Stochastic Galerkin
•  Galerkin method: Seek
uN (t, x, Z ) 
N
∑ û (t, x)Φ
k
k
(Z )
k =0
Such that
⎡ ∂uN
⎤
E⎢
(t, x, Z )Φm (Z ) ⎥ = E ⎡⎣L(uN )Φm (Z ) ⎤⎦ , ∀ m ≤ N
⎣ ∂t
⎦
•  The result:
  Residue is orthogonal to the gPC space
  A set of deterministic equations for the coefficients
  The equations are usually coupled – requires new solver
Stochastic Collocation
•  Collocation: To satisfy governing equations at selected nodes
  Allow one to use existing deterministic codes repetitively
n
Let Z 1 , Z 2 ,…Z Q ∈R Z , be a set of nodes/samples, then solve
∂u
(t, x, Z j ) = L(u), j = 1,…,Q.
∂t
•  Sampling: (solution statistics only)
•  Random (Monte Carlo)
•  Deterministic (lattice rule, tensor grid, cubature)
•  Stochastic collocation: To construct polynomial approximations
  Node selection is critical to efficiency and accuracy
  More than sampling
Stochastic Collocation: Interpolation
•  Definition: Given a set of nodes and solution ensemble, find uN in a proper
polynomial space, such that uN≈u in a proper sense.
•  Interpolation Approaches:
Q
u (Z )  ∑ u(Z j )L j (Z )
Q
j=1
Li (Z j ) = δij , 1≤ i, j ≤ Q
•  Optimal nodal distribution in high dimensions…
Tensor grids: inefficient
Sparse grids: more efficient
(Xiu & Hesthaven, SIAM J. Sci. Comput., 05)
Stochastic Collocation: Discrete Projection
•  Orthogonal projection:
N
uN (t, x, Z )  PN u = ∑ ûk (t, x)Φk (Z )
k =0
ûk = E[u(Z )Φk (Z )] = ∫ u(z)Φk (z)ρ(z) dz
•  Discrete projection:
N
wN (t, x, Z ) = ∑ ŵk (t, x)Φk (Z )
k =0
Q
ŵk = ∑ u(t, x, Z j )Φk (Z j )α j ≈ ∫ u(z)Φk (z)ρ(z) dz
j=1
ŵk (t, x) → ûk (t, x), Q → ∞
•  Aliasing Error:
εQ  uN − wN
L2ρ
Inverse Parameter Estimation
•  Solution of a stochastic system:
•  Need: Probability distribution of the parameters Z
  Prior distribution:
•  Information of the prior distribution is critical
  Requires direct measurements of the parameters
  No/not enough direct measurements? (Use experience/intuition …)
  How to take advantage of measurements of other variables?
Bayesian Inference for Parameter Estimation
•  Setup:
  Prior distribution of the parameters Z: π(z)
  Forward problem simulation:
y=G(Z)
  Measurement/data: d
•  Error:
•  Goal: To estimation the distribution of Z --- posterior distribution
•  Posterior distribution:
π (Z | d) ∝ π (d | Z )π (Z )
•  Likelihood function:
•  Notes:
  Difficult to manipulate
  Classical sampling approaches can be time consuming (MCMC, etc)
Surrogate-based Bayesian Estimation
•  Surrogate:
y N = GN (Z )
•  Approximate Bayes rule:
π N (Z | d) ∝ π N (d | Z )π (Z )
nd
where π N (d | Z ) = ∏ π e (di − GN ,i (Z ))
i=1
i
•  Properties:
  Allows direct sampling with arbitrarily large samples
  No additional simulations – forward problem solver only
Convergence Analysis
•  Kullback-Leibler divergence:
•  Basic assumption: observation error is i.i.d. Gaussian
Theorem. If the gPC expansion GN converges to G in L2π , then the posterior density
Z
π Nd converges to π d in the sense
(
)
D π Nd π d → 0, N → ∞.
Moreover, if
Gi (Z ) − GN ,i (Z )
L2π
≤ CN −α , 1 ≤ i ≤ nd ,α > 0, C independent of N ,
z
then for sufficiently large N ,
(
)
D π Nd π d  N −α .
•  Notes:
  Fast (exponential) convergence rate is retained
  So is the slow convergence
(Marzouk & Xiu, Comm. Comput. Phys.,vol. 6, 08)
Parameter Estimation: Supersensitivity Example
•  Burgers’ equation :
•  Boundary conditions :
∂u
∂u
∂ 2u
+u
=ν 2 ,
∂t
∂x
∂x
x ∈ ⎡⎢⎣−1,1⎤⎥⎦
u(−1) = 1+ δ(Z ); u(1) = −1; 0 < δ  1
δ
noisy observation of
transition layer location
(contrived numerically)
Prior distribution is uniform Z~(0,0.1)
Measurement noise: e~N(0,0.052)
Parameter Estimation: Step Function
•  Assume the forward model is a step function
•  Posterior distribution is discontinuous
•  Gibb’s oscillations exist
•  Slow convergence with global gPC basis functions
3
1.2
exact posterior
gPC posterior
1
2.5
0.8
0.6
!d(z)
G(z) or GN(z)
2
1.5
0.4
1
0.2
0.5
0
exact forward solution
gPC approximation
−0.2
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
z
Forward model and its approximation
1
0
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
z
Posterior distribution and its approximation
1
Biological Application
•  Example: estimate kinetic parameters in a genetic toggle switch
–  Differential-algebraic equations model from [Gardner et al, Nature, 2000]
–  Real experimental data: steady-state expression levels of one gene (v)
1
Normalized GFP expression
0.8
0.6
0.4
0.2
0
−6
−5.5
−5
−4.5
−4
−3.5
−3
−2.5
•  6 uncertain parameters
•  Assumed to be independent and uniform in the forward problem
•  Good agreement with measurement --- but let’s now use the data again
log10(IPTG)
−2
Using the Data Again – Parameter estimation
1-parameter and 2-parameter marginal posterior densities
(dim = 6, N = 4, 6-level sparse grid forward problem solver)
An (almost) Industrial Application: Free-Cantilever Damping
Gas damping of freely-vibrating cantilever
MEMOSA
for
damping
simulation
3-parameter marginal posterior densities
Level 2 sparse grid in thickness, gas density and frequency
24
Summary
  Uncertainty Analysis: To provide improved prediction
•  Input characterization
•  Uncertainty propagation
•  Post processing
  Generalized polynomial chaos (gPC)
•  Multivariate approximation theory
•  An highly efficient uncertainty propagation strategy
•  Makes many UQ tasks “post processing”
  Data, any data, can help
•  UQ simulation needs to be dynamically data driven.
  Support:
•  AFOSR: FA9550-08-1-0353
•  DOE:
•  DE-FC52-08NA28617
•  DE-SC0005713
•  NSF:
•  DMS-0645035
•  IIS-0914447
•  IIS-1028291
Download