Monte Carlo for Linear Operator Equations Fall 2012 By Hao Ji

advertisement
Monte Carlo for Linear Operator
Equations
Fall 2012
By Hao Ji
Review
• Last Class
– Quasi-Monte Carlo
• This Class
– Monte Carlo Linear Solver
• von Neumann and Ulam method
• Randomize Stationary iterative methods
• Variations of Monte Carlo solver
– Fredholm integral equations of the second kind
– The Dirichlet Problem
– Eigenvalue Problems
• Next Class
– Monte Carlo method for Partial Differential Equations
Solving Linear System
• The simultaneous equations,
π‘₯ = 𝐻π‘₯ + π‘Ž
where 𝐻 = (β„Žπ‘–π‘— ) ∈ 𝑅 𝑛×𝑛 is a 𝑛 × π‘› matrix , a ∈ 𝑅 𝑛 is a given
vector and π‘₯ ∈ 𝑅 𝑛 is the unknown solution vector.
• Define the norm of matrix to be
𝐻 = max
𝑖
β„Žπ‘–π‘—
𝑗
Solving Linear System
• Direct methods
– Gaussian elimination
– LU decomposition
–…
• Iterative methods
– Stationary iterative methods (Jacobi method, Gauss Seidel
method, …)
– Krylov subspace methods(CG, Bicg, GMRES,…)
–…
• Stochastic linear solvers
– Monte Carlo methods
–…
Monte Carlo Linear Solver
• The Monte Carlo method proposed by von Neumann and
Ulam:
1. Define the transition probabilities and the terminating
probabilities.
2. Build an unbiased estimator of the solution.
3. Produce Random Walks and calculate the average value.
Monte Carlo Linear Solver
• Let 𝑃 be a 𝑛 × π‘› matrix based on the matrix 𝐻, such that
𝑝𝑖𝑗 ≥ 0,
𝑝𝑖𝑗 ≤ 1
𝑗
and
β„Žπ‘–π‘— ≠ 0 → 𝑝𝑖𝑗 ≠ 0
𝑝𝑖 = 1 −
𝑝𝑖𝑗 ≤ 1
𝑗
• A special case:
𝑝𝑖𝑗 = β„Žπ‘–π‘—
Monte Carlo Linear Solver
• A terminating random walk stopping after k steps is
𝛾 = 𝑖0 , 𝑖1 , … , π‘–π‘˜
which passes through the sequence of integers (the row
indices)
• The successive integers (states) are determined by the
transition probabilities
𝑃 π‘–π‘š+1 = 𝑗 π‘–π‘š = 𝑖, π‘˜ > π‘š = 𝑝𝑖𝑗
and the termination probabilities
𝑃 π‘˜ = π‘š π‘–π‘š = 𝑖, π‘˜ > π‘š − 1 = 𝑝𝑖
Monte Carlo Linear Solver
• Define
π‘‰π‘š 𝛾 = 𝑣 𝑖0 𝑖1 𝑣 𝑖1𝑖2 … 𝑣 π‘–π‘š−1π‘–π‘š (π‘š ≤ π‘˜)
where
𝑣 𝑖𝑗
β„Ž 𝑖𝑗
= 𝑝 𝑖𝑗
0
(𝑝 𝑖𝑗 ≠ 0)
(𝑝 𝑖𝑗 = 0)
Then,
𝑋 𝛾 = π‘‰π‘˜ 𝛾 a π‘–π‘˜ /p 𝑖
π‘˜
is an unbiased estimator of π‘₯ 𝑖0 in the solution π‘₯ if the Neumann
series 𝐼 + 𝐻 + 𝐻 2 + β‹― converges.
Monte Carlo Linear Solver
• Proof:
The expectation of 𝑋 𝛾 is
𝐸 𝑋 𝛾
=
𝛾
∞
=
…
π‘˜=0 𝑖1
=
∞
π‘˜=0
𝑃 𝛾 𝑋(𝛾)
𝑝𝑖
0 𝑖1
𝑝𝑖
1 𝑖2
…𝑝 𝑖
π‘˜−1 π‘–π‘˜
𝑝 𝑖 𝑣 𝑖0𝑖1 𝑣 𝑖1𝑖2 … 𝑣 π‘–π‘˜−1π‘–π‘˜ a π‘–π‘˜ /p 𝑖
π‘˜
π‘˜
π‘–π‘˜
π‘–π‘˜ β„Ž 𝑖0 𝑖1 β„Ž 𝑖1 𝑖2 … β„Ž π‘–π‘˜−1 π‘–π‘˜ a π‘–π‘˜
𝑖1 …
= a 𝑖0 + π»π‘Ž
𝑖0
+ 𝐻2 π‘Ž
𝑖0
(Since 𝑣 𝑖𝑗 =
+β‹―
If the Neumann Series 𝐻 + 𝐻 2 + β‹― converges,
𝐼+𝐻+𝐻2+β‹― π‘Ž = 𝐼−𝐻
then 𝐸 𝑋 𝛾 = π‘₯ 𝑖0 .
−1 π‘Ž
=π‘₯
β„Ž 𝑖𝑗
)
𝑝 𝑖𝑗
Monte Carlo Linear Solver
• Produce 𝑁 random walks starting from 𝑖0 ,
1
𝑋 𝛾 ≈ 𝐸 𝑋 𝛾 = π‘₯ 𝑖0
𝑁
can evaluate only one component of the solution.
• The transition matrix is critical for the convergence of the Monte
Carlo Linear Solver.
In the special case: 𝑝𝑖𝑗 = β„Žπ‘–π‘—
– 𝐻 ≥ 1 Monte Carlo breaks down
– 𝐻 = 0.9 Monte Carlo is less efficient than a conventional
method ( 1% accuracy n<=554, 10% accuracy n<=84)
– 𝐻 = 0.5 (1% accuracy n<=151, 10% accuracy n<=20)
Monte Carlo Linear Solver
• To approximate the sum 𝑆 = 𝑖 𝑠𝑖 based on sampling, define a
𝑠
random variable z with possible values 𝑖 , and the probabilities
π‘žπ‘– = 𝑃(𝑧 =
π‘žπ‘–
𝑠𝑖
)
π‘žπ‘–
Since
𝑆=
𝑠𝑖 =
𝑖
𝑖
𝑠𝑖
π‘žπ‘– = 𝐸(𝑧)
π‘žπ‘–
we can use 𝑁 random samples of 𝑧 to estimate the sum 𝑆.
• The essence of Monte Carlo method in solving linear system is to
sample the underlying Neumann series
𝐼+𝐻+𝐻2+β‹―
Randomize Stationary iterative
methods
• Consider 𝐴π‘₯ = 𝑏
– Jacobi method: decompose A into a diagonal component 𝐷 and the
reminder 𝑅.
π‘₯ (π‘˜+1) = 𝐻π‘₯ (π‘˜) + π‘Ž
where H = −𝐷 −1 𝑅 and π‘Ž = 𝐷 −1 b
– Gauss Seidel method: decomposed A into a lower triangular
component 𝐿, and a strictly upper triangular component π‘ˆ
π‘₯ (π‘˜+1) = 𝐻π‘₯ (π‘˜) + π‘Ž
where H = −𝐿 −1 π‘ˆ and π‘Ž = 𝐿 −1 b
• Stationary iterative methods can easily be randomized by using Monte
Carlo to statistically sample the underlying Neumann Series.
Variations of Monte Carlo Linear
Solver
• Wasow uses another estimator
π‘˜
𝑋∗ 𝛾 =
π‘‰π‘š 𝛾 a π‘–π‘š
π‘š=0
in some situations to obtain smaller variance than 𝑋 𝛾 .
• Adjoint Method
𝑀 𝑖𝑗
β„Ž 𝑖𝑗
= 𝑝 π’‹π’Š
0
(𝑝 𝑖𝑗 ≠ 0)
(𝑝 𝑖𝑗 = 0)
to find the solution π‘₯ instead of π‘₯ 𝑖 only.
Variations of Monte Carlo Linear
Solver
• Sequential Monte Carlo method
To accelerate Monte Carlo method of simultaneous equations,
Halton uses a rough estimate π‘₯ for π‘₯ to transform the original
linear system.
Let 𝑦 = π‘₯ − π‘₯ and 𝑑 = π‘Ž + 𝐻π‘₯ − π‘₯, then
π‘₯ = 𝐻π‘₯ + π‘Ž ⟹ 𝑦 = 𝐻𝑦 + 𝑑
Since the elements of 𝑑 are much smaller than π‘Ž, the
transformed linear system could be much faster to get solution
than solving the original one.
Variations of Monte Carlo Linear
Solver
• Dimov uses a different transtion matrix
β„Žπ‘–π‘—
𝑝𝑖𝑗 =
𝑗 β„Žπ‘–π‘—
Since the terminating probabilities not exist anymore, the
random walk 𝛾 terminates when π‘Š 𝛾 is small enough, where
π‘Š 𝛾 = 𝑣 𝑖0𝑖1 𝑣 𝑖1𝑖2 … 𝑣 π‘–π‘š−1π‘–π‘š
=
𝑗
β„Žπ‘–0 𝑗 ∗
𝑗
β„Žπ‘–1 𝑗 ∗ β‹― ∗
𝑗
β„Žπ‘–π‘š−1𝑗
Fredholm integral equations of the
second kind
• The integral equation
𝑓 π‘₯ =𝑔 π‘₯ +
𝐾 π‘₯, 𝑦 𝑓 𝑦 𝑑𝑦
may be solved by Monte Carlo methods.
Since the integral can be approximated by a quadrature formula:
𝑁
𝑏
𝑦 π‘₯ 𝑑π‘₯ =
π‘Ž
𝑀𝑗 𝑦 π‘₯𝑗
𝑖=1
Fredholm integral equations of the
second kind
• The integral equation can be transformed to be
𝑁
𝑓 π‘₯ =𝑔 π‘₯ +
𝑀𝑗 𝐾 π‘₯, 𝑦𝑗 𝑓 𝑦𝑗
𝑗=1
evaluate it at the quadrature points:
𝑁
𝑓 π‘₯𝑖 = 𝑔 π‘₯𝑖 +
𝑀𝑗 𝐾 π‘₯𝑖 , 𝑦𝑗 𝑓 𝑦𝑗
𝑗=1
Let 𝑓 be the vector 𝑓 π‘₯𝑖 , 𝑔 be the vector 𝑔 π‘₯𝑖 and 𝐾 be the
matrix 𝑀𝑗 𝐾 π‘₯𝑖 , 𝑦𝑗 , the integral equation becomes
𝑓 = 𝐾𝑓 + 𝑔
where 𝑓 is the unknown vector.
The Dirichlet Problem
• Dirichlet’s problem is to find a function 𝑒, which is continuous and
differentiable over a closed domain 𝐷 with boundary 𝐢, satisfying
𝛻 2 𝑒 = 0 π‘œπ‘› 𝐷,
𝑒 = 𝑓 π‘œπ‘› 𝐢.
where 𝑓 is a prescribed function, and
operator.
𝛻2
=
πœ•2 𝑒
πœ•π‘₯
+
πœ•2 𝑒
πœ•π‘₯
is the Laplacian
Replacing 𝛻 2 by its finite-difference approximation,
1
𝑒 π‘₯, 𝑦 = 𝑒 π‘₯, 𝑦 + β„Ž + 𝑒 π‘₯, 𝑦 − β„Ž + 𝑒 π‘₯ + β„Ž, 𝑦 + 𝑒 π‘₯ − β„Ž, 𝑦
4
The Dirichlet Problem
• Suppose the boundary 𝐢 lies on the mesh, the previous
equations can be transformed into
𝑒 = 𝐻𝑒 + 𝑓
– The order of 𝐻 is equal to the number of mesh points in 𝐷.
1
– 𝐻 has four elements equal to in each row corresponding to an
4
interior point of 𝐷, all other elements being zero.
– 𝑓 has boundary values corresponding to an boundary point of 𝐢, all
other interior elements being zero.
– The random walk starting from an interior point 𝑃, terminates when it
hits a boundary point 𝑄. The 𝑓(𝑃) is an unbiased estimator of 𝑒(𝑄).
Eigenvalue Problems
• For a given 𝑛 × π‘› symmetric matrix 𝐻
𝐻π‘₯𝑖 = πœ†π‘– π‘₯𝑖 , π‘₯𝑖 ≠ 0
assume that πœ†1 > πœ†2 ≥ β‹― ≥ πœ†π‘› , so that πœ†1 is the dominant
eigenvalue and π‘₯1 is the corresponding eigenvector.
For any nonzero vector 𝑒 = π‘Ž1 π‘₯1 + π‘Ž2 π‘₯2 + β‹― + π‘Žπ‘› π‘₯𝑛 , according
to the power method,
π»π‘˜ 𝑒
lim
= π‘Ž1 π‘₯1
π‘˜→∞ πœ† π‘˜
1
We can obtain a good approximation of the dominant
eigenvector of 𝐻 from the above.
Eigenvalue Problems
Similar to the idea behinds Monte Carlo solver that
𝐻 π‘˜ 𝑒 = 𝑖1 … π‘–π‘˜ β„Ž 𝑖0𝑖1 β„Ž 𝑖1𝑖2 … β„Ž π‘–π‘˜−1π‘–π‘˜ a π‘–π‘˜
=
𝑖1 …
π‘–π‘˜ 𝑝 𝑖0 𝑖1 𝑝 𝑖1 𝑖2
…𝑝 𝑖
π‘˜−1 π‘–π‘˜
𝑝𝑖 𝑣
π‘˜
𝑖0 𝑖1
𝑣 𝑖1𝑖2 … 𝑣 π‘–π‘˜−1π‘–π‘˜ u π‘–π‘˜ /p 𝑖
π‘˜
we can do sampling on 𝐻 π‘˜ 𝑒 to estimate its value, and then
evaluate the dominant eigenvector π‘₯1 by a proper scaling.
From the Rayleigh quotient,
π‘₯ 𝑇 𝐻π‘₯
πœ†= 𝑇
π‘₯ π‘₯
the dominant eigenvalue πœ†1 be approximated based on the
estimated vector of π‘₯1 .
Summary
• This Class
– Monte Carlo Linear Solver
• von Neumann and Ulam method
• Randomize Stationary iterative methods
• Variations of Monte Carlo solver
– Fredholm integral equations of the second kind
– The Dirichlet Problem
– Eigenvalue Problems
What I want you to do?
• Review Slides
• Work on Assignment 4
Download