A Roadmap of Newton type Methods A Roadmap of Newton

advertisement
A Roadmap of Newton
Newton-type
type Methods
A. Andreeva
O di
Ordinary
Newton
N t
method
th d
• For general nonlinear problems:
F ′(x k ) Δ x k = − F(x k ), x k + 1 = x k + Δ x k
• For system of n nonlinear equations Jacobian matrix
is required
• First we compute the Newton corrections ∆xk and
then improve the iterates xk to obtain xk+1
Si lifi d Newton
Simplified
N t
method
th d
• Keeps the initial derivative throughout the whole
iteration:
k
F ′(x ) Δx = − F(x ),
)
0
k
x
k +1
= x + Δx
k
• Saves computational cost per iteration
k
N t
Newton-like
lik methods
th d
• In finite dimension, the Jacobian matrix is:
– Replaced by some fixed ‘close by’ Jacobian F´(z), z≠x0
– Approximate F´(xk) by M(xk)
M(x k ) δx k = − F(x k ), x k +1 = x k + δx k
E
Exact
t Newton
N t
methods
th d
• When the equation
F ′(x k ) Δ x k = − F(x k )
can be solved using direct elimination methods, we
speak of exact Newton methods
• Erroneous when scaling issues are ignored
L
Local
l versus global
l b l Newton
N t
methods
th d
• Local Newton methods require sufficiently good
initial guess
• Global Newton methods compensate by virtue of
damping or adaptive trust region strategies
• Exact global Newton codes:
– NLEQ-RES
Q
S – residual based
– NLEQ-ERR – error oriented
– NLEQ-OPT – convex optimization
I
Inexact
t Newton
N t
methods
th d
• Inner iteration
F ′( x k )δ x ik = − F ( x k ) + ri k
x ik + 1 = x k + δ x ik
x k +1 = xik +1
• Outer iteration
• In comparison
p
with exact Newton methods an error
arises:
δx k − Δx k
• GIANT – Global Inexact Affine invariant Newton
Techniques
P
Preconditioning
diti i
• Direct elimination of ‘similar’ linear systems
C L F ′(x k ) C R C R−1 ( δ xik − Δxik ) = C L rik
• Residual or error norm need to be replaced by their
preconditioned counterparts
ri k , δxik − Δx ik → C L ri k , C R−1 ( δ xik − Δx ik )
M ti f
Matrix-free
Newton
N t
methods
th d
• Numerical difference approximation
F(x + δv) − F(x)
F ′(x)v =
δ
S
Secant
t methods
th d
• Substitute the tangent by the secant
k
k
f(x
δx
)
f(x
)
+
−
k
k
f ′(x + δxk ) →
= jk +1
δxk
• Compute the correction
δxk +1
f(x k +1 )
=−
,
jk +1
x k +1 = x k + δxk
• Converges
g locally
y superlinearly
p
y
Q
Quasi-Newton
iN t
methods
th d
• Extends the secant idea to system of equations
Jδxκ = F(x k +1 ) − F(x k )
• Previous quasi-Newton step:
J k δxk = − F(x
( k)
F(x k +1 )z T
= Jk +
z T δx
δ k
• Jacobian rank-1 update:
p
J k +1
• Next q
quasi-Newton step:
p
J k +1δxk +1 = − F(x
( k +1 )
G
Gauss-Newton
N t
methods
th d
• Appropriate for nonlinear least square problems
• Must be statistically well-posed (to be discussed
later in Sections 2 and 3)
• Two classes of Gauss-Newton methods:
– Local – good initial guess is required
– Global
G
- otherwise
Q
Quasilinearization
ili
i ti
• Infinite dimensional Newton methods for operator
equations
• The linearized equations can be solved only
approximately
• Similar to inexact Newton methods, where the
‘t
‘truncation
ti errors’’ correspond
d tto ‘‘approximation
i ti
errors’
I
Inexact
t Newton
N t
multilevel
ltil
l methods
th d
• Infinite dimensional linear Newton systems are
approximately solved by linear multilevel methods
• When the approximation errors are controlled within
an abstract framework of inexact Newton methods,
we speak of adaptive Newton multilevel method
M ltil
Multilevel
l Newton
N t
methods
th d
• Schemes wherein a finite dimensional Newton
multigrid method is applied on each level
N li
Nonlinear
multigrid
lti id methods
th d
• Not Newton methods
• Rather fix point iteration methods
• Not treated here
Adaptive
p
inner solver for inexact
Newton Methods
• Idea: solve iteratively the linear systems for the
Newton corrections
• The inexact Newton system is given as:
Ayi = b − ri ,
i = 0 ,1,...,imax
• Several termination criteria:
– Residual norm ||ri|| is small enough
– Error norm ||y - yi|| is small enough
– Energy norm ||A1/2 (y - yi)|| of the error is small enough
R id l norm minimization:
Residual
i i i ti
GMRES
• Initial approximation y0 ≈ y, initial residual r0 = b-Ay0 ,
• Set: β = ||r0||, v1 = r0 / β, V1 = v1, iterate i = 1,2,...,imax
• Step1. Ortogonalization:
v̂ i + 1 = Av i − V i hi where hi = V i T Av i
• Step2.
p Normalisation:
vi+1
v̂i+1
=
vv̂i+1 2
R id l norm minimization:
Residual
i i i ti
GMRES
• Step3. Update:
Vi +1 = ( Vi vi+1 )
⎛ H i −1
Hi = ⎜
⎜ 0
⎝
hi
v̂i +1
⎞
⎟
⎟
2⎠
for i=1 drop the left block column
• Step4. Least squares problem: zi = min βe1 − H i z
• Step5. Approximate solution: yi = Vi zi + y0
Ch
Characteristics
t i ti off GMRES
• Storage: up to iteration i requires to store i+2
vectors of length n
• Computational amount: each iteration performs
one matrix/vector multiplication. Up to iteration i, i2n
flops
• Preconditioning:
P
diti i
b t preconditioning
best
diti i ffor CL = I
E
Energy
norm minimization:
i i i ti
PCG
• For symmetric positive definite matrix A the energy
product and energy norm are defined as:
( u , v ) = u , Av and u
2
A
= ( u ,u )
• Idea: for positive definite B ≈ A-1 is much easier to
compute z = Bc then Ay = b.
E
Error
norm minimization:
i i i ti
CGNE
• Idea: minimize the norm ||y - yi||
• Initialize: initial approximation y0, initial residual
r0 = b – Ay0
• Set: p0 = 0, β0 = 0, σ0 = ||r0||2
E
Error
norm minimization:
i i i ti
CGNE
For i = 1,2,...,imax
pi = AT ri−1 + βi−1 pi−1
αi = σ i−1 pi
γ
2
i−1
2
= αiσ i−1
yi = yi−1 + αi pi ,
2
σ i = ri ,
2
( Euclideanerror contribution yi − yi−1 )
ri = ri−1 − αi Api
βi = σ i σ i−1
Ch
Characteristics
t i ti off CGNE
• Storage: up to iteration i requires only 3 vectors of
length n
• Computational amount: up to step i the Euclidean
inner products sum up to 5in flops
−1
• Preconditioning: C R ( y − yi ) is minimized.
Th f
Therefore,
only
l left
l ft preconditioning
diti i should
h ld be
b
realized.
Download