Challenges for CLAPDE from Optimization:

advertisement
Challenges for CLAPDE from Optimization:
A Personal View
Nick Gould (RAL)
minimize f (x) subject to cE (x) = 0
x∈IRn
CLAPDE, University of Durham, July 2008
CLAPDE Durham, 23rd July 2008 – p. 1/9
Generic quadratic optimization problem from CLAPDE
minimize
f (x) subject to c(x) = 0 ←− discrete PDE
n
x∈IR
CLAPDE Durham, 23rd July 2008 – p. 2/9
Generic quadratic optimization problem from CLAPDE
minimize
f (x) subject to c(x) = 0 ←− discrete PDE
n
x∈IR
=⇒ EQP step subproblem sk from xk
minimize
n
s∈IR
1
2
sT Hks + sT gk subject to Jks + ck = 0
←− linearized PDE
CLAPDE Durham, 23rd July 2008 – p. 2/9
Generic quadratic optimization problem from CLAPDE
minimize
f (x) subject to c(x) = 0 ←− discrete PDE
n
x∈IR
=⇒ EQP step subproblem sk from xk
minimize
n
s∈IR
1
2
sT Hks + sT gk subject to Jks + ck = 0
←− linearized PDE
Jk Jacobian of constraints
Hk symmetric but indefinite ≈ ∇xxℓ(x, y) Hessian of Lagrangian
CLAPDE Durham, 23rd July 2008 – p. 2/9
Generic quadratic optimization problem from CLAPDE
minimize
f (x) subject to c(x) = 0 ←− discrete PDE
n
x∈IR
=⇒ EQP step subproblem sk from xk
minimize
n
s∈IR
1
2
sT Hks + sT gk subject to Jks + ck = 0
←− linearized PDE
Jk Jacobian of constraints
Hk symmetric but indefinite ≈ ∇xxℓ(x, y) Hessian of Lagrangian
NB. If the PDE is nonlinear, this will influence Hk!
CLAPDE Durham, 23rd July 2008 – p. 2/9
EQP is not equivalent to a saddle point system
sk = arg minimize
n
s∈IR
1
2
sT Hks + sT gk subject to Jks + ck = 0
6=⇒ saddle-point solution
Hk JkT
Jk
0
!
sk
yk
!
=−
gk
ck
!
unless (∼) sT Hks > 0 for all nonzero s : Jks = 0 c.f., 2nd-order opt
CLAPDE Durham, 23rd July 2008 – p. 3/9
EQP is not equivalent to a saddle point system
sk = arg minimize
n
s∈IR
1
2
sT Hks + sT gk subject to Jks + ck = 0
6=⇒ saddle-point solution
Hk JkT
Jk
0
!
sk
yk
!
=−
gk
ck
!
unless (∼) sT Hks > 0 for all nonzero s : Jks = 0 c.f., 2nd-order opt
Tasks
find iterative methods which can identify this situation
CLAPDE Durham, 23rd July 2008 – p. 3/9
EQP is not equivalent to a saddle point system
sk = arg minimize
n
s∈IR
1
2
sT Hks + sT gk subject to Jks + ck = 0
6=⇒ saddle-point solution
Hk JkT
Jk
0
!
sk
yk
!
=−
gk
ck
!
unless (∼) sT Hks > 0 for all nonzero s : Jks = 0 c.f., 2nd-order opt
Tasks
find iterative methods which can identify this situation
if violated, want instead normalized sk to minimize
sT Hks : Jks = 0, e.g., appropriate eigenvector
CLAPDE Durham, 23rd July 2008 – p. 3/9
EQP is not equivalent to a saddle point system
sk = arg minimize
n
s∈IR
1
2
sT Hks + sT gk subject to Jks + ck = 0
6=⇒ saddle-point solution
Hk JkT
Jk
0
!
sk
yk
!
=−
gk
ck
!
unless (∼) sT Hks > 0 for all nonzero s : Jks = 0 c.f., 2nd-order opt
Tasks
find iterative methods which can identify this situation
if violated, want instead normalized sk to minimize
sT Hks : Jks = 0, e.g., appropriate eigenvector
if satisfied, eigenvalue bounds?
CLAPDE Durham, 23rd July 2008 – p. 3/9
EQP is not equivalent to a saddle point system
sk = arg minimize
n
s∈IR
1
2
sT Hks + sT gk subject to Jks + ck = 0
6=⇒ saddle-point solution
Hk JkT
Jk
0
!
sk
yk
!
=−
gk
ck
!
unless (∼) sT Hks > 0 for all nonzero s : Jks = 0 c.f., 2nd-order opt
Tasks
find iterative methods which can identify this situation
if violated, want instead normalized sk to minimize
sT Hks : Jks = 0, e.g., appropriate eigenvector
if satisfied, eigenvalue bounds?
OK for constraint-preconditioned CG, but what else??
CLAPDE Durham, 23rd July 2008 – p. 3/9
Residual reduction is not enough
would like reduction in
(∗)
kJks + ckk and/or
1
2
sT Hks + sT gk
CLAPDE Durham, 23rd July 2008 – p. 4/9
Residual reduction is not enough
would like reduction in
(∗)
kJks + ckk and/or
1
2
sT Hks + sT gk
minimum residual like-methods (MINRES, GMRES, QMR, . . . ) aim
for reduction in
!
H s + g + JT y k
k
k
Jk s + c k
CLAPDE Durham, 23rd July 2008 – p. 4/9
Residual reduction is not enough
would like reduction in
(∗)
kJks + ckk and/or
1
2
sT Hks + sT gk
minimum residual like-methods (MINRES, GMRES, QMR, . . . ) aim
for reduction in
!
H s + g + JT y k
k
k
Jk s + c k
may not reduce (∗) for many iterations
CLAPDE Durham, 23rd July 2008 – p. 4/9
Residual reduction is not enough
would like reduction in
(∗)
kJks + ckk and/or
1
2
sT Hks + sT gk
minimum residual like-methods (MINRES, GMRES, QMR, . . . ) aim
for reduction in
!
H s + g + JT y k
k
k
Jk s + c k
may not reduce (∗) for many iterations
are there iterative methods which can ensure (∗) every iteration?
Every pair of iterations??
CLAPDE Durham, 23rd July 2008 – p. 4/9
Three basic computational components
Find sk = nk + tk where
Jk n k + c k ≈ 0
JkT yk + gk ≈ 0
(approx min)
1
2
T
tT
k Hk tk + tk gk : Jk tk ≈ 0
CLAPDE Durham, 23rd July 2008 – p. 5/9
Three basic computational components
Find sk = nk + tk where
Jk n k + c k ≈ 0
JkT yk + gk ≈ 0
(approx min)
1
2
T
tT
k Hk tk + tk gk : Jk tk ≈ 0
all need to be efficient and matrix free
may need to “regularise” (trust-region/cubic regularisation??)
can embed within globally convergent “funnel” framework
CLAPDE Durham, 23rd July 2008 – p. 5/9
Three basic computational components
Find sk = nk + tk where
Jk n k + c k ≈ 0
JkT yk + gk ≈ 0
(approx min)
1
2
T
tT
k Hk tk + tk gk : Jk tk ≈ 0
all need to be efficient and matrix free
may need to “regularise” (trust-region/cubic regularisation??)
can embed within globally convergent “funnel” framework
C.f.— linesearch-based methods based on iterative saddle-point solution
which (to my knowledge) use heuristic perturbations to Hk
CLAPDE Durham, 23rd July 2008 – p. 5/9
Three basic computational components
Find sk = nk + tk where
Jk n k + c k ≈ 0
JkT yk + gk ≈ 0
(approx min)
1
2
T
tT
k Hk tk + tk gk : Jk tk ≈ 0
all need to be efficient and matrix free
may need to “regularise” (trust-region/cubic regularisation??)
can embed within globally convergent “funnel” framework
C.f.— linesearch-based methods based on iterative saddle-point solution
which (to my knowledge) use heuristic perturbations to Hk
how do non-trivial perturbations affect the excellent PDE-based
preconditioners?
CLAPDE Durham, 23rd July 2008 – p. 5/9
Multilevel methods
It is unclear how best to use multigrid in the PDE-optimization context
apply linear multigrid to the EQP subproblem
apply nonlinear multigrid/multilevel ideas
geometric
(Toint, Gratton, Sartenaer, Mouffe, . . . )
algebraic
CLAPDE Durham, 23rd July 2008 – p. 6/9
Auxiliary constraints
If there additional non-PDE side constraints on (e.g.) controls:
extra equations
simple bounds on variables
general inequalities
integer restrictions
how can we impose these without destroying PDE-specific structure (e.g.)
preconditioners?
CLAPDE Durham, 23rd July 2008 – p. 7/9
“Big” questions
Krylov-based methods treat
H
J
JT
0
!
as a generic matrix/operator
are there new methods which really exploit the zero block?
are there new methods which really exploit the substructure?
CLAPDE Durham, 23rd July 2008 – p. 8/9
“Big” questions
Krylov-based methods treat
H
J
JT
0
!
as a generic matrix/operator
are there new methods which really exploit the zero block?
are there new methods which really exploit the substructure?
Krylov-based methods obtain products
!
!
u
H JT
,
J
0
v
i.e., Hu, J u and J T v
are there better methods without such strong ties, e.g.,
Hu, J w and J T v ?
CLAPDE Durham, 23rd July 2008 – p. 8/9
Over to you . . .
Thanks to Alison, Andy and David!
CLAPDE Durham, 23rd July 2008 – p. 9/9
Download