Challenges for CLAPDE from Optimization: A Personal View Nick Gould (RAL) minimize f (x) subject to cE (x) = 0 x∈IRn CLAPDE, University of Durham, July 2008 CLAPDE Durham, 23rd July 2008 – p. 1/9 Generic quadratic optimization problem from CLAPDE minimize f (x) subject to c(x) = 0 ←− discrete PDE n x∈IR CLAPDE Durham, 23rd July 2008 – p. 2/9 Generic quadratic optimization problem from CLAPDE minimize f (x) subject to c(x) = 0 ←− discrete PDE n x∈IR =⇒ EQP step subproblem sk from xk minimize n s∈IR 1 2 sT Hks + sT gk subject to Jks + ck = 0 ←− linearized PDE CLAPDE Durham, 23rd July 2008 – p. 2/9 Generic quadratic optimization problem from CLAPDE minimize f (x) subject to c(x) = 0 ←− discrete PDE n x∈IR =⇒ EQP step subproblem sk from xk minimize n s∈IR 1 2 sT Hks + sT gk subject to Jks + ck = 0 ←− linearized PDE Jk Jacobian of constraints Hk symmetric but indefinite ≈ ∇xxℓ(x, y) Hessian of Lagrangian CLAPDE Durham, 23rd July 2008 – p. 2/9 Generic quadratic optimization problem from CLAPDE minimize f (x) subject to c(x) = 0 ←− discrete PDE n x∈IR =⇒ EQP step subproblem sk from xk minimize n s∈IR 1 2 sT Hks + sT gk subject to Jks + ck = 0 ←− linearized PDE Jk Jacobian of constraints Hk symmetric but indefinite ≈ ∇xxℓ(x, y) Hessian of Lagrangian NB. If the PDE is nonlinear, this will influence Hk! CLAPDE Durham, 23rd July 2008 – p. 2/9 EQP is not equivalent to a saddle point system sk = arg minimize n s∈IR 1 2 sT Hks + sT gk subject to Jks + ck = 0 6=⇒ saddle-point solution Hk JkT Jk 0 ! sk yk ! =− gk ck ! unless (∼) sT Hks > 0 for all nonzero s : Jks = 0 c.f., 2nd-order opt CLAPDE Durham, 23rd July 2008 – p. 3/9 EQP is not equivalent to a saddle point system sk = arg minimize n s∈IR 1 2 sT Hks + sT gk subject to Jks + ck = 0 6=⇒ saddle-point solution Hk JkT Jk 0 ! sk yk ! =− gk ck ! unless (∼) sT Hks > 0 for all nonzero s : Jks = 0 c.f., 2nd-order opt Tasks find iterative methods which can identify this situation CLAPDE Durham, 23rd July 2008 – p. 3/9 EQP is not equivalent to a saddle point system sk = arg minimize n s∈IR 1 2 sT Hks + sT gk subject to Jks + ck = 0 6=⇒ saddle-point solution Hk JkT Jk 0 ! sk yk ! =− gk ck ! unless (∼) sT Hks > 0 for all nonzero s : Jks = 0 c.f., 2nd-order opt Tasks find iterative methods which can identify this situation if violated, want instead normalized sk to minimize sT Hks : Jks = 0, e.g., appropriate eigenvector CLAPDE Durham, 23rd July 2008 – p. 3/9 EQP is not equivalent to a saddle point system sk = arg minimize n s∈IR 1 2 sT Hks + sT gk subject to Jks + ck = 0 6=⇒ saddle-point solution Hk JkT Jk 0 ! sk yk ! =− gk ck ! unless (∼) sT Hks > 0 for all nonzero s : Jks = 0 c.f., 2nd-order opt Tasks find iterative methods which can identify this situation if violated, want instead normalized sk to minimize sT Hks : Jks = 0, e.g., appropriate eigenvector if satisfied, eigenvalue bounds? CLAPDE Durham, 23rd July 2008 – p. 3/9 EQP is not equivalent to a saddle point system sk = arg minimize n s∈IR 1 2 sT Hks + sT gk subject to Jks + ck = 0 6=⇒ saddle-point solution Hk JkT Jk 0 ! sk yk ! =− gk ck ! unless (∼) sT Hks > 0 for all nonzero s : Jks = 0 c.f., 2nd-order opt Tasks find iterative methods which can identify this situation if violated, want instead normalized sk to minimize sT Hks : Jks = 0, e.g., appropriate eigenvector if satisfied, eigenvalue bounds? OK for constraint-preconditioned CG, but what else?? CLAPDE Durham, 23rd July 2008 – p. 3/9 Residual reduction is not enough would like reduction in (∗) kJks + ckk and/or 1 2 sT Hks + sT gk CLAPDE Durham, 23rd July 2008 – p. 4/9 Residual reduction is not enough would like reduction in (∗) kJks + ckk and/or 1 2 sT Hks + sT gk minimum residual like-methods (MINRES, GMRES, QMR, . . . ) aim for reduction in ! H s + g + JT y k k k Jk s + c k CLAPDE Durham, 23rd July 2008 – p. 4/9 Residual reduction is not enough would like reduction in (∗) kJks + ckk and/or 1 2 sT Hks + sT gk minimum residual like-methods (MINRES, GMRES, QMR, . . . ) aim for reduction in ! H s + g + JT y k k k Jk s + c k may not reduce (∗) for many iterations CLAPDE Durham, 23rd July 2008 – p. 4/9 Residual reduction is not enough would like reduction in (∗) kJks + ckk and/or 1 2 sT Hks + sT gk minimum residual like-methods (MINRES, GMRES, QMR, . . . ) aim for reduction in ! H s + g + JT y k k k Jk s + c k may not reduce (∗) for many iterations are there iterative methods which can ensure (∗) every iteration? Every pair of iterations?? CLAPDE Durham, 23rd July 2008 – p. 4/9 Three basic computational components Find sk = nk + tk where Jk n k + c k ≈ 0 JkT yk + gk ≈ 0 (approx min) 1 2 T tT k Hk tk + tk gk : Jk tk ≈ 0 CLAPDE Durham, 23rd July 2008 – p. 5/9 Three basic computational components Find sk = nk + tk where Jk n k + c k ≈ 0 JkT yk + gk ≈ 0 (approx min) 1 2 T tT k Hk tk + tk gk : Jk tk ≈ 0 all need to be efficient and matrix free may need to “regularise” (trust-region/cubic regularisation??) can embed within globally convergent “funnel” framework CLAPDE Durham, 23rd July 2008 – p. 5/9 Three basic computational components Find sk = nk + tk where Jk n k + c k ≈ 0 JkT yk + gk ≈ 0 (approx min) 1 2 T tT k Hk tk + tk gk : Jk tk ≈ 0 all need to be efficient and matrix free may need to “regularise” (trust-region/cubic regularisation??) can embed within globally convergent “funnel” framework C.f.— linesearch-based methods based on iterative saddle-point solution which (to my knowledge) use heuristic perturbations to Hk CLAPDE Durham, 23rd July 2008 – p. 5/9 Three basic computational components Find sk = nk + tk where Jk n k + c k ≈ 0 JkT yk + gk ≈ 0 (approx min) 1 2 T tT k Hk tk + tk gk : Jk tk ≈ 0 all need to be efficient and matrix free may need to “regularise” (trust-region/cubic regularisation??) can embed within globally convergent “funnel” framework C.f.— linesearch-based methods based on iterative saddle-point solution which (to my knowledge) use heuristic perturbations to Hk how do non-trivial perturbations affect the excellent PDE-based preconditioners? CLAPDE Durham, 23rd July 2008 – p. 5/9 Multilevel methods It is unclear how best to use multigrid in the PDE-optimization context apply linear multigrid to the EQP subproblem apply nonlinear multigrid/multilevel ideas geometric (Toint, Gratton, Sartenaer, Mouffe, . . . ) algebraic CLAPDE Durham, 23rd July 2008 – p. 6/9 Auxiliary constraints If there additional non-PDE side constraints on (e.g.) controls: extra equations simple bounds on variables general inequalities integer restrictions how can we impose these without destroying PDE-specific structure (e.g.) preconditioners? CLAPDE Durham, 23rd July 2008 – p. 7/9 “Big” questions Krylov-based methods treat H J JT 0 ! as a generic matrix/operator are there new methods which really exploit the zero block? are there new methods which really exploit the substructure? CLAPDE Durham, 23rd July 2008 – p. 8/9 “Big” questions Krylov-based methods treat H J JT 0 ! as a generic matrix/operator are there new methods which really exploit the zero block? are there new methods which really exploit the substructure? Krylov-based methods obtain products ! ! u H JT , J 0 v i.e., Hu, J u and J T v are there better methods without such strong ties, e.g., Hu, J w and J T v ? CLAPDE Durham, 23rd July 2008 – p. 8/9 Over to you . . . Thanks to Alison, Andy and David! CLAPDE Durham, 23rd July 2008 – p. 9/9