Application of Quasi-Newton Algorithms in Optimal Design Sebastian Ueckert, Joakim Nyberg, Andrew C. Hooker Pharmacometrics Research Group Department of Pharmaceutical Biosciences Uppsala University Sweden Outline 1. Optimizing Designs 2. Introduction: Quasi-Newton Methods (QNMs) 3. Performance QNMs 4. Advantages QNMs 5. Laplace Approximation for Global Optimal Design 6. Using QNMs in Laplace Approximation 2 Optimizing a Design Model Parameters α Estimation Design variables x Data j ( x) f F ( , x) e.g. D-Optimal Design j ( x ) F , x x* arg max j ( x) xX 3 Optimization • Interval methods True global optimizers Hard to implement Still under development • Stochastic methods – Simulated Annealing (SA), Ant colony optimization, Genetic Algorithm(GA) Easy to implement (SA) Marketing effective (GA) Slow No information about solution Heuristic 4 Optimization • Derivative free methods – Downhill Simplex Method No derivatives necessary Robust Slow Local 5 Gradient Based Methods 6 Gradient Based Methods 7 Gradient Based Methods 8 Gradient Based Methods 9 Gradient Based Methods 10 Gradient Based Methods 11 Gradient Based Methods Mathematically well understood Fast (if OFV calc not too expensive) Only local Complicated to implement • Steepest Descent • Conjugate Gradient 12 Newton Method Goal: x* arg max f ( x) xR Algorithm: 1. 2. Set xk=x0 Determine search direction f ( xk p) f k pT f k 12 pT 2 f k p : mk ( p ) mk ( p ) f k 2 f k p 0 p* 2 f k1f k 3. 4. Do line search along p* to find minimal xk+1 Set xk=xk+1 and go to 2 13 Newton Method Goal: x* arg max f ( x) xR Algorithm: 1. 2. Set xk=x0 Determine search direction f ( xk p) f k pT f k 12 pT 2 f k p : mk ( p ) mk ( p ) f k 2 f k p 0 p* 2 f k1f k 3. 4. Do line search along p* to find minimal xk+1 Set xk=xk+1 and go to 2 14 Quasi-Newton Methods Problem: Approach: Calculation of Hessian is computationally expensive Use approx. Hessian and build up during search Algorithm: 1. 2. Set xk=x0, Bk=I Determine search direction f ( xk p) f k pT f k 12 pT Bk f k p : mk ( p ) mk ( p) f k Bk p 0 p* Bk1f k 3. 4. Do line search along p* to find minimal xk+1 Set xk=xk+1, Bk=Bk+Uk and go to 2 15 Quasi-Newton Methods • Different methods for different updating formulas yk f k 1 f k sk xk 1 xk – Davidon–Fletcher–Powell (DFP) yk xkT Bk 1 I T T yk xk yk xkT Bk I T T yk xk yk ykT T yk xk – Broyden-Fletcher-Goldfarb-Shanno (BFGS) Bk xk Bk xk yk y Bk 1 Bk T yk xk xkT Bk xk T k T 16 Constraints • Experiments usually come with practicality constraints e.g.: – Administered dose has to be smaller than X mg – Sampling times can only be taken until 8 h after dosing Box Constraints ui xi bi BFGS-B 17 BFGS-B Algorithm: 1. 2. Set xk=x0, Bk=I Determine search direction f ( xk p) f k pT f k 12 pT Bk f k p : mk ( p ) mk ( p) f k Bk p 0 p* Bk1f k 3. 4. 5. Project search direction vector on feasible region Do line search along p* to find minimal xk+1 respecting bounds Set xk=xk+1, Bk=Bk+Uk and go to 2 18 Comparison • Test Scenario – Model: • PKPD (1 cmp oral absorption; IMAX drug effect) • All parameters (ka,CL,V,IC50, E0, IMAX) with log-normal IIV 30% CV • PK parameters fixed • Combined error – Design: • 3 groups (40,30,30 subjects) • 1 PK and 1 PD sample per subject • Approach: – Generate random initial values – Optimize with steepest descent and BFGS 19 Results 10 15.03 0 10 5 40 50 15 30 20 20 Runtime [s] Frequency[%] 25 0 0 60.84 60 30 2 4 OFV 6 8 BFGS Steepest Descent 10 x 10 Steepest Descent BFGS 20 Design Sensitivity • Approximate Hessian matrix can be used to assess sensitivity of design (at no additional computational costs) – Diagonal of the inverse of the Hessian – Use approximate efficiency Eff ( x ) j ( x) j x* j ( x* a ) j * aT j * 12 aT B *a j* aT j* 12 aT B *a Eff (a) j* 21 Design Sensitivity - Visual 1.02 1.02 1.02 1 1 1 0.98 0.98 0.98 0.96 0.96 0.96 0.94 0.94 0.94 0.92 0.92 0.92 0.9 1 1.5 2 2.5 Group 2 PD 3 0.9 6 6.5 7 7.5 Group 1 PK 8 8.5 0.9 7 7.5 8 8.5 9 9.5 Group 1 PD 22 Design Sensitivity - Numerical PK Sample PD Sample Group 1 7.12 [0.35;13.9] 8.38[5.28;11.38] Group 2 1.26 [0;3.74] 1.79[1.03;2.55] Group 3 9.22 [-1.31E;+1.31E] 0[0;0.0025] 100 100 100 95 95 95 90 90 90 85 85 85 80 0 20 40 60 Group 1 PK 80 100 80 0 20 40 60 Group 3 PK 80 100 80 0 20 40 60 80 Group 2 PD 100 23 LAPLACE APPROXIMATION 24 Global Optimal Design • Integral has to be evaluated • FIM occurs in integrand • For example ED optimal design: j ED ( x) p( ) F ( , x) d • Usually evaluated with Monte-Carlo integration Computationally intensive or imprecise 25 Laplace Approximation j ED ( x) p( ) F ( , x) d k , x e d k , x : log p( ) F ( , x) 1 2 k , x 2 m e k m ,x m arg min k , x 26 Laplace Approximation Algorithm: 1. Minimize k , x : log p( ) F ( , x) 2. Calculate the Hessian 2 k m , x 3. B Evaluate 1 2 2 k m , x e k m ,x 27 Laplace-BFGS Approximation Algorithm: 1. 2. Minimize using BFGS algorithm k , x : log p( ) F ( , x) Evaluate k m , x 1 e 2 B 28 Laplace-BFGS – Random Effects m arg min k , x Problem: Approach: For variance parameter α ≥ 0 Perform optimization on log-domain g ( ) e Algorithm: 1. Minimize using BFGS algorithm k , x : log p( g ( )) F ( g ( ), x) 2. Rescale approximate Hessian B B g T g 3. Evaluate 1 2 B e 1 k g ( m ), x 29 Comparison • Comparison of 4 algorithms: 1. 2. 3. 4. • Monte Carlo integration with random sampling (MC-RS) Monte Carlo integration with Latin hypercube sampling (MC-LHS) Laplace integral approximation (LAPLACE) Laplace integral approximation with BFGS Hessian (LAPLACE-BFGS) Testing MC methods with 50 and 500 random samples 30 Comparison • Test Scenario – Model: • 1 cmp IV bolus • CL,V with log-normal IIV • Additive error – Design: • 20 subjects • 2 samples per subject – Parameter distribution: • Log-normal an all parameters (Fixed effect Var=0.05; Random Effect Var=0.09) 31 Results - OFV Method Mean OFV1021 [95% CI] MC-RS 100,000 3.24 MC-RS 50 3.27[2.2-5.0] MC-RS 500 3.33[2.8-3.8] MC-LHS 50 3.24[2.2-4.6] MC-LHS 500 3.22[2.9-3.7] LAPLACE 2.95 LAPLACE-BFGS 3.01 Mean OFV and non-parametric confidence intervals for different integration methods from 100 evaluations 32 Results - Design MC-RS 50 MC-LHS 50 LAPLACE MC-RS 500 MC-LHS 500 LAPLACE-BFGS 33 4 Results – Runtimes 3.67 2 1 0.63 0.35 0.37 MC-LHS 50 MC-RS 50 0.46 0 Runtime [s] 3 3.53 LAPLACE-BFGS LAPLACE MC-LHS 500 MC-RS 500 34 Conclusions • Quasi-Newton methods constitute fast alternative for continuous design variable optimization • Information about design sensitivity can be obtained with no additional cost • Global Optimal Design: – Monte-Carlo methods are easy and flexible but need high number of samples to give stable results – Laplace approximation constitutes fast alternative for priors with continuous probability distribution function – Laplace integral approximation with BFGS Hessian gave same sampling times with approx. 30% shorter runtimes 35 THANK YOU! 36 References 1) C.G. Broyden, “The Convergence of a Class of Double-rank Minimization Algorithms 1. General Considerations,” IMA J Appl Math, vol. 6, Mar. 1970, pp. 76-90. 2) R. Fletcher, “A new approach to variable metric algorithms,” The Computer Journal, vol. 13, 1970, p. 317. 3) D. Goldfarb, “A family of variable-metric methods derived by variational means,” Mathematics of Computation, 1970, pp. 23–26. 4) D.F. Shanno, “Conditioning of quasi-Newton methods for function minimization,” Mathematics of Computation, 1970, pp. 647–656. 5) R.H. Byrd, P. Lu, J. Nocedal, and C. Zhu, “A limited memory algorithm for bound constrained optimization,” SIAM J. Sci. Comput., vol. 16, 1995, pp. 1190-1208. 6) M. Dodds, A. Hooker, and P. Vicini, “Robust Population Pharmacokinetic Experiment Design,” Journal of Pharmacokinetics and Pharmacodynamics, vol. 32, Feb. 2005, pp. 33-64. 37