Multidisciplinary System Design Optimization (MSDO) Gradient Calculation and Sensitivity Analysis

advertisement
Multidisciplinary System
Design Optimization (MSDO)
Gradient Calculation and
Sensitivity Analysis
Lecture 9
Olivier de Weck
Karen Willcox
1
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Today‟s Topics
• Gradient calculation methods
–
–
–
–
–
Analytic and Symbolic
Finite difference
Complex step
Adjoint method
Automatic differentiation
• Post-Processing Sensitivity Analysis
– effect of changing design variables
– effect of changing parameters
– effect of changing constraints
2
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Definition of the Gradient
“How does the function J value change
locally as we change elements of the
design vector x?”
Compute partial derivatives
of J with respect to xi
J
x3
J
J
x2
J
xn
Gradient vector points normal
to the tangent hyperplane of J(x)
x2
3
J
xi
J
x1
x1
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Geometry of Gradient vector (2D)
Example function:
J x1 , x2
x1
1
x1 x2
x2
Contour plot
2
1.8
25
3.
1.6
4
3.2
5
3.
5
3.1
1.4
3.1
1
5
3.2
3.5
x2
1.2
5
4
0.8
3.1
0.6
J
4
J
x1
1
1 2
x1 x2
J
x2
1
1
x1 x22
3.5
0.4
5
5
3.2
3.25
4
3.5
4
0.2
0
0
0.5
1
x1
1.5
2
Gradient normal to contours
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Geometry of Gradient vector (3D)
Example
J
2
1
2
2
x
x
2
3
x
increasing
values of J
2 x1
J
J
2 x2
2 x3
xo
2 2 2
Tangent plane
2 x1 2 x2
2 x3 6 0
x3
J=3
x2
x
o
1 1 1
T
x1
Gradient vector points to larger values of J
5
T
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Other Gradient-Related Quantities
• Jacobian: Matrix of derivatives of multiple functions
w.r.t. vector of variables
J
J1
J2
J
Jz
J1
x1
J2
x1
Jz
x1
J1
x2
J2
x2
Jz
x2
J1
xn
J2
xn
Jz
xn
nxz
zx1
• Hessian: Matrix of second-order derivatives
2
H
6
2J
J
x12
2
J
x2 x1

2
J
xn x1
2
J
x1 x2
2
J
x22

2
J
xn x2
2




J
x1 xn
2
J
x2 xn

2
J
xn2
nxn
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Why Calculate Gradients
• Required by gradient-based optimization
algorithms
– Normally need gradient of objective function and
each constraint w.r.t. design variables at each
iteration
– Newton methods require Hessians as well
• Isoperformance/goal programming
• Robust design
• Post-processing sensitivity analysis
– determine if result is optimal
– sensitivity to parameters, constraint values
7
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Analytical Sensitivities
If the objective function is known in closed form, we can often
compute the gradient vector(s) in closed form (analytically):
Example
1
Example: J x1 , x2
x1 x2
x1 = x2 =1
x1 x2
J(1,1)=3
Analytical Gradient:
J
J
x1
1
1 2
x1 x2
J
x2
1
1
x1 x22
J (1,1)
Minimum
For complex systems analytical gradients are rarely available
8
0
0
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Symbolic Differentiation
• Use symbolic mathematics programs
• e.g. MATLAB®, Maple®, Mathematica®
construct a symbolic object
» syms x1 x2
» J=x1+x2+1/(x1*x2);
» dJdx1=diff(J,x1)
dJdx1 =1-1/x1^2/x2
» dJdx2=diff(J,x2)
dJdx2 = 1-1/x1/x2^2
9
difference operator
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Finite Differences (I)
Function of a single variable f(x)
• First-order finite difference
approximation of gradient:
'
f xo
f xo
x
x
f xo
Forward difference
approximation to
the derivative
O
f
f xo
10
f xo
x
f xo
2 x
x
x
Truncation Error
x
xo- x xo xo+ x
• Second-order finite difference
approximation of gradient:
'
x
x
O
x2
Central difference
Truncation Error
approximation to
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
the derivative
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Finite Differences (II)
Approximations are derived from Taylor Series expansion:
2
f xo
x
f xo
x ''
f xo
2
'
xf xo
O
x32
Neglect second order and higher order terms; solve for
gradient vector:
'
f xo
f xo
x
x
f xo
Forward Difference
O
x
Truncation Error
O
xo
11
x
x ''
f
2
xo
x
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Finite Differences (III)
Take Taylor expansion backwards at xo
x 2 ''
f xo
2
x 2 ''
f xo
2
'
f xo
x
f xo
xf xo
f xo
x
f xo
xf ' xo
x
O
x2
(1)
O
x2
(2)
(1) - (2) and solve again for derivative
'
f xo
f xo
x
f xo
2 x
x
Central Difference
O
x2
Truncation Error
O
xo
12
x2
x 2 '''
f
6
xo
x
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Finite Differences (IV)
J x11
J
x1
J x1o
J
J x1o
x1
x11 x1o
J(x)
J
x1
x1
finite difference
approximation
J x11
J x1o
true, analytical
sensitivity
13
J x1o
o
1
x1
1
1
x
x
x1
o
x11 - x1
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Finite Differences (V)
• Second-order finite difference
approximation of second derivative:
f ''( xo )
f xo
x
2 f xo
x2
f xo
x
f
x
x
x
xo- x xo xo+ x
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Errors of Finite Differencing
Caution: - Finite differencing always has errors
- Very dependent on perturbation size
J x1 , x2
10
9
8
7
6
5
4
3
2
1
x2
x1 = x2 =1
J(1,1)=3
x1
0
10
Rounding
Errors ~ 1/ x
-7
10
10
x1
Choice of x is critical
Truncation
Errors ~ x
-6
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
15
J (1,1)
0
0
-5
10
Gradient Error
J
x1
1
x1 x2
-8
10
-9
10
-8
10
-7
-6
10
Perturbation Step Size x1
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Perturbation Size x Choice
theoretical function
• Error Analysis
x
x
A
A
f
f
(Gill et al. 1981)
1/ 2
- Forward difference
1/ 3
A
- Central difference
~ x
• Machine Precision
Step size
at k-th iteration
xk
xo
xk 10
q
computed
values
q-# of digits of machine
Precision for real numbers
• Trial and Error – typical value ~ 0.1-1%
16
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Computational Expense of FD
F Ji
Cost of a single objective function
evaluation of Ji
n F Ji
Cost of gradient vector one-sided
finite difference approximation for Ji
for a design vector of length n
z n F Ji
Cost of Jacobian finite
difference approximation with
z objective functions
Example: 6 objectives
30 design variables
1 sec per function evaluation
17
3 min of CPU time
for a single Jacobian
estimate - expensive !
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Complex Step Derivative
• Similar to finite differences, but uses an imaginary
step
p
Im[ f ( x0 + iΔx)]
f ' ( x0 ) ≈
Δx
• Second order accurate
• Can use very small step sizes e.g. Δx ≈ 10 −20
– Doesn’t have rounding error, since it doesn’t perform
subtraction
• Limited application areas
– Code must be able to handle complex step values
J.R.R.A. Martins, I.M. Kroo and J.J. Alonso, An automated method for
sensitivity analysis using complex variables, AIAA Paper 2000-0689,
Jan 2000
18
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Automatic Differentiation
• Mathematical formulae are built from a finite set of basic
functions, e.g. additions, sin x, exp x, etc.
• Using chain rule, differentiate analysis code: add
statements that generate derivatives of the basic
functions
• Tracks numerical values of derivatives, does not track
symbolically as discussed before
• Outputs modified program = original + derivative
capability
• e.g., ADIFOR (FORTRAN), TAPENADE (C, FORTRAN),
TOMLAB (MATLAB), many more…
• Resources at http://www.autodiff.org/
19
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Adjoint Methods
Consider the following problem:
Minimize
s.t.
J (x, u)
R(x, u) 0
where x are the design variables and u are the state variables.
The constraints represent the state equation.
e.g. wing design: x are shape variables, u are flow variables,
R(x,u)=0 represents the Navier Stokes equations.
We need to compute the gradients of J wrt x:
dJ
dx
J
x
J du
u dx
Typically the dimension of u is very high (thousands/millions).
Adjoint Methods
dJ
dx
J
x
J du
u dx
• To compute du/dx, differentiate the state
equation:
dR
dx
R
x
R du
u dx
R du
u dx
du
dx
21
0
R
x
R
u
1
R
x
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Adjoint Methods
• We have
dJ
dx
• Now define
J
x
J du
u dx
J
u
λ
J
x
R
u
R
x
λT
1 T
R
u
J
u
1
• Then to determine the gradient:
First solve
22
Then compute
R
u
T
λ
dJ
dx
J
u
T
(adjoint equation)
J
T R
λ
x
x
Adjoint Methods
• Solving adjoint equation
R
u
T
λ
J
u
T
about same cost as solving forward problem
(function evaluation)
• Adjoints widely used in aerodynamic shape optimization,
optimal flow control, geophysics applications, etc.
• Some automatic differentiation tools have „reverse mode‟
for computing adjoints
23
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Post-Processing Sensitivity Analysis
• A sensitivity analysis is an important
component of post-processing
• Key to understanding which design variables,
constraints, and parameters are important
drivers for the optimum solution
• How sensitive is the “optimal” solution J* to
changes or perturbations of the design
variables x*?
• How sensitive is the “optimal” solution x* to
changes in the constraints g(x), h(x) and
fixed parameters p ?
24
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Sensitivity Analysis: Aircraft
Questions for aircraft design:
How does my solution change if I
• change the cruise altitude?
• change the cruise speed?
• change the range?
• change material properties?
• relax the constraint on payload?
• ...
25
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Sensitivity Analysis: Spacecraft
Questions for spacecraft design:
How does my solution change if I
• change the orbital altitude?
• change the transmission frequency?
• change the specific impulse of the propellant?
• change launch vehicle?
• Change desired mission lifetime?
• ...
26
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Sensitivity Analysis
“How does the optimal solution change as we change
the problem parameters?”
effect on design variables
effect on objective function
effect on constraints
Want to answer this question without having to solve the
optimization problem again.
27
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Normalization
In order to compare sensitivities from different
design variables in terms of their relative sensitivity
it may be necessary to normalize:
J
xi
J J
xi xi
“raw” - unnormalized sensitivity = partial
derivative evaluated at point xi,o
xo
=
xi ,o
o
J (x )
J
xi
Normalized sensitivity captures
relative sensitivity
x
o
~ % change in objective per
% change in design variable
Important for comparing effect between design variables
28
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Example: Dairy Farm Problem
“Dairy Farm” sample problem
L – Length = 100 [m]
N - # of cows = 10
R – Radius = 50 [m]
cow
cow
N
R
xo
With respect to which
design variable is the
objective most sensitive?
cow
fence
L
Parameters:
f=100$/m
n=2000$/cow
m=2$/liter
29
A 2 LR
R2
F
2L 2 R
M
100
A/ N
C f F n N
I N M m
P I C
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Dairy Farm Sensitivity
P
L
P
N
P
R
• Compute objective at xo J (x o ) 13092
• Then compute raw sensitivities
J
o
J
x
J
o
J (x )
• Show graphically with
tornado chart
0.28
1.7
2.25
Dairy Farm Normalized Sensitivities
Design Variable
• Normalize
100
36.6
13092
10
2225.4
13092
50
588.4
13092
R
N
L
0
30
36.6
2225.4
588.4
0.5
1
1.5
2
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
2.5
Realistic Example: Spacecraft
NASA Nexus Spacecraft Concept
J2= Centroid Jitter on Focal Plane [RSS LOS]
60
40
1 pixel
Centroid Y [ m]
T=5 sec
Z
X
Y
Finite Element
Model
20
0
14.97 m
-20
-40
0
1
2
meters
Image by MIT OpenCourseWare.
Spacecraft
CAD model
31
Requirement: J2,req=5 m
-60
-60
-40
Simulation
-20
0
20
40
Centroid X [ m]
“x”-domain
“J”-domain
What are the design variables that are “drivers”
of system performance ?
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
60
Graphical Representation
Srg
Sst
Tgs
m_SM
K_yPM
K_rISO
Srg
Sst
Tgs
m_SM
K_yPM
K_rISO
m_bus
K_zpet
t_sp
I_ss
I_propt
zeta
lambda
m_bus
K_zpet
t_sp
I_ss
I_propt
zeta
lambda
Ro
QE
Mgs
fca
Kc
Kcf
Ro
QE
Mgs
fca
Kc
Kcf
analytical
finite difference
-0.5
0
x
32
o
/J
1,o
0.5
1
1.5
* J / x
1
Graphical Representation of
Jacobian evaluated at design
xo, normalized for comparison.
J
structural
variables
Ru
Us
Ud
fc
Qc
Tst
Design Variables
Ru
Us
Ud
fc
Qc
Tst
disturbance
variables
J2: Norm Sensitivities: RSS LOS
-0.5
0
0.5
1
1.5
xo /J2,o
* J2 / x
control optics
variables variables
J1: Norm Sensitivities: RMMS WFE
x0
Jo
J1
Ru
J2
Ru
J1
K cf
J2
K cf
J1: RMMS WFE most sensitive to:
Ru - upper wheel speed limit [RPM]
Sst - star tracker noise 1 [asec]
K_rISO - isolator joint stiffness [Nm/rad]
K_zpet - deploy petal stiffness [N/m]
J2: RSS LOS most sensitive to:
Ud - dynamic wheel imbalance [gcm2]
K_rISO - isolator joint stiffness [Nm/rad]
zeta - proportional damping ratio [-]
Mgs - guide star magnitude [mag]
Kcf - FSM controller gain [-]
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Parameters
Parameters p are the fixed assumptions.
How sensitive is the optimal solution x* with respect
to fixed parameters ?
Optimal solution:
Example:
x* =[ R=106.1m, L=0m, N=17 cows]T
“Dairy Farm” sample problem
Fixed parameters:
cow
R
cow
N
cow
fence
L
Maximize Profit
33
Parameters:
f=100$/m - Cost of fence
n=2000$/cow - Cost of a single cow
m=2$/liter - Market price of milk
How does x* change as parameters
change?
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Sensitivity Analysis
KKT conditions:
J ( x *)
j
gˆ j ( x * )
j M
gˆ j ( x *)
0,
j
0,
j
j
M
M
0
Set of
active
constraints
For a small change in a parameter, p, we require
that the KKT conditions remain valid:
d (KKT conditions)
0
dp
Rewrite first equation:
J *
(x )
xi
34
gˆ j
j
j M
xi
(x * )
0, i
1,..., n
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Sensitivity Analysis
Recall chain rule. If: Y = Y ( p, x( p ) )
then
n
dY ∂Y
∂Y ∂xi
=
+∑
dp ∂p k =1 ∂xi ∂p
A l i tto fifirstt equation
Applying
ti off KKT conditions:
diti
∂gˆ j (x, p ) ⎞
d ⎛ ∂J (x, p )
⎜
⎟
+ ∑ λ j ( p)
⎜
⎟
dp ⎝ ∂xi
∂xi
j∈M
⎠
2
2
2
2
n ⎛
∂
gˆ j ⎞ ∂xk
∂λ j ∂gˆ j
∂ J
∂ J
∂
g
⎟
+ ∑⎜
+ ∑λj
+∑
=0
=
+ ∑λj
⎟
⎜
∂xi ∂p j∈M ∂xi ∂p k =1 ⎝ ∂xi ∂xk j∈M ∂xi ∂xk ⎠ ∂p j∈M ∂p ∂xi
∂λ j
∂xk
Aik
+ ∑ Bij
+ ci = 0
∑
∂p j∈M
∂p
k =1
n
35
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Sensitivity Analysis
*
g j ( x , p)
Perform same procedure on equation:
gˆ j
n
p
k 1
n
Bkj
k 1
36
xk
p
gˆ j xk
xk p
dj
0
0
0
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Sensitivity Analysis
In matrix form we can write:
n
M
A
T
B
n
M
2
2
Aik
Bij
ci
dj
37
J
xi xk
gˆ j
xi
2
J
xi p
gˆ j
p
j
j M
2
j
j M
B
0
x
λ
gˆ j
xi xk
gˆ j
xi p
x
x1
p
x2
p
xn
p
c
d
0
1
p
2
λ
p
M
p
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Sensitivity Analysis
We solve the system to find x and , then the sensitivity
of the objective function with respect to p can be found:
dJ
dp
J
p
JT x
J
dJ
p
dp
x
x p
(first-order
approximation)
To assess the effect of changing a different parameter, we
only need to calculate a new RHS in the matrix system.
38
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Sensitivity Analysis - Constraints
•
•
We also need to assess when an active constraint will
become inactive and vice versa
An active constraint will become inactive when its
Lagrange multiplier goes to zero:
j
j
p
p
Find the p that makes
j
p
j
j
zero:
p
j
j
p
0
j
M
j
This is the amount by which we can change p before the jth
constraint becomes inactive (to a first order approximation)
39
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Sensitivity Analysis - Constraints
An inactive constraint will become active when gj(x)
goes to zero:
g j (x)
g j (x*)
g j (x*)T x
p
0
Find the p that makes gj zero:
p
g j ( x *)
T
g j ( x *)
x
for all j not
active at x*
• This is the amount by which we can change p before the jth
constraint becomes active (to a first order approximation)
• If we want to change p by a larger amount, then the problem
must be solved again including the new constraint
• Only valid close to the
optimum
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
40
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Lagrange Multiplier Interpretation
• Consider the problem:
minimize J(x) s.t. h(x)=0
with optimal solution x*
• What happens if we change constraint k by a small
amount?
h j (x* ) 0,
hk (x* )
j
k
j
k
• Differentiating w.r.t
dx*
hk
d
41
1
dx*
hj
d
0,
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Lagrange Multiplier Interpretation
• How does the objective function change?
dx*
J
d
dJ
d
• Using KKT conditions:
dJ
d
j
j
hj
dx*
d
j
j
dx*
hj
d
k
• Lagrange multiplier is negative of sensitivity of cost
function to constraint value. Also called shadow
price.
42
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Lecture Summary
• Gradient calculation approaches
– Analytical and Symbolic
– Finite difference
– Automatic Differentiation
– Adjoint methods
• Sensitivity analysis
– Yields important information about the design space,
both as the optimization is proceeding and once the
“optimal” solution has been reached.
Reading
Papalambros – Section 8.2 Computing Derivatives
43
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
MIT OpenCourseWare
http://ocw.mit.edu
ESD.77 / 16.888 Multidisciplinary System Design Optimization
Spring 2010
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Download