Global sensitivity metrics from active subspaces

advertisement
Global sensitivity metrics from active subspaces
Paul Diaza,∗, Paul G. Constantinea
arXiv:1510.04361v1 [math.NA] 15 Oct 2015
a
Department of Applied Mathematics and Statistics, Colorado School of Mines, Golden,
CO 80401
Abstract
Predictions from science and engineering models depend on several input
parameters. Global sensitivity analysis metrics quantify the importance of
each input parameter, which can lead to insight into the model and reduced
computational cost; commonly used sensitivity metrics include Sobol’ indices
derived from a variance-based decomposition and derivative-based metrics.
Active subspaces are an emerging set of tools for identifying important directions in a models input parameter space; these directions can be exploited to
reduce the model’s dimension enabling otherwise infeasible parameter studies. In this paper, we develop global sensitivity metrics called activity scores
from the estimated active subspace, which give insight into the important
model parameters. We analytically compare the activity scores to established
sensitivity metrics, and we discuss computational methods to estimate the
activity scores. We then show two numerical examples with algebraic functions from engineering models. For each case, we demonstrate that the model
admits a one-dimensional active subspace. We then compare the standard
sensitivity metrics to the activity scores, and we find that all metrics give
consistent rankings for the input parameters.
Keywords: sensitivity analysis, active subspaces, activity score
∗
Corresponding author
Email addresses: pdiaz@mymail.mines.edu (Paul Diaz),
paul.constantine@mines.edu (Paul G. Constantine)
URL: http://inside.mines.edu/~pdiaz/ (Paul Diaz),
http://inside.mines.edu/~pconstan (Paul G. Constantine)
Preprint submitted to Reliability Engineering & System Safety
October 16, 2015
1. Introduction
Science and engineering models typically depend on several input parameters. When the model is complex, the scientist may not know which input
parameters drive the its outputs; she may need to evaluate the model at several parameter values to identify the most important parameters. Sensitivity
analysis includes computational methods that seek to identify the most important parameters, which leads to greater insight into the model [1]. The
analysis results may also enable the scientist to ignore the less important parameters, thus reducing the cost of parameter studies—like optimization or
uncertainty quantification—that are essential for confident predictions. Sensitivity metrics are broadly classified as local, which measure the model response to small perturbations around a nominal parameter value, and global,
which assess the importance of each variable over a range of parameters; we
are primarily concerned with global sensitivity metrics. The total sensitivity
indices of Sobol’ are some of the most widely used metrics for global sensitivity analysis [2]. Other common metrics use averages of local gradients to
build global metrics [3]. The appropriate choice of metric depends on the
application. We review three common global sensitivity metrics in Section 2.
Using sensitivity metrics to ignore less important parameters is a form
of dimension reduction. This type of reduction is limited to the coordinates
induced by the model parameters. Our recent work in active subspaces generalizes the dimension reduction idea to directions in the high-dimensional
space of model inputs [4]. A model’s output might depend strongly on each
coordinate but only through a limited number of directions. Each direction
corresponds to a linear combination of the model parameters that we call an
active variable. Our work has developed methods for identifying an active
subspace [5] and exploiting it, if present, to reduce the dimension of subsequent parameter studies [6]. We have applied this approach to engineering
models in aerospace [7, 8], geosciences [9], and alternative energy [10]. We
review active subspaces in Section 3.
One difficulty in the subspace-based dimension reduction is how to draw
insight into the original model parameters once an active subspace has been
identified. For example, how might a climate scientist use the fact that
3 × (cloud parameter) + 2 × (radiation parameter) is an important linear
combination in a climate model? In this paper, we propose global sensitivity
metrics based on active subspaces, and we develop computational methods
to estimate these metrics. We quantitatively compare the proposed metrics
2
to existing global sensitivity metrics in Section 4, and we show two numerical examples in Section 5. We conclude in Section 6 with a summary and
directions for further research.
1.1. Setup and notation
We consider a differentiable and square-integrable function f = f (x) defined
on an m-dimensional hypercube x = [x1 , . . . , xm ]T ∈ [−1, 1]m . Define the
function ρ(x) = 2−m for x ∈ [−1, 1]m and zero elsewhere. Then we can write
f ’s mean and variance as
Z
Z
E [f ] =
f ρ dx,
Var [f ] =
(f − E [f ])2 ρ dx.
(1)
We emphasize that f is a deterministic function; the mean and variance do
not represent any probabilistic information. Denote the gradient of f by
∂f
, and assume the partial derivatives
∇f (x) ∈ Rm with partial derivatives ∂x
i
are square-integrable. All vector norms are the standard Euclidean norm,
and matrix norms are the matrix 2-norm (i.e., the spectral norm).
2. Global sensitivity metrics
We briefly review three global sensitivity metrics used in practice: variancebased metrics, derivative-based metrics, and least-squares-fit linear model
coefficients. In each case, we (i) define the metric, (ii) discuss a method for
high accuracy computation with numerical quadrature, (iii) detail a Monte
Carlo-based method for low accuracy estimation, and (iv) mention practical
error estimates for the Monte Carlo-based estimates.
2.1. Variance-based metrics
The total sensitivity indices were introduced by Sobol’ [2]; several papers
describe these metrics, propose extensions [11], and develop efficient computational procedures [12, 13, 14, 15]. The total sensitivity indices are derived
from a particular decomposition of the square-integrable f (x), often called
the function ANOVA decomposition or the variance-based decomposition;
we use the concise notation from Liu and Owen [12]. Let u be a subset of
the integers {1, . . . , m} with cardinality |u|. Define xu ∈ [−1, 1]|u| , and let
ρu (xu ) = 2−|u| for xu ∈ [−1, 1]|u| and zero elsewhere. The subset −u is the
complement of u in {1, . . . , m}. The ANOVA decomposition can be written
X
f (x) =
fu (x),
(2)
u⊆{1,...,m}
3
where fu depends on x only through xu . The function fu is defined as
Z
X
fu (x) =
f (x) ρ−u dx−u −
fv (x),
(3)
v⊂u
and f∅ = E [f ]. The functions fu are mutually orthogonal, and the variance
of f can be decomposed as
X
Var [f ] =
Var [fu ] .
(4)
u⊆{1,...,m}
Define the set Si to be the set of subsets of {1, . . . , m} containing the index
i. The total sensitivity index τi for the ith variable xi can be written as sums
of subsets of the variances Var [fu ] divided by the total variance,
P
u⊆Si Var [fu ]
.
(5)
τi =
Var [f ]
Other sensitivity metrics can be defined from the variances Var [fu ]; we focus
exclusively on τi from (5) to compare with the metrics derived from active
subspaces; see Section 4.
When the dimension of x is sufficiently low, f (x) is sufficiently smooth,
and f (x) can be computed quickly, one can accurately estimate τi using a
high order numerical quadrature rule. This approach is equivalent to using
the Fourier coefficients as in [14], which we use to compute reference values
for the numerical examples in Section 5.
If f (x) does not satisfy these assumptions, one can use a less accurate
Monte Carlo method as in [1, Section 4.6]. Next we review the Monte Carlo
method and its central limit theorem-based standard error. First draw samples xTi independently according to the uniform density ρ(x) on [−1, 1]m . We
arrange these samples into two M × m matrices, A and B, defined as
 


xT1
xTM +1
 


A =  ...  ,
B =  ...  .
(6)
T
T
xM
x2M
Define the M ×m matrix Ci to be the matrix B with the ith column replaced
by the ith column of A. Define the M -vector fA as


f (x1 )


fA =  ...  .
(7)
f (xM )
4
In words, the vector fA contains evaluations of the function f at each row
of the matrix A. Similarly, define the M -vectors fB and fCi whose elements
contain evaluations of f at the rows of their respective subscripted matrices.
These calculations use (m + 2)M evaluations of f . The total sensitivity
indices may be estimated as
τi ≈ τ̂i = 1 − 2
T
fCi − fˆ02
fA
kfA k2 + kfB k2 − fˆ02
,
(8)
where the sample mean fˆ0 is
(fA + fB )T e
fˆ0 =
,
M
(9)
and e is an M -vector of ones. Monte Carlo estimates come equipped with
an asymptotically valid estimate of the error derived from the central limit
theorem; in statistics, these error estimates are called the standard error.
The standard error seτi for the Monte Carlo estimate τ̂i is
"
seτi =
T
kfA,Ci k2 − fA
fCi
N
2 #1/2
,
(10)
where fA,Ci is the element-wise product of fA and fCi . In section 5, we study
the standard error of the estimator by investigating the values of seτi for
different values of M ; see Figures 3 and 6.
2.2. Derivative-based sensitivity metrics
A natural notion of sensitivity is how a model output responds to changes in
its inputs, i.e., a derivative. However, derivatives only provide information at
a single parameter value. Morris developed sensitivity metrics based on sampling and averaging coarse finite difference approximations to derivatives [16].
Kucherenko, et al [3] extended this idea to continuous derivatives and integration, where the the averages are estimated with Monte Carlo methods.
We focus on one particular derivative-based sensitivity metric studied by
Sobol’ and Kucherenko [17]:
Z νi =
∂f
∂xi
2
ρ dx,
5
i = 1, . . . , m.
(11)
This metric averages local sensitivity information to make it a global metric.
We discuss this metric’s connection to active subspaces in Section 4.
If f ’s derivatives are sufficiently smooth, and if the dimension of x is low,
then one can compute accurate estimates of νi using a high order numerical
quadrature rule; we use the quadrature approach for reference values in the
numerical examples in section 5.
The Monte Carlo estimate of νi using M samples is
2
M 1 X ∂f
(xj ) ,
νi ≈ ν̂i =
M j=1 ∂xi
(12)
where the xj ’s are drawn independently according to ρ(x). The standard
error seνi is

!2 1/2
M
2
1 X
∂f
(xj ) − ν̂i  .
(13)
seνi = 
M − 1 j=1
∂xi
In section 5, we study the standard error of the estimator by investigating
seνi for different values of M ; see Figures 3 and 6.
2.3. Linear model coefficients
A simple and easily computable sensitivity metric can be derived from the
coefficients of a least-squares fit linear approximation to f (x). Such a metric
is valid when f is smooth and monotonic along each component of x. Saltelli,
et al [1, Section 1.2.5] discuss a similar metric in connection with a linear
regression model. Since f (x) is deterministic, the statistical interpretation of
the regression coefficients is not valid. Nevertheless, we can devise a Monte
Carlo-based least-squares procedure to estimate the coefficients of a linear
approximation as follows. Draw xj independently according to ρ(x) for j =
1, . . . , M , and compute fj = f (xj ). Next, use least-squares to fit the intercept
b̂0 and coefficients b̂ = [b̂1 , . . . , b̂m ]T of a linear approximation,
fj ≈ b̂0 + b̂T xj ,
j = 1, . . . , M.
(14)
When the coefficients b̂i are properly scaled, they become useful sensitivity
metrics. Define β̂i as
b̂i
,
(15)
β̂i = √
3 σ̂f
6
√
where σ̂f is the standard deviation of the samples {fj }. The 1/ 3 comes
from scaling by the variance of the uniform random variable xi with support
[−1, 1]. If f (x) is monotonic and sufficiently smooth, then β̂i gives a rough
indication of how f changes in response to a change in xi over the space
of inputs. In contrast to the total sensitivity index τi from (8) and the
derivative-based metric νi from (11), the β̂i ’s are signed, so they indicate if
f will increase or decrease given a positive change in xi .
We can compute reference values for β̂i by noting the relationship between the m-dimensional Legendre series truncated to degree 1 (i.e., only
a constant and m linear terms) and the least-squares linear approximation;
the Legendre series is often referred to as the generalized polynomial chaos
associated with the uniform density [14, 18]. In particular, the degree-1 Legendre approximation is the best linear approximation in the L2 norm. Its
coefficients admit a simple integral representation by virtue of the Legendre
polynomials’ orthogonality. Thus, we can estimate the coefficients with high
order numerical quadrature [19] and scale them to get the coefficients of a
linear monomial approximation as in (14).
When reference values are too expensive to compute, we can estimate
the standard error in β̂i with a bootstrap standard error, which we denote
se
ˆ βi . See [20, Section 9.5] for a detailed description of using the bootstrap
for linear regression coefficients. Note that randomness in the estimates is a
result of the Monte Carlo method; there is no randomness in the evaluation
of f (x). In section 5, we study the bootstrap standard error by investigating
the values of se
ˆ βi for different values of M ; see Figures 3 and 6.
3. Active subspaces
Active subspaces [4] are defined by the eigenvectors of the following m × m
symmetric, positive semidefinite matrix,
Z
C =
∇f ∇f T ρ dx = W ΛW T ,
(16)
where W = [w1 , . . . , wm ] is the orthogonal matrix of eigenvectors, and Λ =
diag(λ1 , . . . , λm ) is the diagonal matrix of eigenvalues in decreasing order.
The eigenpairs satisfy
Z
λi =
(∇f T wi )2 ρ dx.
(17)
7
If λi = 0, then f is constant along the direction wi , and such structure can
be exploited to reduce the model’s dimension. An example is the function
f (x1 , x2 ) = exp(7x1 +3x2 ), whose inactive subspace corresponds to the direction [−3, 7]. Suppose λn > λn+1 for some n less than m. Then we partition
Λ1
Λ=
,
W = W1 W2 ,
(18)
Λ2
where Λ1 is a diagonal matrix of the first n eigenvalues, and W1 is the m × n
matrix containing the first n eigenvectors. If λn+1 , . . . , λm are sufficiently
small, then a reasonable low-dimensional approximation of f is
f (x) ≈ g(W1T x),
(19)
where g : Rn → R. In [6], we propose a particular form for g based on a
conditional average, and we bound its mean-squared error in terms of the
eigenvalues λn+1 , . . . , λm .
To identify an active subspace, we must estimate the eigenpairs Λ, W
and look for a gap in the eigenvalues. If the dimension m is sufficiently small,
then one might use high order numerical integration rules to estimate C in
(16). In [5], we propose and analyze a Monte Carlo method for estimating
these eigenpairs. For i = 1, . . . , M , draw xi uniformly at random from the
hypercube [−1, 1]m . Compute ∇fi = ∇f (xi ). Then estimate
M
1 X
∇fi ∇fiT = Ŵ Λ̂Ŵ T .
C ≈ Ĉ =
M i=1
(20)
In [5], we study how large M must be so that (i) the estimated eigenvalues
Λ̂ are close to the true eigenvalues Λ and (ii) the subspace distance,
dist(ran(W1 ), ran(Ŵ1 )) = kW1 W1T − Ŵ1 Ŵ1T k2 ,
(21)
where Ŵ1 contains the first n columns of Ŵ , is well-behaved.
4. Sensitivity metrics from active subspaces
We propose two possible metrics derived from the eigenvectors W and eigenvalues Λ from (16). The first set of metrics uses the components of the first
eigenvector (i.e., the eigenvector associated with the largest eigenvalue). The
second set of metrics uses linear combinations of the squared eigenvector
components weighted by the eigenvalues, which we call activity scores.
8
4.1. The first eigenvector
In [10, Section 4.4], we used the first eigenvector from the matrix Ĉ in (20) to
identify the most important parameters in a single-diode model of a photovoltaic solar cell. For this particular model, the magnitudes of the eigenvector
components gave the same ranking as the total sensitivity indices (see section 2.1). However, these two metrics indicate different characteristics of the
function. The total sensitivity indices measure the proportion of the variance
attributed to each parameter while the eigenvector identifies an important
direction in the parameter space. Like the linear model coefficients in section 2.3, the eigenvector components are not all positive, and the relative
difference in signs (since eigenvectors are only unique up to a sign) provides
insight into the relationship between inputs and outputs. For example, if two
eigenvector components are equal in magnitude but differ in sign, then we
expect f to increase when one parameter is perturbed but decrease when the
other is perturbed. This is similar to the gradient-based metrics described by
Morris [16] and Kucherenko [3]. However, the trouble of averaging a gradient
that might change signs does not effect the eigenvector. In the numerical examples in section 5, we compare the first eigenvector’s components to other
sensitivity metrics, but we do not provide a theoretical comparison. We use
the bootstrap to estimate a standard error in the eigenvector components,
though one must be careful to keep the signs consistent when computing
the bootstrap replicates. Section 7.2 of [20] shows an example of using the
bootstrap to estimate standard errors in eigenvector components. When the
context is clear, we denote the first eigenvector by w = [w1 , . . . , wm ]T and
the bootstrap standard error by se
ˆ wi .
4.2. Activity scores
We develop a more comprehensive and justifiable sensitivity metric from the
eigenpairs in (16). We propose the following metric for global sensitivity
analysis. Define α ∈ Rm as
α = α(n) =
n
X
λj wj2 ,
(22)
j=1
where the exponent on the vector wj means squaring each component of
wj . We call α(n) the activity scores, and we use these numbers to rank the
importance of a model’s inputs.
9
The active subspace’s construction gives insight into the interpretation
of the activity scores. The eigenvector w identifies the most important direction in the parameter space in the following sense: perturbing x along w
changes f more, on average, than perturbing x orthogonal to w, see (17).
The components of w measure the relative change in each component of
x along this most important direction, so they impart significance to each
component of x. The second most important direction is the eigenvector w2 ,
and the relative importance of w2 is measured by the difference between the
eigenvalues λ1 and λ2 . For example, if λ1 λ2 , then f (x) has dominant
one-dimensional active subspace, and the importance of x’s components is
captured in w’s components. Therefore, to construct the global sensitivity metric, it is reasonable to scale each eigenvectors by its corresponding
eigenvalue.
Squaring each eigenvector component in (22) removes information provided by the signs. But the resulting metric is much easier to compare to
existing sensitivity metrics.
4.3. Comparison to existing metrics
We compare the activity scores to the derivative-based metrics and the total
sensitivity indices reviewed in Section 2.
Theorem 4.1. The activity scores are bounded above by the derivative-based
metrics,
α(n) ≤ ν.
(23)
The inequality becomes an equality when n = m.
Proof. Note that the derivative-based metrics are the diagonal elements of
the matrix C in (16). Then
ν = diag(C) = diag(W ΛW T ).
So
νi =
m
X
λj Wij2 = αi (m),
(24)
(25)
j=1
where Wij is the (i, j) element of W , and αi (m) is the ith component of
α(m). This proves the equality statement. To see the inequality, note that
10
Wij2 ≥ 0 and λj ≥ 0, so
αi (n) =
n
X
λj Wij2
≤
j=1
m
X
λj Wij2 = νi ,
(26)
j=1
as required.
Theorem 4.1 shows that the activity score can be interpreted as a truncation to the derivative-based metric νi from section 2.2. In many practical
situations, the activity scores and the derivative-based metrics give comparable rankings. However, it is possible to construct cases where the rankings
differ. Consider the quadratic function f (x1 , x2 ) = (x21 +x22 )/2. For this case,
the matrix C from (16) is
1/6 0
C =
.
(27)
0 1/6
The derivative-based metrics are the diagonal elements of C—both 1/6 implying equal importance. However, the activity scores with n = 1 are [1/6, 0],
which implies that the second variable is not important at all. This case illustrates that the activity scores are appropriate when there is a gap between
the eigenvalues of C (i.e., f admits an active subspace), and n is chosen
according to the gap. In the case of (27), there is no gap between the two
eigenvalues and therefore, f doesn’t admit an active subspace.
Sobol’ and Kucherenko connect the total sensitivity indices to the derivativebased metrics in [17]; we use [17, Theorem 2] to relate the total sensitivity
indices to the activity scores.
Theorem 4.2. The total sensitivity indices τ are bounded by
τ ≤
1
(α(n) + λn+1 e),
4π 2 V
(28)
where V = Var [f ], λn+1 is the n+1st eigenvalue in (16), and e is an m-vector
of ones.
Proof. Theorem 2 from [17] shows
τi ≤
1
νi .
4π 2 V
11
(29)
Note that an additional factor of 1/4 appears in this case due to scaling the
domain and the partial derivatives to the hypercube [−1, 1]m with a weight
function (1/2)m . Using Theorem 4.1,
νi = αi (n) +
m
X
λj Wij2
≤ αi (n) + λn+1
m
X
Wij2 ≤ αi (n) + λn+1 . (30)
j=n+1
j=n+1
The last inequality follows since the rows of W have norm 1. Combining
(29) with (30) completes the proof.
Theorem 4.2 tells us that if λn+1 is small—which often indicates an ndimensional active subspace—and a component of α(n) is small, then the
corresponding component of the total sensitivity index will also be small.
However, Sobol and Kucherenko show an example of a function in [17, Section 7] where the total sensitivity indices and the derivative-based metrics
identify different sets of variables as important. The situation is similar for
the activity scores. This should not be surprising since different sensitivity
metrics measure different characteristics of the function. The derivativebased metrics measure the average response to small perturbations in the
inputs; the activity scores are comparable when the function admits an active subspace. In contrast, the total sensitivity indices measure the variance
attributable to each parameter. In many nicely behaved functions derived
from practical engineering models (such as the examples in section 5), all
metrics are consistent. However, it is possible to construct functions where
the metrics induce different rankings. The appropriate choice of metric depends on the application.
If the dimension of x is sufficiently small and the derivatives of f are
sufficiently smooth, then one may estimate the integrals defining the matrix
C in (16) using a high order numerical quadrature rule. Then the eigenvalues
are computed from the numerically integrated matrix; we use this approach
to compute reference values in section 5.
When these conditions are not satisfied, which is often the case in complex
simulation models, one may get coarser estimates of the activity scores α
from the eigenpairs Λ̂, Ŵ of the Monte Carlo estimate Ĉ in (20). Our
previous work studied the approximation properties of the Monte Carlobased estimates of the eigenvalues and the subspaces [5]. Unfortunately, those
results do not directly translate to a priori error measures on the estimated
activity scores. We are currently working to find a lower bound on the number
12
M of samples in (20) such that the activity scores are accurate to within
a user-specified tolerance—similar to [5, Corollary 3.3], which bounds the
number of samples needed for accurate eigenvalue estimation. We conjecture
that this number may be smaller than the number of samples needed for
accurate Monte Carlo estimates of the derivative-based metrics ν from (11),
but such analysis is beyond the scope of the current paper.
Since the Monte Carlo-based estimates of the eigenpairs cannot be interpreted as sums of independent random variables, we cannot compute central
limit theorem-based standard errors. This limitation extends to the activity
scores derived from the eigenpairs. Therefore, we turn to the bootstrap to
estimate the standard error in the Monte Carlo-based estimates of the activity scores. The bootstrap algorithm is described in detail by Efron and
Tibshirani [20, Algorithm 6.1]. We denote the bootstrap standard error of
the activity score αi by se
ˆ αi . In section 5, we study the bootstrap standard error’s behavior as the number M of samples increases in the Monte
Carlo-based estimates.
5. Numerical experiments
We perform numerical experiments on two models from Bingham’s Virtual
Library of Simulation Experiments [21]: (i) an algebraic model of a cylindrical piston (section 5.1) and (ii) an algebraic model of a transformerless pushpull circuit (section 5.2). The purpose of these experiments is to compare
the sensitivity metrics constructed from active subspaces to the conventional
sensitivity metrics. In both models we evaluate analytic gradients for computations. Both models admit a dominant one-dimensional active subspace
using the Monte Carlo method represented by (20). To analyze the error in
the Monte Carlo-based estimates of all metrics, we compute reference values
using a tensor product Gauss-Legendre quadrature rule with seven points
in each dimension, which results in 7m function evaluations; m = 7 in the
piston model, and m = 6 in the circuit model. Preliminary convergence experiments verified that seven points per dimension was sufficient for at least
9 accurate digits across metrics. For a summary of the reference values, see
Tables 2 and 4.
For each value of M (the number of Monte Carlo samples), we compute
the (i) relative error in the Monte Carlo-based estimates (e.g., |ν̂i − νi |/νi ,
where νi is the quadrature-based reference value) and (ii) the a posteriori
error estimates—either central limit theorem-based standard error or boot13
strap standard error. Since the Monte Carlo estimates are random, we repeat
the error computations 10 times for each M and take the average. The average relative error plots for the two models are in Figures 2 and 5. And the
average a posteriori error estimates are in Figures 3 and 6.
We ran the experiments in Matlab 2009b on a dual-core 2011 MacBook Air with 4GB of memory. The complete executable Matlab code
for these experiments is available at https://github.com/PaulMDiaz/as_
sensitivity_analysis.
5.1. Piston model
The following algebraic model for a cylindrical piston can be found at http:
//www.sfu.ca/~ssurjano/piston.html, which also contains Matlab code
for evaluating the model predictions for specified inputs. It appears in [22, 23]
as a test model for statistical screening. The model is defined by the following
nonlinear relationships:
s
M
,
(31)
t = 2π
k + S 2 PT0 V0 0 VTa2
!
r
S
P
V
0
0
V =
Ta − A ,
(32)
A2 + 4k
2k
T0
A = P0 S + 19.62M −
kV0
.
S
(33)
The quantity of interest is t, which is the time in seconds it takes the piston to
complete one cycle. The remaining terms in the model are input parameters.
The parameters’ descriptions and bounds are in Table 1 ordered from top to
bottom. For computation, we shift and scale the 7-dimensional parameter
space to the hypercube, and we equip it with a uniform probability density
function. In the notation in previous sections, the parameters in Table 1 are
x, and the time t is f (x).
We apply Algorithm 1.1 from [4, Chapter 1] to test the piston model for
an active subspace. The 7 eigenvalues from Ĉ in (20) are shown in Figure
1a on a logarithmic scale. The order-of-magnitude gap between the first
and second eigenvalues suggests a dominant and computable one-dimensional
active subspace. Figure 1b is a summary plot for the piston model, which
plots the active variable (i.e., the linear combination of the 7 parameters with
weights from the first eigenvector of Ĉ) on the horizontal axis versus the
14
Input parameters
M ∈ [30, 60]
Piston Weight (kg)
S ∈ [0.005, 0.020]
Piston Surface Area (m2 )
V0 ∈ [0.002, 0.010]
Initial Gas Volume (m3 )
k ∈ [1000, 5000]
Spring Coefficient (N/m)
P0 ∈ [90000, 110000] Atmospheric Pressure (N/m2 )
Ta ∈ [290, 296]
Ambient Temperature (K)
T0 ∈ [340, 360]
Filling Gas Temperature (K)
Table 1: Input parameters’ descriptions and ranges for piston model (31).
The parameters are ordered from top to bottom.
corresponding model output on the vertical axis. The near-one-dimensional
relationship between the active variable and the model output confirms the
one-dimensional active subspace suggested by the eigenvalue gap. See [24, 4]
for more information on summary plots. From these two figures, we choose
n = 1 for the dimension of the active subspace, so we use n = 1 when
computing the activity scores.
Input parameter
M
S
V0
k
P0
Ta
T0
τi
0.0509
0.5994
0.3528
0.0669
0.0013
0.0000
0.0001
νi
0.0032
0.0449
0.0265
0.0040
0.0001
0.0000
0.0000
βi
0.1963
−0.7345
0.5567
−0.1424
−0.0352
0.0018
−0.0051
wi
0.1604
−0.7936
0.5768
−0.1035
−0.0305
0.0015
−0.0042
αi (1)
0.0018
0.0437
0.0231
0.0007
0.0001
0.0000
0.0000
Table 2: Reference values for piston model’s inputs’ sensitivity metrics computed with tensor product Gauss-Legendre quadrature on 77 = 823543
points. All sensitivity metrics rank the variables similarly in this model,
with one exception in the ordering of M and k with τi .
The reference values for the sensitivity metrics, computed with high order
numerical quadrature, are shown in Table 2. We use these values to compute
the relative error in the Monte Carlo estimates for M ∈ {10, 50, 100, 500, 1000, 5000, 10000}
samples. Figure 2 shows average relative error over 10 independent trials.
Figure 3 shows the central limit theorem-based standard error or bootstrap
15
1.2
10
0
1
10 -2
0.8
^i
6
f (x)
10 -4
10
-6
10
-8
10 -10
0.6
0.4
1
2
3
4
5
6
7
Index
0.2
-2
-1
0
1
2
^ 1T x
w
(b) Summary plot
(a) Eigenvalues of Ĉ
Figure 1: Figure 1a shows the eigenvalues of Ĉ from (20). The orderof-magnitude gap between the first and second eigenvalue suggests a onedimensional active subspace. This is confirmed by the summary plot in
Figure 1b, which shows that the piston model’s quantity of interest can be
reasonably well approximated by a univariate function of one active variable.
standard error, as appropriate, for the Monte Carlo estimates.
5.2. Circuit model
The transformerless push-pull circuit model can be found at http://www.
sfu.ca/~ssurjano/otlcircuit.html, along with Matlab code for evaluating the model’s predictions given input parameters. This model appears
in [22, 23] as a test function for statistical screening. The model is defined
as
V =
(12Rb2 /(Rb1 + Rb2 ) + 0.74)β(Rc2 + 9)
β(Rc2 + 9) + Rf
11.95Rf
0.74Rf β(Rc2 + 9)
+
+
.
β(Rc2 + 9) + Rf
(β(Rc2 + 9) + Rf )Rc1
(34)
(35)
The quantity of interest is V , the midpoint voltage. The remaining terms in
the model are input parameters; their descriptions and ranges are in Table
3 ordered from top to bottom. We shift and scale the parameter space to
16
6
10
10
10 2
10 0
10 -2
10 2
10 0
10 -2
10 1
10 2
10 3
10 4
Relative error
6
10
4
10 2
10 3
1
2
3
4
5
6
7
10 0
-2
10
10 4
10 1
10 2
10
8
10
6
10 3
10 4
Number of Monte Carlo samples
(b) eνi
10 2
10
10 0
Number of Monte Carlo samples
(a) eτi
10
1
2
3
4
5
6
7
10 2
10 -2
10 1
Number of Monte Carlo samples
6
10 4
1
2
3
4
5
6
7
Relative error
10 4
1
2
3
4
5
6
7
Relative error
Relative error
10 4
6
Relative error
10
(c) eβi
1
2
3
4
5
6
7
10 4
10 2
10 0
10 -2
1
10
2
10
3
10
4
10 1
Number of Monte Carlo samples
10 2
10 3
10 4
Number of Monte Carlo samples
(d) ewi
(e) eαi
Figure 2: The relative error in the Monte Carlo estimates of the sensitivity
metrics for the piston model (31), computed using the reference values in Table 2, for the number M of samples in {10, 50, 100, 500, 1000, 5000, 10000}.
The metrics compared are the total sensitivity index (τi , Figure 2a), the
derivative-based metric (νi , Figure 2b), the linear model coefficients (βi , Figure 2c), the first eigenvector components (wi , Figure 2d), and the activity
scores (αi , Figure 2e). For the total sensitivity indices, the errors in the Monte
Carlo estimates are much larger for the smaller reference values. The same
is true of the linear model coefficients. The errors in the derivative-based
metrics, the first eigenvector components, and the activity scores behave
similarly across the scores.
the hypercube [−1, 1]6 and equip it with a uniform probability density. In
the notation in previous sections, the parameters in Table 3 are x, and the
midpoint voltage V is f (x).
As with the piston model, we apply Algorithm 1.1 from [4, Chapter 1] to
test the circuit model for an active subspace. The 6 eigenvalues from Ĉ in
(20) are shown in Figure 4a on a logarithmic scale. As with the piston model,
17
10 -10
10
1
2
3
4
5
6
7
-5
10 -10
10 1
10 2
10 3
10 4
10 2
10 3
10 0
10 0
Standard error
(b) seνi
1
2
3
4
5
6
7
-5
10 -10
10
10 4
10 1
Number of Monte Carlo samples
(a) seτi
10
10
1
2
3
4
5
6
7
-5
10 -10
10 1
Number of Monte Carlo samples
Standard error
1
2
3
4
5
6
7
-5
10 0
Standard error
10
10 0
Standard error
Standard error
10 0
10
10 2
10 3
10 4
Number of Monte Carlo samples
(c) se
ˆ βi
1
2
3
4
5
6
7
-5
10 -10
1
10
2
10
3
10
4
10 1
Number of Monte Carlo samples
10 2
10 3
10 4
Number of Monte Carlo samples
(d) se
ˆ wi
(e) se
ˆ αi
Figure 3: The central limit theorem-based standard errors se or the bootstrap
standard errors se
ˆ for the sensitivity metrics associated with the piston model
(31) as a function of the number M of Monte Carlo samples. The metrics
compared are the total sensitivity index (τi , Figure 3a), the derivative-based
metric (νi , Figure 3b), the linear model coefficients (βi , Figure 3c), the first
eigenvector components (wi , Figure 3d), and the activity scores (αi , Figure 3e). The bootstrap standard errors were computed using 100 bootstrap
replicates.
the order-of-magnitude gap between the first and second eigenvalues identifies
a dominant and computable one-dimensional active subspace. Figure 4b is
the summary plot. Again, the near-one-dimensional relationship between the
active variable and the model output confirms the one-dimensional active
subspace suggested by the eigenvalue gap. Therefore, we again choose n = 1
for the dimension of the active subspace and use n = 1 when computing the
activity scores.
The reference values for the sensitivity metrics, computed with high order
numerical quadrature, are shown in Table 4. We run the same Monte Carlo
error study as the piston model in section 5.1. Figure 5 shows average relative
18
Input parameters
Rb1 ∈ [50, 150]
Resistance b1 (K-Ohms)
Rb2 ∈ [25, 75]
Resistance b2 (K-Ohms)
Rf ∈ [0.5, 3]
Resistance f (K-Ohms)
Rc1 ∈ [0.25, 1.2]
Resistance c1 (K-Ohms)
Rc2 ∈ [90000, 110000] Resistance c2 (K-Ohms)
β ∈ [50, 300]
Current Gain (Amperes)
Table 3: Input parameters’ descriptions and ranges for the circuit model
(34). The parameters are ordered from top to bottom.
10
2
8
10
0
7
10
-2
6
10
-4
f (x)
^i
6
9
5
10 -6
4
10 -8
3
10 -10
1
2
3
4
5
6
Index
2
-2
-1
0
1
2
^ 1T x
w
(b) Summary plot
(a) Eigenvalues of Ĉ
Figure 4: Figure 4a shows the eigenvalues of Ĉ from (20). The orderof-magnitude gap between the first and second eigenvalue suggests a onedimensional active subspace. This is confirmed by the summary plot in
Figure 4b, which shows that the circuit model’s quantity of interest can be
reasonably well approximated by a univariate function of one active variable.
error over 10 independent trials. Figure 6 shows the central limit theorembased standard error or bootstrap standard error, as appropriate, for the
Monte Carlo estimates.
19
Input parameter
Rb1
Rb2
Rf
Rc1
Rc2
β
τi
0.5001
0.4117
0.0740
0.0218
0.0000
0.0000
νi
2.4555
1.7039
0.2902
0.1090
0.0000
0.0002
βi
−0.6925
0.6358
0.2662
−0.1341
−0.0002
−0.0032
wi
−0.7407
0.6112
0.2458
−0.1318
−0.0002
−0.0039
αi (1)
2.3778
1.6190
0.2617
0.0752
0.0000
0.0001
Table 4: Reference values for circuit model’s inputs’ sensitivity metrics
computed with tensor product Gauss-Legendre quadrature on 67 = 279936
points. All sensitivity metrics rank the variables consistently in this model.
5.3. Discussion
In tables 2 and 4, every sensitivity metric gives a similar ranking to the
input parameter; the only exception is the ordering of k and M using τi in
the piston model. All of the sensitivity metrics identify the same sets of
non-influential inputs, thought it’s worth noting that each metric measures
a different characteristic of the input/output relationship. The signs of the
linear model coefficients and the components of the first eigenvector provide
information about how the input parameter affects the response, on average.
Figures 2 and 5 show us that the Monte Carlo estimates of the activity
score, first eigenvector, and derivative-based sensitivity metric have smaller
errors than estimates of the total sensitivity indices and regression coefficients. Indeed, the relative error in the total sensitivity indices and regression
coefficients corresponding to non-essential parameters—in both models—is
large; this demonstrates a known phenomenon of this metric which has been
studied in [13], where the authors propose two methods to correct for this
effect: reduction of the mean value and correlated sampling. We note this
expected behavior of the total sensitivity index rather than correcting it.
Figures 3 and 6 show the expected O(M −1/2 ) rate for the central limit
theorem-based standard errors, but surprisingly they show a slightly faster
rate of decrease for the bootstrap standard error estimates for the active
subspace-based metrics. We do not know how to explain this behavior. Also,
all sample-based a posteriori error estimates under predict the true relative
error.
20
8
10
10
10
0
1
2
3
4
5
6
10 -2
10 4
10
2
10
0
1
2
3
4
5
6
10 -2
10 1
10 2
10 3
10 4
Relative error
10
1
2
3
4
5
6
2
10 0
10 2
10 3
10 4
10 1
Number of Monte Carlo samples
(a) eτi
10 2
8
10
8
10
6
10
6
10 4
1
2
3
4
5
6
10 2
10 0
10 -2
10 3
10 4
Number of Monte Carlo samples
(b) eνi
10
10
10 4
10 -2
10 1
Number of Monte Carlo samples
8
10 6
Relative error
10 4
2
10
10 6
Relative error
Relative error
10 6
8
Relative error
10
(c) eβi
10 4
1
2
3
4
5
6
10 2
10 0
10 -2
1
10
2
10
3
10
4
10 1
Number of Monte Carlo samples
10 2
10 3
10 4
Number of Monte Carlo samples
(d) ewi
(e) eαi
Figure 5: The relative error in the Monte Carlo estimates of the sensitivity
metrics for the circuit model (34), computed using the reference values in Table 4, for the number M of samples in {10, 50, 100, 500, 1000, 5000, 10000}.
The metrics compared are the total sensitivity index (τi , Figure 6a), the
derivative-based metric (νi , Figure 6b), the linear model coefficients (βi , Figure 6c), the first eigenvector components (wi , Figure 6d), and the activity
scores (αi , Figure 6e). As in the piston model, the errors in the Monte Carlo
estimates of τi and βi are much larger for the smaller reference values. The
errors in the derivative-based metrics, the first eigenvector components, and
the activity scores behave similarly across the scores.
6. Summary and conclusions
We propose two global sensitivity analysis metrics derived from the eigenvectors and eigenvalues computed while testing an engineering model for an
active subspace. The two metrics are (i) the components of the first eigenvector and (ii) the activity scores, which are the squared eigenvector components
linearly combined with the eigenvalues. We show theoretical connections be-
21
1
2
3
4
5
6
10 -5
10 -10
10 0
1
2
3
4
5
6
10 -5
10 -10
10 2
10 3
10 4
Number of Monte Carlo samples
10 2
10 3
10 0
10 0
Standard error
(b) seνi
1
2
3
4
5
6
-5
10 -10
10
10 4
10 1
Number of Monte Carlo samples
(a) seτi
10
1
2
3
4
5
6
10 -5
10 -10
10 1
Standard error
10 1
Standard error
10 0
Standard error
Standard error
10 0
10
10 2
10 3
10 4
Number of Monte Carlo samples
(c) se
ˆ βi
1
2
3
4
5
6
-5
10 -10
1
10
2
10
3
10
4
10 1
Number of Monte Carlo samples
10 2
10 3
10 4
Number of Monte Carlo samples
(d) se
ˆ wi
(e) se
ˆ αi
Figure 6: The central limit theorem-based standard errors sei or the bootstrap standard errors se
ˆ i for the sensitivity metrics associated with the circuit model (34) as a function of the number M of Monte Carlo samples.
The metrics compared are the total sensitivity index (τi , Figure 6a), the
derivative-based metric (νi , Figure 6b), the linear model coefficients (βi , Figure 6c), the first eigenvector components (wi , Figure 6d), and the activity
scores (αi , Figure 6e). The bootstrap standard errors were computed using
100 bootstrap replicates.
tween the activity scores and the more conventional total sensitivity indices
of Sobol’ and the derivative-based metrics of Kucherenko. We discuss high
order quadrature methods and Monte Carlo methods to estimate the new
sensitivity metrics. And we demonstrate the new metrics on two algebraic
engineering models. The models are simple enough to compute reference values with high order quadrature and study the behavior of the Monte Carlo
estimates and their a posteriori error estimates, i.e., the central limit theorembased standard errors or the bootstrap estimates of the standard errors. For
the two test models, all metrics consistently identify the same important and
unimportant variables. While it is possible to construct models where this is
22
not the case, since the metrics measure different characteristics of the model,
we expect that this consistency will occur across many models in engineering
practice.
Acknowledgments
This work is supported by the U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program under Award Number DE-SC-0011077.
References
[1] A. Saltelli, M. Ratto, T. Andres, F. Campolongo, J. Cariboni, D. Gatelli,
M. Saisana, S. Tarantola, Global Sensitivity Analysis: The Primer, John
Wiley & Sons, New York, 2008.
[2] I. Sobol, Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates, Mathematics
and Computers in Simulation 55 (2001) 271 – 280, the Second {IMACS} Seminar on Monte Carlo Methods.
doi:http:
//dx.doi.org/10.1016/S0378-4754(00)00270-6.
URL
http://www.sciencedirect.com/science/article/pii/
S0378475400002706
[3] S. Kucherenko, M. Rodriguez-Fernandez, C. Pantelides, N. Shah,
Monte Carlo evaluation of derivative-based global sensitivity measures, Reliability Engineering & System Safety 94 (7)
(2009) 1135 – 1148, special Issue on Sensitivity Analysis.
doi:http://dx.doi.org/10.1016/j.ress.2008.05.006.
URL
http://www.sciencedirect.com/science/article/pii/
S095183200800152X
[4] P. G. Constantine, Active Subspaces: Emerging Ideas in Dimension
Reduction for Parameter Studies, SIAM, Philadelphia, 2015.
[5] P. Constantine, D. Gleich, Computing active subspaces with Monte
Carlo, arXiv preprint arXiv:1408.0545v2.
23
[6] P. Constantine, E. Dow, Q. Wang, Active subspace methods in theory
and practice: Applications to kriging surfaces, SIAM Journal on Scientific Computing 36 (4) (2014) A1500–A1524. doi:10.1137/130916138.
URL http://dx.doi.org/10.1137/130916138
[7] T. W. Lukaczyk, P. Constantine, F. Palacios, J. J. Alonso, Active Subspaces for Shape Optimization, American Institute of Aeronautics and
Astronautics, 2014. doi:doi:10.2514/6.2014-1171.
URL http://dx.doi.org/10.2514/6.2014-1171
[8] P. Constantine, M. Emory, J. Larsson, G. Iaccarino, Exploiting active
subspaces to quantify uncertainty in the numerical simulation of the
HyShot II scramjet, Journal of Computational Physics 302 (2015) 1 –
20. doi:http://dx.doi.org/10.1016/j.jcp.2015.09.001.
URL
http://www.sciencedirect.com/science/article/pii/
S002199911500580X
[9] J. L. Jefferson, J. M. Gilbert, P. G. Constantine, R. M. Maxwell,
Active subspaces for sensitivity analysis and dimension reduction of an
integrated hydrologic model, Computers & Geosciences 83 (2015) 127 –
138. doi:http://dx.doi.org/10.1016/j.cageo.2015.07.001.
URL
http://www.sciencedirect.com/science/article/pii/
S0098300415300091
[10] P. G. Constantine, B. Zaharatos, M. Campanelli, Discovering an active
subspace in a single-diode solar cell model, Statistical Analysis and Data
Mining: The ASA Data Science Journal (2015) n/a–n/adoi:10.1002/
sam.11281.
URL http://dx.doi.org/10.1002/sam.11281
[11] A. Owen, Variance components and generalized sobol’ indices,
SIAM/ASA Journal on Uncertainty Quantification 1 (1) (2013) 19–41.
doi:10.1137/120876782.
URL http://epubs.siam.org/doi/abs/10.1137/120876782
[12] R. Liu, A. B. Owen, Estimating mean dimensionality of analysis of variance decompositions, Journal of the American Statistical Association
101 (474) (2006) pp. 712–721.
URL http://www.jstor.org/stable/27590729
24
[13] I. M. Sobol’, E. E. Myshetskaya, Monte Carlo estimators for small
sensitivity indices., Monte Carlo Methods & Applications 13 (5/6)
(2008) 455 – 465.
URL
https://search-ebscohost-com.ezproxy.stanford.
edu/login.aspx?direct=true&db=aph&AN=47102227&site=
ehost-live&scope=site
[14] B. Sudret, Global sensitivity analysis using polynomial chaos expansions, Reliability Engineering & System Safety 93 (7) (2008)
964 – 979, bayesian Networks in Dependability.
doi:http:
//dx.doi.org/10.1016/j.ress.2007.04.002.
URL
http://www.sciencedirect.com/science/article/pii/
S0951832007001329
[15] T. Crestaux, O. L. Maı̂tre, J.-M. Martinez, Polynomial chaos expansion
for sensitivity analysis, Reliability Engineering & System Safety 94 (7)
(2009) 1161 – 1172. doi:http://dx.doi.org/10.1016/j.ress.2008.
10.008.
URL
http://www.sciencedirect.com/science/article/pii/
S0951832008002561
[16] M. D. Morris, Factorial sampling plans for preliminary computational experiments, Technometrics 33 (2) (1991) 161–174.
doi:10.1080/00401706.1991.10484804.
URL
http://www.tandfonline.com/doi/abs/10.1080/00401706.
1991.10484804
[17] I. Sobol’, S. Kucherenko, Derivative based global sensitivity measures and their link with global sensitivity indices, Mathematics and Computers in Simulation 79 (10) (2009) 3009 – 3017.
doi:http://dx.doi.org/10.1016/j.matcom.2009.01.023.
URL
http://www.sciencedirect.com/science/article/pii/
S0378475409000354
[18] D. Xiu, G. E. Karniadakis, The Wiener-Askey polynomial chaos for
stochastic differential equations, SIAM Journal on Scientific Computing 24 (2) (2002) 619–644. arXiv:http://dx.doi.org/10.1137/
S1064827501387826, doi:10.1137/S1064827501387826.
URL http://dx.doi.org/10.1137/S1064827501387826
25
[19] P. G. Constantine, M. S. Eldred, E. T. Phipps, Sparse pseudospectral approximation method, Computer Methods in Applied Mechanics and Engineering 229–232 (2012) 1 – 12.
doi:http://dx.doi.org/10.1016/j.cma.2012.03.019.
URL
http://www.sciencedirect.com/science/article/pii/
S0045782512000953
[20] B. Efron, R. J. Tibshirani, An introduction to the bootstrap, Chapman
and Hall/CRC, Boca Raton, 1993.
[21] Virtual library of simulation experiments: Test functions and datasets,
http://www.sfu.ca/~ssurjano/index.html, accessed: 10/13/2015.
[22] E. N. Ben-Ari, D. M. Steinberg, Modeling data from computer experiments: an empirical comparison of kriging with mars and projection
pursuit regression, Quality Engineering 19 (4) (2007) 327–338.
[23] H. Moon, Design and analysis of computer experiments for screening
input variables, Ph.D. thesis, The Ohio State University (2010).
[24] R. D. Cook, Regression graphics: Ideas for studying regressions through
graphics, John Wiley & Sons, Hoboken, 2009.
26
Download