Global sensitivity metrics from active subspaces Paul Diaza,∗, Paul G. Constantinea arXiv:1510.04361v1 [math.NA] 15 Oct 2015 a Department of Applied Mathematics and Statistics, Colorado School of Mines, Golden, CO 80401 Abstract Predictions from science and engineering models depend on several input parameters. Global sensitivity analysis metrics quantify the importance of each input parameter, which can lead to insight into the model and reduced computational cost; commonly used sensitivity metrics include Sobol’ indices derived from a variance-based decomposition and derivative-based metrics. Active subspaces are an emerging set of tools for identifying important directions in a models input parameter space; these directions can be exploited to reduce the model’s dimension enabling otherwise infeasible parameter studies. In this paper, we develop global sensitivity metrics called activity scores from the estimated active subspace, which give insight into the important model parameters. We analytically compare the activity scores to established sensitivity metrics, and we discuss computational methods to estimate the activity scores. We then show two numerical examples with algebraic functions from engineering models. For each case, we demonstrate that the model admits a one-dimensional active subspace. We then compare the standard sensitivity metrics to the activity scores, and we find that all metrics give consistent rankings for the input parameters. Keywords: sensitivity analysis, active subspaces, activity score ∗ Corresponding author Email addresses: pdiaz@mymail.mines.edu (Paul Diaz), paul.constantine@mines.edu (Paul G. Constantine) URL: http://inside.mines.edu/~pdiaz/ (Paul Diaz), http://inside.mines.edu/~pconstan (Paul G. Constantine) Preprint submitted to Reliability Engineering & System Safety October 16, 2015 1. Introduction Science and engineering models typically depend on several input parameters. When the model is complex, the scientist may not know which input parameters drive the its outputs; she may need to evaluate the model at several parameter values to identify the most important parameters. Sensitivity analysis includes computational methods that seek to identify the most important parameters, which leads to greater insight into the model [1]. The analysis results may also enable the scientist to ignore the less important parameters, thus reducing the cost of parameter studies—like optimization or uncertainty quantification—that are essential for confident predictions. Sensitivity metrics are broadly classified as local, which measure the model response to small perturbations around a nominal parameter value, and global, which assess the importance of each variable over a range of parameters; we are primarily concerned with global sensitivity metrics. The total sensitivity indices of Sobol’ are some of the most widely used metrics for global sensitivity analysis [2]. Other common metrics use averages of local gradients to build global metrics [3]. The appropriate choice of metric depends on the application. We review three common global sensitivity metrics in Section 2. Using sensitivity metrics to ignore less important parameters is a form of dimension reduction. This type of reduction is limited to the coordinates induced by the model parameters. Our recent work in active subspaces generalizes the dimension reduction idea to directions in the high-dimensional space of model inputs [4]. A model’s output might depend strongly on each coordinate but only through a limited number of directions. Each direction corresponds to a linear combination of the model parameters that we call an active variable. Our work has developed methods for identifying an active subspace [5] and exploiting it, if present, to reduce the dimension of subsequent parameter studies [6]. We have applied this approach to engineering models in aerospace [7, 8], geosciences [9], and alternative energy [10]. We review active subspaces in Section 3. One difficulty in the subspace-based dimension reduction is how to draw insight into the original model parameters once an active subspace has been identified. For example, how might a climate scientist use the fact that 3 × (cloud parameter) + 2 × (radiation parameter) is an important linear combination in a climate model? In this paper, we propose global sensitivity metrics based on active subspaces, and we develop computational methods to estimate these metrics. We quantitatively compare the proposed metrics 2 to existing global sensitivity metrics in Section 4, and we show two numerical examples in Section 5. We conclude in Section 6 with a summary and directions for further research. 1.1. Setup and notation We consider a differentiable and square-integrable function f = f (x) defined on an m-dimensional hypercube x = [x1 , . . . , xm ]T ∈ [−1, 1]m . Define the function ρ(x) = 2−m for x ∈ [−1, 1]m and zero elsewhere. Then we can write f ’s mean and variance as Z Z E [f ] = f ρ dx, Var [f ] = (f − E [f ])2 ρ dx. (1) We emphasize that f is a deterministic function; the mean and variance do not represent any probabilistic information. Denote the gradient of f by ∂f , and assume the partial derivatives ∇f (x) ∈ Rm with partial derivatives ∂x i are square-integrable. All vector norms are the standard Euclidean norm, and matrix norms are the matrix 2-norm (i.e., the spectral norm). 2. Global sensitivity metrics We briefly review three global sensitivity metrics used in practice: variancebased metrics, derivative-based metrics, and least-squares-fit linear model coefficients. In each case, we (i) define the metric, (ii) discuss a method for high accuracy computation with numerical quadrature, (iii) detail a Monte Carlo-based method for low accuracy estimation, and (iv) mention practical error estimates for the Monte Carlo-based estimates. 2.1. Variance-based metrics The total sensitivity indices were introduced by Sobol’ [2]; several papers describe these metrics, propose extensions [11], and develop efficient computational procedures [12, 13, 14, 15]. The total sensitivity indices are derived from a particular decomposition of the square-integrable f (x), often called the function ANOVA decomposition or the variance-based decomposition; we use the concise notation from Liu and Owen [12]. Let u be a subset of the integers {1, . . . , m} with cardinality |u|. Define xu ∈ [−1, 1]|u| , and let ρu (xu ) = 2−|u| for xu ∈ [−1, 1]|u| and zero elsewhere. The subset −u is the complement of u in {1, . . . , m}. The ANOVA decomposition can be written X f (x) = fu (x), (2) u⊆{1,...,m} 3 where fu depends on x only through xu . The function fu is defined as Z X fu (x) = f (x) ρ−u dx−u − fv (x), (3) v⊂u and f∅ = E [f ]. The functions fu are mutually orthogonal, and the variance of f can be decomposed as X Var [f ] = Var [fu ] . (4) u⊆{1,...,m} Define the set Si to be the set of subsets of {1, . . . , m} containing the index i. The total sensitivity index τi for the ith variable xi can be written as sums of subsets of the variances Var [fu ] divided by the total variance, P u⊆Si Var [fu ] . (5) τi = Var [f ] Other sensitivity metrics can be defined from the variances Var [fu ]; we focus exclusively on τi from (5) to compare with the metrics derived from active subspaces; see Section 4. When the dimension of x is sufficiently low, f (x) is sufficiently smooth, and f (x) can be computed quickly, one can accurately estimate τi using a high order numerical quadrature rule. This approach is equivalent to using the Fourier coefficients as in [14], which we use to compute reference values for the numerical examples in Section 5. If f (x) does not satisfy these assumptions, one can use a less accurate Monte Carlo method as in [1, Section 4.6]. Next we review the Monte Carlo method and its central limit theorem-based standard error. First draw samples xTi independently according to the uniform density ρ(x) on [−1, 1]m . We arrange these samples into two M × m matrices, A and B, defined as xT1 xTM +1 A = ... , B = ... . (6) T T xM x2M Define the M ×m matrix Ci to be the matrix B with the ith column replaced by the ith column of A. Define the M -vector fA as f (x1 ) fA = ... . (7) f (xM ) 4 In words, the vector fA contains evaluations of the function f at each row of the matrix A. Similarly, define the M -vectors fB and fCi whose elements contain evaluations of f at the rows of their respective subscripted matrices. These calculations use (m + 2)M evaluations of f . The total sensitivity indices may be estimated as τi ≈ τ̂i = 1 − 2 T fCi − fˆ02 fA kfA k2 + kfB k2 − fˆ02 , (8) where the sample mean fˆ0 is (fA + fB )T e fˆ0 = , M (9) and e is an M -vector of ones. Monte Carlo estimates come equipped with an asymptotically valid estimate of the error derived from the central limit theorem; in statistics, these error estimates are called the standard error. The standard error seτi for the Monte Carlo estimate τ̂i is " seτi = T kfA,Ci k2 − fA fCi N 2 #1/2 , (10) where fA,Ci is the element-wise product of fA and fCi . In section 5, we study the standard error of the estimator by investigating the values of seτi for different values of M ; see Figures 3 and 6. 2.2. Derivative-based sensitivity metrics A natural notion of sensitivity is how a model output responds to changes in its inputs, i.e., a derivative. However, derivatives only provide information at a single parameter value. Morris developed sensitivity metrics based on sampling and averaging coarse finite difference approximations to derivatives [16]. Kucherenko, et al [3] extended this idea to continuous derivatives and integration, where the the averages are estimated with Monte Carlo methods. We focus on one particular derivative-based sensitivity metric studied by Sobol’ and Kucherenko [17]: Z νi = ∂f ∂xi 2 ρ dx, 5 i = 1, . . . , m. (11) This metric averages local sensitivity information to make it a global metric. We discuss this metric’s connection to active subspaces in Section 4. If f ’s derivatives are sufficiently smooth, and if the dimension of x is low, then one can compute accurate estimates of νi using a high order numerical quadrature rule; we use the quadrature approach for reference values in the numerical examples in section 5. The Monte Carlo estimate of νi using M samples is 2 M 1 X ∂f (xj ) , νi ≈ ν̂i = M j=1 ∂xi (12) where the xj ’s are drawn independently according to ρ(x). The standard error seνi is !2 1/2 M 2 1 X ∂f (xj ) − ν̂i . (13) seνi = M − 1 j=1 ∂xi In section 5, we study the standard error of the estimator by investigating seνi for different values of M ; see Figures 3 and 6. 2.3. Linear model coefficients A simple and easily computable sensitivity metric can be derived from the coefficients of a least-squares fit linear approximation to f (x). Such a metric is valid when f is smooth and monotonic along each component of x. Saltelli, et al [1, Section 1.2.5] discuss a similar metric in connection with a linear regression model. Since f (x) is deterministic, the statistical interpretation of the regression coefficients is not valid. Nevertheless, we can devise a Monte Carlo-based least-squares procedure to estimate the coefficients of a linear approximation as follows. Draw xj independently according to ρ(x) for j = 1, . . . , M , and compute fj = f (xj ). Next, use least-squares to fit the intercept b̂0 and coefficients b̂ = [b̂1 , . . . , b̂m ]T of a linear approximation, fj ≈ b̂0 + b̂T xj , j = 1, . . . , M. (14) When the coefficients b̂i are properly scaled, they become useful sensitivity metrics. Define β̂i as b̂i , (15) β̂i = √ 3 σ̂f 6 √ where σ̂f is the standard deviation of the samples {fj }. The 1/ 3 comes from scaling by the variance of the uniform random variable xi with support [−1, 1]. If f (x) is monotonic and sufficiently smooth, then β̂i gives a rough indication of how f changes in response to a change in xi over the space of inputs. In contrast to the total sensitivity index τi from (8) and the derivative-based metric νi from (11), the β̂i ’s are signed, so they indicate if f will increase or decrease given a positive change in xi . We can compute reference values for β̂i by noting the relationship between the m-dimensional Legendre series truncated to degree 1 (i.e., only a constant and m linear terms) and the least-squares linear approximation; the Legendre series is often referred to as the generalized polynomial chaos associated with the uniform density [14, 18]. In particular, the degree-1 Legendre approximation is the best linear approximation in the L2 norm. Its coefficients admit a simple integral representation by virtue of the Legendre polynomials’ orthogonality. Thus, we can estimate the coefficients with high order numerical quadrature [19] and scale them to get the coefficients of a linear monomial approximation as in (14). When reference values are too expensive to compute, we can estimate the standard error in β̂i with a bootstrap standard error, which we denote se ˆ βi . See [20, Section 9.5] for a detailed description of using the bootstrap for linear regression coefficients. Note that randomness in the estimates is a result of the Monte Carlo method; there is no randomness in the evaluation of f (x). In section 5, we study the bootstrap standard error by investigating the values of se ˆ βi for different values of M ; see Figures 3 and 6. 3. Active subspaces Active subspaces [4] are defined by the eigenvectors of the following m × m symmetric, positive semidefinite matrix, Z C = ∇f ∇f T ρ dx = W ΛW T , (16) where W = [w1 , . . . , wm ] is the orthogonal matrix of eigenvectors, and Λ = diag(λ1 , . . . , λm ) is the diagonal matrix of eigenvalues in decreasing order. The eigenpairs satisfy Z λi = (∇f T wi )2 ρ dx. (17) 7 If λi = 0, then f is constant along the direction wi , and such structure can be exploited to reduce the model’s dimension. An example is the function f (x1 , x2 ) = exp(7x1 +3x2 ), whose inactive subspace corresponds to the direction [−3, 7]. Suppose λn > λn+1 for some n less than m. Then we partition Λ1 Λ= , W = W1 W2 , (18) Λ2 where Λ1 is a diagonal matrix of the first n eigenvalues, and W1 is the m × n matrix containing the first n eigenvectors. If λn+1 , . . . , λm are sufficiently small, then a reasonable low-dimensional approximation of f is f (x) ≈ g(W1T x), (19) where g : Rn → R. In [6], we propose a particular form for g based on a conditional average, and we bound its mean-squared error in terms of the eigenvalues λn+1 , . . . , λm . To identify an active subspace, we must estimate the eigenpairs Λ, W and look for a gap in the eigenvalues. If the dimension m is sufficiently small, then one might use high order numerical integration rules to estimate C in (16). In [5], we propose and analyze a Monte Carlo method for estimating these eigenpairs. For i = 1, . . . , M , draw xi uniformly at random from the hypercube [−1, 1]m . Compute ∇fi = ∇f (xi ). Then estimate M 1 X ∇fi ∇fiT = Ŵ Λ̂Ŵ T . C ≈ Ĉ = M i=1 (20) In [5], we study how large M must be so that (i) the estimated eigenvalues Λ̂ are close to the true eigenvalues Λ and (ii) the subspace distance, dist(ran(W1 ), ran(Ŵ1 )) = kW1 W1T − Ŵ1 Ŵ1T k2 , (21) where Ŵ1 contains the first n columns of Ŵ , is well-behaved. 4. Sensitivity metrics from active subspaces We propose two possible metrics derived from the eigenvectors W and eigenvalues Λ from (16). The first set of metrics uses the components of the first eigenvector (i.e., the eigenvector associated with the largest eigenvalue). The second set of metrics uses linear combinations of the squared eigenvector components weighted by the eigenvalues, which we call activity scores. 8 4.1. The first eigenvector In [10, Section 4.4], we used the first eigenvector from the matrix Ĉ in (20) to identify the most important parameters in a single-diode model of a photovoltaic solar cell. For this particular model, the magnitudes of the eigenvector components gave the same ranking as the total sensitivity indices (see section 2.1). However, these two metrics indicate different characteristics of the function. The total sensitivity indices measure the proportion of the variance attributed to each parameter while the eigenvector identifies an important direction in the parameter space. Like the linear model coefficients in section 2.3, the eigenvector components are not all positive, and the relative difference in signs (since eigenvectors are only unique up to a sign) provides insight into the relationship between inputs and outputs. For example, if two eigenvector components are equal in magnitude but differ in sign, then we expect f to increase when one parameter is perturbed but decrease when the other is perturbed. This is similar to the gradient-based metrics described by Morris [16] and Kucherenko [3]. However, the trouble of averaging a gradient that might change signs does not effect the eigenvector. In the numerical examples in section 5, we compare the first eigenvector’s components to other sensitivity metrics, but we do not provide a theoretical comparison. We use the bootstrap to estimate a standard error in the eigenvector components, though one must be careful to keep the signs consistent when computing the bootstrap replicates. Section 7.2 of [20] shows an example of using the bootstrap to estimate standard errors in eigenvector components. When the context is clear, we denote the first eigenvector by w = [w1 , . . . , wm ]T and the bootstrap standard error by se ˆ wi . 4.2. Activity scores We develop a more comprehensive and justifiable sensitivity metric from the eigenpairs in (16). We propose the following metric for global sensitivity analysis. Define α ∈ Rm as α = α(n) = n X λj wj2 , (22) j=1 where the exponent on the vector wj means squaring each component of wj . We call α(n) the activity scores, and we use these numbers to rank the importance of a model’s inputs. 9 The active subspace’s construction gives insight into the interpretation of the activity scores. The eigenvector w identifies the most important direction in the parameter space in the following sense: perturbing x along w changes f more, on average, than perturbing x orthogonal to w, see (17). The components of w measure the relative change in each component of x along this most important direction, so they impart significance to each component of x. The second most important direction is the eigenvector w2 , and the relative importance of w2 is measured by the difference between the eigenvalues λ1 and λ2 . For example, if λ1 λ2 , then f (x) has dominant one-dimensional active subspace, and the importance of x’s components is captured in w’s components. Therefore, to construct the global sensitivity metric, it is reasonable to scale each eigenvectors by its corresponding eigenvalue. Squaring each eigenvector component in (22) removes information provided by the signs. But the resulting metric is much easier to compare to existing sensitivity metrics. 4.3. Comparison to existing metrics We compare the activity scores to the derivative-based metrics and the total sensitivity indices reviewed in Section 2. Theorem 4.1. The activity scores are bounded above by the derivative-based metrics, α(n) ≤ ν. (23) The inequality becomes an equality when n = m. Proof. Note that the derivative-based metrics are the diagonal elements of the matrix C in (16). Then ν = diag(C) = diag(W ΛW T ). So νi = m X λj Wij2 = αi (m), (24) (25) j=1 where Wij is the (i, j) element of W , and αi (m) is the ith component of α(m). This proves the equality statement. To see the inequality, note that 10 Wij2 ≥ 0 and λj ≥ 0, so αi (n) = n X λj Wij2 ≤ j=1 m X λj Wij2 = νi , (26) j=1 as required. Theorem 4.1 shows that the activity score can be interpreted as a truncation to the derivative-based metric νi from section 2.2. In many practical situations, the activity scores and the derivative-based metrics give comparable rankings. However, it is possible to construct cases where the rankings differ. Consider the quadratic function f (x1 , x2 ) = (x21 +x22 )/2. For this case, the matrix C from (16) is 1/6 0 C = . (27) 0 1/6 The derivative-based metrics are the diagonal elements of C—both 1/6 implying equal importance. However, the activity scores with n = 1 are [1/6, 0], which implies that the second variable is not important at all. This case illustrates that the activity scores are appropriate when there is a gap between the eigenvalues of C (i.e., f admits an active subspace), and n is chosen according to the gap. In the case of (27), there is no gap between the two eigenvalues and therefore, f doesn’t admit an active subspace. Sobol’ and Kucherenko connect the total sensitivity indices to the derivativebased metrics in [17]; we use [17, Theorem 2] to relate the total sensitivity indices to the activity scores. Theorem 4.2. The total sensitivity indices τ are bounded by τ ≤ 1 (α(n) + λn+1 e), 4π 2 V (28) where V = Var [f ], λn+1 is the n+1st eigenvalue in (16), and e is an m-vector of ones. Proof. Theorem 2 from [17] shows τi ≤ 1 νi . 4π 2 V 11 (29) Note that an additional factor of 1/4 appears in this case due to scaling the domain and the partial derivatives to the hypercube [−1, 1]m with a weight function (1/2)m . Using Theorem 4.1, νi = αi (n) + m X λj Wij2 ≤ αi (n) + λn+1 m X Wij2 ≤ αi (n) + λn+1 . (30) j=n+1 j=n+1 The last inequality follows since the rows of W have norm 1. Combining (29) with (30) completes the proof. Theorem 4.2 tells us that if λn+1 is small—which often indicates an ndimensional active subspace—and a component of α(n) is small, then the corresponding component of the total sensitivity index will also be small. However, Sobol and Kucherenko show an example of a function in [17, Section 7] where the total sensitivity indices and the derivative-based metrics identify different sets of variables as important. The situation is similar for the activity scores. This should not be surprising since different sensitivity metrics measure different characteristics of the function. The derivativebased metrics measure the average response to small perturbations in the inputs; the activity scores are comparable when the function admits an active subspace. In contrast, the total sensitivity indices measure the variance attributable to each parameter. In many nicely behaved functions derived from practical engineering models (such as the examples in section 5), all metrics are consistent. However, it is possible to construct functions where the metrics induce different rankings. The appropriate choice of metric depends on the application. If the dimension of x is sufficiently small and the derivatives of f are sufficiently smooth, then one may estimate the integrals defining the matrix C in (16) using a high order numerical quadrature rule. Then the eigenvalues are computed from the numerically integrated matrix; we use this approach to compute reference values in section 5. When these conditions are not satisfied, which is often the case in complex simulation models, one may get coarser estimates of the activity scores α from the eigenpairs Λ̂, Ŵ of the Monte Carlo estimate Ĉ in (20). Our previous work studied the approximation properties of the Monte Carlobased estimates of the eigenvalues and the subspaces [5]. Unfortunately, those results do not directly translate to a priori error measures on the estimated activity scores. We are currently working to find a lower bound on the number 12 M of samples in (20) such that the activity scores are accurate to within a user-specified tolerance—similar to [5, Corollary 3.3], which bounds the number of samples needed for accurate eigenvalue estimation. We conjecture that this number may be smaller than the number of samples needed for accurate Monte Carlo estimates of the derivative-based metrics ν from (11), but such analysis is beyond the scope of the current paper. Since the Monte Carlo-based estimates of the eigenpairs cannot be interpreted as sums of independent random variables, we cannot compute central limit theorem-based standard errors. This limitation extends to the activity scores derived from the eigenpairs. Therefore, we turn to the bootstrap to estimate the standard error in the Monte Carlo-based estimates of the activity scores. The bootstrap algorithm is described in detail by Efron and Tibshirani [20, Algorithm 6.1]. We denote the bootstrap standard error of the activity score αi by se ˆ αi . In section 5, we study the bootstrap standard error’s behavior as the number M of samples increases in the Monte Carlo-based estimates. 5. Numerical experiments We perform numerical experiments on two models from Bingham’s Virtual Library of Simulation Experiments [21]: (i) an algebraic model of a cylindrical piston (section 5.1) and (ii) an algebraic model of a transformerless pushpull circuit (section 5.2). The purpose of these experiments is to compare the sensitivity metrics constructed from active subspaces to the conventional sensitivity metrics. In both models we evaluate analytic gradients for computations. Both models admit a dominant one-dimensional active subspace using the Monte Carlo method represented by (20). To analyze the error in the Monte Carlo-based estimates of all metrics, we compute reference values using a tensor product Gauss-Legendre quadrature rule with seven points in each dimension, which results in 7m function evaluations; m = 7 in the piston model, and m = 6 in the circuit model. Preliminary convergence experiments verified that seven points per dimension was sufficient for at least 9 accurate digits across metrics. For a summary of the reference values, see Tables 2 and 4. For each value of M (the number of Monte Carlo samples), we compute the (i) relative error in the Monte Carlo-based estimates (e.g., |ν̂i − νi |/νi , where νi is the quadrature-based reference value) and (ii) the a posteriori error estimates—either central limit theorem-based standard error or boot13 strap standard error. Since the Monte Carlo estimates are random, we repeat the error computations 10 times for each M and take the average. The average relative error plots for the two models are in Figures 2 and 5. And the average a posteriori error estimates are in Figures 3 and 6. We ran the experiments in Matlab 2009b on a dual-core 2011 MacBook Air with 4GB of memory. The complete executable Matlab code for these experiments is available at https://github.com/PaulMDiaz/as_ sensitivity_analysis. 5.1. Piston model The following algebraic model for a cylindrical piston can be found at http: //www.sfu.ca/~ssurjano/piston.html, which also contains Matlab code for evaluating the model predictions for specified inputs. It appears in [22, 23] as a test model for statistical screening. The model is defined by the following nonlinear relationships: s M , (31) t = 2π k + S 2 PT0 V0 0 VTa2 ! r S P V 0 0 V = Ta − A , (32) A2 + 4k 2k T0 A = P0 S + 19.62M − kV0 . S (33) The quantity of interest is t, which is the time in seconds it takes the piston to complete one cycle. The remaining terms in the model are input parameters. The parameters’ descriptions and bounds are in Table 1 ordered from top to bottom. For computation, we shift and scale the 7-dimensional parameter space to the hypercube, and we equip it with a uniform probability density function. In the notation in previous sections, the parameters in Table 1 are x, and the time t is f (x). We apply Algorithm 1.1 from [4, Chapter 1] to test the piston model for an active subspace. The 7 eigenvalues from Ĉ in (20) are shown in Figure 1a on a logarithmic scale. The order-of-magnitude gap between the first and second eigenvalues suggests a dominant and computable one-dimensional active subspace. Figure 1b is a summary plot for the piston model, which plots the active variable (i.e., the linear combination of the 7 parameters with weights from the first eigenvector of Ĉ) on the horizontal axis versus the 14 Input parameters M ∈ [30, 60] Piston Weight (kg) S ∈ [0.005, 0.020] Piston Surface Area (m2 ) V0 ∈ [0.002, 0.010] Initial Gas Volume (m3 ) k ∈ [1000, 5000] Spring Coefficient (N/m) P0 ∈ [90000, 110000] Atmospheric Pressure (N/m2 ) Ta ∈ [290, 296] Ambient Temperature (K) T0 ∈ [340, 360] Filling Gas Temperature (K) Table 1: Input parameters’ descriptions and ranges for piston model (31). The parameters are ordered from top to bottom. corresponding model output on the vertical axis. The near-one-dimensional relationship between the active variable and the model output confirms the one-dimensional active subspace suggested by the eigenvalue gap. See [24, 4] for more information on summary plots. From these two figures, we choose n = 1 for the dimension of the active subspace, so we use n = 1 when computing the activity scores. Input parameter M S V0 k P0 Ta T0 τi 0.0509 0.5994 0.3528 0.0669 0.0013 0.0000 0.0001 νi 0.0032 0.0449 0.0265 0.0040 0.0001 0.0000 0.0000 βi 0.1963 −0.7345 0.5567 −0.1424 −0.0352 0.0018 −0.0051 wi 0.1604 −0.7936 0.5768 −0.1035 −0.0305 0.0015 −0.0042 αi (1) 0.0018 0.0437 0.0231 0.0007 0.0001 0.0000 0.0000 Table 2: Reference values for piston model’s inputs’ sensitivity metrics computed with tensor product Gauss-Legendre quadrature on 77 = 823543 points. All sensitivity metrics rank the variables similarly in this model, with one exception in the ordering of M and k with τi . The reference values for the sensitivity metrics, computed with high order numerical quadrature, are shown in Table 2. We use these values to compute the relative error in the Monte Carlo estimates for M ∈ {10, 50, 100, 500, 1000, 5000, 10000} samples. Figure 2 shows average relative error over 10 independent trials. Figure 3 shows the central limit theorem-based standard error or bootstrap 15 1.2 10 0 1 10 -2 0.8 ^i 6 f (x) 10 -4 10 -6 10 -8 10 -10 0.6 0.4 1 2 3 4 5 6 7 Index 0.2 -2 -1 0 1 2 ^ 1T x w (b) Summary plot (a) Eigenvalues of Ĉ Figure 1: Figure 1a shows the eigenvalues of Ĉ from (20). The orderof-magnitude gap between the first and second eigenvalue suggests a onedimensional active subspace. This is confirmed by the summary plot in Figure 1b, which shows that the piston model’s quantity of interest can be reasonably well approximated by a univariate function of one active variable. standard error, as appropriate, for the Monte Carlo estimates. 5.2. Circuit model The transformerless push-pull circuit model can be found at http://www. sfu.ca/~ssurjano/otlcircuit.html, along with Matlab code for evaluating the model’s predictions given input parameters. This model appears in [22, 23] as a test function for statistical screening. The model is defined as V = (12Rb2 /(Rb1 + Rb2 ) + 0.74)β(Rc2 + 9) β(Rc2 + 9) + Rf 11.95Rf 0.74Rf β(Rc2 + 9) + + . β(Rc2 + 9) + Rf (β(Rc2 + 9) + Rf )Rc1 (34) (35) The quantity of interest is V , the midpoint voltage. The remaining terms in the model are input parameters; their descriptions and ranges are in Table 3 ordered from top to bottom. We shift and scale the parameter space to 16 6 10 10 10 2 10 0 10 -2 10 2 10 0 10 -2 10 1 10 2 10 3 10 4 Relative error 6 10 4 10 2 10 3 1 2 3 4 5 6 7 10 0 -2 10 10 4 10 1 10 2 10 8 10 6 10 3 10 4 Number of Monte Carlo samples (b) eνi 10 2 10 10 0 Number of Monte Carlo samples (a) eτi 10 1 2 3 4 5 6 7 10 2 10 -2 10 1 Number of Monte Carlo samples 6 10 4 1 2 3 4 5 6 7 Relative error 10 4 1 2 3 4 5 6 7 Relative error Relative error 10 4 6 Relative error 10 (c) eβi 1 2 3 4 5 6 7 10 4 10 2 10 0 10 -2 1 10 2 10 3 10 4 10 1 Number of Monte Carlo samples 10 2 10 3 10 4 Number of Monte Carlo samples (d) ewi (e) eαi Figure 2: The relative error in the Monte Carlo estimates of the sensitivity metrics for the piston model (31), computed using the reference values in Table 2, for the number M of samples in {10, 50, 100, 500, 1000, 5000, 10000}. The metrics compared are the total sensitivity index (τi , Figure 2a), the derivative-based metric (νi , Figure 2b), the linear model coefficients (βi , Figure 2c), the first eigenvector components (wi , Figure 2d), and the activity scores (αi , Figure 2e). For the total sensitivity indices, the errors in the Monte Carlo estimates are much larger for the smaller reference values. The same is true of the linear model coefficients. The errors in the derivative-based metrics, the first eigenvector components, and the activity scores behave similarly across the scores. the hypercube [−1, 1]6 and equip it with a uniform probability density. In the notation in previous sections, the parameters in Table 3 are x, and the midpoint voltage V is f (x). As with the piston model, we apply Algorithm 1.1 from [4, Chapter 1] to test the circuit model for an active subspace. The 6 eigenvalues from Ĉ in (20) are shown in Figure 4a on a logarithmic scale. As with the piston model, 17 10 -10 10 1 2 3 4 5 6 7 -5 10 -10 10 1 10 2 10 3 10 4 10 2 10 3 10 0 10 0 Standard error (b) seνi 1 2 3 4 5 6 7 -5 10 -10 10 10 4 10 1 Number of Monte Carlo samples (a) seτi 10 10 1 2 3 4 5 6 7 -5 10 -10 10 1 Number of Monte Carlo samples Standard error 1 2 3 4 5 6 7 -5 10 0 Standard error 10 10 0 Standard error Standard error 10 0 10 10 2 10 3 10 4 Number of Monte Carlo samples (c) se ˆ βi 1 2 3 4 5 6 7 -5 10 -10 1 10 2 10 3 10 4 10 1 Number of Monte Carlo samples 10 2 10 3 10 4 Number of Monte Carlo samples (d) se ˆ wi (e) se ˆ αi Figure 3: The central limit theorem-based standard errors se or the bootstrap standard errors se ˆ for the sensitivity metrics associated with the piston model (31) as a function of the number M of Monte Carlo samples. The metrics compared are the total sensitivity index (τi , Figure 3a), the derivative-based metric (νi , Figure 3b), the linear model coefficients (βi , Figure 3c), the first eigenvector components (wi , Figure 3d), and the activity scores (αi , Figure 3e). The bootstrap standard errors were computed using 100 bootstrap replicates. the order-of-magnitude gap between the first and second eigenvalues identifies a dominant and computable one-dimensional active subspace. Figure 4b is the summary plot. Again, the near-one-dimensional relationship between the active variable and the model output confirms the one-dimensional active subspace suggested by the eigenvalue gap. Therefore, we again choose n = 1 for the dimension of the active subspace and use n = 1 when computing the activity scores. The reference values for the sensitivity metrics, computed with high order numerical quadrature, are shown in Table 4. We run the same Monte Carlo error study as the piston model in section 5.1. Figure 5 shows average relative 18 Input parameters Rb1 ∈ [50, 150] Resistance b1 (K-Ohms) Rb2 ∈ [25, 75] Resistance b2 (K-Ohms) Rf ∈ [0.5, 3] Resistance f (K-Ohms) Rc1 ∈ [0.25, 1.2] Resistance c1 (K-Ohms) Rc2 ∈ [90000, 110000] Resistance c2 (K-Ohms) β ∈ [50, 300] Current Gain (Amperes) Table 3: Input parameters’ descriptions and ranges for the circuit model (34). The parameters are ordered from top to bottom. 10 2 8 10 0 7 10 -2 6 10 -4 f (x) ^i 6 9 5 10 -6 4 10 -8 3 10 -10 1 2 3 4 5 6 Index 2 -2 -1 0 1 2 ^ 1T x w (b) Summary plot (a) Eigenvalues of Ĉ Figure 4: Figure 4a shows the eigenvalues of Ĉ from (20). The orderof-magnitude gap between the first and second eigenvalue suggests a onedimensional active subspace. This is confirmed by the summary plot in Figure 4b, which shows that the circuit model’s quantity of interest can be reasonably well approximated by a univariate function of one active variable. error over 10 independent trials. Figure 6 shows the central limit theorembased standard error or bootstrap standard error, as appropriate, for the Monte Carlo estimates. 19 Input parameter Rb1 Rb2 Rf Rc1 Rc2 β τi 0.5001 0.4117 0.0740 0.0218 0.0000 0.0000 νi 2.4555 1.7039 0.2902 0.1090 0.0000 0.0002 βi −0.6925 0.6358 0.2662 −0.1341 −0.0002 −0.0032 wi −0.7407 0.6112 0.2458 −0.1318 −0.0002 −0.0039 αi (1) 2.3778 1.6190 0.2617 0.0752 0.0000 0.0001 Table 4: Reference values for circuit model’s inputs’ sensitivity metrics computed with tensor product Gauss-Legendre quadrature on 67 = 279936 points. All sensitivity metrics rank the variables consistently in this model. 5.3. Discussion In tables 2 and 4, every sensitivity metric gives a similar ranking to the input parameter; the only exception is the ordering of k and M using τi in the piston model. All of the sensitivity metrics identify the same sets of non-influential inputs, thought it’s worth noting that each metric measures a different characteristic of the input/output relationship. The signs of the linear model coefficients and the components of the first eigenvector provide information about how the input parameter affects the response, on average. Figures 2 and 5 show us that the Monte Carlo estimates of the activity score, first eigenvector, and derivative-based sensitivity metric have smaller errors than estimates of the total sensitivity indices and regression coefficients. Indeed, the relative error in the total sensitivity indices and regression coefficients corresponding to non-essential parameters—in both models—is large; this demonstrates a known phenomenon of this metric which has been studied in [13], where the authors propose two methods to correct for this effect: reduction of the mean value and correlated sampling. We note this expected behavior of the total sensitivity index rather than correcting it. Figures 3 and 6 show the expected O(M −1/2 ) rate for the central limit theorem-based standard errors, but surprisingly they show a slightly faster rate of decrease for the bootstrap standard error estimates for the active subspace-based metrics. We do not know how to explain this behavior. Also, all sample-based a posteriori error estimates under predict the true relative error. 20 8 10 10 10 0 1 2 3 4 5 6 10 -2 10 4 10 2 10 0 1 2 3 4 5 6 10 -2 10 1 10 2 10 3 10 4 Relative error 10 1 2 3 4 5 6 2 10 0 10 2 10 3 10 4 10 1 Number of Monte Carlo samples (a) eτi 10 2 8 10 8 10 6 10 6 10 4 1 2 3 4 5 6 10 2 10 0 10 -2 10 3 10 4 Number of Monte Carlo samples (b) eνi 10 10 10 4 10 -2 10 1 Number of Monte Carlo samples 8 10 6 Relative error 10 4 2 10 10 6 Relative error Relative error 10 6 8 Relative error 10 (c) eβi 10 4 1 2 3 4 5 6 10 2 10 0 10 -2 1 10 2 10 3 10 4 10 1 Number of Monte Carlo samples 10 2 10 3 10 4 Number of Monte Carlo samples (d) ewi (e) eαi Figure 5: The relative error in the Monte Carlo estimates of the sensitivity metrics for the circuit model (34), computed using the reference values in Table 4, for the number M of samples in {10, 50, 100, 500, 1000, 5000, 10000}. The metrics compared are the total sensitivity index (τi , Figure 6a), the derivative-based metric (νi , Figure 6b), the linear model coefficients (βi , Figure 6c), the first eigenvector components (wi , Figure 6d), and the activity scores (αi , Figure 6e). As in the piston model, the errors in the Monte Carlo estimates of τi and βi are much larger for the smaller reference values. The errors in the derivative-based metrics, the first eigenvector components, and the activity scores behave similarly across the scores. 6. Summary and conclusions We propose two global sensitivity analysis metrics derived from the eigenvectors and eigenvalues computed while testing an engineering model for an active subspace. The two metrics are (i) the components of the first eigenvector and (ii) the activity scores, which are the squared eigenvector components linearly combined with the eigenvalues. We show theoretical connections be- 21 1 2 3 4 5 6 10 -5 10 -10 10 0 1 2 3 4 5 6 10 -5 10 -10 10 2 10 3 10 4 Number of Monte Carlo samples 10 2 10 3 10 0 10 0 Standard error (b) seνi 1 2 3 4 5 6 -5 10 -10 10 10 4 10 1 Number of Monte Carlo samples (a) seτi 10 1 2 3 4 5 6 10 -5 10 -10 10 1 Standard error 10 1 Standard error 10 0 Standard error Standard error 10 0 10 10 2 10 3 10 4 Number of Monte Carlo samples (c) se ˆ βi 1 2 3 4 5 6 -5 10 -10 1 10 2 10 3 10 4 10 1 Number of Monte Carlo samples 10 2 10 3 10 4 Number of Monte Carlo samples (d) se ˆ wi (e) se ˆ αi Figure 6: The central limit theorem-based standard errors sei or the bootstrap standard errors se ˆ i for the sensitivity metrics associated with the circuit model (34) as a function of the number M of Monte Carlo samples. The metrics compared are the total sensitivity index (τi , Figure 6a), the derivative-based metric (νi , Figure 6b), the linear model coefficients (βi , Figure 6c), the first eigenvector components (wi , Figure 6d), and the activity scores (αi , Figure 6e). The bootstrap standard errors were computed using 100 bootstrap replicates. tween the activity scores and the more conventional total sensitivity indices of Sobol’ and the derivative-based metrics of Kucherenko. We discuss high order quadrature methods and Monte Carlo methods to estimate the new sensitivity metrics. And we demonstrate the new metrics on two algebraic engineering models. The models are simple enough to compute reference values with high order quadrature and study the behavior of the Monte Carlo estimates and their a posteriori error estimates, i.e., the central limit theorembased standard errors or the bootstrap estimates of the standard errors. For the two test models, all metrics consistently identify the same important and unimportant variables. While it is possible to construct models where this is 22 not the case, since the metrics measure different characteristics of the model, we expect that this consistency will occur across many models in engineering practice. Acknowledgments This work is supported by the U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program under Award Number DE-SC-0011077. References [1] A. Saltelli, M. Ratto, T. Andres, F. Campolongo, J. Cariboni, D. Gatelli, M. Saisana, S. Tarantola, Global Sensitivity Analysis: The Primer, John Wiley & Sons, New York, 2008. [2] I. Sobol, Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates, Mathematics and Computers in Simulation 55 (2001) 271 – 280, the Second {IMACS} Seminar on Monte Carlo Methods. doi:http: //dx.doi.org/10.1016/S0378-4754(00)00270-6. URL http://www.sciencedirect.com/science/article/pii/ S0378475400002706 [3] S. Kucherenko, M. Rodriguez-Fernandez, C. Pantelides, N. Shah, Monte Carlo evaluation of derivative-based global sensitivity measures, Reliability Engineering & System Safety 94 (7) (2009) 1135 – 1148, special Issue on Sensitivity Analysis. doi:http://dx.doi.org/10.1016/j.ress.2008.05.006. URL http://www.sciencedirect.com/science/article/pii/ S095183200800152X [4] P. G. Constantine, Active Subspaces: Emerging Ideas in Dimension Reduction for Parameter Studies, SIAM, Philadelphia, 2015. [5] P. Constantine, D. Gleich, Computing active subspaces with Monte Carlo, arXiv preprint arXiv:1408.0545v2. 23 [6] P. Constantine, E. Dow, Q. Wang, Active subspace methods in theory and practice: Applications to kriging surfaces, SIAM Journal on Scientific Computing 36 (4) (2014) A1500–A1524. doi:10.1137/130916138. URL http://dx.doi.org/10.1137/130916138 [7] T. W. Lukaczyk, P. Constantine, F. Palacios, J. J. Alonso, Active Subspaces for Shape Optimization, American Institute of Aeronautics and Astronautics, 2014. doi:doi:10.2514/6.2014-1171. URL http://dx.doi.org/10.2514/6.2014-1171 [8] P. Constantine, M. Emory, J. Larsson, G. Iaccarino, Exploiting active subspaces to quantify uncertainty in the numerical simulation of the HyShot II scramjet, Journal of Computational Physics 302 (2015) 1 – 20. doi:http://dx.doi.org/10.1016/j.jcp.2015.09.001. URL http://www.sciencedirect.com/science/article/pii/ S002199911500580X [9] J. L. Jefferson, J. M. Gilbert, P. G. Constantine, R. M. Maxwell, Active subspaces for sensitivity analysis and dimension reduction of an integrated hydrologic model, Computers & Geosciences 83 (2015) 127 – 138. doi:http://dx.doi.org/10.1016/j.cageo.2015.07.001. URL http://www.sciencedirect.com/science/article/pii/ S0098300415300091 [10] P. G. Constantine, B. Zaharatos, M. Campanelli, Discovering an active subspace in a single-diode solar cell model, Statistical Analysis and Data Mining: The ASA Data Science Journal (2015) n/a–n/adoi:10.1002/ sam.11281. URL http://dx.doi.org/10.1002/sam.11281 [11] A. Owen, Variance components and generalized sobol’ indices, SIAM/ASA Journal on Uncertainty Quantification 1 (1) (2013) 19–41. doi:10.1137/120876782. URL http://epubs.siam.org/doi/abs/10.1137/120876782 [12] R. Liu, A. B. Owen, Estimating mean dimensionality of analysis of variance decompositions, Journal of the American Statistical Association 101 (474) (2006) pp. 712–721. URL http://www.jstor.org/stable/27590729 24 [13] I. M. Sobol’, E. E. Myshetskaya, Monte Carlo estimators for small sensitivity indices., Monte Carlo Methods & Applications 13 (5/6) (2008) 455 – 465. URL https://search-ebscohost-com.ezproxy.stanford. edu/login.aspx?direct=true&db=aph&AN=47102227&site= ehost-live&scope=site [14] B. Sudret, Global sensitivity analysis using polynomial chaos expansions, Reliability Engineering & System Safety 93 (7) (2008) 964 – 979, bayesian Networks in Dependability. doi:http: //dx.doi.org/10.1016/j.ress.2007.04.002. URL http://www.sciencedirect.com/science/article/pii/ S0951832007001329 [15] T. Crestaux, O. L. Maı̂tre, J.-M. Martinez, Polynomial chaos expansion for sensitivity analysis, Reliability Engineering & System Safety 94 (7) (2009) 1161 – 1172. doi:http://dx.doi.org/10.1016/j.ress.2008. 10.008. URL http://www.sciencedirect.com/science/article/pii/ S0951832008002561 [16] M. D. Morris, Factorial sampling plans for preliminary computational experiments, Technometrics 33 (2) (1991) 161–174. doi:10.1080/00401706.1991.10484804. URL http://www.tandfonline.com/doi/abs/10.1080/00401706. 1991.10484804 [17] I. Sobol’, S. Kucherenko, Derivative based global sensitivity measures and their link with global sensitivity indices, Mathematics and Computers in Simulation 79 (10) (2009) 3009 – 3017. doi:http://dx.doi.org/10.1016/j.matcom.2009.01.023. URL http://www.sciencedirect.com/science/article/pii/ S0378475409000354 [18] D. Xiu, G. E. Karniadakis, The Wiener-Askey polynomial chaos for stochastic differential equations, SIAM Journal on Scientific Computing 24 (2) (2002) 619–644. arXiv:http://dx.doi.org/10.1137/ S1064827501387826, doi:10.1137/S1064827501387826. URL http://dx.doi.org/10.1137/S1064827501387826 25 [19] P. G. Constantine, M. S. Eldred, E. T. Phipps, Sparse pseudospectral approximation method, Computer Methods in Applied Mechanics and Engineering 229–232 (2012) 1 – 12. doi:http://dx.doi.org/10.1016/j.cma.2012.03.019. URL http://www.sciencedirect.com/science/article/pii/ S0045782512000953 [20] B. Efron, R. J. Tibshirani, An introduction to the bootstrap, Chapman and Hall/CRC, Boca Raton, 1993. [21] Virtual library of simulation experiments: Test functions and datasets, http://www.sfu.ca/~ssurjano/index.html, accessed: 10/13/2015. [22] E. N. Ben-Ari, D. M. Steinberg, Modeling data from computer experiments: an empirical comparison of kriging with mars and projection pursuit regression, Quality Engineering 19 (4) (2007) 327–338. [23] H. Moon, Design and analysis of computer experiments for screening input variables, Ph.D. thesis, The Ohio State University (2010). [24] R. D. Cook, Regression graphics: Ideas for studying regressions through graphics, John Wiley & Sons, Hoboken, 2009. 26