Part 3 Sensitivity Analysis (Design Exploration) optiSLang Sensitivity Analysis • optiSLang scans the design space and measures the sensitivity with statistical measures • Results of a global sensitivity study are: • Global sensitivities of the variables due to important responses • Identification of reduced sets of important variables which have the most significant influence on the responses • Estimate the variation of responses • Estimate the solver noise • Better understanding and verification of dependences between input parameter variation and design responses 2 Part 3: Sensitivity Analysis Input/output of models Inputs 3 CAE, …, experiments Part 3: Sensitivity Analysis Outputs Statistical properties of input/output • Mean value • Variance • Standard deviation • Coefficient of variation 4 Part 3: Sensitivity Analysis Methods for Sensitivity Analysis • Local methods • Local derivatives • Standardized derivatives • Global methods • Anthill plots • Coefficients of correlation (linear, quadratic) • Rank order correlation • Standardized regression coefficients • Stepwise polynomial regression • Reduced polynomial models: Coefficients of Importance • Advanced surrogate models including prediction analysis and optimal subspace detection: Coefficients of Prognosis 5 Part 3: Sensitivity Analysis Scanning the design space with DOE Inputs Design of experiments Design evaluation • Output variability and input sensitivities • Input parameter significance and multivariate dependencies • Regression analysis • Quantification of the model predictability and noise fraction 6 Part 3: Sensitivity Analysis Outputs Deterministic DOE schemes Full factorial Central composite D-Optimal quadr. • Simple DOE schemes can not identify multivariate dependencies • More complex schemes only efficient for small number of variables • Optimized only for polynomial regression • Not always uniformly distributed • Fixed size of samples, critical if failed designs occur 7 Part 3: Sensitivity Analysis Latin Hypercube Sampling Standard Monte Carlo Simulation • • • • • • 8 Latin Hypercube Sampling Improved Monte Carlo Simulation Cumulative distribution function is subdivided into N classes with same probability Reduced number of required samples for statistical estimates Reduced unwanted input correlations Add optimal samples to an existing set of LHS samples (ALHS) LHS requires N≥k+1 samples, if not possible ALHS can be used Part 3: Sensitivity Analysis DOE settings 9 Part 3: Sensitivity Analysis Anthill plots • Two-dimensional scatter-plots of two sample vectors of any design variable or response • Reveal both linear and nonlinear dependencies • Strongly nonlinear dependency, e.g. as bifurcation may become clearly visible 10 Part 3: Sensitivity Analysis Covariance 11 Part 3: Sensitivity Analysis • Covariance measures dependence between two selected parameters (design variables or responses) • Linear dependence is assumed • Covariance value depends on parameter dimensions Coefficient of correlation • Defined as standardized covariance of two variables • Coefficient of correlation is always between -1 and 1 • Defines degree of linear dependence 12 Part 3: Sensitivity Analysis Coefficient of correlation No correlation Weak correlation Strong correlation • Positive value indicates a positive relationship between two variables X and Y, e.g. in case that the value of X increases, the value for Y increases as well • A value near zero indicates a random or nonlinear relationship between the two variables 13 Part 3: Sensitivity Analysis Correlation matrix Input-Output Output-Output Input-Input • 14 • Symmetric matrix: • One at diagonal: Output-Input Significant deviation from the target correlation of the input parameters indicates a possible error during the design procedure, or that the number of samples is too small Part 3: Sensitivity Analysis Example: Analytical nonlinear function • Additive linear and nonlinear terms and one coupling term • Contribution to the output variance (reference values): X1: 18.0%, X2: 30.6%, X3: 64.3%, X4: 0.7%, X5: 0.2% 15 Part 3: Sensitivity Analysis Example: Analytical nonlinear function 16 Part 3: Sensitivity Analysis Example: Analytical nonlinear function 0.03 0.06 0.19 0.42 0.65 17 Part 3: Sensitivity Analysis Example: Analytical nonlinear function • Since optiSLang 3.2: Extended correlation matrix 18 Part 3: Sensitivity Analysis Confidence interval • Correlation coefficients are statistical estimates with an accuracy depending on the number of samples N • Lower and upper bounds estimated from 95% confidence interval • Number of required samples: 50…100 if k<20, else 100…200 19 Part 3: Sensitivity Analysis Simple polynomial regression • Approximation of one variable Y in terms of a single variable X • Minimization of squared error sum for given set of samples • Least squares solution 20 Part 3: Sensitivity Analysis Quadratic correlation • • 21 Quadratic regression of variable Y on variable X by a least-squares fit of the sample values Correlation between approximated and exact values of Y Part 3: Sensitivity Analysis Quadratic correlation matrix Input-Output Output-Output Input-Input • 22 • Unsymmetric matrix: • One at diagonal: Output-Input A value near zero means that there is a random or highly nonlinear relationship between the two variables Part 3: Sensitivity Analysis Spearman’s rank correlation • It assesses how well an arbitrary monotonic function could describe the relationship between two variables without making any assumptions about the regression function of the variables • Data are converted to ranks before calculating the coefficient • Insensitive towards outliers 23 Part 3: Sensitivity Analysis Multiple polynomial regression • Set of input variables • Definition of polynomial basis • Approximation function • Least squares solution • Number of samples, linear: • Quadratic: 24 Part 3: Sensitivity Analysis Coefficient of Determination (CoD) 25 • Fraction of explained variation of an approximated variable • Total variation • Explained variation • Unexplained variation • Quality measure of approximation quality of a simple or multiple polynomial regression Part 3: Sensitivity Analysis Coefficient of Determination (CoD) 26 • Adjusted Coefficient of Determination to penalize over-fitting • One dimensional CoDs are equivalent to squared linear or quadratic correlation coefficients of a certain response Yb with respect to a single variable Xa • Linear CoDs of single variables sum up to CoD of multidimensional linear polynomial if inputs are uncorrelated Part 3: Sensitivity Analysis Example: Analytical nonlinear function 27 Part 3: Sensitivity Analysis Coefficient of Importance (CoI) 28 • Explained variation of a response Yb due to a single variable Xa including its coupling terms • Reduced polynomial basis • Coefficient of Importance as reduction of CoD by removing Xa from the regression model • Sum of single CoIs larger than full CoD indicates important coupling terms • For linear basis and independent inputs Part 3: Sensitivity Analysis Example: Analytical nonlinear function all inputs,linear (N=100, p=6) • 29 all inputs, quadratic (N=100, p=21) 3 inputs, quadratic (N=100, p=10) Calculated CoIs are more precise for a smaller number of regression coefficients (at least N>2p samples are recommended) Part 3: Sensitivity Analysis Significance filter • In large dimensions, the necessary number of solver runs for sensitivity analysis increases • But in reality, often only a small number of variables is important • Therefore, optiSLang includes filter technology to estimate significant correlation between inputs and outputs • Significance level for linear and quadratic correlation coefficient from InputInput correlation errors 30 Part 3: Sensitivity Analysis Limitations of CoD/CoI • CoD/CoI is only based on how good regression model fits through the sample points, but not on how good is the prediction quality • Approximation quality is too optimistic for small number of samples • For interpolation models with perfect fit, CoD is equal to one • Better approximation models are required for highly nonlinear problems, but CoD/CoI works only with polynomials 31 Part 3: Sensitivity Analysis Moving Least Squares approximation • Local polynomial regression with position-dependent coefficients • Distance depending weighting of support points • Smoothing of approximation is controlled by distance radius D • Choice of D directly influences CoD 32 Part 3: Sensitivity Analysis Moving Least Squares interpolation • Regularized weighting function • Interpolation condition is almost fulfilled • Almost independent of influence radius D • Noise is fully interpolated and can not be filtered • CoD is equal to one 33 Part 3: Sensitivity Analysis Box-Cox transformation • Flexible transformation to transform nonlinear problems to almost linear • Family of power transformations • Geometric mean • Determination of optimal l by minimization of approximation errors • If optimal l is equal one, transformation is not active • Requires scaling of response 34 Part 3: Sensitivity Analysis Coefficient of Prognosis (CoP) • Fraction of explained variation of prediction • optiSLang 3.1: CoP is estimated by approximation error of an additional test data set which is not used to build the model • Problems: • Test data set not used for approximation • For small number of samples: CoP estimate may not be reliable • optiSLang 3.2: Estimation of CoP by cross validation using a partitioning of all available samples: • Due to reverse checking every sample point is used for approximation and prediction, CoP estimate is more reliable • All samples are used for final approximation model 35 Part 3: Sensitivity Analysis Coefficient of Prognosis (CoP) • 36 Sample splitting with different splitting rates may lead to large variation of CoP estimate while approximation function is almost unchanged COP =0.59 COP =0.46 COP =0.67 COP =0.54 COP =0.69 COP =0.76 Part 3: Sensitivity Analysis Coefficient of Prognosis (CoP) • 37 Cross validation CoP estimate is much more reliable then by using sample splitting COP =0.59 COP =0.59 COP =0.61 COP =0.62 COP =0.59 COP =0.58 Part 3: Sensitivity Analysis Coefficient of Prognosis (CoP) • CoP increases with increasing number of samples • CoP works for interpolation and approximation models • With MLS, continuous functions also including coupling terms can be represented with a certain number of samples • Prediction quality is better if unimportant variables are removed from the approximation model 38 Part 3: Sensitivity Analysis Meta-model of Optimal Prognosis (MOP) • Approximation of solver output by fast surrogate model • Reduction of input space to get best compromise between available information (samples) and model representation (number of input variables) • Advanced filter technology to obtain candidates of optimal subspace (significance and CoI filters) • Determination of most appropriate approximation model (polynomials with linear or quadratic basis, MLS, …, Box-Cox) • Assessment of approximation quality (CoP) • MOP solves three important tasks: • Best variable subspace • Best meta-model • Determination of prediction quality 39 Part 3: Sensitivity Analysis Meta-model of Optimal Prognosis (MOP) • 31 possible subspaces from 5 variables for each meta-model type =155 model evaluations • 5 possible subspaces with filter • Filter technology dramatically reduces the number of investigated subspaces to improve efficiency • DCoP measure quantifies accepted reduction in prediction quality to obtain a smaller subspace and/or simpler meta-model 40 Part 3: Sensitivity Analysis MOP Settings 41 Part 3: Sensitivity Analysis CoP for single variable sensitivity • Variance based sensitivity indices: fraction of variance explained by a single variable • Total indices include coupling terms • Determination of conditional variances on optimal meta-model and scaling with CoP • Sum of single CoPs larger as total CoP indicates coupling terms 42 Example: Analytical nonlinear function CoD (quad. Polynomial, 5 inputs) 43 CoP (MOP: MLS with 3 inputs) • Prediction quality is almost perfect with MOP on 100 LHS samples • Optimal subspace contains only X1, X2 and X3 • Highly nonlinear function of X3 and coupling term X1X2 are represented by approximation and sensitivity measures Part 3: Sensitivity Analysis Example: Analytical nonlinear function • MOP/CoP close to reference values (detects optimal subspace automatically, represents nonlinear and coupling terms) CoD, k=5 CoI, k=5 CoI, k=3 CoP, k=3 Reference 75% 75% 74% 97% 100% X1 2% 14% 14% 18% 18% X2 18% 30% 28% 31% 31% X3 41% 34% 39% 62% 64% X4 0% 0% - - 0.7% X5 0% 1% - - 0.2% (all inputs) Full model 44 (all inputs) Part 3: Sensitivity Analysis (manual) (automatic) CoI versus CoP Application: 9 Inputs, 200 Sample, passive safety application • MoP can be used for visualization • global prognosis quality and local prognosis quality can be evaluated! 45 Part 3: Sensitivity Analysis Example: Low-dim. instability problem CoD/CoI/CoP - Get ready for productive use. 1 4 3 optiSLang Version 3 2 (CoI find most important variable) 1 optiSLang Version 3.1 (CoP quantify nonlinearity) optiSLang Version 2 (CoD shows no importance) 46 Part 3: Sensitivity Analysis optiSLang Version 3.1 CoP: 0.73 MoP: MLS-Approximation Sample Split 70/30 How to verify the CoP/MoP • Compare CoI/CoD and CoP (Explainability should continuously improve) • Check plausibility using 2D/3D visualization If plausibility is not verified: • Sample set are to small or to much clustered • Add samples and repeat MOP/CoP calculation 47 Part 3: Sensitivity Analysis Strategy “No Run too Much” Using advanced LHS sampling, filter technology, CoD/CoI/CoP we can start to check after 50 runs ⇒ can we explain the variation ⇒ which input scatter is important ⇒ how large is the amount of unexplainable scatter (potentially noise, extraction problems or high dimensional non-linearity) 48 Part 3: Sensitivity Analysis “no run too much” a must for practical use! Summary 49 • Sensitivity analysis gives advanced information about design parameters and helps to simplify the optimization problem • One-dimensional linear/quadratic dependencies can be captured by correlation coefficients /1D CoDs • Linear coupling terms can be identified with multiple CoD/CoIs but generally require a reduction of the design space • MOP serves optimal meta-model in best subspace • CoP gives reliable quality estimate (explained variation) for polynomials and advanced meta-models • Small CoP indicates insufficient number of samples or unexplainable solver behavior/problems • Single variance-based CoPs can represent highly nonlinear coupled and uncoupled dependencies • Strategy “No run too much”: Reliable sensitivity estimates with 50…100 if k<20, else 100…200 Part 3: Sensitivity Analysis