CHEE434/821 Module 0 Slides

CHEE801 Module 5: Nonlinear Regression 1 Notation Model: random noise component Yi  f (x i ,  )   i explanatory variables – ith run conditions p-dimensional vector of parameters Model specification – f ( x i , ) – the model equation is – with n experimental runs, we have –  ( ) defines the expectation surface – the nonlinear regression model is Y   ( )    f ( x1 ,  )     f ( x 2 , )   ( )        f ( x , )    n 2 Parameter Estimation – Gauss-Newton Iteration Least squares estimation – minimize Y   ( ) 2  e e  S ( ) T Numerical optimization procedure is required. One possible method: 1. Linearization about the current estimate of the parameters 2. Solution of the linear(ized) regression problem to obtain the next parameter estimate 3. Iteration until a convergence criterion is satisfied 3 Linearization about a nominal parameter vector Linearize the expectation function η(θ) in terms of the parameter vector θ about a nominal vector θ0:  ( )   ( 0 )  V 0 (   0 )   ( 0 )  V 0  Sensitivity Matrix -Jacobian of the expectation function -contains first-order sensitivity information V0    ( )  T 0   f ( x1 ,  )   1      f ( x n , )   1     f ( x1 ,  )    p    f ( x n , )    p  0 4 Parameter Estimation – Gauss-Newton Iteration Iterative procedure consisting of: 1. Linearization about the current estimate of the parameters Y   ( 2. )  V  (i ) ( i 1) e Solve the linearized regression problem to obtain the next parameter estimate update  ( i  1)  3. (i )  (V ( i 1) ( i )T  V ( i ) 1 (i ) ) V ( i )T   ( y   ( (i ) )) ( i 1) Iterate until the parameter estimates converge 5 Computational Issues in Gauss-Newton Iteration The Gauss-Newton iteration can be subject to poor numerical conditioning, for some parameter values: » Conditioning problems arise in inversion of VTV » Solution – use a decomposition technique • QR decomposition • Singular Value Decomposition (SVD) » Use a different optimization technique » Don’t try to estimate so many parameters • Simplify the model • Fix some parameters at reasonable values 6 Other numerical estimation methods • Nonlinear least-squares is a minimization problem • Use any good optimization technique to find parameter estimates to minimize the sum of squares of the residuals 7 Inference – Joint Confidence Regions • • Approximate confidence regions for parameters and predictions can be obtained by using a linearization approach Approximate covariance matrix for parameter estimates: T 1 2  ˆ  ( Vˆ Vˆ )   • • ˆ is the Jacobian of () evaluated at the least squares where V parameter estimates This covariance matrix is asymptotically the true covariance matrix for the parameter estimates as the number of data points becomes infinite 100(1-α)% joint confidence region for the parameters: T T 2 (  ˆ ) Vˆ Vˆ (  ˆ )  p s F p , n  p , » Compare to the linear regression case 8 Inference – Marginal Confidence Intervals • Marginal confidence intervals » Confidence intervals on individual parameters ˆi  t , / 2 sˆi where sˆ is the approximate standard error of the parameter i estimate • – i-th diagonal element of the approximate parameter estimate covariance matrix, with noise variance estimated as in the linear case T 1 2  ˆ  ( Vˆ Vˆ ) s  9 Precision of the Predicted Responses – Linear Case From the linear regression module The predicted response from an estimated model has uncertainty, because it is a function of the parameter estimates which have uncertainty: e.g., Solder Wave Defect Model - first response at the point -1,-1,-1 y1  0  1 (  1)  2 (  1)  3 (  1) If the parameter estimates were uncorrelated, the variance of the predicted response would be: V a r ( y1 )  V a r ( 0 )  V a r ( 1 )  V a r ( 2 )  V a r ( 3 ) Why? 10 Precision of the Predicted Responses - Linear In general, both the variances and covariances of the parameter estimates must be taken into account. For prediction at the k-th data point: T T Var ( yˆ k )  x k ( X X )   xk1 xk 2 1 2 x k   xk1    x T 1  k 2  2 x kp ( X X )       x   kp    T T Var ( yˆ k )  x k ( X X ) 1 2 T x k    x k  ˆ x k 11 Precision of the Predicted Responses - Nonlinear Linearize the prediction equation about the least squares estimate: f (x k , ) ˆ )  f ( x , ˆ )  v T (  ˆ ) yˆ k  f ( x k , ˆ )  (    k k T  ˆ  For prediction at the k-th data point: T T 1 2 Var ( yˆ k )  vˆ k ( Vˆ Vˆ ) vˆ k     vˆ k 1 vˆ k 2   vˆ k 1    v ˆ T 1  k 2  2 vˆ kp ( Vˆ Vˆ )        vˆ   kp   T T 1 2 T Note - Var ( yˆ k )  vˆ k ( Vˆ Vˆ ) vˆ k    vˆ k  ˆ vˆ k 12 Estimating Precision of Predicted Responses Use an estimate of the inherent noise variance T T 1 T T 1 s 2 yˆ k  x k (X X ) s 2 yˆ k  v k (V V ) 2 x k s 2 v k s linear nonlinear The degrees of freedom for the estimated variance of the predicted response are those of the estimate of the noise variance » replicates » external estimate » MSE 13 Confidence Limits for Predicted Responses Linear and Nonlinear Cases: Follow an approach similar to that for parameters - 100(1-α)% confidence limits for the mean value of a predicted response are: y k  t , / 2 s y k » degrees of freedom are those of the inherent noise variance estimate If the prediction is for a new data value, confidence intervals are: yˆ k  t , / 2 s 2 2  se yˆ k Why? 14 Properties of LS Parameter Estimates Key Point - parameter estimates are random variables » because stochastic variation in data propagates through estimation calculations » parameter estimates have a variability pattern - probability distribution and density functions Unbiased E {  }   » “average” of repeated data collection / estimation sequences will be true value of parameter vector 15 Properties of Parameter Estimates Linear Regression Case – Least squares estimates are – » Unbiased » Consistent » Efficient Nonlinear Regression Case – Least squares estimates are – » Asymptotically unbiased – as number of data points becomes infinite » Consistent » Efficient 16 Diagnostics for nonlinear regression • Similar to linear case • Qualitative – residual plots – Residuals vs. » Factors in model » Sequence (observation) number » Factors not in model (covariates) » Predicted responses – Things to look for: » Trend remaining » Non-constant variance • Qualitative – plot of observed and predicted responses – Predicted vs. observed – slope of 1 – Predicted and observed – as function of independent variable(s) 17 Diagnostics for nonlinear regression • Quantitative diagnostics – Ratio tests: » 3 tests are the same as for linear case » R-squared • coarse measure of significant trend • squared correlation of observed and predicted values • adjusted R-squared • squared correlation of observed and predicted values 18 Diagnostics for nonlinear regression • Quantitative diagnostics – Parameter confidence intervals: » Examine marginal intervals for parameters • Based on linear approximations • Can also use hypothesis tests » Consider dropping parameters that aren’t statistically significant » What should we do if parameters are • Not significantly different from zero • Not signficiantly different from the initial guesses » In nonlinear models– parameters are more likely to be involved in more complex expressions involving factors and other parameters • E.g., Arrhenius reaction rate expression » If possible, examine joint confidence regions 19 Diagnostics for nonlinear regression • Quantitative diagnostics – Parameter estimate correlation matrix: » Examine correlation matrix for parameter estimates • Based on linear approximation • Compute covariance matrix, then normalize using pairs of standard deviations » Note significant correlations and keep these in mind when retaining/deleting parameters using marginal significance tests » Significant correlation between some parameter estimates may indicate over-parameterization relative to the data collected • Consider dropping some of the parameters whose estimates are highly correlated • Further discussion – Chapter 3 - Bates and Watts (1988), Chapter 5 - Seber and Wild (1988) 20 Practical Considerations – What kind of stopping conditions should be used to determine convergence? – Problems with local minima? – Reparameterization to reduce correlation between parameter estimates • Ensuring physically realistic parameter estimates – Common problem – we know that some parameters should be positive or should be bounded between reasonable values – Solutions » Constrained optimization algorithm to enforce non-negativity of parameters   exp(  ) positive » Reparameterization tricks  • Estimate  instead of  positive   10   1 1 e  Bounded between 0 and 1 21 Practical considerations • Correlation between parameter estimates – Reduce by reparameterization – Exponential example – 1 exp(  2 x )  1 exp(   2 ( x  x 0  x 0 ))   1 exp(   2 x 0 ) exp(   2 ( x  x 0 ))  1 exp(   2 ( x  x 0 )) 22 Practical considerations • Particular example – Arrhenius rate expression  E 1 E  1 1    k 0 exp    )   k 0 exp   (   R T T T  RT  ref ref     E 1 E  1   exp   (   k 0 exp   )  RT   R T T  ref ref      E 1 1   k ref exp   (  )  R T T  ref   – Reduces correlation between parameter estimates and improves conditioning of estimation problem 23 Practical considerations • Scaling – of parameters and responses • Choices – Scale by nominal values » Nominal values – design centre point, typical value over range, average value – Scale by standard errors or initial uncertainty ranges for parameters » Parameters – estimate of standard devn of parameter estimate » Responses – by standard devn of observations – noise standard deviation • Scaling can improve conditioning of the estimation problem (e.g., scale sensitivity matrix V), and can facilitate comparison of terms on similar (dimensionless) bases 24 Practical considerations • Initial parameter guesses are required – From prior scientific knowledge – From prior estimation results – By simplifying model equations 25 Things to learn in CHEE 811 • Estimating parameters in differential equation models: dy dt  f ( y , u , t ;  ); y ( t 0 )  y 0 • Estimating parameters in multi-response models • Deriving model equations based on chemical engineering knowledge and stories about what is happening • Solving model equations numerically • Deciding which parameters to estimate and which to leave at initial guesses when data are limited. 26

CHEE434/821 Module 0 Slides

Related documents

Products

Support

CHEE434/821 Module 0 Slides

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib