KNNL_Ch06

Multiple Regression I KNNL – Chapter 6 Models with Multiple Predictors • Most Practical Problems have more than one potential predictor variable • Goal is to determine effects (if any) of each predictor, controlling for others • Can include polynomial terms to allow for nonlinear relations • Can include product terms to allow for interactions when effect of one variable depends on level of another variable • Can include “dummy” variables for categorical predictors First-Order Model with 2 Numeric Predictors Yi  0  1 X i1  2 X i 2   i E  i   0  E Y   0  1 X1  2 X 2 Plane in 3-dimensions E(Y)=100-5X1+10X2 200 E(Y) 150 100 150-200 50 100-150 01 4 7 10 X1 13 16 19 50-100 0-50 Interpretation of Regression Coefficients • Additive: E{Y} = 0  1X1 + 2X2 ≡ Mean of Y @ X1, X2 • 0 ≡ Intercept, Mean of Y when X1=X2=0 • 1 ≡ Slope with Respect to X1 (effect of increasing X1 by 1 unit, while holding X2 constant) • 2 ≡ Slope with Respect to X2 (effect of increasing X2 by 1 unit, while holding X1 constant) • These can also be obtained by taking the partial derivatives of E{Y} with respect to X1 and X2, respectively • Interaction Model: E{Y} = 0  1X1 + 2X2 + 3X1X2 • When X2 = 0: Effect of increasing X1 by 1: 1(1)+3(1)(0)= 1 • When X2 = 1: Effect of increasing X1 by 1: 1(1)+3(1)(1)= 1+3 • The effect of increasing X1 depends on level of X2, and vice versa General Linear Regression Model Yi   0  1 X i1   2 X i 2  ...   p 1 X i , p 1   i p 1  Yi   0    k X ik   i k 1 p 1  Yi    k X ik   i k 0 where: X i 0  1 E  i   0  E Y    0  1 X 1   2 X 2  ...   p 1 X p 1 (Hyperplane in p -dimensions) p  1  1  Simple linear regression Normality, independence, and constant variance for errors:  i ~ NID  0,  2   Yi ~ N   0  1 X i1   2 X i 2  ...   p 1 X i , p 1 ,  2   Yi , Y j   0  i  j Special Types of Variables/Models - I • p-1 distinct numeric predictors (attributes)  Y = Sales, X1=Advertising, X2=Price • Categorical Predictors – Indicator (Dummy) variables, representing m-1 levels of a m level categorical variable  Y = Salary, X1=Experience, X2=1 if College Grad, 0 if Not • Polynomial Terms – Allow for bends in the Regression  Y=MPG, X1=Speed, X2=Speed2 • Transformed Variables – Transformed Y variable to achieve linearity Y’=ln(Y) Y’=1/Y Special Types of Variables/Models - II • Interaction Effects – Effect of one predictor depends on levels of other predictors       Y = Salary, X1=Experience, X2=1 if Coll Grad, 0 if Not, X3=X1X2 E(Y) = 0  1X1  2X2  3X1X2 Non-College Grads (X2 =0): E(Y) = 0  1X1  2(0)  3X1(0)= 0  1X1 College Grads (X2 =1): E(Y) = 0  1X1  2(1)  3X1(1)= 0  21 3 X1 • Response Surface Models  E(Y) = 0  1X1  2X12  3X2  4X22 5X1X2 • Note: Although the Response Surface Model has polynomial terms, it is linear wrt Regression parameters Matrix Form of Regression Model Yi   0  1 X i1   2 X i 2  ...   p 1 X i , p 1   i i  1,..., n Matrix Form:  Y1  Y  Y   2 n1     Yn  1 X 11 1 X 21 X  n p   1 X n1  0     1  β    p1     p 1   1    ε   2 n1      n  Y X β ε n1 n p p1 n1   X 12 X 22 X n2 X 1, p 1  X 2, p 1    X n , p 1  0  0  E ε   n×1     0       E Y = EX β + ε   X β n×1 n×p p×1 n×1  n p p1  2 0  2 0  2 σ ε  n1   0  0   σ 2 Y   2I n1 0  0   2I  2   Least Squares Estimation of Regression Coefficients Goal: Minimize: Q      Yi   0  1 X i1  ...   p 1 X i , p 1  n i 1 n 2 i 2 i 1  Obtain Estimates of  0 , 1 ,...,  p 1 that minimize Q  b0 , b1 ,..., bp 1  b0  b  -1 1   Normal Equations: X ' X b  X ' Y  b    X'X  X'Y p p p1 p1 p1     b  p 1  Maximum Likelihood also leads to the same estimator b :  L β,  2    2 2   n /2  1 exp   2  2  Y     X  ...   X  i 0 1 i1  p 1 i , p 1   i 1  n 2  Y   n since maximizing L involves minimizing i 1 i 0  1 X i1  ...   p 1 X i , p 1  2 Fitted Values and Residuals ^  Y 1  ^  ^ Fitted Values: Y  Y 2  n1     Y^   n ^  e1  e  Residuals: e   2  n1      en  Y  X b  X  X'X  X'Y = HY H = X  X'X  X' H = H' = HH -1 n1 -1 n p p1 ^ e = Y  Y  Y  X b  Y  X  X'X  X'Y =  I  H  Y n1 -1 n1 n1  2 ^ n1 n p p1 σ Y = σ HY = Hσ Y H' = σ H 2 2 2 I  H   I  H  '  I  H  I  H   ^ s Y  MSE  H  2 σ 2 e = σ 2  I  H  Y =  I  H  σ 2 Y I  H  ' = σ 2  I  H  s 2 e  MSE  I  H  Analysis of Variance – Sums of Squares ^  Y 1  ^  ^ Y  Y 2   Xb  HY n1     Y^ n     Y1  Y  Y   2 n1     Yn  n  SSTO   Yi  Y i 1 Y    1 Y 1   Y      n1 n   1 Y   Y1  1    Y2    1  JY     n  1   Yn     Y  Y  '  Y  Y   Y'  I   1n  J  Y 2 2 ^ ^ ^       SSE    Yi  Y i    Y  Y  '  Y  Y   Y'  I  H  Y  Y'Y - b'X'Y      i 1  n MSE   1  1 ^  ^  ^  SSR    Y i  Y    Y  Y  '  Y  Y   Y'  H    J  Y  b'X'Y  Y'   JY      n  n i 1   SSE n p 2 n MSR  E MSE   2 p 1 p 1 E MSR      SS kk     k  k ' SS kk ' 2 k 1 E MSR  E MSE 2 k k 1 k ' k n  SS kk '   X ik  X k i 1  X E MSR  E MSE  1  ...   p 1  0 ik '  X k'  SSR p 1 ANOVA Table, F-test, and R2 Analysis of Variance (ANOVA) Table Source df Sum of Squares Mean Square ------------------------------------------------------------------------------------------------------------------------------------Regression p 1 1 SSR  b'X'Y - Y'   JY n Error n p SSE  Y'Y - b'X'Y Total n 1 1 SSTO  Y'Y - Y'   JY n Test of H 0 : 1  ...   p 1  0 Test Statistic: F *  MSR MSE  E Y     MSR  MSE  SSR p 1 SSE n p H A : Not all  k  0 0 Rejection Region: F *  F 1   ; p  1, n  p  P-value= Pr F  p  1, n  p   F * Coefficient of Multiple Determination: R 2  SSR SSE 1 SSTO SSTO  SSE  n  p   1   n  1  SSE 2 Adjusted-R  1      SSTO   n  p  SSTO  n  1  Correlation: R  R 2 places a "penalty" on models with extra predictors Inferences Regarding Regression Parameters E ε = 0 Y = Xβ + ε  σ 2 ε =  2I  E Y = Xβ  σ 2 Y =  2I E b = E  X'X  X'Y =  X'X  X'E Y =  X'X  X'Xβ = β -1 -1 -1  b0 , bp 1   2 b0   b0 , b1    b1 , b0   2 b1 2 σ b      bp 1 , b0   bp 1 , b1 σ 2 b = σ 2  X'X  -1 s 2 b = MSE  X'X  bk   k ~ tn  p s bk     b1 , bp 1      2 bp 1     s 2 b0  s b0 , b1    s b1 , b0  s 2 b1 2 s b      s bp 1 , b0  s bp 1 , b1 X'Y   X'X  X'σ 2 Y  X'X  X' '   2  X'X  X'X  X'X    2  X'X  -1 -1 -1 -1 1   100% CI for  k  bk  t 1  Test of H 0 :  k  0 H A :  k  0  Test Statistic: t*     Rejection Region: t *  t 1  ; n  p   2    ; n  p  s bk  2  bk s bk  P-value=2Pr  t  n  p   t *     Simultaneous 1   100% CI s for g  p  s : bk  t 1  ; n  p  s bk   2g  -1 s b0 , b p 1  s b1 , b p 1     s 2 bp 1  -1 Estimating Mean Response at Specific X-levels Given set of levels of X 1 ,..., X p 1 : X h1 ,..., X h , p 1  1   X  h1  Xh     p1   X  h , p 1   ^ E Yh  X β ' h ^ E Yh   X β Y h  X'h b ' h  ^  Y h  X σ b X h   X  X'X  X h 2 1   100% CI for E  ^ Yh ' h 2 2 -1 ' h  ^  s Y h  MSE X 'h  X'X  X h 2 -1      ^ : Y h  t 1  ; n  p  s Y h  2  ^ 1   100% Confidence Region for Regression Surface:  ^ ^  ^ 1   100% CI for several (g ) E Y h : Y h  Bs Y h ^  ^ Y h  Ws Y h W 2  pF 1   ; p, n  p     B  t 1  ;n  p  2g  Predicting New Response(s) at Specific X-levels Given set of levels of X 1 ,..., X p 1 : X h1 ,..., X h , p 1  1   X  h1  Xh     p1    X h , p 1  E Yh   X β ' h  s 2 pred  MSE 1  X'h  X'X  X h -1 ^ Y h  X'h b  -1 1  mean of m observations (at same X-levels): s 2 predmean  MSE   X'h  X'X  X h  m  1   100% CI for Yh (new) :    Y h  t 1  ; n  p  s pred  2  ^ ^ Scheffe: 1   100% CI for several (g ) Yh (new) : Y h  Ss pred ^ Bonferroni: 1   100% CI for several (g ) Yh (new) : Y h  Bs pred S 2  gF 1   ; g , n  p     B  t 1  ;n  p   2g 

KNNL_Ch06

Related documents

Products

Support

KNNL_Ch06

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib