Correlation, Simple and Multiple Regression

Multiple Regression Analysis: Wrap-Up More Extensions of MRA 1 Contents for Today  Probing interactions in MRA:       Reliability and interactions Enter Method vs. Hierarchical Patterns of interactions/effects coefficients Polynomial Regression Interpretation issues in MRA Model Comparisons 2 Recall our continuous variable interaction  Job satisfaction as a function of Hygiene and Caring Atmosphere  Steeper slope for the regression of job satisfaction on hygiene, when people perceived an (otherwise) caring atmosphere. 3 Simple Slopes of satisfaction on hygiene for three levels of caring atmosphere Interaction of Care x Basics on Satisfaction 8 Low Care Mean of Care High Care 7 Satisfaction Level 6 Low Basic High Basic 3.464 5.036 4.192 6.258 4.575 6.902 5 Low Value of care Mean of care High value of care 4 3 2 1 0 Low X High X Low/High Basics 4 Developing equations to graph: Using Cohen et al.’s notation: 1) Choose a high, medium, and low value for Z and solve the following: Y '  ( A  B2 Z )  (B1  B3Z ) X Example: Low value of Z [Caring atmosphere] might be -1.57 (after centering) Y '  [5.23  0.3721(1.57)]  [0.5165  0.0472(1.57)]X1 Y '  [4.646]  [0.442] X 2) Next, solve for two reasonable (but extreme) values of X1 Example: Doing so for -2 and +2 for basics gives us 3.76 and 5.53 3) Repeat for medium & high values of X2 5 More on centering… First, some terms (two continuous variables with an interaction) Y '  0  1 X  2 Z  3 XZ Assignment of X & Z is arbitrary. What do β1 and β2 represent if β3 =0? What if β3 ≠ 0? Full regression equation for our example (centered): Y '  5.225  .516 X  .372 Z  .047 XZ Where, X = Hygeine (cpbasic) and Z = caring atmosphere (cpcare) 6 Even more on centering    We know that centering helps us with multicollinearity issues. Let’s examine some other properties, first turning to p. 259 of reading… Note the regression equation above graph A.  Then above graph B. 7 Why does this (eq. slide 6) make sense? Or does it? Our regression equation: Y '  0  1 X  2 Z  3 XZ Rearranging some terms: Y '  1 X  3 XZ  0  2 Z Then factor out X: Y '  (1  3Z ) X  (0  2 Z ) The right-hand side (in parentheses) reflects the intercept. The left-hand side (in parentheses) reflects the slope. We then solve this equation at different values of Z. 8 Since the regression is “symmetric”… Our regression equation: Y '  0  1 X  2 Z  3 XZ We can rearrange the terms differently: Y '  2 Z  3 XZ  0  1 X Then factor out Z: Y '  (2  3 X )Z  (0  1 X ) The right-hand side (in parentheses) reflects the intercept. The left-hand side (in parentheses) reflects the slope. We then solve this equation at different values of X. 9 Are the simple slopes different from 0? May be a reasonable question, if so solve for the simple slope: Y '  (1  3Z ) X  (0  2 Z ) And solve for a chosen Z value (e.g., one standard deviation below the mean: -1.57) Y '  [5.23  0.3721(1.57)]  [0.5165  0.0472(1.57)]X1 Y '  [4.646]  [0.442] X The simple slope is 0.442. Next we need to obtain an error term. The standard error is given by: SEB at Z  SEB211  2Z cov B13  Z 2 SEB233 10 How to solve… Under “Statistics” request covariance matrix for regression coefficients For our example: Coefficient Covariances b1 b2 b3 cpbasic cpcare cbasxcar cpbasic cpcare 0.00187830 -0.00053855 -0.00053855 0.00047930 0.00015193 0.00011203 cbasxcar 0.00015193 0.00011203 0.00031912 Note: I reordered these as SPSS didn’t provide them in order, I also added b1, b2, etc. SEB at Z  0.0018783  2(-1.5735)0.00015193  (1.5735)2 0.00031912 SEBat Z  .002190286  .0468 t1038  0.442  9.444 0.0468 11 Simple Slope Table Simple slopes, intercepts, test statistics and confidence intervals Confidence Intervals for Simple Slope Simple Regression Equations Value of Z (z_cpcare) Low Medium High Value Slope Intercept -1.5735 0.4423 4.6396 0.0000 0.5165 5.2251 1.5735 0.5907 5.8106 SE of simple slope t 0.0468 9.4502 0.0433 11.9172 0.0561 10.5303 p 0.0000 0.0000 0.0000 95% CI Low 0.3504 0.4314 0.4806 95% CI High 0.5341 0.6015 0.7008 Compare to original regression table (Constant) x_cpbasic z_cpcare xz_cbasxcar B 5.2251 0.5165 0.3721 0.0472 Std. Error 0.0286 0.0433 0.0219 0.0179 Beta 0.3413 0.4981 0.0650 t 182.6190 11.9172 16.9960 2.6399 Sig. 0.0000 0.0000 0.0000 0.0084 95% CI for B Lower Upper 5.1690 5.2813 0.4314 0.6015 0.3291 0.4151 0.0121 0.0822 12 A visual representation of regression plane: The centering thing again… 13 Using SPSS to get simple slopes  When an interaction is present   bx = ? Knowing this, we can “trick” SPSS into computing simple slope test statistics for us. Uncentered Descriptives Centered Descriptives Descriptive Statistics Descriptive Statistics satis pbasic pcare basxcar   Mean 5.26 4.131144 5.621881 24.0355 Std. Deviation 1.175 .7766683 1.5735326 9.20924 N 1042 1042 1042 1042 satis cpbasic cpcare cbasxcar Mean 5.26 -.0007 .0032 .8107 Std. Deviation 1.175 .77667 1.57353 1.61918 N 1042 1042 1042 1042 When X is centered, we get the “middle” simple slope. So… 14 Using SPSS to get simple slopes (cont’d)   If we force Z=0 to be 1 standard deviation above the mean… We get the simple slope for X at 1 standard deviation below the mean compute cpbasic=pbasic-4.131875. compute cpcare = pcare-5.61866+1.573532622. compute cbasxcar=cpbasic*cpcare. execute. TITLE 'regression w/interaction centered'. REGRESSION /DESCRIPTIVES MEAN STDDEV CORR SIG N /MISSING LISTWISE /STATISTICS COEFF OUTS CI BCOV R ANOVA COLLIN TOL ZPP /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT satis /METHOD=ENTER cpbasic cpcare cbasxcar /SCATTERPLOT=(*ZRESID ,*ZPRED ) /SAVE PRED . This code gets us… 15 This… B Std. Error (Constant) 4.6396 0.0479 cpbasic 0.4423 0.0468 cpcare 0.3721 0.0219 cbasxcar 0.0472 0.0179 Dependent Variable: satis Beta 0.2922 0.4981 0.0610 t 96.7988 9.4502 16.9960 2.6399 Sig. 0.0000 0.0000 0.0000 0.0084 Lower Bound 4.5456 0.3504 0.3291 0.0121 Upper Bound 4.7337 0.5341 0.4151 0.0822 And, setting the zero-point to one standard deviation below the mean: compute cpcare = pcare-5.61866-1.573532622. Gives us… (Constant) cpbasic cpcare cbasxcar B Std. Error 5.8106 0.0414 0.5907 0.0561 0.3721 0.0219 0.0472 0.0179 Beta 0.3903 0.4981 0.0976 t 140.3721 10.5303 16.9960 2.6399 Sig. 0.0000 0.0000 0.0000 0.0084 Lower Bound 5.7294 0.4806 0.3291 0.0121 Upper Bound 5.8918 0.7008 0.4151 0.0822 16 Choice of levels of Z for simple slopes Interaction of Care x Basics on Satisfaction 8 •+/- 1 standard deviation 7 •Range of values Satisfaction Level 6 5 Low Value of care Mean of care High value of care 4 •Meaningful cutoffs 3 2 1 0 Low X High X Low/High Basics 17 Wrapping up CV Interactions  Interaction term (highest order) is invariant:    Upper limits on correlations governed by rxx Crossing point of regression lines      Assumes all lower-level terms are included For Hygeine: -10.9 (centered) For Caring atmosphere: -7.9 If your work involves complicated interaction hypotheses – Examine Aiken & West (1991). Section 7.7 not covered, but good discussion Cannot interpret β weights using method discussed here 18 Polynomial Regression (10,000 Ft) Polynomial 3rd Power Polynomial 2nd Power 25.00000000 9.00000000 8.00000000 20.00000000 7.00000000 15.00000000 6.00000000 5.00000000 10.00000000 Series1 Poly. (Series1) Series1 Poly. (Series1) 4.00000000 5.00000000 3.00000000 0.00000000 0.0000 2.00000000 -3.0000 -2.0000 -1.0000 1.0000 2.0000 3.0000 1.00000000 -5.00000000 -3.0000 -2.0000 -1.0000 0.00000000 0.0000 1.0000 2.0000 3.0000 -10.00000000 -1.00000000 Polynomial 4th Power Polynomial to the 5th Power 60.0000 140.0000 120.0000 50.0000 100.0000 40.0000 X+X^2+X^3+X^4 80.0000 30.0000 60.0000 Series1 Poly. (Series1) Series1 Poly. (Series1) 40.0000 20.0000 20.0000 10.0000 -3.0000 -3.0000 -2.0000 -1.0000 0.0000 0.0000 -2.0000 -1.0000 0.0000 0.0000 1.0000 2.0000 3.0000 -20.0000 1.0000 2.0000 3.0000 -40.0000 -10.0000 Y' -60.0000 Y 19 Predicting job satisfaction from IQ Model Summaryb Model 1 R a .365 Adjusted R Square .111 R Square .134 Std. Error of the Estimate 2.10543 a. Predictors: (Constant), IQ b. Dependent Variable: JobSat ANOVAb Model 1 Regression Residual Total Sum of Squares 25.952 168.448 194.400 df 1 38 39 Mean Square 25.952 4.433 F 5.855 Sig. .020a a. Predictors: (Constant), IQ b. Dependent Variable: JobSat 20 Continued Unstandardized Coefficients Standardized Coefficients 95% Confidence Interval for B B Std. Error Beta t Sig. Lower Bound Upper Bound (Constant) -0.8432 3.1763 -0.2655 0.7921 -7.2734 5.5869 IQ 0.0720 0.0297 0.3654 2.4196 0.0204 0.0118 0.1322 Dependent Variable: JobSat It’s all good, let’s inspect our standardized predicted by residual graph  21 Ooops! 22 Next step…    Center X Square X Add X2 to prediction equation compute c_IQ=IQ-106.20. compute IQsq=c_IQ**2. execute. REGRESSION /DESCRIPTIVES MEAN STDDEV CORR SIG N /MISSING LISTWISE /STATISTICS COEFF OUTS CI R ANOVA COLLIN TOL ZPP /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT JobSat /METHOD=ENTER IQ IQsq /SCATTERPLOT=(*ZRESID ,*ZPRED ) . 23 Results Model Summaryb Model 1 R a .944 Adjusted R Square .886 R Square .891 Std. Error of the Estimate .75534 a. Predictors: (Constant), IQsq, IQ b. Dependent Variable: JobSat ANOVAb Model 1 Regression Residual Total Sum of Squares 173.290 21.110 194.400 df 2 37 39 Mean Square 86.645 .571 F 151.867 Sig. .000a a. Predictors: (Constant), IQsq, IQ b. Dependent Variable: JobSat 24 And both predictors are significant (Constant) IQ IQsq B Std. Error -6.5099 1.1928 0.1455 0.0116 -0.0171 0.0011 Beta 0.7385 -0.9472 t -5.4575 12.5294 -16.0700 Sig. 0.0000 0.0000 0.0000 Lower Bound -8.9269 0.1219 -0.0192 Upper Bound -4.0930 0.1690 -0.0149 25 Interpretation Issues & Model Comparison   Linearity vs. Nonlinearity  Nonlinear effects well-established?  Replicability of nonlinearity  Degree of nonlinearity Interpretation Issues  Regression coefficients context specific  Assumption that we are testing “the” model β-weights vs. b-weights  Replication  Strength of relationship Model Comparison  May sometimes wish to determine whether one model significantly better predictor than another (where different variables are used)    E.g., which of two sets of predictors best predict relapse? 26 Strength of relationship: My test is sooo valid! Example of "Artificial" Correlation 10.0000 9.0000 R2 = 0.4499 8.0000 7.0000 Outcome 6.0000 Y Linear (Y) Linear (Y) 5.0000 4.0000 3.0000 2.0000 1.0000 0.0000 0.0000 1.0000 2.0000 3.0000 4.0000 5.0000 6.0000 7.0000 8.0000 9.0000 10.0000 Predictor 27

Correlation, Simple and Multiple Regression

Related documents

Products

Support

Correlation, Simple and Multiple Regression

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib