Stat 301 – Lecture 12 Multiple Regression A single numerical response variable, Y. Multiple numerical explanatory variables, X1, X2,…, Xk 1 Multiple Regression Y Y | x1 , x2 ,..., xk Y 0 1 x1 2 x2 ... k xk 2 Example Y, Response – Effectiveness score based on experienced teachers’ evaluations. Explanatory – Test 1, Test 2, Test 3, Test 4. 3 Stat 301 – Lecture 12 Student Eval Teacher 1 2 3 4 5 6 Test1 489 423 507 467 340 524 23 Test2 81 68 80 107 43 129 Test3 151 156 165 149 134 163 434 76 141 Test4 46 46 76 56 49 72 54 44 45 55 43 49 50 58 4 JMP Analyze – Fit Model Pick Role Variables Y – EVAL Construct Model Effects Add – Test1, Test2, Test3, Test4 5 JMP Analyze – Fit Model Personality – Standard Least Squares Emphasis – Minimal Report 6 Stat 301 – Lecture 12 Response EVAL Summary of Fit RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.802861 0.759052 37.53627 444.4783 23 Analysis of Variance Source Model Error C. Total DF 4 18 22 Sum of Squares Mean Square 103286.25 25821.6 25361.49 1409.0 128647.74 F Ratio 18.3265 Prob > F <.0001* Parameter Estimates Term Intercept Test1 Test2 Test3 Test4 Estimate Std Error t Ratio Prob>|t| -193.4994 125.3074 -1.54 0.1399 1.1158539 0.319746 3.49 0.0026* 2.243267 0.628449 3.57 0.0022* -1.367001 0.563965 -2.42 0.0261* 6.0482387 1.202281 5.03 <.0001* 7 Prediction Equation Predicted Evaluation = –193.50 + 1.116*Test1 + 2.243*Test2 – 1.367*Test3 + 6.048*Test4 8 Conditions The random error term, , is Independent Identically distributed Normally distributed with standard deviation, . 9 Stat 301 – Lecture 12 Estimate of Error Variance, 2 MSError SSError df Error y yˆ 2 MSError MSError n k 1 25361.49 1409.0 18 10 Estimate of Error Std Dev, Root Mean Square Error RMSE MSError RMSE 1409.0 37.54 11 Multiple R2 R2 SSModel SS 1 Error SSTotal SSTotal R2 103286.25 0.802861 128647.74 12 Stat 301 – Lecture 12 Interpretation 80.3% of the variation in the evaluation scores can be explained by the model, i.e. the relationship with the explanatory variables. 13 Caution Including additional explanatory variables in a model can only increase the value of R2, even if those explanatory variables have nothing to do with the response variable. 14 Adjusted R2 adjR 2 1 adjR 2 1 MSError MSTotal 25361.49 18 0.75905 128647.74 22 15 Stat 301 – Lecture 12 Test of Model Utility Is there any explanatory variable in the model that is helping to explain significant amounts of variation in the response? 16 Step 1: Hypotheses H 0 : 1 2 ... k 0 H A : at least one parameter is not zero 17 Step 2: Test Statistic F MSModel MSError 25821.6 18.3265 1409.0 P value 0.0001 F 18 Stat 301 – Lecture 12 Step 3: Decision Reject the null hypothesis because the P-value is so small. 19 Step 4: Conclusion At least one of the Test 1, Test 2, Test 3 or Test 4 is providing statistically significant information about the evaluation score. The model is useful. Maybe not the best, but useful. 20 Alternative Form R k 2 F 1 R2 1 n k 0.802861 4 18.3265 F 0.197139 18 P value 0.0001 21