Computer lab 6: Multiple linear regression – model selection and validation. In practice, we often have many possible explanatory variables in a multiple linear regression model and want to find a “good” model. With “good” we mean a model with few variables and which is acceptable compared to all other models. Learning objectives After reading the recommended text and completing the computer lab the student shall be able to: Understand and make use of different criteria for model selection Use different automatic search procedures for model selection Validate selected models Recommended reading Chapter 9 in Kutner et al. Assignment 1: Selection of regression models Study again the SENIC data in Appendix C.1. Carry out exercise 9.25 a, b. Note that only the cases 57-113 should be used in this analysis! Which model should be selected by using the criterion? Using the Cp criterion? Which model should be selected with backward elimination? Assignment 2: Validation Now we want to validate the models above. First we carry out internal validations of the three models. Run each model and study the criterion PRESS. Which model is the “best”? Is the “best” model a good one? Carry out an external validation of the “best” model by utilizing cases 1-56: First, fit a regression model to cases 1-56 (the validation set) with the same explanatory variables as in the “best” model found in the model-building set. Compare the model in the validation set with the model in the model-building set by investigating The estimated regression coefficients and their standard errors MSE R2 Next, calculate the mean squared prediction error, MSPR. This is done by predicting each observation in the validation set by utilizing the regression function estimated in the “best” model from the model-building set. These predicted ̂ together with the observed Yi in the validation set will give MSPR = *∑( ̂) (see book, page 370). Compare MSPR with MSE in the model-building set. Conclusion? To hand in Answers to assignment 1-2. The lab report should be handed in no later than 5 days after the scheduled computer lab. Use Lisam (lisam.liu.se) for handing in the assignments.