Computer lab 2: Simple linear regression – model validation and matrix representation A simple linear regression model is composed of a linear function of an explanatory variable x, and a random error ε. The construction of confidence and prediction intervals is based on the assumption that all the error terms are statistically independent and N(0 ; σ) for some σ > 0. Furthermore, it is assumed that the error terms and the x-variables are independent. The soundness of these assumptions can be examined by investigating the model residuals, i.e. the differences between observed and predicted response values. Matrix representations of regression models have the advantage that they enable statistical inference from models involving two or more explanatory variables. Learning objectives After reading the recommended text and completing the computer lab the student shall be able to: • • To investigate and formally test whether or not a simple linear regression model can be regarded as a correct model of a given data set. To write a simple linear regression model in matrix form and to employ matrix operations to estimate the model parameters. Recommended reading Chapter 3 – 5 in Kutner et al. Assignment 1: Model validation using residual plots Consider the data set in exercise 1.20 in the textbook and carry out the following: Use Minitab 15 (Stat → Regression → Regression) to investigate the relationship between total number of minutes spent by the service person and the number of copiers serviced. Make all the different residual plots that are offered. Also plot the residuals against the number of copies serviced. Which conclusions can be drawn from the five residual plots? Is there any evidence of: i. ii. iii. iv. Outliers Trends in observation order Non-constant variance Relationship between the error terms and the levels of the explanatory variable How does the plot of residuals against fitted values differ from the plot of residuals against the levels of the explanatory variable? Assignment 2: Matrix representation of regression models Let us first examine the matrix operations offered in MINITAB 15 (Calc → Matrices). Small matrices can easily be entered from the keyboard. Start by clicking Editor → Enable commands. Then, click Read, enter the number of rows and columns and the name of the matrix, and click OK. (This is equivalent to typing commands like read 4 3 m1 in the session window.) Finally enter the matrix elements row by row. Check that your matrix has been correctly entered by typing the command prin m1 in the session window or using the toolbar commands Manip → Display Data. In the following, we use the cited matrix operations to compute parameter estimates in a simple linear regression model. a) Consider the data set from exercise 1.21 in the textbook and create a 10*2 matrix X that has ones in the first column and the levels of the explanatory variable in the second column. b) Transpose the X-matrix. Compute and print the matrix (vector) X’Y, where Y is the 10*1 matrix (vector) of response values. c) Compute and print the matrix (X’X)-1 d) Estimate the intercept and slope parameters of the regression model by computing b = (X’X)-1 X’Y. e) Estimate the variance of the error terms by computing MSE =e’e/(n-2), where e is the 10*1 matrix (vector) of residuals and n = 10. f) Finally, estimate the covariance matrix of b by computing MSE (X’X)-1. Try to explain why the estimates of the intercept and the slope are correlated. To hand in Answers to Assignment 1 and c, d, e, f in Assignment 2. The lab report should be handed in no later than 5 days after the scheduled computer lab. Use Lisam (lisam.liu.se) for handing in the assignments.