732G18/732G21/732A22 Linear statistical models Department of Computer and Information Science Computer lab 1: Simple linear regression – standard analyses Simple linear regression models are used to examine the relationship between a response variable y and an explanatory variable x, when we have made n independent observations (xi, yi), i = 1, …, n of the two variables. Learning objectives After reading the recommended text and completing the computer lab the student shall be able to: Formulate a simple linear regression model and explain how the model parameters can be interpreted; Use a data set to make inference about the model parameters; Compute a confidence interval for the expected response at a given level of the explanatory variable; Compute a prediction interval for a new observation of the response at a given level of the explanatory variable; Be familiar with the regression tools in Excel and MINITAB. Recommended reading Chapter 1 – 2.10 in Kutner et al: Applied Linear Statistical Models Exercise 1 Consider the data set in exercise 1.20 in the textbook and carry out the following: a) Use Excel (Insert Chart XY(Scatter)) and Minitab 15 (Graph Scatterplot Simple) to make a scatter plot of the observed service times versus the number of copying machines at the 45 customers that have been visited. b) Formulate a linear regression model of the observed data. How many parameters (unknown constants) are there in the model? c) Fit a straight line to the observed pairs of data using Excel (Chart Add a trend line) and Minitab 15 (Stat Regression Fitted line plot). How can you interpret the estimated slope of the fitted regression line? Has the intercept of the regression line a meaningful interpretation? d) Use the regression tools in Excel (Tools Data analysis Regression) and Minitab 15 (Stat Regression Regression) to estimate the parameters of the 732G18/732G21/732A22 Linear statistical models Department of Computer and Information Science e) f) g) h) i) j) k) fitted regression model. How large is the estimated standard deviation and variance of the error terms in the regression model? Use Excel to compute a 90% confidence interval of the slope of the regression line. Explain how this confidence interval can be interpreted. Use Minitab to compute a 90% confidence interval of the slope of the regression line. Check in the textbook how the information in the output from Minitab can be combined with a suitable t-value to compute the desired confidence interval. Compute a 90% confidence interval for the expected service time for customers with six copying machines. A customer with six copying machines calls for service. Compute a 90% prediction interval for the time it will take for that service. Explain the difference between the computed confidence and prediction intervals. Consider the ANOVA table in the regression output in Minitab. Which parameter is estimated by MSE? How is MSE related to s? Which hypothesis is tested in the F-test? How is the outcome of this test related to the confidence interval computed in exercise 1.e? To hand in 1. Answers to the questions in exercise 1 that are highlighted (yellow colour) 2. Solutions to the following exercises in the textbook: 1.19; 2.4; 2.13a,b,c; 2.23a,b,c no later than before the lesson on Thursday 4 September