Linear Regression The Science of Predicting Outcome Least-Squares Regression LSR is a method for finding a line that summarizes the relationship between two variables Regression line is a straight line that describes how a response variable y changes as an explanatory variable x changes We often use a regression line to predict the value of y for a given value of x LSRL: Least Square Regression Line Y-intercept Slope Example #1 - Finding the LSRL • Consider the following data: • With this data, find the LSRL • Start by entering this data into list 1 and list 2 Shoe Size (men’s U.S.) 7 10 12 8 9.5 10.5 11 12.5 13.5 10 Height (in) 64 69 71 68 71 70 72 74 77 68 Example #1 - Finding the LSRL We need our graphing calculator to solve the first Case for today Example #1 - Finding the LSRL You should then see the results of the regression. a=53.24 b=1.65 r-squared=.8422 r=.9177 This is the correlation coefficient for the scatterplot!! Example #2 – Interpreting LSRL Interpreting the intercept When your shoe size is 0, you should be about 53.24 inches tall (Of course this does not make much sense in the context of the problem) Interpreting the slope For each increase of 1 in the shoe size, we would expect the height to increase by 1.65 inches Example #3 – Using LSRL Making predictions How tall might you expect someone to be who has a shoe size of 12.5? Just plug in 12.5 for the shoe size above, so… Height = 53.24+1.65 (12.5)=73.865 inches (this is a prediction and is therefore not exact.) Practice Student 1 2 A. Find the strength of correlation between the 2 variables 3 6 7 B. Write the linear model for this data set 9 11 13 4 C. What will be your BAC level if you drink 6 bottle of beers. 5 8 10 12 14 15 16 Number of Beers 5 2 9 7 3 3 4 5 8 3 5 5 6 7 1 4 Blood Alcohol Level 0.1 0.03 0.19 0.095 0.07 0.02 0.07 0.085 0.12 0.04 0.06 0.05 0.1 0.09 0.01 0.05 Coefficients a and b The slope is: The intercept is: S-sub y and s-sub x are the sample standard deviations of y and x (kinda like rise over run) y-bar and x-bar are the mean y and x respectively The equation of the least squares regression line is written as: This table describes a study that recorded data on number of beers consumed and blood alcohol content (BAC) for 16 students. Here is some partial computer output from Minitab relating to these data: Y-intercept Slope (a) Use the computer output to write the equation of the least-squares line. (b) Interpret the slope and y intercept of the equation in this setting. (c) What blood alcohol level would your equation predict for a student who consumed 6 beers? Answers (a) If y = blood alcohol content (BAC) and x = number of beers, BAC = −0.01270 + 0.017964(number of beers). (b) Slope: for every extra beer consumed, the BAC will increase by an average of 0.017964. Intercept: if no beers are consumed, the BAC will be, on average, −0.01270 (obviously meaningless). (c) Predicted BAC = 0.0951 Here’s a computer generated output of 2 bivariate data. Write a linear model that corresponds to these set of data. y-hat = -0.124 + 0.0179(x) Class Activity: Arm-span vs Height “On predicting height given arm span “ Students will measure their height and arm span. Then they will write the LSRL from the data they collected and predict a person’s arm span with their height.