Regression

•To be able to calculate the regression line of y on x •To be able to interpret the equation of the regression line

By the end of the lesson you should be able to answer this regression exam question

The meaning of the regression line

Carbon percentages against melting point

y = -4.0476x + 36.214

40 35 30 25 20 15 10 5 Rather than draw a line a best fit by eye you can calculate and plot accurately a line that minimises the distance between the data plotted and the line.

0 0 1 2 3 4 5

Carbon percentage

6 7 8 9 The vertical distance between the points and the line of best fit are called

residuals

(red lines) This is why the regression line is sometimes called the

least squares regression line

y = a + bx a is the value of the melting point when carbon is 0% b is the rate at which the melting point reduces as the carbon percentage increases

The meaning of the regression line

The y axis shows the dependent (or response) variable. These variables are determined by the x values 40 35 30 25 20 15

Carbon percentages against melting point

y = -4.0476x + 36.214

10 5 0 0 1 2 3 4 5

Carbon percentage

6 7 8 9 The x axis shows the independent (or explanatory) variable. It is set independently of the other variable

Important formulae

Sxx = Σx² - (Σx)² n Sxy = Σxy - ΣxΣy n b = Sxy Sxx a = y - bx Regression line equation y = a + bx

Example The results from an experiment in which different masses were placed on a spring and the resulting length of the spring measured, are shown below Mass, x (kg) Length, y (cm) 20 48 40 55.1

60 56.3

80 61.2

100 68 Σx = 300, Σx²=22000, x = 60, Σxy = 18238, Σy² = 16879.14, Σy = 288.6, y = 57.72

a) Calculate Sxx and Sxy b) Calculate the regression line of y on x c) Calculate the length of the spring when a mass of 50kg is added d) Calculate the length of the spring when a mass of 140kg is added. Give a reason why this may or may not be a reliable answer.

Mass, x (kg) Length, y (cm) 20 48 40 55.1

Σx = 300, Σx²=22000, x = 60, Σxy = 18238, Σy² = 16879.14, Σy = 288.6, y = 57.72

a) Calculate Sxx and Sxy Sxx = 22000 - 300 ² = 4000 5 Sxy = 18238 – 300x288.6 = 922 5 b) Calculate the regression line of y on x b = Sxy = 922 = 0.2305

Sxx 4000 60 56.3

a = y – bx a = 57.72 – 0.2305 x 60 = 43.89

y = 43.89 + 0.2305x

80 61.2

100 68

c) Calculate the length of the spring when a mass of 50kg is added d) Calculate the length of the spring when a mass of 140kg is added. Give a reason why this is or is not a reliable answer y = 43.89 + 0.2305x

c) Y = 43.39 + 0.2305x50 = 54.915cm

d) Y = 43.39 + 0.2305 x 140 = 75.66cm

This may not a reliable answer as it has been calculated using extrapolation. It could be unreliable140kg is outside the range of data given and used to calculate the regression line.