“Teach A Level Maths” Statistics 1 Linear Scaling of Regression Data © Christine Crisp Linear Scaling of Regression Data Statistics 1 AQA EDEXCEL OCR "Certain images and/or photos on this presentation are the copyrighted property of JupiterImages and are being used with permission under license. These images and/or photos may not be copied or downloaded without permission from JupiterImages" Linear Scaling of Regression Data We may want to change the units that have been used when collecting data. For example, we may want kilometres instead of miles or kilograms instead of pounds. Sometimes we may simply want to reduce the size of the numbers in data items. In both these cases we talk about scaling or coding the data. When dealing with regression lines, we can alter a regression line to different units without converting the original data. Linear Scaling of Regression Data A researcher has the following data giving the average daily maximum temperature for each month in 1980 and the corresponding figures for the sales of milk in a supermarket. J F M A M J J A S O N D Temp (°F), F 40 38 46 55 60 70 65 69 64 58 47 49 Sales ( thousands of 8 pints ), y 9 8 5 6 3 4 2 3 7 8 10 The y on F regression line is y 17 84 0 2134F The correlation coefficient is 0·89. Recent data are measured in degrees Celsius and thousands of litres and the researcher wants to compare the sets of results by converting the older ones. Linear Scaling of Regression Data The conversion from degrees Fahrenheit to degrees Celsius is 5 C ( F 32) 9 To convert from pints to litres, we must divide by 1·76 so, if Y is the new variable, y Y 1 76 As the conversions are both linear, instead of converting all the data we can simply substitute into the regression line. We first need to rearrange both conversion equations. Linear Scaling of Regression Data y Y 1 76 5 C ( F 32) 9 y 1 76 Y 9 C F 32 5 F 9 C 32 5 Substituting in y 17 84 0 2134F 9 1 76 Y 17 84 0 2134 C 32 5 Simplifying: 1 76 Y 17 84 0 3841C 6 83 1 76 Y 11 01 0 3841C Y 6 26 0 2183 C As we have only a small amount of data, we can check the effect of the scaling by converting the data and drawing both regression lines. Linear Scaling of Regression Data Graphs showing milk sales against temperature y 17 84 0 2134F Y 6 26 0 2183C Product Moment Correlation Coefficient (p.m.c.c.) The correlation coefficient is a measure of the spread of the data so is not altered by linear scaling. ( Although the scales are different on the diagrams, we can see that the scatter of the points is unchanged. ) Linear Scaling of Regression Data e.g.2 A set of data connects two variables, p and t. However, in order to calculate a regression line, the data has been coded using the formulae x 60t and y p 1000 If the regression line of y on x is y = 3·29 + 4·15x find the equation of the regression line for p on t. Solution: Substitute for y and x: y 3 29 4 15 x p 1000 3 29 4 15(60t ) p 1003 29 249t Linear Scaling of Regression Data SUMMARY To convert a scaled or coded regression equation, substitute for the variables using the conversion formulae. The product moment correlation coefficient (p.m.c.c.) is not changed by linear scaling. Linear Scaling of Regression Data Exercise 1. Data for 2 variables, v and z have been scaled using the formulae v x , 100 y z 1000 If the equation of the resulting y on x regression line is y 0 44 2 16 x find the equation of the regression line of v on z. Solution: y 0 44 2 16 x becomes v z 1000 0 44 2 16 100 z 1000 44 0 0216v Linear Scaling of Regression Data Exercise 2. A company rep. records the distance travelled, m miles, and time taken, t minutes, for 5 journeys. The data were converted to kilometres and hours and are summarised below. The formulae for the conversions are x 1422 t 8m and y x 60 5 2 y 32 4 x 462464 x y 10326 n5 (a) Find the equation of the y on x regression line, giving the values of the constants correct to 2 d.p. (b) Use your answer to (a) to find the equation of the regression line of t on m. (c) What effect would the conversion have on the product moment correlation coefficient? Linear Scaling of Regression Data Solution: x 1422 y 32 4 x 2 462464 x y 10326 (a) The y on x regression line: S xy x y b S xy S xx 1111 44 0 0191 x y n S xx x 2 x 58047 2 2 n a y bx 1 0345 y a bx y 1 03 0 02 x (b) The equation of the regression line of t on m. t 8m so y 1 03 0 02 x becomes x , y 60 5 t 8m 1 03 0 02 60 5 t 61 8 1 92m (c) The conversion has no effect on the p.m.c.c. Linear Scaling of Regression Data The following slides contain repeats of information on earlier slides, shown without colour, so that they can be printed and photocopied. For most purposes the slides can be printed as “Handouts” with up to 6 slides per sheet. Linear Scaling of Regression Data We may want to change the units that have been used when collecting data. For example, we may want kilometres instead of miles or kilograms instead of pounds. Sometimes we may simply want to reduce the size of the numbers in data items. In both these cases we talk about scaling or coding the data. When dealing with regression lines, we can alter a regression line to different units without converting the original data. Linear Scaling of Regression Data A researcher has the following data from 1980 giving the monthly average daily maximum temperature and sales of milk in a supermarket. J F M A M J J A S O N D Temp (°F), F 40 38 46 55 60 70 65 69 64 58 47 49 Sales ( thousands of 8 pints ), y 9 8 5 6 3 4 2 3 7 8 10 The y on F regression line is y 17 84 0 2134F The correlation coefficient is 0·89. Recent data are measured in degrees Celsius and thousands of litres and the researcher wants to compare the sets of results by converting the older ones. Linear Scaling of Regression Data The conversion from degrees Fahrenheit to degrees Celsius is 5 C ( F 32) 9 To convert from pints to litres, we must divide by 1·76 so, if Y is the new variable, y Y 1 76 As the conversions are both linear, instead of converting all the data we can simply substitute into the regression line. We first need to rearrange both conversion equations. Linear Scaling of Regression Data y Y 1 76 5 C ( F 32) 9 y 1 76Y 9 C F 32 5 F 9 C 32 5 Substituting in y 17 84 0 2134F 9 1 76 Y 17 84 0 2134 C 32 5 Simplifying: 1 76 Y 17 84 0 3841C 6 83 1 76 Y 11 01 0 3841C Y 6 26 0 2183C As we have only a small amount of data, we can check the effect of the scaling by converting the data and drawing both regression lines. Linear Scaling of Regression Data Graphs showing milk sales against temperature y 17 84 0 2134F Y 6 26 0 2183C Product Moment Correlation Coefficient (p.m.c.c.) The correlation coefficient is a measure of the spread of the data so is not altered by linear scaling. ( Although the scales are different on the diagrams, we can see that the scatter of the points is unchanged. ) Linear Scaling of Regression Data SUMMARY To convert a scaled or coded regression equation, substitute for the variables using the conversion formulae. The product moment correlation coefficient (p.m.c.c.) is not changed by linear scaling.