LP 130123 Y12 39 Linear Scaling of Regression Data

advertisement
“Teach A Level Maths”
Statistics 1
Linear Scaling of
Regression Data
© Christine Crisp
Linear Scaling of Regression Data
Statistics 1
AQA
EDEXCEL
OCR
"Certain images and/or photos on this presentation are the copyrighted property of JupiterImages and are being used with
permission under license. These images and/or photos may not be copied or downloaded without permission from JupiterImages"
Linear Scaling of Regression Data
We may want to change the units that have been used
when collecting data.
For example, we may want kilometres instead of miles or
kilograms instead of pounds.
Sometimes we may simply want to reduce the size of the
numbers in data items.
In both these cases we talk about scaling or coding the
data.
When dealing with regression lines, we can alter a
regression line to different units without converting the
original data.
Linear Scaling of Regression Data
A researcher has the following data giving the average daily
maximum temperature for each month in 1980 and the
corresponding figures for the sales of milk in a supermarket.
J
F M A M J
J
A
S
O N D
Temp (°F), F 40 38 46 55 60 70 65 69 64 58 47 49
Sales
( thousands of 8
pints ), y
9
8
5
6
3
4
2
3
7
8 10
The y on F regression line is y  17  84  0  2134F
The correlation coefficient is 0·89.
Recent data are measured in degrees Celsius and
thousands of litres and the researcher wants to compare
the sets of results by converting the older ones.
Linear Scaling of Regression Data
The conversion from degrees Fahrenheit to degrees Celsius
is
5
C  ( F  32)
9
To convert from pints to litres, we must divide by 1·76
so, if Y is the new variable,
y
Y
1 76
As the conversions are both linear, instead of converting
all the data we can simply substitute into the regression
line.
We first need to rearrange both conversion equations.
Linear Scaling of Regression Data
y
Y
1  76
5
C  ( F  32)
9

y  1  76 Y

9
C  F  32
5

F
9
C  32
5
Substituting in y  17  84  0  2134F
9

1  76 Y  17  84  0  2134  C  32 
5

Simplifying:
1  76 Y  17  84  0  3841C  6  83
1  76 Y  11  01  0  3841C
Y  6  26  0  2183 C
As we have only a small amount of data, we can check
the effect of the scaling by converting the data and
drawing both regression lines.
Linear Scaling of Regression Data
Graphs showing milk sales against temperature
y  17  84  0  2134F
Y  6  26  0  2183C
Product Moment Correlation Coefficient (p.m.c.c.)
The correlation coefficient is a measure of the spread of
the data so is not altered by linear scaling.
( Although the scales are different on the diagrams, we
can see that the scatter of the points is unchanged. )
Linear Scaling of Regression Data
e.g.2 A set of data connects two variables, p and t.
However, in order to calculate a regression line, the data
has been coded using the formulae
x  60t and y  p  1000
If the regression line of y on x is y = 3·29 + 4·15x
find the equation of the regression line for p on t.
Solution:
Substitute for y and x:
y  3  29  4  15 x  p  1000  3  29  4  15(60t )
 p  1003  29  249t
Linear Scaling of Regression Data
SUMMARY
To convert a scaled or coded regression equation,
substitute for the variables using the conversion
formulae.
The product moment correlation coefficient (p.m.c.c.)
is not changed by linear scaling.
Linear Scaling of Regression Data
Exercise
1. Data for 2 variables, v and z have been scaled
using the formulae
v
x
,
100
y  z  1000
If the equation of the resulting y on x regression line
is
y  0  44  2  16 x
find the equation of the regression line of v on z.
Solution:
y  0  44  2  16 x becomes
v
z  1000  0  44  2  16
100

z  1000  44  0  0216v
Linear Scaling of Regression Data
Exercise
2. A company rep. records the distance travelled, m
miles, and time taken, t minutes, for 5 journeys. The
data were converted to kilometres and hours and are
summarised below. The formulae for the conversions are
 x  1422
t
8m
and y 
x
60
5
2
 y  32  4  x  462464
 x y  10326
n5
(a) Find the equation of the y on x regression line, giving
the values of the constants correct to 2 d.p.
(b) Use your answer to (a) to find the equation of the
regression line of t on m.
(c) What effect would the conversion have on the product
moment correlation coefficient?
Linear Scaling of Regression Data
Solution:
 x  1422
 y  32  4
x
2
 462464
 x y  10326
(a) The y on x regression line:
S xy   x y 
b
S xy
S xx
 1111 44
 0  0191
 x y
n
S xx   x 
2
 x 
 58047  2
2
n
a  y  bx  1  0345
y  a  bx
 y  1  03  0  02 x
(b) The equation of the regression line of t on m.
t
8m
so y  1  03  0  02 x becomes
x
, y
60
5
t
8m
 1  03  0  02
60
5
 t  61  8  1  92m
(c) The conversion has no effect on the p.m.c.c.
Linear Scaling of Regression Data
The following slides contain repeats of
information on earlier slides, shown without
colour, so that they can be printed and
photocopied.
For most purposes the slides can be printed
as “Handouts” with up to 6 slides per sheet.
Linear Scaling of Regression Data
We may want to change the units that have been used
when collecting data.
For example, we may want kilometres instead of miles or
kilograms instead of pounds.
Sometimes we may simply want to reduce the size of the
numbers in data items.
In both these cases we talk about scaling or coding the
data.
When dealing with regression lines, we can alter a
regression line to different units without converting the
original data.
Linear Scaling of Regression Data
A researcher has the following data from 1980 giving the
monthly average daily maximum temperature and sales of
milk in a supermarket.
J
F M A M J
J
A
S
O N D
Temp (°F), F 40 38 46 55 60 70 65 69 64 58 47 49
Sales
( thousands of 8
pints ), y
9
8
5
6
3
4
2
3
7
8 10
The y on F regression line is y  17  84  0  2134F
The correlation coefficient is 0·89.
Recent data are measured in degrees Celsius and
thousands of litres and the researcher wants to compare
the sets of results by converting the older ones.
Linear Scaling of Regression Data
The conversion from degrees Fahrenheit to degrees Celsius
is
5
C  ( F  32)
9
To convert from pints to litres, we must divide by 1·76
so, if Y is the new variable,
y
Y
1 76
As the conversions are both linear, instead of converting
all the data we can simply substitute into the regression
line.
We first need to rearrange both conversion equations.
Linear Scaling of Regression Data
y
Y
1  76
5
C  ( F  32)
9

y  1  76Y

9
C  F  32
5

F
9
C  32
5
Substituting in y  17  84  0  2134F
9

1  76 Y  17  84  0  2134  C  32 
5

Simplifying:
1  76 Y  17  84  0  3841C  6  83
1  76 Y  11  01  0  3841C
Y  6  26  0  2183C
As we have only a small amount of data, we can check
the effect of the scaling by converting the data and
drawing both regression lines.
Linear Scaling of Regression Data
Graphs showing milk sales against temperature
y  17  84  0  2134F
Y  6  26  0  2183C
Product Moment Correlation Coefficient (p.m.c.c.)
The correlation coefficient is a measure of the spread of
the data so is not altered by linear scaling.
( Although the scales are different on the diagrams, we
can see that the scatter of the points is unchanged. )
Linear Scaling of Regression Data
SUMMARY
To convert a scaled or coded regression equation,
substitute for the variables using the conversion
formulae.
The product moment correlation coefficient (p.m.c.c.)
is not changed by linear scaling.
Download