  y x Cov

advertisement
251progscor 3/25/06
Finding a Sample correlation.
Doing Old Computational Problem 1b on Minitab.
Downing and Clark (formerly pg. 348 now posted at end of 251hwkadd) Old Computational Problem
1: This is obviously sample data, so we compute only a sample covariance and correlation.
b) Compute Covx, y  and Corr x, y  .
x 34 26 9 30 47 10 34 34 45 10 47 32 47 8 45
y 6 57 89 60 95 42 31 28 90 25 45 23 52 95 48
The program Samcov will compute the sample covariance and correlation for a set of x and y points. The
setup is as below. The joint probability is in rows 1-5 of C10-C14. x is in C15 and y is in C17.
Column Number
Column Label
C40
x
Row 1
Row 2
Row 3
Row 4
Row 5
Row 6
Row 7
Row 8
Row 9
Row 10
Row 11
Row 12
Row 13
Row 14
Row 15
34
26
9
30
47
10
34
34
45
10
47
32
47
8
45
C41
C42
y
6
57
89
60
95
42
31
28
90
25
45
23
52
95
48
The space allocation is as follows for
computation of the covariance and correlation.
C41
x
x2
(Column)
C42
y
(Column)
C40
C43
y
2
(Column)
C44
xy
(Column)
K44
x
x
y
y
 xy
K45
K46
K47
n
x
y
K48
s x2
K49
s 2y
K50
s xy
K51
sx
K52
sy
K53
r xy
K54
r xy2
K40
K41
K42
K43
The space allocation is as follows for
computation of the variances.
C1
Input is set up by Samcov
x
2
C2
x
K1
K2
x
x
K3
K4
n
x
K5
K6
K7
s2
s
n 1
K8
sx 
2
s
n
(Column)
2
2
Set up a storage area for the program modules, called ‘exec’s by the current version of Minitab.
1
251progscor 3/25/06
Load the following modules into the storage area that you have set up for the execs. Each of these is a
separate document in ‘txt’ format. Note that # introduces a comment. The first line gives the name of the
file.
#251samcov.txt Computes sample variances
# and covariances using 'var973'
# Input is x column in C40
# y column in C42
name c40 'x'
name c41 'xsq'
name c42 'y'
name c43 'ysq'
name c44 'xy'
name k40 'sumx'
name k41 'sumx2'
name k42 'sumy'
name k43 'sumy2'
name k44 'sumxy'
name k45 'n'
name k46 'xbar'
name k47 'ybar'
name k48 'svarx'
name k49 'svary'
name k50 'scovxy'
name k51 'sx'
name k52 'sy'
name k53 'rxy'
name k54 'rxy2'
let c1=c40
execute 'var973.txt'
let C41=c2
let k40=k1
let k41=k2
let k46=k4
let k48=k5
let k51=k6
let c1=c42
execute 'var973.txt'
let C43=c2
let k42=k1
let k43=k2
let k45=k3
let k47=k4
let k49=k5
let k52=k6
let c44=c40*c42
let k44=sum(c44)
let k50=k45*k46*k47
let k50=k44-k50
let k50=k50/k7
let k53=k51*k52
let k53=k50/k53
let k54=k53*k53
Print c40-c44
Print k40-k54
end
#var973.mtb
#computes sample variance of data in C1.
let k1=sum(c1)
let c2=c1 * c1
let k2=sum(c2)
let k3=count(c1)
let k4=k1/k3 #mean
let k5=k3*k4*k4
let k5=k2-k5
print k5
name k1 'sum'
name k2 'sumsq'
name k3 'count'
name k4 'smean'
name k5 'svar'
name k6 'sdev'
name k7 'DF'
name k8 'sterr'
let k7 = k3 - 1
let k5 = k5/k7
let k6 = sqrt(k5)
let k8 = k5/k3
let k8 = sqrt(k8)
describe c1
print c1-c2
print k1-k8
end
To use these execs, click on the command area of the screen, open the ‘editor’ pull-down menu and select
‘enable commands.’ Then open the ‘file’ pull-down menu, select ‘other files,’ then ‘run an exec’. Leave the
number of times to execute at one and click on ‘select file.’ Locate samcov – you may have to type it in –
and hit ‘open.’ This will start the execs running. My results follow with comments. The information given at
the beginning of the document was entered by hand before the execs were run. Explanations are indented.
————— 3/25/2006 1:32:15 AM ————————————————————
Welcome to Minitab, press F1 for help.
Executing from file: C:\Documents and Settings\rbove\My
Documents\Minitab\251samcov.txt
2
251progscor 3/25/06
Executing from file: var973.txt
Data Display
K5
3105.73
Var 973 is called twice by the main program to compute the sample variances. The display above
is the variance of x.
Descriptive Statistics: C1
Variable
C1
N
15
N*
0
Mean
30.53
SE Mean
3.85
StDev
14.89
Minimum
8.00
Q1
10.00
Median
34.00
Q3
45.00
Maximum
47.00
Var 973 uses the command ‘describe C1’ to get statistics on x. The display above is the value of n,
s
the number of blank lines, the mean, s x  x or the standard error of the mean, and the 5-number
n
summary.
Data Display
Row
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
C1
34
26
9
30
47
10
34
34
45
10
47
32
47
8
45
C2
1156
676
81
900
2209
100
1156
1156
2025
100
2209
1024
2209
64
2025
The display above consists of the x and x 2 columns used in the variance computation.
Data Display
sum
sumsq
count
smean
svar
sdev
DF
sterr
458.000
17090.0
15.0000
30.5333
221.838
14.8942
14.0000
3.84567
The display above consists of
 x , x
2
, n , x , s x2 , s x , n 1 and s x 
s
.
n
Executing from file: var973.txt
Data Display
svar
11465.6
The display above is the variance of y.
Descriptive Statistics: C1
Variable
C1
N
15
N*
0
Mean
52.40
SE Mean
7.39
StDev
28.62
Minimum
6.00
Q1
28.00
Median
48.00
Q3
89.00
Maximum
95.00
3
251progscor 3/25/06
Var 973 uses the command ‘describe C1’ to get statistics on y. The display above is the value of n,
sy
the number of blank lines, the mean, s y 
or the standard error of the mean, and the 5-number
n
summary.
Data Display
Row
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
C1
6
57
89
60
95
42
31
28
90
25
45
23
52
95
48
C2
36
3249
7921
3600
9025
1764
961
784
8100
625
2025
529
2704
9025
2304
The display above consists of the y and y 2 columns used in the variance computation.
Data Display
sum
sumsq
count
smean
svar
sdev
DF
sterr
786.000
52652.0
15.0000
52.4000
818.971
28.6177
14.0000
7.38905
The display above consists of
Data Display
Row
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
x
34
26
9
30
47
10
34
34
45
10
47
32
47
8
45
xsq
1156
676
81
900
2209
100
1156
1156
2025
100
2209
1024
2209
64
2025
y
6
57
89
60
95
42
31
28
90
25
45
23
52
95
48
ysq
36
3249
7921
3600
9025
1764
961
784
8100
625
2025
529
2704
9025
2304
y, y
2
, n , y , s 2y , s y , n 1 and s y 
sy
.
n
xy
204
1482
801
1800
4465
420
1054
952
4050
250
2115
736
2444
760
2160
The display above consists of the x , x 2 , y , y 2 and xy columns.
4
251progscor 3/25/06
Data Display
sumx
sumx2
sumy
sumy2
sumxy
n
xbar
ybar
svarx
svary
scovxy
sx
sy
rxy
rxy2
458.000
17090.0
786.000
52652.0
23693.0
15.0000
30.5333
52.4000
221.838
818.971
-21.8714
14.8942
28.6177
-0.0513127
0.00263299
 x ,  x ,  y ,  y ,  xy
2
The final display above consists of
2
, n , x , y , s x2 ,
s 2y , s xy , s x , s y , r xy and r xy2 . We now have all the information we need to fake the following
obs
1
x
34
y
6
x2
1156
y2
36
xy
204
2
3
4
26
9
30
57
89
60
676
81
900
3249
7921
3600
1482
801
1800
5
47
10
95
42
2209
100
9025
6
1764
4465
420
7
8
34
34
31
28
1156
1156
961
784
1054
952
computation.
9
45
90
2025
8100
4080
10
11
10
47
25
45
100
2209
625
2025
250
2115
12
32
23
1369
529
736
13
47
52
2209
2704
2444
14
15
8
45
95
48
64
2025
9025
2304
760
2160
Total 463 786 17435
52652
23808
 x  463 ,  x
x
2
 17435 ,
 y  786 ,  y
 x  463  30.867
y
n
s x2 
15
 y  786  52.400
n
sxy 
rxy 
s 2y 
15
x
2
2
 52652 and
 nx 2
n 1
y
2
 ny 2
n 1
 xy  23808 .

17435  1530 .867 2
 224 .5524
14

52652  1552 .400 2
 818 .9714
14
 xy  nx y  23808  1530.867 52.400   453 .2  32.3714
n 1
sxy
sx s y

14
14
 32 .3714
224 .5514 818 .9714
and
 0.07548 .
5
Download