12/1/98 252z9861

advertisement
12/1/98 252z9861
2. The data below consists of 1000 monthly salaries.
a.
Use a chi-square test to see if they have a normal distribution with a mean of 1200 and a standard
deviation of 200. Note: You found the probabilities for the first three intervals on page 1. You can find
the others by symmetry. (6)
Would a Lilliefors or a Kolmogorov-Smirnov test be a correct more powerful method to test this
hypothesis?
Use the one of these that is appropriate to do the test in a). (6)
b.
Interval
Less than $800
$800-$1000
$1000-$1200
$1200-$1400
$1400-$1600
Probability
.0228
.1359
.3413
.3413
.1359
Expected
22.8
135.9
341.3
341.3
135.9
Observed
26
146
361
311
143
$1600 or above
.0228
22.8
Sum
1.0000
1000.0
Solution: a)The  2 test is done two ways.
13
1000
E  O  2
O2
E
E
26
22.8
-3.2 10.240
0.44912
29.649
146
135.9
-10.1 102.010
0.75063
156.851
361
341.3
-19.7 388.090
1.13709
381.837
311
341.3
30.3 918.089
2.68998
283.390
143
135.9
-7.1 50.410
0.37094
150.471
13
22.8
9.8 96.040
4.21228
7.412
1000 1000.0
0.0
9.61005
1009.610
From the last column  2  1009.611000  9.61 , which is the same as the result in the fifth column. H 0 is
N 1200 ,200  and, because there are 6 rows and no parameters are estimated from the data, there are 5
O
E O
E
E  O2
5 
degrees of freedom. Since  2 .05  11 .0705 is larger than our computed  2 , we accept H 0 .
b) You must use a Kolmogorov-Smirnov test due to the fact that the mean and variance are known.
Lilliefors is only appropriate if they are unknown and must be estimated from the data.
The table for the test appears below. The probabilities and O are repeated in the first and third columns. A
cumulative distribution is computed for E in the second column and a cumulative distribution for O is
computed in the fifth column. Finally D is computed in the last column.
Probability
.0228
.1359
.3413
.3413
.1359
.0228
1.0000
Fe
.0228
.1587
.5000
.8413
.9772
1.0000
O
26
146
361
311
143
13
1000
O
Fo
n
.0260
.1460
.3610
.3110
.1430
.0130
1.0000
.0260
.1720
.5330
.8440
.9870
1.0000
D
.0032
.0133
.0330
.0027
.0098
.0000
H 0 is again N 1200 ,200  . The maximum difference is .0330 and we compare it to the critical value from
the K-S table.. For a significance level of 5% this is
1.36
n
is smaller than the critical value we accept H 0 .
5

1.36
1000
 .0430 . Since the maximum difference
12/1/98 252z9861
3. The data below represents samples of consumer ratings of three different displays.
a. Assuming that each column represents a random sample from a normal distribution and that variances of
the parent populations are similar, compare the means of the three populations. (9) Column sums are now
given.
 x1  225,  x12  10475,  x2  175,  x22  6375,  x3  180,  x32  6850 .
b. Do a confidence interval for the difference between mean ratings of the first and third displays,
assuming that the interval is one of three possible contrasts. (3)
c. Assume that each row represents the opinions of a single individual and find if there are significant
differences between display means and means for individuals.(8)
Solution: New material is added in boldface.
Display
Display
Display
1
2
3
50
45
45
45
30
35
30
25
20
45
35
40
55
40
40
Sum
225
175
180
5
5
5
nj
45
x j 
SS
x j 
a)
35
10475
2025
2
36
6375
1225
6850
1296
One-way ANOVA
SST 

2
j .j
DF
2
12
14
MS
151.647
80.833
F
1.876
difference between display means.’
b) From the outline, the Scheffe interval is
m  1Fm 1, n  m  s
 45  36   24.75  80 ,8333
6
6550
4150
1925
4850
6225
23700
2
 xijk
2177.7778
1344.4444
625.0000
1600.0000
2025.0000
7772.2222
x
23700
4546
2
 xijk
 x i 2
 x .2j .
x
2
.j
 nx 2
x
2
i.
 nx 2
 3772 .2222   1538 .6667 2
2,12  4.75
Since F.05
we accept H 0 which is ‘no
 9  15 .95
46.6667
36.6667
25.0000
40.0000
45.0000
(38.6667)
x
(38.6667)
SSR  C
 303 .2947
SSW  SST  SSB
1   2  x1  x3  
3
3
3
3
3
15
n
 303 .2947
 nx 2
SS
303.2947
970
1293.2947
xi 2
 54546   15 38 .6667 2
 54546   15 38 .6667 2
Source
Between
Within
Total
140
110
75
120
135
580
15
SS
x i 
SSC  R
 1273 .2947
n x
ni
c) Two-way ANOVA
SST  Same as 1-way ANOVA
x 2  nx 2
 23700  15 38 .6667 2
SSB 
Sum
1 1
6
6
1
1

n1 n2
 889 .9613
SSW  SST  SSR  SSC
Source
Rows
Columns
Within
Total
SS
888.9613
303.2947
80.0387
1293.2947
DF
4
2
8
14
MS
222.4903
151.6474
10.0048
F
22.24
15.16
4,8  3.84
Since F.05
we reject H 01 which is ‘no
difference between display means.’ Since
2,8  4.46 we reject
F.05
H 02 which is ‘no
difference between display means.’
6
12/1/98 252z9861
4. Data is repeated from the previous problem.
a. Drop the assumption of normality and compare the columns assuming that they represent random
samples. (5)
c. Do the same assuming that each row represents the opinion of one consumer. (5)
Display
1
50
45
30
45
55
Display
2
45
30
25
35
40
Display
3
45
35
20
40
40
r1
r2
r3
3
1.5
1.5
3
1
2
3
2
1
3
1
2
3
1.5
1.5
15
7.0
8.0
Solution: a) The data is arranged in order at left and ranked at right below. As usual ties are treated by
giving the same rank to equal numbers. For example, the 45s are initially ranked as numbers 10, 11, 12, and
13, but these are replaced by their average, which is 11.5.
x1
x2
x3
r1
r2
r3
20
1
25
2
30
30
3.5
3.5
35
35
5.5
5.5
40
40
8
8
40
8
45
45
45
11.5
11.5
11.5
45
11.5
50
14
55
15
___
___
55.5
30.5
34.0
15 16  , the sum of the first 15 numbers.
Our check on the ranking is that 55.5 + 30.5 + 34 = 120, which is
2
2

SRi 
12
12 55 .52 30 .52 34 .02 
We now calculate
H
 3n  1 



  3 16
nn  1
ni
15 16  5
5
5 
= 0.05(1033.3) - 48 = 3.665. We then use the Kruskal -Wallis table for 5, 5, 5 to find that the p-value for
3.665 is above .102. Since this is above 5%, we do not reject H 0 which is ‘no difference between

distribution (or medians) of x1 , x 2 and x3 .’
b) Ranks of the numbers within rows are given in boldface to the right of the original table. We take the
column rank sums and check our ranking by noting that the sum of the rank sums should be
rcc  1 5  3  4

 30 , which is the sum of 15, 7 and 8. We now calculate
2
2
12
12
1
 F2 
SR 2  3r c  1 
15 2  7 2  8 2  338   60  7.6 . From the Friedman table
rcc  1
5 3 4
5
for N = 5 and k = 3, the p-value for 7.6 is .024. Since this is less than 5%, we reject H 0 which is ‘no



difference between distribution (or medians) of x1 , x 2 and x3 .’
7
12/1/98 252z9861
5. Union member confidence in big business and job satisfaction is reported below.
a. Test the hypothesis that job satisfaction and confidence are independent. (7)
b. Test the hypothesis that the proportion that is very confident is the same for the very satisfied and the
moderately satisfied. (4)
O
Very Confident
Somewhat confident
Not Confident
Sum
Very
Satisfied
26
95
34
155
Moderately
Satisfied
15
73
28
116
Dissatisfied
Sum
3
21
19
43
44
189
81
314
E
pr
.1401
.6019
.2580
1.0000
V.S.
M.S.
D.
Sum
21.72
93.29
39.99
155.00
16.25
69.82
29.93
116.00
6.02
25.88
11.09
42.99
43.99
188.99
81.01
313.99
Solution: a) The construction of the Expected table is shown in boldface above. pr is the proportion in
each row. For example .1401 is 44 divided by 314. To get the upper left hand value in E , multiply 155 by
.1401. The  2 test is done two ways below. The value of  2 is either 10.2145 or 325.2245 – 314 =
4 
10.2245. Since the problem has r  1c  1  2  2  4 degrees of freedom and  2 .05  9.4877 is smaller
than either value, we reject H 0 which is ‘independence.’
O
E  O  2
E  O2
E O
E
O2
E
31.1234
96.7413
28.9072
13.8462
76.3248
26.1945
1.4950
17.0402
32.5518
324.2245
E
26
21.72
-4.28
18.3184
0.84339
95
93.29
-1.71
2.9241
0.03134
34
39.99
5.99
35.8801
0.89723
15
16.25
1.25
1.5625
0.09615
73
69.82
-3.18
10.1124
0.14484
28
29.93
1.93
3.7249
0.12445
3
6.02
3.02
9.1204
1.51502
21
25.88
4.88
23.8144
0.92019
19
11.09
-7.91
62.5681
5.64185
314
313.99
0.00
10.2145
b) From page 10 of the Syllabus supplement.
Interval for
Confidence
Hypotheses
Test Ratio
Critical Value
Interval
pcv  p0  z  p
Difference
p  p  z 2 sp H 0 :p  p0
p  p0
z
between
If p0  0
p  p1  p2
 p
H 1 : p  p0
proportions
  pq 1 1

q  1 p
sp 
p1q1 p2 q 2

n1
n2
p0  p01  p02
or p 0  0
p
If p  0
p01q 01 p02 q 02

n1
n2
 p 
Or use
H 0 : p  p 0
p1 
p0 
H1 : p  p 0 or H 0 : p1  p 2
x1
26

 .1677
n1 155
x2
15

 .1293
n1 116
n1 p1  n2 p2
26  15
41


 .1775
n1  n2
155  116 231
If we use a test ratio, z 
reject H 0 .
8
p2 
p  p0
p

p 
0
H1 : p1  p 2
0
0

2
n1
n p  n2 p2
p0  1 1
n1  n2
s p
  .10 z 2  1.645
p  p1  p2  .1677  .1293  .0384
1
1
p 0 q 0  
 n1 n 2

 


.1775 .8225 
1
 155

1 
  .0022  .04691
116 
.0384
 0.819 ,it is inside the interval z.025  1.96 so do not
.04691
n2

Download