4/17/02 252y0131 ECO252 QBA2 Name

advertisement
4/17/02 252y0131
ECO252 QBA2
THIRD HOUR EXAM
April 17, 2001
Name
KEY
Hour of Class Registered (Circle)
MWF TR 10 12 12:30 2:00
I. (10+ points) Do all the following;
1. Hand in your computer printouts for problems 2 and 3.(5 points – 3 point penalty for not handing in).
remember that the ANOVA printout must be completed, using a 5% significance level, for full credit. I
should be able to tell what is tested and what are the conclusions.
2. Do not do the following unless you handed in at least two outputs.
On the next few pages there are problems very much like the ones you did.
a. A random survey of CEOs (Don Black) asks the question "Do you agree that an increase of
market share is a reason to consider a merger?" Responses (Agree?) were between 5 and 1, with 5
indicating strong agreement. Responses were classified in 2 ways, with 'years' dividing respondents
according to the number of years they had been with the company and 'size' dividing firms according to the
companies' sales in millions of dollars. The output follows.
Tabulated Statistics
ROWS: Size
1
2
3
4
ALL
COLUMNS: Years
1
2
3
ALL
4
4
4
4
16
4
4
4
4
16
4
4
4
4
16
12
12
12
12
48
CELL CONTENTS -COUNT
Tabulated Statistics
ROWS: Size
COLUMNS: Years
1
2
3
1
2.0000
3.0000
2.0000
2.0000
2.0000
1.0000
2.0000
3.0000
2.0000
1.0000
1.0000
2.0000
2
2.0000
1.0000
2.0000
3.0000
2.0000
3.0000
2.0000
3.0000
2.0000
3.0000
1.0000
2.0000
3
3.0000
4.0000
4.0000
5.0000
3.0000
2.0000
4.0000
4.0000
3.0000
2.0000
3.0000
3.0000
4
3.0000
4.0000
4.0000
3.0000
3.0000
3.0000
3.0000
4.0000
2.0000
3.0000
2.0000
3.0000
1
4/24/01 252y0131
CELL CONTENTS -Agree?:DATA
Tabulated Statistics
ROWS: Size
1
2
3
4
ALL
COLUMNS: Years
1
2
3
ALL
2.2500
2.0000
4.0000
3.5000
2.9375
2.0000
2.5000
3.2500
3.2500
2.7500
1.5000
2.0000
2.7500
2.5000
2.1875
1.9167
2.1667
3.3333
3.0833
2.6250
CELL CONTENTS -Agree?:MEAN
MTB > Twoway c1 c2 c3;
SUBC>
Means c2 c3.
Two-way Analysis of Variance
Analysis of Variance for Agree?
Source
DF
SS
MS
Size
3
17.083
5.694
Years
2
4.875
2.437
Interaction
6
2.292
0.382
Error
36
17.000
0.472
Total
47
41.250
Size
1
2
3
4
Mean
1.92
2.17
3.33
3.08
Years
1
2
3
Mean
2.94
2.75
2.19
Individual 95% CI
-----+---------+---------+---------+-----(------*------)
(------*------)
(------*-----)
(-----*------)
-----+---------+---------+---------+-----1.80
2.40
3.00
3.60
Individual 95% CI
--------+---------+---------+---------+--(---------*---------)
(---------*---------)
(---------*--------)
--------+---------+---------+---------+--2.10
2.45
2.80
3.15
(i) Complete the ANOVA table that compares the effect of size and years on the response of the CEOs. (2)
Solution: Most of this table is copied from above. F is gotten by dividing MS by the error (within) mean
square. The F.05 column contains values found in the F table. If the F we computed exceeds the table F we
put an 's' (significant) in the F column, otherwise we put in a 'ns' ( not significant)
Source
DF
SS
MS
F
F.05

3,36
Size
3
17.083
5.694
12.064s
F
 2.87

2,36
Years
2
4.875
2.437
5.163s
F
 3.26
Interaction
6
2.292
0.382
0.809ns F 6,36  2.36
Error
36
17.000
0.472
Total
47
41.250
(ii) Is there a statistically significant difference between the means for responses of CEOs with different
years of service? Show what numbers brought you to your conclusion. (2)
Solution: Yes. The 'Years' line is marked significant as explained in (i). This means that we reject the null
hypothesis of no significant difference for the means of the 3 'years' categories.
2
4/24/01 252y0131
b. Ken Black says that the Delta Wire Corporation believes that the more positive outlook created
by employee education sessions results in greater interest in the job, and thus fewer sick days per worker. A
random sample of workers is taken, and a regression is run with 'sick' (number of sick days per worker) as
the dependent variable and 'educ' (number of hours of employee education) as the independent variable.
Part of the output appears below.
Regression Analysis
The regression equation is
Sick = 7.92 - 0.0864 Educ
Predictor
Constant
Educ
Coef
7.9197
-0.08637
s = 2.368
Stdev
0.7837
0.01574
R-sq = 62.6%
t-ratio
10.11
-5.49
p
0.000
0.000
R-sq(adj) = 60.5%
Analysis of Variance
SOURCE
Regression
Error
Total
DF
1
18
19
SS
168.80
100.95
269.75
Unusual Observations
Obs.
Educ
Sick
4
120
1.000
MS
168.80
5.61
Fit
-2.444
F
30.10
Stdev.Fit
1.414
p
0.000
Residual
3.444
St.Resid
1.81 X
C4
10
5
0
0
50
100
Educ
(i) What equation does it give to relate education hours to number of sick days? How many sick days does it
predict that someone with 10 hours of education will take? (2)


Solution: The equation can be written Y  7.92  0.0864 x or Y  7.9197  0.08637 x . Whichever version

we substitute 10 into, we get Y  7.9197  0.0863710  7.056
(ii) What test or tests would lead you to believe that the company is correct? Cite evidence using a 5%
significance level. (2) Solution: The company has asserted that more education will cut down sick days.
They are right if 'educ' has a significant negative coefficient. (-0.0864) is negative and the p-value for the t
test is zero. This means that we would reject the null hypothesis of insignificance at any level. Alternately,
compare the t of -5.49 with the 5% value of t with 18 (from the 'Error' line in the ANOVA) degrees of
freedom.
3
4/24/01 252y0131
II. Do at least 4 of the following 5 Problems (at least 10 each) (or do sections adding to at least 40 points Anything extra you do helps, and grades wrap around) . Show your work! State H 0 and H1 where
applicable. Never say 'yes' or 'no' without a statistical test.
1.
a. In the ANOVA problem beginning on page 1, there is a plot for an individual 95% confidence
interval for the mean for size 3.
(i) Figure out, using your formulas and the data on the pages, what this interval actually
is. (2)
(ii) Is there a significant difference between the means for level 1 and level 2 of 'years?
To answer this question, do 95% confidence intervals for the difference between these two means
that are () Valid when used alone. (1) () Valid when used with other possible differences
between means. (2) and state a conclusion (1)
b. In the regression problem on page 3.
(i) Add a regression line to the graph (1)
(ii) Do a 99% confidence interval for the slope of the equation. (2)
(iii) Find s e2 from the printout and (if the sample mean for x is 36.70 and the sample
standard deviation for x is 34.51) do a confidence interval for the mean number of sick
days taken by someone with 10 hours of education (4).
Solution: a) (i) From the printout, in the COUNT table, we can find out that there are R  4 rows,
C  3 columns and P  4 measurements per cell. We can also see that MSW  0.472 and has 36 degrees of
36
 2.028 . There are PC  12 numbers in a row.
freedom associated with it. t .025
A source of a confidence interval could be Exercise 14.42 - for row means 1  x1   tRC P 1
2m
The table of means says that the mean of row 3 is 3.3337 so 1  3.3337   2.028
MSW
.
PC
0.472
12
 3.334  0.402 .
(ii) The printout says that the mean for column 1 is 2.9375, the mean for column 2 is 2.7500, that there are
PR  16 numbers in a column and that the F for columns has 2 and 36 degrees of freedom.
The outline says:
i. A Single Confidence Interval
If we desire a single interval we use the formula for a Bonferroni Confidence Interval below with
m  1.
ii. Scheffé Confidence Interval
2MSW
For column means, use  1   2  x1  x2   C  1FC 1, RC P 1
.
PR
iii. Bonferroni Confidence Interval
2MSW
Use for column means  1   2  x1  x2   t RC P 1
.
2m
PR
() A 95% confidence interval for the difference between these two means that is valid when used alone is
2MSW
 1   2  x1  x2   t RC P 1
2
PR
 2.9375  2.7500   2.028
20.472 
 0.1875  2.028 0.243   0.19  0.49
16
4
4/24/01 252y0131
()A 95% confidence interval for the difference between these two means that is valid when used with
2MSW
other possible differences between means  1   2  x1  x2   C  1FC 1, RC P 1
PR
 0.1875 
2F.052,36 20.472   0.1875  23.26 20.472   0.19  0.62
16
16
Conclusion: Since both these intervals include zero, the difference between these two means is not
significant.
b) In the regression problem on page 3.
(i) Just connect the x's
(ii) From the printout, there are 18 degrees of freedom, the coefficient of 'educ' is -0.08637 and the
18
standard deviation of that coefficient is 0.01574. t .005
 2.878 1  b1  t sb1  .0864  2.878 0.01574 
2
 .086  0.044 .
(iii) From the printout s e  2.368 . Either square this, or copy the Mean Square Error, which is 5.61.
Because the total degrees of freedom are 19, n  20 . x  37.70 and since s x  36.67, s x2  1344.69,
SS x  n  1s x2  191344.68  25549.089 . Since SS x appears in so many formulas, there are many other
ways to get it. If X 0  10, Y0  7.9197  .08637 10  = 7.056.


1 X X 2 

The Confidence Interval is  Y0  Yˆ0  t sYˆ , where sY2ˆ  s e2   0
n

SS x


2 

1 10  37 .70  
= 0.44877. So  Y0  Yˆ0  t sYˆ  7.056  2.101  0.44877  6.88  1.40 .
 2.368 2  
 20

25549
.
089


5
4/24/01 252y0131
2. According to Ken Black an agricultural researcher planted parts of six blocks of land with peanuts , using
each of three different methods.. Part of the results are given below. Data is yield per acre in thousands.
Assume that the parent distribution is Normal and compare the mean yields for the three methods noting the
fact that it is cross-classified. Use   .01 . (14) Note: If you wish to ignore that the fact that the data is
blocked, indicate this now and compare the column means assuming that the data is three independent
random samples from a normal distribution.(10). (   .01 )
BLOCK
1
2
3
4
5
6
Sum
Sum of squares
Method 1
1.31
1.27
1.28
1.22
1.19
1.30
7.57
9.5619
Method 2
1.08
1.10
1.05
1.02
0.99
0.95
6.19
6.4019
Method 3
0.85
1.02
0.78
0.87
0.80
0.96
?
?
Sum
3.24
3.39
3.11
3.11
2.98
3.21
Sum of Squares
3.6050
3.8633
3.3493
3.2857
3.0362
?
Solution: As we said many times -- If the parent distribution is Normal use ANOVA, if it's not
Normal, use Friedman or Kruskal-Wallis. If the samples are independent random samples use 1-way
ANOVA or Kruskal Wallis. If they are cross-classified, use Friedman or 2-way ANOVA.
a) 2-way ANOVA (Blocked by block) ‘s’ indicates that the null hypothesis is rejected.
BLOCK
Method 1 Method 2 Method 3
Sum
SS
ni
x i.
x i2.
x1
x2
x3
x i..

1
2
3
4
5
6
Sum
1.31
1.27
1.28
1.22
1.19
1.30
7.57
nj
6
1.08
1.10
1.05
1.02
0.99
0.95
+ 6.19
0.85
1.02
0.78
0.87
0.80
0.96
+ 5.28 =
+6
+6
3.24
3.39
3.11
3.11
2.98
3.21
19.04 
= 18  n
 x
x j
1.2617
1.0317
0.8800
SS
9.5619
+ 6.4019
+4.6868
1.0578  x
=20.6536   xij2
x 2j
1.5919
+ 1.0644
+ 0.7744
= 3.4307 =
Note that x is not a sum, but is

SSC   n
SST 
x ij2
3
3
3
3
3
3
18
1.0800
1.1300
1.0367
1.0367
0.9933
1.0700
1.0578
3.6050
3.8633
3.3493
3.2857
3.0362
3.5141
20.6536

 x
2
ij
1.1664
1.2769
1.0747
1.0747
0.9866
1.1449
6.7242

x
 x .2j
 x .
n
 n x  20 .6536  181.0578 2  20 .6536  20 .1409  0.5127 .
2
j x j
2
 n x  63.4307   181.0578 2  20 .5842  20 .1409  0.4433 . This is SSB in a one
2
way ANOVA.
SSR 
 n x
2
i i.
 n x  36.7242   181.0578 2  20 .1726  20 .1409  0.0317
2
( SSW  SST  SSC  SSR  0.0377 )
6
2
i.
4/24/01 252y0131
Source
SS
DF
MS
F
Rows (Blocks)
0.0317
5
0.00634
1.682
Columns(Methods)
0.4433
2
0.22165
58.79
F.01
F 5,10  5.64 ns
F 2,10  7.56 s
H0
Row means equal
Column means equal
Within (Error)
0.0377
10
0.00377
Total
0.5127
17
So the yields (column means) are significantly different.
b) One way ANOVA (Not blocked by statement)
Source
SS
DF
Columns(Employees)
0.4433
2
( SSW  SST  SSB  .0694 )
MS
F.01
F
0.22165
47.904
F 2,15  6.36 s
H0
Column means equal
Within (Error)
0.0694
15
0.004627
Total
0.5127
17
Once again, the yields (column means) are significantly different.
7
4/24/01 252y0131
3.
a. Data from problem 2 is repeated below. Assume that the distribution is not normal, but
that it is blocked (cross-classified), and again compare the distributions represented by the
columns. (5) (   .01 )
BLOCK
Method 1
Method 2
Method 3
Sum
Sum of Squares
1
1.31
1.08
0.85
3.24
3.6050
2
1.27
1.10
1.02
3.39
3.8633
3
1.28
1.05
0.78
3.11
3.3493
4
1.22
1.02
0.87
3.11
3.2857
5
1.19
0.99
0.80
2.98
3.0362
6
1.30
0.95
0.96
3.21
?
Sum
7.57
6.19
?
Sum of squares
9.5619
6.4019
?
b. A researcher looks at the effect of industry (Factor A) and size (measured by $millions in sales)
(Factor B) on Research and Development expenditures as a per cent of sales. A sample is taken of 4 firms in
each industry-size category. The sample is repeated over three different years (Factor C). If the researcher
looks at 3 size categories within 5 industries over the three years, generate an ANOVA table showing all
possible interactions, using the following data. SSA = 24.021, SSB = 4.829, SSC = 4.029, SSAB = 9.059,
SSAC = 14.976, SSBC = 2.528, SSABC = 9.615, SST = 150.672. Using a 5% significance level, explain
whether the size of the firm and the industry seem to make a difference in R&D expenditures. Which of the
other differences and interactions are significant.? (7)
Solution: a) Friedman Test H 0 : Columns from same distribution . Rank within rows.
BLOCK
Method 1
Method 2
Method 3
r1
r2
r3
1
2
3
4
5
6
Sum
1.31
1.27
1.28
1.22
1.19
1.30
3
3
3
3
3
3
18
1.08
1.10
1.05
1.02
0.99
0.95
2
2
2
2
2
1
11
0.85
1.02
0.78
0.87
0.80
0.96
1
1
1
1
1
2
7
There are r  6 rows and c  3 columns. Check: the rank sums must add to r
18 + 11 + 7 = 36, we are all right. The Friedman Statistic is  F2 


12
r c c  1
cc  1
34
6
 36 . Since
2
2
 SR  3r c  1
2
12
1
18 2  11 2  7 2  364  494   72  10.333 . According to the Friedman Table
634
6
( k  3, n  6 ), 10.333 has a p-value of .002. Since   .01, and the p-value for our null hypothesis is below
this significance level, we reject H 0 .

8
4/24/01 252y0131
b) There are 5 industries, 3 sizes and 3 years or 5  3  3  45 groups with 4 observations in each group, so
n  45  4  180 . . ‘s’ means ‘significant difference’ ( H 0 rejected), ‘ns’ means ‘no significant difference’
( H 0 accepted). It seems that both industry (Factor A) and size (measured by $millions in sales) (Factor B)
make a difference in Research and Development expenditures since their Fs are significant. Year (Factor C)
and Interaction AC also have an effect.
Source
SS
DF
MS
F
Factor A
24.021
4
6.00525
9.933 s
Factor B
4.829
2
2.41450
3.994 s
Factor C
4.029
2
2.01450
3.332 s
Interaction AB
9.059
8
1.13238
1.873 ns
Interaction AC
14.976
8
1.87200
3.096 s
Interaction BC
2.528
4
0.63200
1.045 ns
F.05
F 4,135  2.44
F 2,135  3.07
F 2,135  3.07
F 8,135  2.01
F 8,135  2.01
F 4,135  2.44
F 16,135  1.73
Interaction ABC 9.615
16
0.60094
0.994 ns
Error (Within)
81.615
135
0.60456
Total
150.672
179

,125
Note: A = Industry, B = Size, C = Year F
is used in place of F ,135 because examination of the
table shows very little variation at this level..
9
4/24/01 252y0131
4. (Levine et. al. p 839) The following data are charges in dollars per minute and billions of minutes of calls
made to 9 countries from the US in 1996.
Country
minutes
charge
Canada
Britain
Germany
Japan
Dom.Rep.
France
India
Brazil
Taiwan
3.05
1.02
0.66
0.57
0.41
0.36
0.29
0.28
0.27
0.34
0.73
0.88
1.00
0.84
0.81
1.38
0.96
0.97
For your convenience the following values are given:
 x  7.91,  x
2
 7.5515,
 y  6.91,  y
2
 11.6365 and n  9.
a. Compute the regression equation Y  b0  b1 x to predict billions of minutes of calls. (6)
b. On the basis of your regression, how many billions of minutes of calls do you expect when the charge is
$.90 ? (1)
c. Compute R 2 . (4)
d. Compute s e . (3)
e. Compute s b1 and do a significance test on b1 .(4)
f.. Do a prediction interval for billions of minutes of calls when the charge is $.90 (3)
g. Using your SST etc., put together the ANOVA table (6)
Solution:
We compute
 x y  4.49930
(See next page)
x
Spare Parts Computation:
x
 x  7.91  0.878889
SSx 
 0.599489
y

 1.57382
n
n
9
y

Sxy 
6.91
 0.767778
9
SSy 
a) b1 
Sxy

SSx

x
2
 nx 2  7.5515  90.878889 2
 xy  nx y  4.4993  90.878889 0.767778 
y
2
 ny  11 .6365  90.767778 2
2
 6.33115  SST
xy  nx y
2
 nx
2

 1.57382
 2.6252
0.599489
b0  y  b1 x  0.767778  2.6252 0.878889  3.07504
b) Y  b0  b1 x becomes Yˆ  3.075  2.6252 x , and Yˆ  3.075  2.62520.90  0.7125 is the number of
billions of minutes that we forecast.
SSR 4.13159
xy  nx y  2.6252  1.57382   4.13159 R 2 

 0.6526 or
c) SSR  b1 Sxy  b1
SST 6.33115


10
4/24/01 252y0131
 xy  nxy 
Sxy2 
 1.57382 2  .6526


SSxSSy  x 2  nx 2  y 2  ny 2  0.599489 6.33115 
2
R
2
d) SSE  SST  SSR  6.32115  4.13159  2.19956
s e2 
 y
SSy  b1 Sxy

n2
SSE 2.19956

 0.31422 or
n2
7
  xy  nxy  6.33115   0.2.6252 1.57382 

 0.31422 or
 ny 2  b1
2
1  R SST  1  R  y

2
2
s e2
s e2 
( 0  R 2  1 always!)
n2
n2
2
 ny 2
n2
s e  0.31422  0.56056
(
s e2

or
se2
 y

7
2
 ny
2
  b  x
2
1
2
 nx 2

n2
is always positive!)
e) H 0:  1  0 H 1 : 1  0
s b21 
t
s e2

SSxx
 x
s e2
2
 nx
2


0.31422
 0.524148
0.599487
sb1  0.524148  0.72389
b1  10 b1  0 2.6256


 3.6260 Assume that   .05 and Make a diagram. Show an almost
sb1
sb1
0.72389
7
7
normal curve and that the 'reject region is below  t.n2  t .025
 2.365 or above t.n2  t.025
 2.365 .
2
2
Since -3.6280 is in the lower 'reject' region, reject H 0 and conclude that 1 is significant.
f) We found in b) that if x  0.90 , Yˆ  0.7125 .
0
1
s 2y  s e2  
0
n

 x 0  x 2
 x
2
 nx 2
0


 1  x  x 2

 1 0.90  0.8789 2

 1  s e2   0
 1  0.31422  
 1  0.3494
9

n

0.599487 
SS x






s y0  0.3494  0.5911.
7
So Y0  Yˆ0  t  2 s y0  0.7125  2.306 0.3494   0.71  0.81 . Error t.n2  t.025
 2.365 .
2
g) From the previous page or above, SSR  4.13159 , SST  6.32115 and SSE  2.19956 . H 0 is that there
is no relation between Y and X .
Source
SS
DF
MS
F
F.05
Regression
4.13159
1
4.13159
Error (Within)
Total
2.19956
6.32115
7
8
0.314227
13.148
F 1,7   5.59 s
Since the table F is less than the computed F, reject H 0 .
Appendix: Computation of column sums.
Row
i
1
2
3
4
5
6
7
8
9
Sum
min
y
3.05
1.02
0.66
0.57
0.41
0.36
0.29
0.28
0.27
6.91
charg
x
0.34
0.73
0.88
1.00
0.84
0.81
1.38
0.96
0.97
7.91
C3
x2
0.1156
0.5329
0.7744
1.0000
0.7056
0.6561
1.9044
0.9216
0.9409
7.5515
C4
C5
xy
y2
1.0370
0.7446
0.5808
0.5700
0.3444
0.2916
0.4002
0.2688
0.2619
4.4993
9.3025
1.0404
0.4356
0.3249
0.1681
0.1296
0.0841
0.0784
0.0729
11.6365
11
4/24/01 252y01321
5. The failure times in thousands of hours are given for a random sample of 7 components.
0.5
8.5
7.8
8.3
4.7
3.5
4.4
Minitab says that the sample mean is 5.39 and the sample standard deviation is 2.97
Use methods appropriate to testing goodness of fit.
a. Test the hypothesis that these numbers came from a normal distribution. Use a 5% significance level. (5)
b. Test the hypothesis that the above data came from a normal distribution with a mean of 4.5 and a
standard deviation of 2 (5)
c. A television set distributorship believes that television ownership in the local area is distributed
according to a Poisson distribution with a mean of 4. A sample of 100 is taken. Is this true? Use a 5%
significance level. (5)
Number of TV sets:
0
1
2
3
4
5
6 or more.
Number of Households: 2 30 30 18 10
2
8
Solution: a) H 0 : N  ?, ? H 1 : Not Normal
Because the mean and standard deviation are unknown, this is a Lilliefors problem.
xx
From the data we found that x  5.39 and s  2.97 . t 
. F t  actually is computed from the Normal
s
table. For example F 1.65   Px  1.65   Pz  0  P1.65  z  0  .5  .4505  .0495 .
x
0.5
3.5
4.4
4.7
7.8
8.3
8.5
t
 1.65  0.64  0.33  0.23
0.81
0.98
1.05
MaxD   .1624


Ft
.0495
.2611
.3707
.4090
.7910
.8365
.8531
Since the Critical
O
1
1
1
1
1
1
1
O  n  7 Value for   .05
O
is .300 , do not
0.1429 0.1429 0.1429 0.1429 0.1429 0.1429 0.1429
n
reject H 0 .
Fo 0.1429 0.2857 0.4286 0.5714 0.7143 0.8571 1.0000
D
.0934
.0246
.0579
.1624
.0767
.0206
.1469
b) H 0 :N 4.5,2 H 1 : Not N 4.5,2
Because the population mean and standard deviation are known, this is a Kolmogorov-Smirnov problem.
x
z
.


x
0.5
z
 2.00
F z  .0228
O
1
O
0.1429
n
Fo 0.1429
D
.1201
3.5
4.4
4.7
7.8
8.3
8.5
 0.50
.3050
1
 0.05
.4801
1
0.10
.5398
1
1.65
.9505
1
1.90
.9713
1
2.00
.9772
1
0.1429
0.2857
.0193
0.1429
0.4286
.0515
0.1429
0.5714
.0316
0.1429
0.7143
.2362
0.1429 0.1429
0.8571 1.0000
.1142
.0228
Max D   .2362
O  n  7
Since the Critical
Value for   .05
is .47, do not
reject H 0 .
12
4/24/01 252y01321
c) H 0 :Poisson4 H 1 : Not Poisson4 This can be done as a chi-square or Kolmogorov-Smirnov problem.
f e and Fe come from the Poisson table.
x
0
1
2
3
4
5
6+
O
2
30
30
18
10
2
8
100
O
n
.02
.30
.30
.18
.10
.02
.08
1.00
Fo
fe
.02
.32
.62
.80
.90
.92
1.00
.01832
.07326
.14652
.19537
.19537
.15629
.21487
1.00000
E
1.832
7.326
14.652
19.537
19.537
15.629
21.487
100.000
Fe
D
.01832
.09158
.23810
.43347
.62884
.78513
1.0000
.00168
.22842
.38190
.36653
.27116
.13487
0
For the Kolmogorov-Smirnov Method the 5% critical value is
136
.
O
32
30
18
10
2
8
100
E
9.158
14.652
19.537
19.537
15.629
21.487
100.000
O2
E
111.8148
61.4251
16.5839
5.1185
0.2559
2.9785
198.1767
. This is less than the
 0136
.
100
maximum value of D , which is .36653, so reject H 0 .
For the Chi-Squared Method, we have had to merge two cells, because the first E was below 5. We
thus have 6 - 1 = 5 degrees of freedom. The value of Chi-squared that we compute is 198.1767 - 100 =
98.1767. From the Chi-squared table  .2055  11 .0705 . This is less than our computed  2 , so reject H 0 .
13
Download