THE UNIVERSITY OF NEW SOUTH WALES DEPARTMENT OF STATISTICS

advertisement
THE UNIVERSITY OF NEW SOUTH WALES
DEPARTMENT OF STATISTICS
Exercises for MATH3811, Statistical Inference
Contingency Tables
1. A sample of 46 students randomly selected from private high schools and a sample of 82 students randomly selected from public high schools were given standardized achievement tests with the following
results:
Score 0-275 Score 276-350 Score 351-425 Score 426-500
Private School
6
14
17
9
Public School
30
32
17
3
Test the null hypothesis that the distribution of test scores is the same for private and public high
school students at α = 0.05 against a two-sided alternative.
2. The following (3 × 3) contingency table is of 500 psychiatric patients classified by their degree of
depression and suicidal tendencies.
Observed Frequencies Not Depressed Moderately Depressed Severely Depressed
Attempted Suicide
26
39
39
Contemplated Suicide
20
27
27
Neither
195
93
34
a) Assuming independence between the two factors, calculate the expected frequencies
b) Carry out a test for independence
3. In an experiment to study the dependence of hypertension on smoking habits, the following data
were taken on 180 individuals:
Non Smokers Moderate Smokers Heavy Smokers
Hypertension
21
36
30
No Hypertension
48
26
19
Test the hypothesis that the presence or absence of hypertension is independent of smoking habit.
4. The following table was obtained as a result of Random Breath Testing during the Easter Holidays
in NSW:
Tested and Charged Tested and Not charged
Area: Sydney
99
11669
Area: country
115
15299
Test whether there is any association between area of testing and outcome of random breath test
during such a holiday period.
5. The following 2 × 3 × 4 contingency table presents data from the study of coronary heart disease.
The first factor concerns the presence or absence of coronary heart disease, the second factor is
serum cholesterol at three levels, and the third variable is systolic blood pressure with four categories.
Coronary Disease Serum Cholesterol Level
(mg/100cc) Pressure <127 Pressure: 127-146 Pressure: 147-166 Pressure >166
< 200
2
3
3
4
Present
200-259
11
13
6
9
≥ 260
7
12
11
11
< 200
117
121
47
22
Absent
200-259
204
307
111
63
≥ 260
67
99
46
33
Construct appropriate two-way contingency tables and test the following three hypotheses:
1
a) H01 : No association between coronary disease and systolic blood pressure.
b) H02 : No association between serum cholesterol and systolic blood pressure
c) H03 : No association between coronary disease and serum cholesterol.
6. Two batches of 15 experimental animals each, with one batch inoculated and the other not inoculated,
were exposed to infection under comparable conditions. The results of the experiment are given in
the following (2 × 2) table:
Frequencies Died Survived
Inoculated
5
10
Not Inoculated
9
6
On the hypothesis of independence, calculate the probability of a result at least extreme as that
actually observed by:
a) The Chi-squared test without continuity correction.
b) The Chi-squared test with Yates’s continuity correction.
c) Is Fisher’s exact test suitable to be applied here? Give reasons for your answer.
d) Do you accept the hypothesis of independence?
7. Fourteen newly hired business majors, 10 males and 4 females, all equally qualified, are being assigned
by the bank president to their new jobs. Ten of the new jobs are tellers, and four are account
representatives. The null hypothesis is that males and females have equal chances at getting the
more desirable account representative job. The one-sided alternative of interest is that females are
more likely than males to get the account representative jobs. Only one female is assigned a teller
position. Using Fisher’s exact test, can you reject the null hypothesis against in favour of the onesided alternative?
Answers
1. df= 3, Q = 17.3 > 7.815, reject H0 .
2. df= 4, Q = 71.47 > 9.49, reject H0 .
3. df= 2, Q = 14.464 > 5.99, reject H0 .
4. df= 1, Q = 0.774 < 3.84, accept H0 .
5. a) Q = 64.6641, df = 3, reject H0 ; b) Q = 19.48, df = 6, reject H0 ; c) Q = 31.3893, df = 2, reject
H0 .
6. a) Q = 2.1429, df = 1, p-value= .1432. b) Q = 1.2054, df = 1, p-value=.2723; c) Not suitable.
7. p-value = 0.041, hence the hypothesis is to be rejected at α = 0.05.
2
Download