Document 11258861

advertisement
STAT 557
Fall 1998
Midterm Exam
INSTRUCTIONS:
1.
NAME _____________
Show your work and write your answers in the space provided on this
exam. Use the back of the page or attach additional sheets of paper if more
space is needed, but clearly indicate where this is done. You may use a
calculator, pencils, erasers, and the formula sheet attached to this exam. No
other materials are allowed.
The following table displays information on the number of accidents during a one year
period experienced by a sample of 400 workers from the automobile industry. This table
also presents maximum likelihood estimates of expected counts for a model that assumes that
the number of accidents for different workers are i.i.d. negative binomial random variables.
Observed Number of Workers
Estimate of Expected Counts
0
282
282.2
Number of Accidents in One Year
1
2
3
4
5 or more
74
26
8
4
6
72.5
26.6
10.7
4.5
3.5
(a)
Write out a formula for the log-likelihood that was maximized to get the estimates of
the expected counts shown in the table.
(b)
Using the observed counts and the estimates of the expected counts from the table, the
value of the Pearson statistic for testing the fit of the negative binomial model against
the general alternative is X2 = 2.60. What are the degrees of freedom for this test?
(c)
The maximum likelihood estimates for the parameters in the negative binomial model
are π̂ – 0.53 and β̂ = 0.54, and the estimate of the large sample covariance matrix
for (π2 , β̂ ) is
 s2πˆ


sπˆ , βˆ

sπˆ , βˆ 
.0035 .0062

= 


.0062 .0342
s 2βˆ 

Show how this information can be used to construct an approximate 95% confidence
interval for
ðâ = exp(â log (ð)) ,
the proportion of the population of workers that will have no accidents in a one year
period.
(Display required formulas. You do not have to finish numerical computations.)
2.
One way to measure the effectiveness of an allergy treatment is to monitor use of "over-the-
2
counter" remedies, such as decongestants and nasal sprays, that can be purchased without a
doctor's prescription. Each of 400 subjects enrolled in a study of a particular allergy
treatment was monitored for use of over-the-counter remedies during a four week period
before the treatment was first administered. Each subject was classified into one of the
following three categories for level of use of over-the-counter remedies:
1. no use of over-the-counter remedies
2. moderate use
3. heavy use
Then, each subject was given the treatment. During the treatment period, the subjects could
use any over-the-counter remedies that they waned to use. Each subject was monitored
during the last 4 weeks of the treatment period and classified into one of the three categories
for use of "over-the-counter" remedies listed above.
Show how to test the null hypothesis that use of over-the-counter remedies was not affected
by the new allergy treatment. Give a formula for your test statistic and report the associated
degrees of freedom for your test.
3.
In a study of smoking habits of college students, observations from 2317 students were
cross-classified into a 2×2×3×3 contingency table with respect to the following variables.
Each student in the study had at most one older sibling.
Variable
(a)
(b)
Description
A
Smoking habit of the respondent
i=1 for smoker
i=2 for non-smoker
B
Sex of respondent
j=1 for female
j=2 for male
C
Older sibling as a role model
k=1 older sibling smokes
k=2 older sibling does not smoke
k=3 no older sibling
D
Parents as role models
l=1 both parents smoke
l=2 one parent smokes
l=3 neither parent smokes
Write down the largest log-linear model that satisfies the following statement:
Given the parents' smoking status (D), the smoking status of the respondent (A) is
conditionally independent of both sex of the respondent (B) and the status of a
possible older sibling (C).
The following log-linear model was fit to the data:
(
3
)
log mijk l =λ +
λ
A
i
+ λik +
AC
+
λ
B
j
λil
AD
+
+
C
k
λ
+
λ
λ jl
+
λkl
BD
D
l
λ
AB
ij
+
CD
+
λij l
ABD
The value of the Pearson statistic for testing the fit of this model against the general
alternative is X2 = 22.74. What are the degrees of freedom associated with this test?
(c)
How should data be collected from students if one wishes to accurately use a chisquare distribution for the X2 test in part (b)?
(d)
Assume that the model shown in part (b) is the correct model. Describe what this
model implies about associations of the other three variables with the respondent's
smoking status. Maximum likelihood estimates of model parameters are shown
below. (These estimates satisfy the restrictions that the sum across the levels of any
single variable is zero.) Use these estimates to interpret the size and direction of
significant associations identified by this model. More space for your answer is
available on the next page.
Effect
Subscripts
Estimate
A
B
A*B
D
i=1
j=1
i=1, j=1
1=1
1=2
i=1, 1=1
i=1, 1=2
j=1, 1=1
j=1, 1=2
i=1, j=1, 1=1
i=1, j=1, 1=2
k=1
k=2
i=1, k=1
i=1, k=2
k=1, 1=1
k=1, 1=2
k=2, 1=1
k=2, 1=1
−0.2292
−0.0197
−0.1463
0.7216
0.9156
0.3522
−0.1925
0.0469
−0.00004
0.0988
0.0179
0.5934
0.1080
0.2949
0.0247
0.2000
−0.5167
0.1037
−0.1864
A*D
B*D
A*B*D
C
A*C
C*D
Standard
Error
0.0394
0.0369
0.0369
0.0522
0.0504
0.0419
0.0431
0.0411
0.0415
0.0411
0.0415
0.0530
0.0580
0.0313
0.0335
0.0598
0.0584
0.0651
0.0627
ChiSquare
Prob
33.91
0.28
15.72
190.83
329.60
70.50
19.91
1.30
0.00
5.77
0.19
125.16
3.47
88.59
0.54
11.18
78.27
2.54
8.83
0.0000
0.5938
0.0001
0.0000
0.0000
0.0000
0.0000
0.2543
0.9993
0.0163
0.6655
0.0000
0.0625
0.0000
0.4605
0.0008
0.0000
0.1112
0.0030
EXAM SCORE
Download