STAT 557 EXAM I NAME ________________

advertisement
STAT 557
FALL 2000
Instructions:
1.
EXAM I
NAME ________________
You may use a calculator and the formula sheets you brought to this exam. No
other notes or books are allowed. Write your answers in the spaces provided
below. If you need more space use the back of the page or attach additional
sheets of paper, but clearly indicate where this is done. You need not complete
numerical computations, you will receive complete credit by showing that you
know how to solve the problem. Be sure to define any notation you use that is not
defined in the statement of a problem.
Medical researchers want to investigate the claim that long term consumption of a low dose
of aspirin (in this case 250 mg. per day for at least five years) reduces the risk of
experiencing a heart attack in middle aged males (men between 40 and 55 years old).
(a) Describe how a prospective study could be done to investigate this claim.
(b) Describe how a retrospective study could be done to investigate this claim.
2
(c) Suppose the study you described in Part (b) produced the following table of counts.
Long term consumption
of 250 mg of aspirin per day
Experienced a
heart attack
Yes
No
Yes
15
27
No
185
173
Compute an approximate 95% confidence interval for the odds ratio corresponding to
the odds that long term aspirin users experience a heart attack divided by the odds that
non-aspirin users experience a heart attack.
(d) Explain why the odds ratio in Part (c) can be used as an approximate measure of relative
risk of heart attack. Be sure to describe the situations in which this approximation
would be most accurate.
3
2. Each member of a simple random sample of 400 female high school students in Iowa was
asked the question:
“Should teenagers be allowed to purchase birth control pills without the consent of
their parents?”
Each respondent was classified into one of three categories:
(1) Yes
(2) No
(3) Unsure/No opinion
A response to the same question was obtained from the mother of each student in the sample.
Show how you would use the data from this survey to test the null hypothesis that the
distribution of opinions across the three response categories is the same for female high
school students and mothers of female high school students. Give a formula for your test
statistic, degrees of freedom, and the critical value for a .05 Type I error level test.
4
3. In a study of two surgical procedures (procedure A and procedure B) for correcting a heart
defect, patients with the heart defect will be recruited at 18 different hospitals. The number
of patients recruited at the various hospitals will vary from as few as 6 to as many as 16 at a
single hospital. Within each hospital, half of the patients will be randomly assigned to
procedure A and the other half will be treated with procedure B. It is well known that
success rates for these types of surgical procedures vary from hospital to hospital, depending
on the training and policies of the staff at different hospitals and features of the populations
of patients served by different hospitals. Nevertheless, the researchers hope to find that one
procedure is consistently better than the other. Describe how you would analyze the data
from this experiment. In particular, identify the null hypotheses to be tested and the
alternative hypothesis. Report a formula for a test statistic and describe how it would be used
to establish a conclusion.
5
4. A simple random sample of 50 teaching assistants at a large public university were cross
classified into a 3x3 contingency table with respect to their perception of how well students
were prepared to take the course they were teaching and their level of job satisfaction.
Low
Student
Preparation
Job Satisfaction
Moderate
High
Poor
8
6
3
Moderately Good
1
16
4
Very Good
0
7
5
(a) Estimate the gamma measure of association.
(b) Estimate λ R |C .
(c) The estimate of kappa is 0.34. Is it better to use gamma or kappa to describe the
relationship between level of teacher job satisfaction and level of student preparation?
Explain.
6
5. Using wild ducks that were captured and fitted with radio transmitters in the previous
summer, researchers were able to locate 100 nesting pairs of ducks in the subsequent spring.
The number of eggs hatched in each of the 100 nests was recorded, and the counts are shown
below.
Number of eggs that hatched
Number of nests
0
26
1
8
2
12
3
11
4
18
5
14
6
8
7
2
8
0
9
0
10
1
This table indicates, for example, that 10 eggs hatched in one nest and no eggs hatched in
26 nests. If we use Y1, Y2, …, Y100 to represent the number of eggs that hatched in the 100
nests, then this table indicates that 26 of the Yi values are zero, 8 of the Yi values are
one, etc . ……
The relatively large number of zero counts requires a probability model that can give more
probability to a zero outcome than a Poisson distribution. Consider the model where Y1, Y2,
…, Y100 are independent random variables with
Pr{Yi = 0} = θ + (1 − θ ) e − λ
λk −λ
Pr{Yi = k} = (1 − θ )
e ,
k!
k = 1, 2, ...,
(a) Write out the log-likelihood function for this model. Define any notation you use that
has not been defined in the statement of this problem.
(b) The log-likelihood from Part (a) was maximized to obtain maximum likelihood
estimates
θˆ = 0.241
(
and
)
′
The covariance matrix for θˆ, λˆ was reported as
λˆ = 3.675
7
.002178 .000835
V = 

 .000835 .039643
Assuming that the proposed probability model is correct, use this information to construct
an approximate 95% confidence interval for (1 − θ) λ , the mean number of eggs that
hatch per nest.
(c) Suppose the researchers want to do a larger study to estimate (1 − θ) λ more precisely.
They want the standard error of their estimator to be smaller than 0.10. Using the
information from the current study, what is your recommendation for the number of
nests that they should monitor in the new study? Show how you arrived at your answer.
(d) To determine if the observed data are inconsistent with the proposed probability model,
the following table of expected counts was constructed.
Number of
hatched eggs
Observed number
of nests
Expected number
of nests
0
1
2
3
4
5
6
26
8
12
11
18
14
8
7 or
more
3
26.00 7.08 13.00 15.92 14.63 6.59 5.59 6.04
The value of the Pearson statistic is X 2 = 5.32. What are the degrees of freedom for
this test?
Download