Biostatistics - Winona State University

advertisement
Biostatistics (STAT 405) - Midterm Exam ( points)
NAME:
1. ICU Admission Study
The ICU data set consists of a sample of 200 subjects who were part of a much larger study on survival of
patients following admission to an adult intensive care unit (ICU). The major goal of this study was to
examine and identify the risk factors associated with ICU mortality. These data represent a small subset of
the variables collected by the researchers.
Source: Data were collected at Baystate Medical Center in Springfield, MA.
NAME
Description
Codes/Values/Units
--------------------------------------------------------------------------------Outcome
Vital Status
1 = Died
0 = Survived
Age
Age
years
Cancer
Cancer Part of Present
Problem
Yes
No
SYS
Systolic Blood Pressure at
ICU Admission
mmHg
ER
Was patient first brought
to the emergency room (ER)
Yes
No
Below is a cross-tabulation of outcome and whether of not the patient was admitted to the
ICU from the emergency room.
a) Compute the risk difference, relative risk, and
ER
Died Survived Row Totals
odds ratio for death associated with being
Yes
38
109
147
admitted to the adult ICU from the emergency
No
2
51
53
room. (6 pts.)
Column
Totals
40
160
n = 200
b) Construct a CI for the RR associated with be admitted to the ICU from the emergency
room and interpret the CI in practical terms. (4 pts.)
1
The results of fitting a logistic model for survival status in R using all available
covariates are given below.
glm(formula = Outcome ~ Age + Cancer + Sys + ER, family = "binomial")
Deviance Residuals:
Min
1Q
Median
-1.2327 -0.6744 -0.4100
3Q
-0.1657
Max
2.9168
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.455427
1.405254 -3.171 0.001522 **
Age
0.036205
0.011222
3.226 0.001254 **
CancerYes
1.705672
0.821645
2.076 0.037901 *
Sys
-0.013801
0.006061 -2.277 0.022795 *
ERYes
2.923411
0.887747
3.293 0.000991 ***
--Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 200.16
Residual deviance: 163.55
AIC: 173.55
on 199
on 195
degrees of freedom
degrees of freedom
c) Holding the other covariates constant, calculate the odds ratio associated with a 10
year increase in age. Interpret. (3 pts.)
d) Holding the other covariates constant, calculate the odds ratio associated with having
cancer as part of their problem and find a CI for the population OR. (4 pts.)
e) Holding the other covariates constant, calculated the odds ratio associated with being
admitted to the ICU from the emergency room. Contrast this estimate with OR from part
(a). Why do they differ? (3 pts.)
2
f) Elmer Fudd is 72 years old, cancer free, and brought to emergency room after
collapsing from chasing a wascally rabbit. His systolic blood pressure upon arrival was
90 mmHg. What is the estimated probability that poor Elmer Fudd dies? (3 pts.)
2.) Sixty-five pregnant pairs of women at a high risk of pregnancy-induced hypertension
participated in a randomized controlled trial. Each pair was matched according to age
and parity (# of previous pregnancies). One woman in each pair was randomized to
receive aspirin and the other a placebo. For each pair hypertension status was recorded
and the results are shown below.
Placebo Treated
Hypertension
No hypertension
Hypertension
4
11
Aspirin
Treated
No Hypertension
30
20
Is there evidence to suggest that aspirin was effective at reducing the risk of
hypertension? Use the appropriate test to answer this question. The table below should
prove useful. State the test you are using and summarize your findings. (6 pts.)
3
3.) Duration of IUD Use and Infertility
A study was performed relating the duration of IUD (inter-uterine device) use to
infertility. A group of 89 infertile IUD users and group of 640 control (fertile) IUD users
were identified. The women were subdivided by the duration of IUD use and the data are
summarized in the table below.
Status
Case
Control
Column
Totals
<3
months
10
(7.69)
53
(55.31)
63
Duration of IUD use in months
[3,18)
[18,36]
months
months
23
20
(
)
(22.95)
200
168
(195.78)
(165.05)
223
188
> 36
months
36
(31.13)
219
(
)
255
Row Totals
89
604
n = 693
a) Explain what is meant by a case-control study. What is the main reason for doing a
case-control study? What limitations do these studies have? (3 pts.)
b) What type of statistical test could we use to determine whether the duration of IUD
use was the same for both cases and controls. (1 pt.)
c) Fill in the missing expected frequencies in the contingency table above. (2 pts.)
d) The result of the test is shown below. Summarize the findings from the test in
terms of the research question. (3 pts.)
4
4) The  4 allele of the gene encoding apolipoprotein E (APOE) is strongly associated
with Alzheimer’s disease, but its value in making the diagnosis remains unbeaten. A
study was conducted among 2188 patients who were evaluated at autopsy for
Alzheimer’s disease by previously established pathological criteria. Patients were also
evaluated clinically (based on symptoms while they were alive) for the presence of
Alzheimer’s disease. Suppose the pathological diagnosis is considered the “gold
standard” for Alzheimer’s disease (D+). If both the  4 allele and clinical diagnosis are
present it is considered a positive screening test result (T+).
Pathological
Diagnosis for
AD (D+)
No Pathological
Diagnosis for AD
(D-)
diagnosis both present
(T+)
 4 allele or clinical
diagnosis absent
(T-)
1076
66
1142
694
352
1046
Column Totals
1770
418
2188
 4 allele and clinical
Row Totals
a) What is the specificity of the test? False-positive probability? (2 pts.)
b) What is the sensitivity of the test? False-negative probability? (2 pts.)
c) 3% of the population between 65 and 74 years of age has Alzheimer’s disease. Given
that 74 year old Willem de Kooning has the  4 allele and has been clinically diagnosed
as possibly having Alzheimer’s, what is the probability he actually has Alzheimer’s. (4 pts.)
5
Multiple Choice Questions: Circle the correct answer. (2 pts each)
5. Suppose a study is performed concerning infant blood pressure and birth weight.
All infants born in a specific hospital are ascertained within the first week of life
while in the hospital and have their blood pressure measured in the newborn
nursery. The researchers then divide the infants into two groups: a high-blood
pressure group and a normal blood pressure group. They then compare birth
weights between these two groups. What type of study design was used?
a.
b.
c.
d.
Case-control study
Cross-sectional study
Randomized clinical trial
Cohort study
6. A study looked at the effects of oral contraceptive use on heart disease in women 40
to 44 years of age. It found that among 5000 current oral contraceptive users at
baseline, 13 women develop a myocardial infarction over a 3 year period, whereas
among 10,000 non-oral contraceptive users, 7 develop a myocardial infarction over a
3-year period. What type of study design was used?
a.
b.
c.
d.
Case-control study
Cross-sectional study
Randomized clinical trial
Cohort study
7. Consider the previous problem investigating the effects of oral contraceptive use on
heart disease in women 40 to 44 years of age. Which of the following statistical tests
is most appropriate for these data?
a.
b.
c.
d.
binomial exact test
chi-square test
McNemar’s test
independent samples t-test
8. Once again, consider the study from problem 6. The relative risk of having a
myocardial infarction for oral contraceptive users compared to non-users was
investigated, and a 95% confidence interval for this relative risk is given by (1.48,
9.30). Which of the following statements is most correct?
a. This interval provides evidence that for women age 40-44, oral contraceptive
users are more likely to have a myocardial infarction than non-users because
the confidence interval does not include zero.
6
b. This interval provides evidence that for women age 40-44, oral contraceptive
users are more likely to have a myocardial infarction than non-users because
the confidence interval does not include one.
c. This interval does not provide evidence that for women age 40-44, oral
contraceptive users are more likely to have a myocardial infarction than nonusers because the confidence interval does not include zero.
d. This interval does not provide evidence that for women age 40-44, oral
contraceptive users are more likely to have a myocardial infarction than nonusers because the confidence interval does not include one.
9. A twin design is used to study age-related macular degeneration (AMD), a common
eye disease of the elderly that results in substantial losses in vision. The study
involves 66 sets of identical twins, where one twin has AMD and the other twin does
not. The twins are given a dietary questionnaire to report their usual diet, and
researchers record whether each individual in the study takes multivitamin
supplements. The data are shown in the following table.
AMD twin
Multivitamin
supplement
No multivitamin
supplement
Normal Twin
Multivitamin
No multivitamin
supplement
supplement
3
10
8
45
Which of the following statistical tests is most appropriate for these data?
a.
b.
c.
d.
Fisher’s exact test
chi-square test
McNemar’s test
independent samples t-test
10. Which of the following statements is most correct?
a. It is appropriate to calculate relative risk for case-control studies.
b. The odds ratio defines how much more likely a certain outcome is to occur
when a given risk factor is present.
c. The odds ratio can be used to estimate the relative risk when the disease
outcome under study is rare in the general population.
d. The odds ratio should not be used when analyzing results of case-control
studies.
7
Short Answer Questions:
11. A study is conducted to investigate the relationship between lung-cancer incidence
and heavy drinking (defined as ≥2 drinks per day). The data are summarized in the
following table.
Lung Cancer
Drinking Status
Heavy Drinker
Non-Heavy drinker
Yes
33
27
No
1667
2273
a. Compute the relative risk of lung cancer for heavy drinkers compared to nonheavy drinkers.
(3 pts)
b. Compute the odds ratio for having lung cancer for heavy drinkers compared to
non-heavy drinkers. (3 pts)
12. Improving control of blood-glucose levels is an important motivation for the use of
insulin pumps by diabetic patients. However, certain side effects have been reported
with pump therapy. The results of a study investigating the occurrence of diabetic
ketoacidosis (DKA) in patients before and after the start of pump therapy are shown
below. The research question is as follows: Is the rate of DKA different before and
after the start of pump therapy?
After pump therapy
No DKA
DKA
Before pump therapy
No DKA
DKA
128
7
19
7
a. The binomial distribution can be used to find the p-value for this test. What
values should be used for n and p? (2 pts)
n = _____________ , p = _____________
8
b. We can also use SAS to carry out this analysis, as shown below. Use the output
to write a conclusion addressing the research question. (3 pts)
13. Two groups of 50-59 year-old men with initially normal cholesterol levels are
identified. Group A consists of 25 men whose cholesterol level rose by 50 mg/dL
over a 5-year period, and Group B consists of 25 men whose cholesterol level
dropped by 50 mg/dL over a 5-year period. The groups were followed for mortality
over another 5 years, and the results are given below.
a. Researchers were interested in whether men whose cholesterol level rose
(Group A) had a different subsequent mortality than those whose cholesterol
level dropped (Group B). Which statistical test is most appropriate for
investigating this research question? (2 pts)
9
b. Use the SAS output to find the p-value for testing the research question given
in part a, and write a conclusion in the context of the problem. (4 pts)
c. The power for the above test is calculated to be .271. Using everyday
language, explain exactly what this means.
10
Download