Biostatistics (STAT 405) - Midterm Exam ( points) NAME: 1. ICU Admission Study The ICU data set consists of a sample of 200 subjects who were part of a much larger study on survival of patients following admission to an adult intensive care unit (ICU). The major goal of this study was to examine and identify the risk factors associated with ICU mortality. These data represent a small subset of the variables collected by the researchers. Source: Data were collected at Baystate Medical Center in Springfield, MA. NAME Description Codes/Values/Units --------------------------------------------------------------------------------Outcome Vital Status 1 = Died 0 = Survived Age Age years Cancer Cancer Part of Present Problem Yes No SYS Systolic Blood Pressure at ICU Admission mmHg ER Was patient first brought to the emergency room (ER) Yes No Below is a cross-tabulation of outcome and whether of not the patient was admitted to the ICU from the emergency room. a) Compute the risk difference, relative risk, and ER Died Survived Row Totals odds ratio for death associated with being Yes 38 109 147 admitted to the adult ICU from the emergency No 2 51 53 room. (6 pts.) Column Totals 40 160 n = 200 b) Construct a CI for the RR associated with be admitted to the ICU from the emergency room and interpret the CI in practical terms. (4 pts.) 1 The results of fitting a logistic model for survival status in R using all available covariates are given below. glm(formula = Outcome ~ Age + Cancer + Sys + ER, family = "binomial") Deviance Residuals: Min 1Q Median -1.2327 -0.6744 -0.4100 3Q -0.1657 Max 2.9168 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -4.455427 1.405254 -3.171 0.001522 ** Age 0.036205 0.011222 3.226 0.001254 ** CancerYes 1.705672 0.821645 2.076 0.037901 * Sys -0.013801 0.006061 -2.277 0.022795 * ERYes 2.923411 0.887747 3.293 0.000991 *** --Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 200.16 Residual deviance: 163.55 AIC: 173.55 on 199 on 195 degrees of freedom degrees of freedom c) Holding the other covariates constant, calculate the odds ratio associated with a 10 year increase in age. Interpret. (3 pts.) d) Holding the other covariates constant, calculate the odds ratio associated with having cancer as part of their problem and find a CI for the population OR. (4 pts.) e) Holding the other covariates constant, calculated the odds ratio associated with being admitted to the ICU from the emergency room. Contrast this estimate with OR from part (a). Why do they differ? (3 pts.) 2 f) Elmer Fudd is 72 years old, cancer free, and brought to emergency room after collapsing from chasing a wascally rabbit. His systolic blood pressure upon arrival was 90 mmHg. What is the estimated probability that poor Elmer Fudd dies? (3 pts.) 2.) Sixty-five pregnant pairs of women at a high risk of pregnancy-induced hypertension participated in a randomized controlled trial. Each pair was matched according to age and parity (# of previous pregnancies). One woman in each pair was randomized to receive aspirin and the other a placebo. For each pair hypertension status was recorded and the results are shown below. Placebo Treated Hypertension No hypertension Hypertension 4 11 Aspirin Treated No Hypertension 30 20 Is there evidence to suggest that aspirin was effective at reducing the risk of hypertension? Use the appropriate test to answer this question. The table below should prove useful. State the test you are using and summarize your findings. (6 pts.) 3 3.) Duration of IUD Use and Infertility A study was performed relating the duration of IUD (inter-uterine device) use to infertility. A group of 89 infertile IUD users and group of 640 control (fertile) IUD users were identified. The women were subdivided by the duration of IUD use and the data are summarized in the table below. Status Case Control Column Totals <3 months 10 (7.69) 53 (55.31) 63 Duration of IUD use in months [3,18) [18,36] months months 23 20 ( ) (22.95) 200 168 (195.78) (165.05) 223 188 > 36 months 36 (31.13) 219 ( ) 255 Row Totals 89 604 n = 693 a) Explain what is meant by a case-control study. What is the main reason for doing a case-control study? What limitations do these studies have? (3 pts.) b) What type of statistical test could we use to determine whether the duration of IUD use was the same for both cases and controls. (1 pt.) c) Fill in the missing expected frequencies in the contingency table above. (2 pts.) d) The result of the test is shown below. Summarize the findings from the test in terms of the research question. (3 pts.) 4 4) The 4 allele of the gene encoding apolipoprotein E (APOE) is strongly associated with Alzheimer’s disease, but its value in making the diagnosis remains unbeaten. A study was conducted among 2188 patients who were evaluated at autopsy for Alzheimer’s disease by previously established pathological criteria. Patients were also evaluated clinically (based on symptoms while they were alive) for the presence of Alzheimer’s disease. Suppose the pathological diagnosis is considered the “gold standard” for Alzheimer’s disease (D+). If both the 4 allele and clinical diagnosis are present it is considered a positive screening test result (T+). Pathological Diagnosis for AD (D+) No Pathological Diagnosis for AD (D-) diagnosis both present (T+) 4 allele or clinical diagnosis absent (T-) 1076 66 1142 694 352 1046 Column Totals 1770 418 2188 4 allele and clinical Row Totals a) What is the specificity of the test? False-positive probability? (2 pts.) b) What is the sensitivity of the test? False-negative probability? (2 pts.) c) 3% of the population between 65 and 74 years of age has Alzheimer’s disease. Given that 74 year old Willem de Kooning has the 4 allele and has been clinically diagnosed as possibly having Alzheimer’s, what is the probability he actually has Alzheimer’s. (4 pts.) 5 Multiple Choice Questions: Circle the correct answer. (2 pts each) 5. Suppose a study is performed concerning infant blood pressure and birth weight. All infants born in a specific hospital are ascertained within the first week of life while in the hospital and have their blood pressure measured in the newborn nursery. The researchers then divide the infants into two groups: a high-blood pressure group and a normal blood pressure group. They then compare birth weights between these two groups. What type of study design was used? a. b. c. d. Case-control study Cross-sectional study Randomized clinical trial Cohort study 6. A study looked at the effects of oral contraceptive use on heart disease in women 40 to 44 years of age. It found that among 5000 current oral contraceptive users at baseline, 13 women develop a myocardial infarction over a 3 year period, whereas among 10,000 non-oral contraceptive users, 7 develop a myocardial infarction over a 3-year period. What type of study design was used? a. b. c. d. Case-control study Cross-sectional study Randomized clinical trial Cohort study 7. Consider the previous problem investigating the effects of oral contraceptive use on heart disease in women 40 to 44 years of age. Which of the following statistical tests is most appropriate for these data? a. b. c. d. binomial exact test chi-square test McNemar’s test independent samples t-test 8. Once again, consider the study from problem 6. The relative risk of having a myocardial infarction for oral contraceptive users compared to non-users was investigated, and a 95% confidence interval for this relative risk is given by (1.48, 9.30). Which of the following statements is most correct? a. This interval provides evidence that for women age 40-44, oral contraceptive users are more likely to have a myocardial infarction than non-users because the confidence interval does not include zero. 6 b. This interval provides evidence that for women age 40-44, oral contraceptive users are more likely to have a myocardial infarction than non-users because the confidence interval does not include one. c. This interval does not provide evidence that for women age 40-44, oral contraceptive users are more likely to have a myocardial infarction than nonusers because the confidence interval does not include zero. d. This interval does not provide evidence that for women age 40-44, oral contraceptive users are more likely to have a myocardial infarction than nonusers because the confidence interval does not include one. 9. A twin design is used to study age-related macular degeneration (AMD), a common eye disease of the elderly that results in substantial losses in vision. The study involves 66 sets of identical twins, where one twin has AMD and the other twin does not. The twins are given a dietary questionnaire to report their usual diet, and researchers record whether each individual in the study takes multivitamin supplements. The data are shown in the following table. AMD twin Multivitamin supplement No multivitamin supplement Normal Twin Multivitamin No multivitamin supplement supplement 3 10 8 45 Which of the following statistical tests is most appropriate for these data? a. b. c. d. Fisher’s exact test chi-square test McNemar’s test independent samples t-test 10. Which of the following statements is most correct? a. It is appropriate to calculate relative risk for case-control studies. b. The odds ratio defines how much more likely a certain outcome is to occur when a given risk factor is present. c. The odds ratio can be used to estimate the relative risk when the disease outcome under study is rare in the general population. d. The odds ratio should not be used when analyzing results of case-control studies. 7 Short Answer Questions: 11. A study is conducted to investigate the relationship between lung-cancer incidence and heavy drinking (defined as ≥2 drinks per day). The data are summarized in the following table. Lung Cancer Drinking Status Heavy Drinker Non-Heavy drinker Yes 33 27 No 1667 2273 a. Compute the relative risk of lung cancer for heavy drinkers compared to nonheavy drinkers. (3 pts) b. Compute the odds ratio for having lung cancer for heavy drinkers compared to non-heavy drinkers. (3 pts) 12. Improving control of blood-glucose levels is an important motivation for the use of insulin pumps by diabetic patients. However, certain side effects have been reported with pump therapy. The results of a study investigating the occurrence of diabetic ketoacidosis (DKA) in patients before and after the start of pump therapy are shown below. The research question is as follows: Is the rate of DKA different before and after the start of pump therapy? After pump therapy No DKA DKA Before pump therapy No DKA DKA 128 7 19 7 a. The binomial distribution can be used to find the p-value for this test. What values should be used for n and p? (2 pts) n = _____________ , p = _____________ 8 b. We can also use SAS to carry out this analysis, as shown below. Use the output to write a conclusion addressing the research question. (3 pts) 13. Two groups of 50-59 year-old men with initially normal cholesterol levels are identified. Group A consists of 25 men whose cholesterol level rose by 50 mg/dL over a 5-year period, and Group B consists of 25 men whose cholesterol level dropped by 50 mg/dL over a 5-year period. The groups were followed for mortality over another 5 years, and the results are given below. a. Researchers were interested in whether men whose cholesterol level rose (Group A) had a different subsequent mortality than those whose cholesterol level dropped (Group B). Which statistical test is most appropriate for investigating this research question? (2 pts) 9 b. Use the SAS output to find the p-value for testing the research question given in part a, and write a conclusion in the context of the problem. (4 pts) c. The power for the above test is calculated to be .271. Using everyday language, explain exactly what this means. 10