STAT 601 – Assignment #4 (Due Sunday, March 6th) (80 pts.) Review the following materials: 1) Narrated Powerpoint Lectures Comparing Two Populations with a Numeric Response a) independent samples b) dependent samples Comparing Two Populations with a Nominal/Categorical Response a) independent samples (z-test, Fisher's Exact test, CI’s for RR & OR) b) dependent samples (FYI ONLY – NOT ON ASSIGNMENT) 2) Non-narrated Powerpoint Lectures (same as above) 3) Lecture Handouts 9 – Statistical Inference sections 9.5 – 9.8 only. 4) JMP tutorials below by the Assignment 4 link. JMP Demo - Two Sample Pooled t-Test JMP Demo - Two Sample Non-pooled t-Test JMP Demo - Paired t-Test RESEARCH ARTICLE REVIEW PROBLEMS 1) Use Table VI in the paper entitled “Perceived Coping MI” answer the following questions. a) Were the groups in this study independent or dependent? Provide a rationale for your answer. b) Examine the t-ratios (i.e. t-statistics) in Table VI. Which t-ratio indicates the largest difference between the males and females post MI in this study? Is this ratio significant? Provide a rationale for your answer. c) What is a Type I error? Is there a risk of Type I error in this study? Provide a rationale for your answer. d) The authors reported multiple df (degrees of freedom) values in Table VI. Why were different df values reported for this study? e) What does the t-value for the Physical Component Score tell you about men and women post MI? If this result was consistent with previous research, how might you use this knowledge in your practice? 2) Using the article labeled “Transfer Anxiety MI” answer the following questions. a) The baseline anxiety and information scores were not significantly different between the experimental and comparison groups. What does this mean? Does this strengthen or weaken the results of the study? Provide a rationale for your answer. b) The results indicated a statistically significant difference between the anxiety score for the two groups of patients (t=3.875, p<.0001). What do these results mean? c) The results indiciated a statistically significant difference between the anxiety scores for the two groups of patients (t = 3.875, p < .0001). Are these results also significant at the .01 level? Provide a rationale for your answer. d) How might you use these study findings in practice? 3) Using the article titled “Effect of Health Promo CV Risk” answer the following questions. a) What are the two groups whose results are reflected by the t-ratios in Tables 2 & 3? b) Which t-ratio in Table 2 represents the greatest relative or standardized difference between the pretest and 3 months outcomes? Is this t-ratio statistically significant? Provide a rationale for your answer. c) Which t-ratio in Table 3 represents the smallest relative difference between the pretest and 3 months? Is this t-ratio statistically significant? What does this result mean? d) Compare the 3 months and 6 months t-ratios for the variable Exercise from Table 3. What is your conclusion about the long-term effect of the health-promotion intervention on Exercise in this study? f) Why are the largest t-ratios more likely to be statistically significant? g) Did the health-promotion program have a statistical effect on Systolic blood pressure (BP) in this study? Provide a rationale for your answer. ADDITIONAL PROBLEMS Problems 1 – 4 deal with comparing two population means using independent and dependent samples. 1 - Preeclampsia and Gestational Age (this example is worked through in the narrated Powerpoint, see if you can reproduce the results) The goal of study conducted by Baker et. al. was to determine whether medical deformation alters in vitro effects of plasma from patients with preeclampsia on endothelial cell function to produce a paradigm similar to in vivo disease state. Subjects were 24 nulliparous pregnant women before delivery, of whom 12 had preeclampsia and 12 were normal pregnant patients. The patients were independently sampled from these populations and were not matched according to any criteria. Among the data collected were the gestational ages (in weeks) at delivery. Research Question: Is there evidence to suggest that the mean gestational age at delivery for mothers with preeclampsia is lower than that for mothers with a normal pregnancy? Use JMP to analyze these data. You can enter the data in JMP yourself. You will need two columns, one to denote the group and the other to contain the response, in this case gestational age at birth. Be sure to check assumptions and perform your analysis accordingly. Data: Preeclampsia: 38, 32, 42, 30, 38, 35, 32, 38, 39, 29, 29, 32 Normal: 40, 41, 38, 40, 40, 39, 39, 41, 41, 40, 40, 40 a) Perform a hypothesis test answer the question of interest and summarize your findings. (4 pts.) b) Find and report the 95% CI for the difference in the population means from the JMP output. Discuss this interval in practical terms. (2 pts.) 2 - DHEAS Levels in Asthmatics Data File: Asthma.JMP In a study to explore the possibility of hormonal alteration in asthma, Weinstein et al. collected data on 22 postmenopausal women with asthma and 22 age-matched post menopausal, women without asthma. Perform the appropriate analysis of these data to answer the following research question: Is there evidence to suggest that postmenopausal women with asthma have significantly higher levels of dehydroepiandrosterone sulfate (DHEAS)? a) Do you think that age-matching to create pairing is valid? Why or why not? (2 pts.) b) Use an appropriate test and supporting CI to answer the research question. Summarize your findings. (5 pts.) 3 – Comparisons of the Mean Infant Birth Weight for Different Populations of Mothers Data File: NCBirth.JMP In this problem you will use comparative methods to compare the actual mean birth weights of different populations of mothers. The results of your comparisons will be contained in the table below. For each situation be sure to check assumptions and briefly summarize your findings in that regard. Use appropriate statistical methods to make comparisons of mean birth weight across the two populations defined by the variables below: Sex of child (1 = male, 2 = female) Marital status (1 = married, 2 = not married) White? (Non-white vs. White) Hispanic? (Hisp vs. non-Hisp) Smokers vs. non-smokers a) Use both hypothesis tests and confidence intervals to compare the mean birth weights of the infants born to the two populations defined by the factors above. To organize your results enter them into the table below. For the p-value and CI columns you will need to enter the p-value from the appropriate test for comparing the two population means for each factor and the confidence interval for the difference in those population means, thus for each factor you will only have one p-value and confidence interval. Report the sample size, sample mean, and sample standard deviation (SD) for each level of the factor. (12 pts.) Sex of child is done for you as an example to follow. Factor Sex of Child Male Female Marital Status Married Not Married White? Non-white White Hispanic? Hispanic Non-Hispanic Smoking Status Smoker Non-smoker Sample Size (n) Sample Mean SD p-value 418 382 3340.8g 3253.8g 651.3g 622.9g .0543 CI for Difference in Population Means Male - Female (-1.6 g, 175.6 g) b) Briefly comment on the assumptions required for the analyses you conducted in completing the table. Are the assumptions satisfied for each factor? (2 pts.) c) Summarize your findings from part (a) in a clearly written paragraph, citing p-values and confidence intervals as needed. (5 pts.) 4 - Middle Ear Effusion in Breast-Fed and Bottle-Fed Infants A common symptom of otitus media in young children in the prolonged presence of fluid in the middle ear, known a middle-ear effusion. The presence of fluid may result in termporary hearing loss and interfere with normal learning skills in the first two years of life. One hypothesis is that babies who are breast-fed for at least 1 month build up some immunity against the effects of the disease and have less prolonged effusion than do bottle-fed babies. A small study of 24 pairs of babies is set up, where the babies are matched on a one-to-one basis according to age, sex, socioeconomic status, and type of medications taken. One member of the matched pair is a breast-fed baby, and other member is a bottle fed baby. The outcome variable is the duration of middle-ear effusion after the first episode of otitus media. The results are shown below. Pair Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Duration of effusion in breastfed baby 20 11 3 24 7 28 58 39 17 17 12 52 14 12 30 7 15 65 10 7 19 34 25 Duration of effusion in bottlefed baby 18 35 7 182 7 33 223 57 76 186 29 39 15 21 28 8 27 77 12 8 16 28 20 Difference Do these data provide evidence that breast-fed babies have shorter durations of effusion when compared to bottle-fed babies that are the same age, sex, socioeconomic status, and on the same medications? Enter these data into JMP and conduct the appropriate analysis. (6 pts.) PROBLEMS 5 & 6 deal with comparing two proportions (p1 vs. p2) and inference for odds ratios (OR) and for relative risks (RR) 5 - Prostate-Specific Antigen (PSA) Levels and Cancer Diagnosis Babaian et al. "The Role of Prostate-Specific Antigen as Part of the Diagnostic Triad and as a Guide When to Perform a Biopsy", Cancer, 68, (1991) state that prostate-specific antigen (PSA), found in the ductal epithelial cells of the prostate, is specific for prostatic tissue and is detectable in serum from men with normal prostates and men with either benign or malignant diseases of this gland. They determined the PSA values in sample of 124 men who underwent a prostate biopsy. Sixty-seven of the men had elevated PSA values (> 4 ng/ml). Of these, 46 were diagnosed as having cancer. Ten of the 57 men with PSA values < 4 ng/ml had cancer. On the basis of these data may we conclude that, in general, men with elevated PSA values are more likely to have prostate cancer? Let = .01. Use inferential based upon the standard normal and Fisher's Exact Test to test the hypothesis of interest. Data File: PSA-Cancer.JMP a) Standard normal test and CI for the difference in the population proportions. (4 pts.) b) Fisher's Exact Test (2 pts.) c) Summarize your findings. (2 pts.) 6 – HIV Status & IV Drug Use History of Women in NY Prison System In a study of HIV infection among women entering the New York State prison system, 475 inmates were cross-classified with respect to HIV seropositivity and their histories of intravenous drug use. The variables you will be working with are coded as follows: • • IV Drug Use – indicator of previous intravenous drug use (Yes or No) HIV Status – results of HIV seropositivity test (positive or negative) and the study results are contained in the data file: Prison HIV-Drug Use.JMP . Research Question: Is there evidence that intravenous drug use is associated with HIV seropositivity? a) Among women who have used drugs intravenously, what proportion are HIVpositive? Among women who have not used drugs intravenously, what proportion are HIV-positive? (2 pts.) b) Use Fisher’s Exact Test to determine if being HIV-positive is positively associated with a previous history of intravenous drug use for this population of women. State your conclusion along with a supporting p-value. (2 pts.) c) Find a 95% CI for the risk difference and interpret. This difference is also referred to as the attributable risk (AR) = pexposed - punexposed. (3 pts.) d) Use your answers to calculate the relative risk (RR) for being HIV-positive associated with intravenous drug use for this population of women. Also find a 95% CI for the RR. Interpret. (4 pts.) e) Compute the odds ratio (OR) for being HIV-positive associated with intravenous drug use for this population of women. Also find a 95% CI for the OR. Interpret. (4 pts.) f) Number Need to Harm (NNH) – Go to the following website which is actually the first hit when you Google Search: Number Needed to Harm. http://en.wikipedia.org/wiki/Number_needed_to_harm Read through the Wikipedia entry on this website and then find the Number Need to Harm for this study. (2 pts.)