Biost 536, Fall 2014 Homework #1 September 26, 2014, Page 1 of 4 Biost 536: Categorical Data Analysis in Epidemiology Emerson, Fall 2014 Homework #1 September 26, 2014 Student # 1704 1. Provide suitable descriptive statistics for this dataset as might be presented in Table 1 of a manuscript appearing in the medical literature. Methods: I created the following table to provide descriptive statistics for patients categorized by treatment with either daunorubicin or idarubicin. Relevant variables include demographic factors, classification of disease into subtypes, and measurements of disease severity and patient condition. Results: There were 65 patients treated with daunorubicin and 65 patients treated with idarubicin. Measurements of FAB were missing for 12 patients, and an additional patient was omitted from analysis due to an improbable FAB value of 0. One patient was missing data on baseline white blood cells, one patient was missing data on baseline platelets, and one patient was missing data on baseline hemoglobin. Patients on daunorubicin had trends toward higher baseline white blood cells and platelets. Descriptive Statistics Daunorubicin Idarubicin All patients Mean (SD: Min Mdn Max; n) Mean (SD: Min Mdn Max; n) Mean (SD: Min Mdn Max; n) n (%) n (%) n (%) 39.8 (13.4; 19, 40, 60; n=65) 38.0 (12.5; 17, 36, 61; n=65) 38.9 (12.9; 17, 37 , 61 ;n= 130) Female 30 (46%) 35 (54%) 65 (50%) Male 35 (54%) 30 (46%) 65 (50%) 3.3 (1.4; 1, 3, 6; n=56) 2.9 (1.5; 1, 3, 6; n=61) 3.1 (1.5; 1, 3, 6; n=117) 1 6 (11%) 13 (21%) 19 (16%) 2 15 (27%) 15 (25%) 30 (25%) 3 9 (16%) 11 (18%) 20 (17%) 4 12 (21%) 8 (13%) 20 (17%) 5 13 (23%) 12 (20%) 25 (21%) 6 1 (2%) 2 (3%) 3 (3%) 8* 4 12* 79.5 (12.6; 40, 80, 100; n=65) 79.5 (11.6; 30, 80, 100; n=65) 79.5 (12.1; 30, 80, 100; n=130) 43.3 (55.0; 0.7, 16.7, 215; n=64) 29.0 (36.3; 0.4, 11.8, 154.1; n=65) 36.1 (46.9; 0.4, 13.8, 215; n=129) Age (years) Sex FAB classification Missing Karnovsky score Baseline white blood cellsA B Baseline platelets 93.6 (92.4; 11, 62, 457; n=64) 66.6 (57.8; 11, 50, 370; n=65) Baseline hemoglobin C 9.6 (1.5; 6.4, 9.5, 13.9; n=64) 9.2 (1.8; 2.8, 9.2, 13.7; n=65) *In addition to the 12 missing values for FAB, one implausible value of 0 for FAB in the daunorubicin group was omitted. A One person in the daunorubicin group was missing data on baseline white blood cells. B One person in the daunorubicin group was missing data on baseline platelets. C One person in the hemoglobin group was missing data on baseline hemoglobin. 80.0 (77.8; 11, 57, 457; n=129) 9.4 (1.7; 2.8, 9.3, 13.9; n=129) Biost 536, Fall 2014 Homework #1 September 26, 2014, Page 2 of 4 2. Perform an analysis to assess whether subjects taking idarubicin have better primary clinical outcomes than patients on daunorubicin. Methods: I used an unadjusted logistic regression model to assess the association of complete remission and treatment. Both treatment (idarubicin vs. daunorubicin) and complete remission (yes vs. no) were binary variables. Results: There was evidence that patients taking idarubicin had better primary clinical outcomes than patients taking daunorubicin (two-sided p=0.016). Those taking idarubicin were 2.59 times as likely to go into complete remission than those taking daunorubicin. Based on a 95% confidence interval, I found that the precision of the study was such that the result would not be unusual if the odds ratio of complete remission in the idarubicin group as compared to the daunorubicin group ranged between 1.20 and 5.59. 3. Is the analysis of treatment effect confounded by sex? Provide your reasoning. Methods: I assessed whether confounding was possible in this study design. In order to be a confounder, the causal model must include an association between the potential confounding factor (PCF) and the exposure (treatment), as well as the PCF and the outcome (complete remission), with the additional condition that the PCF cannot be in the causal pathway. Results: This is a randomized clinical trial, and so there should not be an association between sex and the treatment, which is necessary for confounding to be present. Table 1 shows relatively balanced sex distribution between the daunorubicin and idarubicin groups, so there is no evidence that randomization failed. 4. Perform an analysis to assess any treatment benefit of idarubicin over daunorubicin adjusted for sex. Methods: I used a logistic regression model to assess the association of complete remission and treatment adjusted for sex. Treatment (idarubicin vs. daunorubicin), complete remission (yes vs. no), and sex (male vs. female) were binary variables. Results: There was evidence that patients taking idarubicin have better primary clinical outcomes than patients taking daunorubicin, after adjustment for sex (two-sided p=0.016). Those taking idarubicin were 2.59 times as likely to go into complete remission than those taking daunorubicin. Based on a 95% confidence interval, I found that the precision of the study was such that the result would not be unusual if the odds ratio of complete remission in the idarubicin group as compared to the daunorubicin group, adjusted for sex, ranged between 1.20 and 5.60. 5. Perform an analysis to assess whether males taking idarubicin have more frequent complete remission than males taking daunorubicin. Methods: I used a logistic regression model to assess the association of complete remission and treatment in males. Treatment (idarubicin vs. daunorubicin) and complete remission (yes vs. no) were binary variables. Results: There was not strong evidence that male patients taking idarubicin had better primary clinical outcomes than male patients taking daunorubicin (two-sided p=0.084). Males who took idarubicin were 2.47 times as likely to go into complete remission as males taking daunorubicin; however, based on a 95% confidence interval, I found that the precision of the study was such that the result would not be unusual if the odds ratio of complete remission in the idarubicin group as compared to the daunorubicin group ranged between 0.89 and 6.88. Since I was limiting analysis to males, the sample size was smaller, which made the confidence interval less precise. 6. Repeat problem 5 for females. Methods: I used a logistic regression model to assess the association of complete remission and treatment in females. Treatment (idarubicin vs. daunorubicin) and complete remission (yes vs. no) were binary variables. Results: There was not strong evidence that female patients taking idarubicin had better primary clinical outcomes than female patients taking daunorubicin (two-sided p=0.084). Females taking idarubicin were 2.57 times as likely to go into complete remission as females taking daunorubicin; however, based on a 95% Biost 536, Fall 2014 Homework #1 September 26, 2014, Page 3 of 4 confidence interval, I found that the precision of the study was such that the result would not be unusual if the odds ratio of complete remission in the idarubicin group as compared to the daunorubicin group ranged between 0.75 and 8.77. Since I was limiting analysis to females, the sample size was smaller, which made the confidence interval less precise. 7. Perform an analysis to assess whether any treatment benefit of idarubicin over daunorubicin differs by sex. Methods: I used a logistic regression model with an interaction term for sex and treatment to assess whether the treatment benefit of idarubicin over daunorubicin differed by sex. I used the p-value for the interaction term to test the null hypothesis of no interaction. Treatment (idarubicin vs. daunorubicin), complete remission (yes vs. no), and sex (male vs. female) were binary variables. Results: There was not evidence that the treatment benefit of idarubicin over daunorubicin differed by sex (p for interaction=0.961). Since I was considering the treatment benefit separately for males and females, the sample sizes were smaller. This lowered the precision, which could have been the reason for not seeing an interaction, or there truly may have been no interaction effect. 8. Use the analysis you performed in problem 7 to answer the question of whether idarubicin use is associated with more frequent induction of remission. Methods: In order to test the null hypothesis of no effect of treatment on complete remission, I performed an ftest, using a logistic regression model with an interaction term for sex and treatment. Treatment (idarubicin vs. daunorubicin), complete remission (yes vs. no), and sex (male vs. female) were binary variables. Results: There was evidence that patients taking idarubicin had better primary clinical outcomes than patients taking daunorubicin after accounting for potential effect modification by sex (p=0.016). 9. For each of the analysis models used in problems 4 and 7, provide estimates of the probability of inducing a complete remission by all combinations of treatment group and sex. How do these estimates compare to descriptive statistics for those groups? Methods: I used the adjusted logistic regression model (from question 4) and the logistic regression model with an interaction term (from question 7) to estimate the probability of inducing a complete remission for all combinations of sex and treatment group. I got the coefficients for these models using a statistical analysis package and then had to plug these numbers into the appropriate model for the given treatment and sex groups. Because I had used logistic regression, I then had to convert odds to probability. Results: The estimates for the adjusted model were the same as those for the descriptive statistics for females in both treatment groups and males who took idarubicin (at least with the two significant digits I presented). Males using daunorubicin were very similar, with a 48% probability of complete remission in the adjusted model and a 49% probability of complete remission in the descriptive statistics. The estimates from the interaction model were the same as the descriptive statistics for all combinations of sex and treatment. Descriptive Statistics for Probability of Complete Remission Daunorubicin Idarubicin Count (percent) Count (percent) Males 17 (49%) 21 (70%) Females 21 (70%) 30 (86%) Probability of Complete Remission Adjustment Model (Question 4) Daunorubicin Idarubicin Males 48% 70% Females 70% 86% Biost 536, Fall 2014 Homework #1 September 26, 2014, Page 4 of 4 Probability of Complete Remission Interaction Model (Question 7) Daunorubicin Idarubicin Males 49% 70% Females 70% 86% 10. Which of the above analyses should be used to decide whether idarubicin should be approved for the indication of AML? What problems exist with the use of the other analyses you performed? The appropriate model is the crude model without a term for adjustment or interaction, used in problem 2. As stated in problem 3, this is a randomized clinical trial, and so there should not be an association between sex and the treatment, which is necessary for confounding to be present. Table 1 shows relatively balanced sex distribution between the daunorubicin and idarubicin groups, so there is no evidence that randomization failed. Therefore, a sex-adjusted model is not appropriate. In order for an interaction term to be warranted, there would need to be an a priori hypothesis that the effect of treatment on complete remission is biologically different between men and women. I do not believe this is likely, and so a model with an interaction term is not appropriate.