Biost 536, Fall 2014 Homework #4 November 4, 2014, Page 1 of 4 Biost 536: Categorical Data Analysis in Epidemiology Emerson, Fall 2014 Homework #4 November 4, 2014 We are interested in associations between prevalence of infarct-like lesions on MRI and various predictors. 1. Fit a logistic regression model investigating prevalence of infarcts as a function of age (modeled continuously) and coronary heart disease (modeled as dummy variables). Provide a scientific interpretation of each of the regression coefficients, including a description of the intercept in the model. (You do not need to describe the methods, or provide CI or p values.) Infarcts = β0 + βage(X) + βchd(W) was the logistic regression model used to compare the prevalence of infarct like lesions on MRI by age and coronary heart disease (CHD) status. In this model the odds of infarct-like lesion on MRI among newborns (age=0) without CHD was 0.0083. Among a group of individuals with the same CHD status (0= no CHD, 1= angina, 2= MI), the odds of an infarct-like lesion found on MRI was 1.05 when comparing two groups of individuals who differ in age by one year. When adjusting for age, the odds of an infarct-like lesion found on MRI was 1.37 when comparing individuals with angina to those without a diagnosis of CHD and 1.82 when comparing individuals with prior myocardial infarction (MI) to individuals without a diagnosis of CHD. 2. Fit a logistic regression model investigating prevalence of infarcts as a function of age (modeled continuously), coronary heart disease (modeled as dummy variables), and their multiplicative interaction. Provide a scientific interpretation of each of the regression coefficients, including a description of the intercept in the model. (You do not need to describe the methods, or provide CI or p values.) Infarcts = β0 + βage(X) + βchd(W) + βa_chd(XW) was the logistic regression model used to compare the prevalence of infarct like lesions on MRI by age and coronary heart disease (CHD) status. In this model the odds of infarct-like lesion on MRI among newborns (age=0) without CHD was 0.0073. Among individuals without CHD, the OR of an infarct-like lesion found on MRI was 1.05 when comparing two groups of individuals who differ in age by one year. Among newborns, the odds of an infarct-like lesion found on MRI was 1.23 when comparing individuals with angina to those without a diagnosis of CHD and 7.00 when comparing individuals with prior myocardial infarction (MI) to individuals without a diagnosis of CHD. The OR of finding an infarct-like lesion on MRI when comparing individuals who differ in age by 1 year will change by 1.00 when CHD status changes from no CHD (CHD=0) to history of angina (CHD=1), and the OR of finding infarct-like lesion on MRI when comparing individuals who differ in age by one year will change by 0.98 when CHD status changes from no CHD (CHD=0) to prior MI (CHD=2). When comparing individuals with angina to those without a diagnosis of CHD, the OR of finding an infarct-like lesion on MRI will change by 1.00 when age changes by 1 year. When comparing individuals with prior MI to those without a Biost 536, Fall 2014 Homework #4 November 4, 2014, Page 2 of 4 diagnosis of CHD, the OR of finding an infarct-like lesion on MRI will change by 0.98 when age changes by 1 year. 3. Fit a logistic regression model that investigates the linearity of the association between the log odds of presence of infarcts and age, after adjustment for coronary heart disease. (Here you do need to describe your methods and results as they relate to the specific question.) Methods: To determine the association between age and presences of infarct-like lesions on MRI after adjusting for CHD status, the binary indicator of infarct-like lesions on MRI was analyzed using a logistic regression with age categorized into 5 categories (65-69, 70-74, 75-79, 80-84, 85+) and CHD status indicator as 3 categories (0=no CHD, 1= prior angina, 2= prior MI). All regression coefficients of the age categories were simultaneously tested equaling zero using a multiple partial Wald test. The Wald statistic of this test was used to calculate the p value. As predefined secondary analysis, a test of non-linearity was performed if the test of association was found to be statistically significant. In this second test a logistic regression model that included the dummay variables of age, CHD status as categories, and a linear continuous term for age. If the coefficients for one or more dummy variables in the second augmented model were significantly different from zero in a multiple partial Wald test, that would be evidence that suggestive of a non-linear association between infarct-like lesions on MRI and age. The overall test of association was as a gate-keeper in this testing strategy and as a result, the experiment wise type I error of the test of nonlinearity was preserved. Results: The prevalence of finding infarct-like lesions on MRI in the cohort overall was 0.31% (1066/3448). Logistic regression analysis of infarct-like lesions on MRI odds ratio across age groups using a dummy variable model found a statistically significant association between age and infarct-like lesions on MRI (p value <0.0001) after adjusting for CHD status. Since a statistically significant association was found, the second predefined test of nonlinearity in this association was performed. In that analysis the regression coefficients for the dummy variables were not found be jointly statistically significant suggesting we cannot reject the null hypothesis that the association between age and infarct-like lesions on MRI is linear after adjusting for CHD status (p value 0.89). 4. Fit a logistic regression model that investigates whether there is a U-shaped association between the log odds of presence of infarcts and ldl, after adjustment for age. (Here you do need to describe your methods and results as they relate to the specific question.) Methods: To determine the association between LDL cholesterol and presence of infarctlike lesion on MRI after adjusting for age, the binary indicator of infarct-like lesions on MRI was analyzed using a logistic regression on a linear splines model with LDL modeled as categories (0-24, 25-49, 50-174, 175-199, 200+) and age modeled as a continuous linear variable. Ratios of the odds of having an infarct-like lesion on MRI defined by LDL category were evaluated by testing that all regression coefficients for the LDL categories were equal to zero simultaneously. The likelihood ratio test of association was used to determine a two-sided p value based on the standard error of the regression parameters assuming an approximate normal distribution. As a pre-planned second step, a test for a U-shaped association between infarct-like lesions and LDL was Biost 536, Fall 2014 Homework #4 November 4, 2014, Page 3 of 4 performed if there was a significant association between infarct-like lesions on MRI and the linear splines model of LDL. The second test was done to determine if the association between MRI findings and LDL was non-linear. A multiple partial Wald test was performed on the linear splines regression coefficients, and if this analysis found a significant inequality among at least two of the coefficients, that would be interpreted as the association between MRI findings and LDL after adjusting for age was nonlinear. The fitted results from this regression model would then be graphed on LDL to determine if shape of the association. The overall test of association acted as a gate-keeper in this testing strategy, and as a result, the experiment wise type I error of the test of nonlinearity was preserved. Results: The prevalence of finding infarct-like lesions on MRI in the cohort overall was 0.31% (1066/3448). The prevalence of infarct-like lesions on MRI by LDL category were 0.32, 0.34, and 0.32 for LDL categories 0-74, 75-149, ≥ 150 respectively. Among LDL categories, the OR of having infarct-like lesion on MRI when comparing groups who differ in LDL by 1-unit are presented in the table below. Logistic regression analysis of infarct-like lesions on MRI odds ratio across LDL groups using a linear splines model found a statistically significant association between LDL and infarct-like lesions on MRI (p value <0.0001) after adjusting for age. Since a statistically significant association was found, the second predefined test of nonlinearity in this association was performed. In that analysis the regression coefficients for the linear splines variables were found be jointly highly statistically significant suggesting that the association between LDL and infarct-like lesions on MRI is nonlinear after adjusting for age (p value 0.04). A graph of the fitted values for the primary linear splines model is below. Table 1: Odds ratio of having infarct-like lesions on MRI when comparing individuals in the same category that differ in LDL by 1 after adjusting for age. AAI OR (95% CI) 0-74 1.005 (0.988-1.022) 75-149 0.999 (0.996-1.003) ≥ 150 1.008 (1.002-1.014) Biost 536, Fall 2014 Homework #4 November 4, 2014, Page 4 of 4