Random Number: 2470 Biost 536 Homework 2 Due 10-12-14 1. Methods: To determine if censoring is present in our data, I restricted the data to those who did not die to see what the minimum observation time was. Those that died during the study left the study due to the achievement of the outcome of interest, and they would not be considered censored. If the minimum observation time for those who did not die is greater than or equal to 5 years, then we will know that no one left the study during the 5-year period for reasons other than the outcome of death. Results: The minimum observation time for participants who did not die during the study was 1827 days which is equivalent to 5.00 years. This provides evidence that there was no censoring present in our data and that such methods as logistic regression can be used to answer the question of interest. 2. a. Methods: The comparison groups of those with baseline prevalence of ASCVD and those without baseline prevalence of ASCVD were compared using the difference in the probability of death. Crude estimates for the probability of death within 5 years were estimated for each group using the sample proportion, and the difference in those sample proportions was used to estimate the effect of ASCVD on mortality. The age and sex-adjusted estimates of the risk of mortality were calculated using a stratified analysis. Age was categorized into 5 age groups (65-69, 70-74, 75-79, 8084, and 85+ years), and sex is a binary variable. A 95% confidence interval for the difference in mortality probabilities was calculated using a Wald type confidence interval based on the approximate normal distribution for the maximum likelihood estimates for a binomial distribution. A p-value testing the null hypothesis of no difference in mortality probabilities were computed using the chi square test. Results: Death occurred within 5 years in 53 of the 518 (10.2%) of those without ASCVD at baseline and in 68 of the 217 (31.3%) of those with ASCVD at baseline. Among adults 65 and older, ASCVD is associated with a crude absolute increase of 21.1% in the probability of mortality (95% CI: 14.4%, 27.8% absolute increase). Among those adults of the same age and sex, ASCVD is associated with an absolute increase of 19.3% in the probability of mortality (95% CI: 12.5%, 26.0%). Based on a two-sided p value <0.001, we reject the null hypothesis that the comparison groups are equal with respect to 5-year mortality. b. Methods: The comparison groups of those with baseline prevalence of ASCVD and those without baseline prevalence of ASCVD were compared using the difference in the probability of death. Crude estimates for the probability of death within 5 years were estimated for each group using the sample proportion, and adjusted point and interval estimates for the difference in 5-year mortality were based on a linear regression of the binary indicator of death in 5 years on a model that included a binary indicator of prevalence of ASCVD at baseline, a binary indicator of sex, and a continuous indicator of age, and robust standard errors. A 95% confidence interval for the age and sex adjusted difference in 5-year mortality was calculated using Wald type confidence intervals and the p-value was calculated using the Wald test. Both the confidence interval and p value were computed assuming the approximate normal distribution for the regression parameter estimates. Results: Death occurred within 5 years in 53 of the 518 (10.2%) of those without ASCVD at baseline and in 68 of the 217 (31.3%) of those with ASCVD at baseline. Based on a linear regression model adjusting for age and sex, ASCVD is estimated to be associated with an absolute increase of 18.9% in the probability of mortality (95% CI: 12.2%, 25.7%). Based on a two-sided p value of <0.001, we reject the null hypothesis that the comparison groups are equal with respect to 5-year mortality. c. Since the stratified analysis divides up the data into strata, it always treats the data as though there are interactions. The stratified analysis also requires that the variables be categorical, which meant that age cannot be treated as a continuous variable. Since I was able to treat age as a continuous variable in the linear regression, I gained more precision and accuracy with my risk difference estimate. The confidence intervals are slightly wider in the stratified analysis than in the regression analysis. 3. a. Methods: The comparison groups of those with baseline prevalence of ASCVD and those without baseline prevalence of ASCVD were compared using odds ratio which compares the odds of probability of death in 5 years. Crude estimates for the probability of death within 5 years were estimated for each group using the sample proportion, and the ratio of odds in those sample proportions was used to estimate the effect of ASCVD on mortality. The age and sex-adjusted estimates of the odds of mortality in 5 years were calculated using a stratified analysis. Age was categorized into 5 age groups (65-69, 70-74, 75-79, 80-84, and 85+ years), and sex is a binary variable. A 95% confidence interval for the difference in mortality probabilities was calculated using a Wald type confidence interval based on the approximate normal distribution for the maximum likelihood estimates for a binomial distribution. A pvalue testing the null hypothesis of no difference in mortality probabilities were computed using the chi square test. Results: Death occurred within 5 years in 53 of the 518 (10.2%) of those without ASCVD at baseline and in 68 of the 217 (31.3%) of those with ASCVD at baseline. Based on a stratified analysis adjusting for age and sex, those with ASCVD have an odds 3.50 (95% CI: 2.28, 5.36) times that of those without ASCVD of mortality in 5 years. Based on a two-sided p value <0.001, we reject the null hypothesis that the comparison groups are equal with respect to 5-year mortality. b. Methods: The comparison groups of those with baseline prevalence of ASCVD and those without baseline prevalence of ASCVD were compared using odds ratio which compares the odds of probability of death in 5 years. Crude estimates for the probability of death within 5 years were estimated for each group using the sample proportion, and point and interval estimates for the odds ratio were based on a logistic regression of the binary indicator of death in 5 years on a model that included a binary indicator of prevalence of ASCVD at baseline, a binary indicator of sex, and a continuous indicator of age. A 95% confidence interval for the age and sex adjusted difference in 5-year mortality was calculated using Wald type confidence intervals and the p-value was calculated using the Wald test. Both the confidence interval and p value were computed assuming the approximate normal distribution for the regression parameter estimates. Results: Death occurred within 5 years in 53 of the 518 (10.2%) of those without ASCVD at baseline and in 68 of the 217 (31.3%) of those with ASCVD at baseline. Based on a logistic regression model adjusting for age and sex, ASCVD is estimated to be associated with 3.57-fold higher odds of 5-year mortality (95% CI: 2.36-fold, 5.38-fold higher odds). Based on a two-sided p value <0.001, we reject the null hypothesis that the comparison groups are equal with respect to 5-year mortality. c. Since the stratified analysis divides up the data into strata, it always treats the data as though there are interactions. The stratified analysis also requires that the variables be categorical, which meant that age cannot be treated as a continuous variable. Since I was able to treat age as a continuous variable in the linear regression, I gained more precision and accuracy with my risk difference estimate. The confidence intervals are wider in the stratified analysis than in the regression analysis. 4. a. Methods: The comparison groups of those with baseline prevalence of ASCVD and those without baseline prevalence of ASCVD were compared using risk ratio which compares the risk of probability of death in 5 years. Crude estimates for the probability of death within 5 years were estimated for each group using the sample proportion, and the ratio of those sample proportions was used to estimate the effect of ASCVD on mortality. The age and sex-adjusted estimates of the risk of mortality in 5 years were calculated using a stratified analysis. Age was categorized into 5 age groups (65-69, 70-74, 75-79, 80-84, and 85+ years), and sex is a binary variable. A 95% confidence interval for the difference in mortality probabilities was calculated using a Wald type confidence interval based on the approximate normal distribution for the maximum likelihood estimates for a binomial distribution. A pvalue testing the null hypothesis of no difference in mortality probabilities were computed using the chi square test. Results: Death occurred within 5 years in 53 of the 518 (10.2%) of those without ASCVD at baseline and in 68 of the 217 (31.3%) of those with ASCVD at baseline. Based on a stratified analysis adjusting for age and sex, those with ASCVD are 2.63 times as likely as those without ASCVD to die in 5 years (95% CI: 1.92, 3.62). Based on a two-sided p value <0.001, we reject the null hypothesis that the comparison groups are equal with respect to 5-year mortality. b. Methods: The comparison groups of those with baseline prevalence of ASCVD and those without baseline prevalence of ASCVD were compared using a risk ratio which compares the risk of death in 5 years. Crude estimates for the probability of death within 5 years were estimated for each group using the sample proportion, and point and interval estimates for the odds ratio were based on a Poisson regression of the binary indicator of death in 5 years on a model that included a binary indicator of prevalence of ASCVD at baseline, a binary indicator of sex, and a continuous indicator of age. A 95% confidence interval for the age and sex adjusted difference in 5-year mortality was calculated using Wald type confidence intervals and the p-value was calculated using the Wald test. Both the confidence interval and p value were computed assuming the approximate normal distribution for the regression parameter estimates. Results: Death occurred within 5 years in 53 of the 518 (10.2%) of those without ASCVD at baseline and in 68 of the 217 (31.3%) of those with ASCVD at baseline. Based on a Poisson regression model adjusting for age and sex, ASCVD is estimated to be associated with 2.72-fold higher risk of 5-year mortality (95% CI: 1.89-fold, 3.91-fold higher odds). Based on a two-sided p value <0.001, we reject the null hypothesis that the comparison groups are equal with respect to 5-year mortality. c. Since the stratified analysis divides up the data into strata, it always treats the data as though there are interactions. The stratified analysis also requires that the variables be categorical, which meant that age cannot be treated as a continuous variable. Since I was able to treat age as a continuous variable in the linear regression, I gained more precision and accuracy with my risk difference estimate. The confidence intervals are slightly narrower in the stratified analysis than in the regression analysis. 5. A similarity of the 3 approaches is that all 3 results rejected the null hypothesis that the comparison groups were equal with respect to 5-year mortality with a twosided p-value <0.001. Another similarity is that all 3 approaches found ASCVD to be associated with an elevated risk of 5-year mortality. A difference between the approaches is that a risk difference gives the absolute difference, which is more useful when considering the public health impacts, and the magnitude is similar regardless of which group you are subtracting from the other. A disadvantage of risk difference is that it may be more prone to effect modification. Relative risk can be useful to highlight the magnitude of an association on a multiplicative scale. A problem is that the magnitude changes when discussing the probability of event or being event-free. It is more prone to effect modification with common diseases. The odds ratio is advantageous in that the magnitude is the same when discussing the probability of event or being event-free, and it is less prone to effect modification. The odds ratio does not take the baseline risk in relevant stratum into account. In general, I prefer using relative risk since it is easily interpretable and quantifies the risk as opposed to odds of an event occurring since it takes incidence rates into account. Relative risks are especially useful for rare outcomes. If the outcome was common, I may prefer a risk difference instead. Since death in 5 years is not a rare outcome, the odds ratio did not approximate the relative risk very well in this particular study. 6. a. Methods: The comparison groups of US-born and foreign-born whites living in the US were compared using incidence ratios of colorectal cancer. Crude estimates for the colorectal cancer incidence rates were estimated for each group using the number of colorectal cancer cases and amount of person-time at risk, and the ratio of those incidence rates was used to estimate the risk of colorectal cancer as a function of birthplace. The age, sex, and SEER location-adjusted estimates of the colorectal cancer incidence were calculated using directly standardized rates, standardized to the US population. Age was categorized into 18 age groups (5 year intervals), sex is a binary variable, and SEER location was a numeric variable for the 9 sites. A 95% confidence interval for the risk of colorectal cancer was calculated using a Wald type confidence interval based on the approximate normal distribution for the maximum likelihood estimates for a binomial distribution. A p-value testing the null hypothesis of no difference in mortality probabilities were computed using the chi square test. Results: Based on a stratified analysis adjusting for age, sex, and SEER site, foreignborn are 1.02 times as likely as US-born to develop colorectal cancer (95% CI: 0.98, 1.05). Our data does not provide strong evidence of an association between birth place and the development of colorectal cancer, and we do not reject the null hypothesis that the comparison groups are equal with respect to colorectal cancer incidence. b. Methods: The comparison groups of US-born and foreign-born whites living in the US were compared using incidence ratios of colorectal cancer. Crude estimates for the colorectal cancer incidence rates were estimated for each group using the number of colorectal cancer cases and amount of person-time at risk, and the ratio of those incidence rates was used to estimate the risk of colorectal cancer as a function of birthplace. Point and interval estimates for the incidence ratio were based on a Poisson regression of the continuous indicator of colorectal cancer incidence rate on a model that included a binary indicator of birthplace (U.S. vs. foreign), a categorical indicator of age (5 year interval means), a binary indicator of sex, and a categorical indicator for SEER location. A 95% confidence interval for the age, sex, and SEER location adjusted incidence ratio of colorectal cancer was calculated using Wald type confidence intervals and the p-value was calculated using the Wald test. Both the confidence interval and p value were computed assuming the approximate normal distribution for the regression parameter estimates. Results: Based on a Poisson regression model adjusting for age, sex and SEER location, foreign-born are 0.95 times as likely as US-born to develop colorectal cancer (95% CI: 0.86, 1.06). Our data does not provide strong evidence of an association between birth place and the development of colorectal cancer (p=0.359), and we do not reject the null hypothesis that the comparison groups are equal with respect to colorectal cancer incidence. c. The difference between the statistical models is that the directly standardized rate approach stratified my data and weighted it so that the foreign-born group would have the same distribution of age, sex, and SEER location as the US born group, which adjusts for potential confounding by those variables. It computed a weighted average rate for the US born, a weighted average rate for the foreign born, and took a ratio of the two average rates. The Poisson regression model did not stratify my data and instead averaged the difference in the rates and exponentiated them to get a geometric mean of the stratum specific risk ratios. Since were averaging over effect modification, there are slightly different risk ratio estimates that come out of the different methods of analysis. Since these two analyses approaches handled adjusting for age, sex, and SEER by different weights, the estimates are slightly different.