Random Number: 2470
Biost 536
Homework 2
Due 10-12-14
1.
Methods: To determine if censoring is present in our data, I restricted the data to
those who did not die to see what the minimum observation time was. Those that
died during the study left the study due to the achievement of the outcome of
interest, and they would not be considered censored. If the minimum observation
time for those who did not die is greater than or equal to 5 years, then we will know
that no one left the study during the 5-year period for reasons other than the
outcome of death.
Results: The minimum observation time for participants who did not die during the
study was 1827 days which is equivalent to 5.00 years. This provides evidence that
there was no censoring present in our data and that such methods as logistic
regression can be used to answer the question of interest.
2. a.
Methods: The comparison groups of those with baseline prevalence of ASCVD and
those without baseline prevalence of ASCVD were compared using the difference in
the probability of death. Crude estimates for the probability of death within 5 years
were estimated for each group using the sample proportion, and the difference in
those sample proportions was used to estimate the effect of ASCVD on mortality.
The age and sex-adjusted estimates of the risk of mortality were calculated using a
stratified analysis. Age was categorized into 5 age groups (65-69, 70-74, 75-79, 8084, and 85+ years), and sex is a binary variable. A 95% confidence interval for the
difference in mortality probabilities was calculated using a Wald type confidence
interval based on the approximate normal distribution for the maximum likelihood
estimates for a binomial distribution. A p-value testing the null hypothesis of no
difference in mortality probabilities were computed using the chi square test.
Results: Death occurred within 5 years in 53 of the 518 (10.2%) of those without
ASCVD at baseline and in 68 of the 217 (31.3%) of those with ASCVD at baseline.
Among adults 65 and older, ASCVD is associated with a crude absolute increase of
21.1% in the probability of mortality (95% CI: 14.4%, 27.8% absolute increase).
Among those adults of the same age and sex, ASCVD is associated with an absolute
increase of 19.3% in the probability of mortality (95% CI: 12.5%, 26.0%). Based on
a two-sided p value <0.001, we reject the null hypothesis that the comparison
groups are equal with respect to 5-year mortality.
b.
Methods: The comparison groups of those with baseline prevalence of ASCVD and
those without baseline prevalence of ASCVD were compared using the difference in
the probability of death. Crude estimates for the probability of death within 5 years
were estimated for each group using the sample proportion, and adjusted point and
interval estimates for the difference in 5-year mortality were based on a linear
regression of the binary indicator of death in 5 years on a model that included a
binary indicator of prevalence of ASCVD at baseline, a binary indicator of sex, and a
continuous indicator of age, and robust standard errors. A 95% confidence interval
for the age and sex adjusted difference in 5-year mortality was calculated using
Wald type confidence intervals and the p-value was calculated using the Wald test.
Both the confidence interval and p value were computed assuming the approximate
normal distribution for the regression parameter estimates.
Results: Death occurred within 5 years in 53 of the 518 (10.2%) of those without
ASCVD at baseline and in 68 of the 217 (31.3%) of those with ASCVD at baseline.
Based on a linear regression model adjusting for age and sex, ASCVD is estimated to
be associated with an absolute increase of 18.9% in the probability of mortality
(95% CI: 12.2%, 25.7%). Based on a two-sided p value of <0.001, we reject the null
hypothesis that the comparison groups are equal with respect to 5-year mortality.
c. Since the stratified analysis divides up the data into strata, it always treats the
data as though there are interactions. The stratified analysis also requires that the
variables be categorical, which meant that age cannot be treated as a continuous
variable. Since I was able to treat age as a continuous variable in the linear
regression, I gained more precision and accuracy with my risk difference estimate.
The confidence intervals are slightly wider in the stratified analysis than in the
regression analysis.
3. a.
Methods: The comparison groups of those with baseline prevalence of ASCVD and
those without baseline prevalence of ASCVD were compared using odds ratio which
compares the odds of probability of death in 5 years. Crude estimates for the
probability of death within 5 years were estimated for each group using the sample
proportion, and the ratio of odds in those sample proportions was used to estimate
the effect of ASCVD on mortality. The age and sex-adjusted estimates of the odds of
mortality in 5 years were calculated using a stratified analysis. Age was categorized
into 5 age groups (65-69, 70-74, 75-79, 80-84, and 85+ years), and sex is a binary
variable. A 95% confidence interval for the difference in mortality probabilities was
calculated using a Wald type confidence interval based on the approximate normal
distribution for the maximum likelihood estimates for a binomial distribution. A pvalue testing the null hypothesis of no difference in mortality probabilities were
computed using the chi square test.
Results: Death occurred within 5 years in 53 of the 518 (10.2%) of those without
ASCVD at baseline and in 68 of the 217 (31.3%) of those with ASCVD at baseline.
Based on a stratified analysis adjusting for age and sex, those with ASCVD have an
odds 3.50 (95% CI: 2.28, 5.36) times that of those without ASCVD of mortality in 5
years. Based on a two-sided p value <0.001, we reject the null hypothesis that the
comparison groups are equal with respect to 5-year mortality.
b.
Methods: The comparison groups of those with baseline prevalence of ASCVD and
those without baseline prevalence of ASCVD were compared using odds ratio which
compares the odds of probability of death in 5 years. Crude estimates for the
probability of death within 5 years were estimated for each group using the sample
proportion, and point and interval estimates for the odds ratio were based on a
logistic regression of the binary indicator of death in 5 years on a model that
included a binary indicator of prevalence of ASCVD at baseline, a binary indicator of
sex, and a continuous indicator of age. A 95% confidence interval for the age and
sex adjusted difference in 5-year mortality was calculated using Wald type
confidence intervals and the p-value was calculated using the Wald test. Both the
confidence interval and p value were computed assuming the approximate normal
distribution for the regression parameter estimates.
Results: Death occurred within 5 years in 53 of the 518 (10.2%) of those without
ASCVD at baseline and in 68 of the 217 (31.3%) of those with ASCVD at baseline.
Based on a logistic regression model adjusting for age and sex, ASCVD is estimated
to be associated with 3.57-fold higher odds of 5-year mortality (95% CI: 2.36-fold,
5.38-fold higher odds). Based on a two-sided p value <0.001, we reject the null
hypothesis that the comparison groups are equal with respect to 5-year mortality.
c. Since the stratified analysis divides up the data into strata, it always treats the
data as though there are interactions. The stratified analysis also requires that the
variables be categorical, which meant that age cannot be treated as a continuous
variable. Since I was able to treat age as a continuous variable in the linear
regression, I gained more precision and accuracy with my risk difference estimate.
The confidence intervals are wider in the stratified analysis than in the regression
analysis.
4. a.
Methods: The comparison groups of those with baseline prevalence of ASCVD and
those without baseline prevalence of ASCVD were compared using risk ratio which
compares the risk of probability of death in 5 years. Crude estimates for the
probability of death within 5 years were estimated for each group using the sample
proportion, and the ratio of those sample proportions was used to estimate the
effect of ASCVD on mortality. The age and sex-adjusted estimates of the risk of
mortality in 5 years were calculated using a stratified analysis. Age was categorized
into 5 age groups (65-69, 70-74, 75-79, 80-84, and 85+ years), and sex is a binary
variable. A 95% confidence interval for the difference in mortality probabilities was
calculated using a Wald type confidence interval based on the approximate normal
distribution for the maximum likelihood estimates for a binomial distribution. A pvalue testing the null hypothesis of no difference in mortality probabilities were
computed using the chi square test.
Results: Death occurred within 5 years in 53 of the 518 (10.2%) of those without
ASCVD at baseline and in 68 of the 217 (31.3%) of those with ASCVD at baseline.
Based on a stratified analysis adjusting for age and sex, those with ASCVD are 2.63
times as likely as those without ASCVD to die in 5 years (95% CI: 1.92, 3.62). Based
on a two-sided p value <0.001, we reject the null hypothesis that the comparison
groups are equal with respect to 5-year mortality.
b.
Methods: The comparison groups of those with baseline prevalence of ASCVD and
those without baseline prevalence of ASCVD were compared using a risk ratio which
compares the risk of death in 5 years. Crude estimates for the probability of death
within 5 years were estimated for each group using the sample proportion, and
point and interval estimates for the odds ratio were based on a Poisson regression
of the binary indicator of death in 5 years on a model that included a binary
indicator of prevalence of ASCVD at baseline, a binary indicator of sex, and a
continuous indicator of age. A 95% confidence interval for the age and sex adjusted
difference in 5-year mortality was calculated using Wald type confidence intervals
and the p-value was calculated using the Wald test. Both the confidence interval and
p value were computed assuming the approximate normal distribution for the
regression parameter estimates.
Results: Death occurred within 5 years in 53 of the 518 (10.2%) of those without
ASCVD at baseline and in 68 of the 217 (31.3%) of those with ASCVD at baseline.
Based on a Poisson regression model adjusting for age and sex, ASCVD is estimated
to be associated with 2.72-fold higher risk of 5-year mortality (95% CI: 1.89-fold,
3.91-fold higher odds). Based on a two-sided p value <0.001, we reject the null
hypothesis that the comparison groups are equal with respect to 5-year mortality.
c. Since the stratified analysis divides up the data into strata, it always treats the
data as though there are interactions. The stratified analysis also requires that the
variables be categorical, which meant that age cannot be treated as a continuous
variable. Since I was able to treat age as a continuous variable in the linear
regression, I gained more precision and accuracy with my risk difference estimate.
The confidence intervals are slightly narrower in the stratified analysis than in the
regression analysis.
5. A similarity of the 3 approaches is that all 3 results rejected the null hypothesis
that the comparison groups were equal with respect to 5-year mortality with a twosided p-value <0.001. Another similarity is that all 3 approaches found ASCVD to be
associated with an elevated risk of 5-year mortality. A difference between the
approaches is that a risk difference gives the absolute difference, which is more
useful when considering the public health impacts, and the magnitude is similar
regardless of which group you are subtracting from the other. A disadvantage of
risk difference is that it may be more prone to effect modification. Relative risk can
be useful to highlight the magnitude of an association on a multiplicative scale. A
problem is that the magnitude changes when discussing the probability of event or
being event-free. It is more prone to effect modification with common diseases. The
odds ratio is advantageous in that the magnitude is the same when discussing the
probability of event or being event-free, and it is less prone to effect modification.
The odds ratio does not take the baseline risk in relevant stratum into account.
In general, I prefer using relative risk since it is easily interpretable and quantifies
the risk as opposed to odds of an event occurring since it takes incidence rates into
account. Relative risks are especially useful for rare outcomes. If the outcome was
common, I may prefer a risk difference instead. Since death in 5 years is not a rare
outcome, the odds ratio did not approximate the relative risk very well in this
particular study.
6. a.
Methods: The comparison groups of US-born and foreign-born whites living in the
US were compared using incidence ratios of colorectal cancer. Crude estimates for
the colorectal cancer incidence rates were estimated for each group using the
number of colorectal cancer cases and amount of person-time at risk, and the ratio
of those incidence rates was used to estimate the risk of colorectal cancer as a
function of birthplace. The age, sex, and SEER location-adjusted estimates of the
colorectal cancer incidence were calculated using directly standardized rates,
standardized to the US population. Age was categorized into 18 age groups (5 year
intervals), sex is a binary variable, and SEER location was a numeric variable for the
9 sites. A 95% confidence interval for the risk of colorectal cancer was calculated
using a Wald type confidence interval based on the approximate normal distribution
for the maximum likelihood estimates for a binomial distribution. A p-value testing
the null hypothesis of no difference in mortality probabilities were computed using
the chi square test.
Results: Based on a stratified analysis adjusting for age, sex, and SEER site, foreignborn are 1.02 times as likely as US-born to develop colorectal cancer (95% CI: 0.98,
1.05). Our data does not provide strong evidence of an association between birth
place and the development of colorectal cancer, and we do not reject the null
hypothesis that the comparison groups are equal with respect to colorectal cancer
incidence.
b.
Methods: The comparison groups of US-born and foreign-born whites living in the
US were compared using incidence ratios of colorectal cancer. Crude estimates for
the colorectal cancer incidence rates were estimated for each group using the
number of colorectal cancer cases and amount of person-time at risk, and the ratio
of those incidence rates was used to estimate the risk of colorectal cancer as a
function of birthplace. Point and interval estimates for the incidence ratio were
based on a Poisson regression of the continuous indicator of colorectal cancer
incidence rate on a model that included a binary indicator of birthplace (U.S. vs.
foreign), a categorical indicator of age (5 year interval means), a binary indicator of
sex, and a categorical indicator for SEER location. A 95% confidence interval for the
age, sex, and SEER location adjusted incidence ratio of colorectal cancer was
calculated using Wald type confidence intervals and the p-value was calculated
using the Wald test. Both the confidence interval and p value were computed
assuming the approximate normal distribution for the regression parameter
estimates.
Results: Based on a Poisson regression model adjusting for age, sex and SEER
location, foreign-born are 0.95 times as likely as US-born to develop colorectal
cancer (95% CI: 0.86, 1.06). Our data does not provide strong evidence of an
association between birth place and the development of colorectal cancer
(p=0.359), and we do not reject the null hypothesis that the comparison groups are
equal with respect to colorectal cancer incidence.
c. The difference between the statistical models is that the directly standardized
rate approach stratified my data and weighted it so that the foreign-born group
would have the same distribution of age, sex, and SEER location as the US born
group, which adjusts for potential confounding by those variables. It computed a
weighted average rate for the US born, a weighted average rate for the foreign born,
and took a ratio of the two average rates. The Poisson regression model did not
stratify my data and instead averaged the difference in the rates and exponentiated
them to get a geometric mean of the stratum specific risk ratios. Since were
averaging over effect modification, there are slightly different risk ratio estimates
that come out of the different methods of analysis. Since these two analyses
approaches handled adjusting for age, sex, and SEER by different weights, the
estimates are slightly different.