Penalized Methods as Universal Tools for Data Analysis Lab 2. Data augmentation – Offset method on summarized data Consider the association between chocolate consumption and risk of stroke in a prospective cohort of middle-aged and elderly men (Larsson et al. Neurology, 2012). The multivariable relative risk of stroke comparing the highest quartile of chocolate consumption (median 62.9 g/week) with the lowest quartile (median 0 g/week) was 0.83 (95% CI 0.70–0.99). Given that the means must be higher than the medians, the RRs at issue are for roughly 70g (2.5 oz) per week. The two-way table corresponding to the multivariable adjusted RR and 95% CI is the following. . iri 236 279 46068 45163 | Exposed Unexposed | Total -----------------+------------------------+-----------Cases | 236 279 | 515 Person-time | 46068 45163 | 91231 -----------------+------------------------+-----------| | Incidence rate | .0051229 .0061776 | .005645 a) Poisson regression on summarized data Create a dataset from the two-way table above and fit a Poisson regression model to estimate the exposure-disease relative risk. clear input x case py 1 236 46068 0 279 45163 end . glm case x , lnoffset(py) fam(poisson) -----------------------------------------------------------------------------| OIM case | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------x | -.1872204 .0884393 -2.12 0.034 -.3605583 -.0138824 _cons | -5.086822 .0598684 -84.97 0.000 -5.204162 -4.969482 ln(py) | 1 (exposure) ------------------------------------------------------------------------------ . lincom x, eform -----------------------------------------------------------------------------case | exp(b) Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------(1) | .829261 .0733393 -2.12 0.034 .6972869 .9862135 ------------------------------------------------------------------------------ The glm command has the eform option to report exponentiated coefficients Greenland S., Orsini N., IMM, KI, Sept 16-17, 2013 1 b) Poisson regression on augmented data with a null centre prior Enter the data corresponding to a prior with 0.95 probability on RR between 0.8 and 1.25. scalar prior_logrr = 0 scalar prior_v =((log(1.25)-log(.8))/(2*invnormal(.975)))^2 gen constant = 1 gen H = log(py) set obs `=_N+1' scalar S = 30 scalar A = scalar(S)^2/prior_v scalar H = log(scalar(A)) - (prior_logrr/scalar(S)) replace case = scalar(A) in l replace x = 1/scalar(S) in l replace H = scalar(H) in l replace constant = 0 in l . clist 1. 2. 3. x 1 0 .0333333 case 236 279 69433.65 py 46068 45163 . constant 1 1 0 H 10.73787 10.71803 11.14813 Fit a Poisson regression model on augmented data. Is the posterior relative risk similar to information-weighted averaging? . glm case x constant , fam(poisson) offset(H) nocons -----------------------------------------------------------------------------| OIM case | IRR Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------x | .8896238 .0620893 -1.68 0.094 .7758874 1.020033 constant | .0059782 .0003298 -92.79 0.000 .0053655 .0066609 H | 1 (offset) ------------------------------------------------------------------------------ Results are not different compared to an inverse variance weighted approach. . postrri , prior(.8 1.25) data(0.83 0.70 0.99) format(%9.0g) Posterior median for RR = .8902788 95% posterior limits for RR ( .7763922, Greenland S., Orsini N., IMM, KI, Sept 16-17, 2013 1.020871) 2 c) Poisson regression on augmented data with a non-null centre prior The pooled relative risk of stroke for approximately 70 gr per week of chocolate consumption was 0.707 (95% CI 0.565–0.885). Consider the results of this meta-analysis to inform the prior and augment the observed data. clear input x case py 1 236 46068 0 279 45163 end Fit a Poisson regression model on augmented data. . ppoisson case x , exposure(py) prior(x .565 .885) irr Penalized poisson regression No. of obs = 2 Prior _b[x]: Normal(-0.347, 0.013) -----------------------------------------------------------------------------case | IRR Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------x | .7811926 .0547403 -3.52 0.000 .680945 .8961984 _cons | .0063462 .0003422 -93.84 0.000 .0057097 .0070536 ------------------------------------------------------------------------------ Results are negligibly different compared to averaging of the ln(RR) weighted by their inverse variance (information, precision), which is fixed-effects metaanalysis including the prior data as a study. . postrri , prior(.565 .885) data(0.83 0.70 0.99) Posterior median for RR = .7817668 95% posterior limits for RR ( .6815676, Greenland S., Orsini N., IMM, KI, Sept 16-17, 2013 .8966966) 3