Appendix 1: Derivation of expression for mixture prior in terms of

advertisement
Appendix 1: Derivation of expression for mixture prior in terms of counts instead of rates or
outcome
We can express mixture prior in terms of rates rather than counts of outcomes, at the expense of
complicating the calculations. Recall that λo is the average number of outcomes based on rate Ro in the
cohort, λg is the average number of outcomes based on rate Rg in the cohort (rate that is estimated for the
reference general population), and λw is the average number of outcomes based on rate Rw in the cohort
(rate that is estimated for the reference working population). We assume, for simplicity that the two
reference rates are known with equal precision that involves ignorable error, i.e. the rates can be treated as
constants. Expected count of outcomes in an ideal (latent) reference population is λe and is based on
unobserved reference rate Re. In our approach, we are interested in how to combine Rw and Rg to obtain
Re. The unobserved reference rate Re is governed by the weight parameter ω because we propose to view
it as a combination of rates in the general and working populations, Rg and Rw. We can think of the
distribution of rate of events in every j th age-sex-period strata observed in the cohort of interest, PYj, as a
weighted average of rates based on general and working populations:
[Rej|PYj] ~ ω[Rgj|PYj] + (1-ω)[Rwj| PYj], where the weight  is the same for each strata.
It is essential to keep in mind that person-years are derived from the cohort of interest only, such that
person-years observed for the general population used to estimate Rg are not used in these calculations.
This realization allows the following simplification. Across all age-sex-period strata, the expected number
of cases will be
λe = ∑𝐽𝑗=1 𝑅𝑒𝑗 × 𝑃𝑌𝑗 = ∑𝐽𝑗=1(𝑅𝑔𝑗 + (1 − )𝑅𝑤𝑗 )𝑃𝑌𝑗 = ∑𝐽𝑗=1(𝜆𝑔𝑗 + (1 − )𝜆𝑤𝑗 ).
Appendix 2: Heuristic illustrating Bayesian calculation of SMR or SIR with two reference
populations
Heuristically, the sampling for posterior distribution of interest proceeds in the following manner:
1. Sample candidate values of λg and λw from the Gamma-distributed priors; retain the values most
compatible with Poisson distribution of expected counts derived from PY-structure (g,w).
2. Sample candidate values of ω from the Beta-distributed prior.
3. Calculate λe using values from steps 1 and 2 using equation (1) and sample the value of E from
resulting distribution.
4. Sample candidate value of λo from the Gamma-distributed prior; retain the values most compatible
with the Poisson distribution of observed counts (O).
5. Calculate the SMR using values from steps 3 and 4 using equation (1).
Appendix 3: Code of implementation of the our method in WinBUGS and reproducing results in
Table 1
WinBUGS model that documents the proposed method as well as illustrates calculations for the heart disease
model{
#flat gamma priors on observed and expected counts -- conjugate of Poisson
lambdao~dgamma(0.01, 0.01)
lambdag~dgamma(0.01, 0.01)
lambdaw~dgamma(0.01, 0.01)
#sampling weight that determines likelihood of a reference rate being free of selection (HWHE)
bias
omega~dbeta(a, b)
#reconcile priors with data under distributional assumptions
o~dpois(lambdao) #obeserved counts have Poisson distribution
g~dpois(lambdag) #expected counts based on general population rates have Poisson distribution
w~dpois(lambdaw) #expected counts based on working population rates have Poisson
distribution
#mean expected counts that account for uncertainty as to whether general or working population
rates
#are free of health worker effect = selection bias
lambdae<-lambdag*omega+lambdaw*(1-omega)
#true SMR is the ratio of expected values of observed counts and expected counts under the null
SMR<- lambdao/lambdae
}
##############################################################################
#data from Am J Epidemiol. 2012 Nov 15;176(10):909-17. doi: 10.1093/aje/kws171.
#Table 1 for IHD
#alter priors on omega to so that they all have the same variance (i.e., not favoring any selection
#scenario)
#o=287, w=287/0.97=295.88, g=287/0.68=422.06
#data with prior that asserts that working and general populations are equally valid references on
average
list(o=287, w=295.88, g=422.06, a=3.2618, b=3.2618)
#a=b=3.2618 means that we are 95% sure that omega is less than 0.8 with mode at 0.5
#indifferent: var(omega)=0.033228733
#data with prior that asserts that general populations is the more valid reference on average
list(o=287, w=295.88, g=422.06, a=3.4834, b=1.1307)
#means that we are 95% sure that omega is > than 0.4 with mode at 0.95
#weak selection: var(omega)=0.032952913
#data with prior that asserts that working populations is the most valid reference on average
list(o=287, w=295.88, g=422.06, a=1.5502, b=4.1177)
#means that we are 95% sure that omega is < than 0.6 with mode at 0.15
#strong selection: var(omega)=0.029799455
Appendix 4: Procedure for elucidation of prior knowledge about the strength of healthy worker
hire effect
1. To elicit the mode of [ω], we ask:
“Based on experience, what is the best guess about the proportion of people recruited into <JOB> who
have the same risk of acquiring <DISEASE> as the general population?”
Sample answer: “The best guess is that on average, 5 to 33% of people recruited into <JOB>
would have the same risk of <DISEASE> as the general population.”
Interpretation of the answer: This implies that the best guess is that the appropriate
reference group has average ω centred on (0.05+0.33)/2, i.e., ~0.19 weight is given, “on
average”, to λg and the rest -- λw .
2. To elicit 95th percentile of [ω], we ask:
“Based on experience, what is the highest guess (i.e., 95% certain) about the proportion of people
recruited into <JOB> who have the same risk of acquiring <DISEASE> as the general population? This
must be higher than your previous guess about the best value.”
Sample answer: “It is 95% certain that no more than (30, 36, 40, 70)% of people recruited in the
<JOB> have the same <DISEASE> risk as the general population”.
Interpretation of the answer: The four guesses average to 44%. This implies that experts
are 95% sure that, on average, 44% of the new hires will the same risk of <DISEASE>
as the general population; this give the 95th percentile of ω is 0.44.
We chose the mode and the 95th percentile because we imagined that these are familiar thresholds for
epidemiologists; however, any two percentiles that appear to be workable in a given context would do as
long as the content area experts are comfortable providing them. Two percentiles of the Beta distribution
from the specific hypothetical example allow us to determine its parameters (e.g,. using Beta Buster
software http://www.epi.ucdavis.edu/diagnostictests/betabuster.html). In this artificial example (which we
do not use in the main paper) we have Beta distribution with mode 0.19 and 95th percentile of 0.44, i.e.
[ω]~Beta(α= 3.1903, β = 10.3376).
Download