schleier - Environmental Statistics Group

advertisement
Analysis of “Benchmark dose estimation incorporating multiple data sources”
Jerome J. Schleier III
Background
Modeling and decision theory are being used increasingly for model and uncertainty
analysis in risk management (Ascough et al. 2008). Information often is generated and gathered
from multiple laboratories that derive toxic endpoints like the benchmark dose (BMD). Decision
makers and risk analysts are confronted with the issue of combining data from different
laboratories to set an overall threshold for chemicals. However it is has been shown that there is
a large amount of intra-laboratory variability, therefore by accounting for the variability in the
model can allow for better estimation of the dose-response curve (Bailer et al. 2000; Bailer and
Oris 1993; Wheeler and Bailer 2009).
The analysis by Wheeler and Bailer (2009) is based on the findings of Bailer et al. (2000)
that toxicity testing of chemicals shows significant lab-to-lab heterogeneity with respect to the
determination of the toxic threshold for sodium chloride. Therefore, Wheeler and Bailer (2009)
propose the use of a hierarchical model that takes into account the underlying heterogeneity of
each lab and incorporates it into the dose-response equation for sodium chloride.
Methods
Wheeler and Bailer (2009) modeled the BMD by using a two-stage hierarchical model to
incorporate multiple laboratory toxicology data. They use reproductive inhibition (RI), which
they define as the concentration that decreases the expected brood size below the control
concentration of zero by some proportion p. They set the RI25 and RI50 as their BMD for which
they estimate for the population and laboratories (Bailer et al. 2000; Bailer and Oris 1993). The
data they use is from the U.S. Environmental Protection Agencies Region IX’s reference toxicity
database on the effect of sodium chloride on the reproduction of Ceriodaphnia dubia (Wheeler
and Bailer 2009).
To estimate the population response they use a Poisson model (jc) with a logarithmic
link function that relates the mean response to a quadratic polynomial for the fixed effects which
was determined by Bailer et al. (2000) in which Wheeler and Bailer (2009) added the random
effects of laboratory heterogeneity in the following equation:
where bkj is the lab random effect for the jth lab. The prior distributions were assumed to be flat
for all parameters estimated. The priors were j ~N3(, D) where
𝜎00
𝐷=[ 0
0
0
𝜎11
0
0
0 ]
𝜎22
Or it can be thought of as kj ~N (k, kk) for k= 0, 1, 2. The uncertainty of the parameters was
reflected by k ~ N(0,106) and kk ~ InvGamma (0.001, 0.001). They used a 10,000 burn-in
sampling that was utilized on three different chains and once the model converged 20,000
additional samples were taken from the posterior distribution for , and  A difficulty in their
notation is they do not define what  represents. This is troubling because they need to change
the notation in the pdf so that the appropriate notation for representing the prior distribution for
the parameters; however I am assuming that  = p. If  is not equal to p then when the
distributions are multiplied together they would contain p, , and x and it would be unclear how
 related to p (Lynch 2007).
For comparison Wheeler and Bailer perform the same analysis where the data is pooled
using the following equation,
All of the parameters in the model were given the same prior of k ~ N(0, 2 = 106).
Discussion
Wheeler and Bailer’s (2009)model can be thought of as representing the underlying doseresponse which is modeled using the quadratic Possion regression adding in the underlying
laboratory variability. Wheeler and Bailer’s (2009) interpretation of the “true” estimate is really
the estimated average response for all of the laboratories based on the quadratic Possion
regression. Therefore their model can be simplified and thought of as a collection of doseresponse quadratic Possion regressions that estimates the posterior distributions for each lab
(Table 2; Figure 1). Their model can be simplified to the following,
log(jc)=b0j + b1j c+ b2j c2
where the b’s are estimated for each lab, and the average of the b coefficients is the average
laboratory response. This is clearly shown in tables 1 and 2 in which the estimated average and
95% credible intervals for the RI25 and RI50 for all of the labs are the same as their hierarchical
model above.
Wheeler and Bailer’s (2009) assume that the analysis of Bailer et al. (2000) is the
appropriate dose-response model sodium chloride in C. dubia. This may not be a correct
assumption because Bailer et al. (2000) used model selection techniques based on visual analysis
of fit to select their model and not a biological basis for their selection (discussed below).
Wheeler and Bailer (2009) used the deviance information criterion (DIC) to compare
models for pool-data and the hierarchical model that incorporates laboratory variability (Table
1). Their argument is that the model incorporating laboratory variability provides a better fit
based on the DIC. This argument is a frequentist type analysis similar to that used for ANOVA
analysis of models. Based on figure 1 it is clear that there is a large amount of variability in the
estimated responses for the laboratories. As we discussed in class the selection of model should
have a theatrical basis where two competing theories are both probable. I do not agree with their
interpretation because if they know based on previous studies that there is large laboratory
variability (their analysis and figure 1 also clearly demonstrate it) then the model that best
describes the data should incorporate it into the analysis. In addition an analysis that takes this
variability into account is going to provide a better fit to the data because a larger number of
coefficients are being estimated. Generally, if more parameters are incorporated into the model
it will provide better fit if these variables are linked to the data and provides information that
other parameters do not.
Wheeler and Bailer (2009) also argue that by analyzing the data where all of the
laboratories are pooled it could lead to toxicity measurements that underestimate the “true”
toxicity of the chemical. Yet, there analysis does not lead credence to this strong statement
because the estimated RI25 is lower for the model that pooled laboratory data than it is for the
hierarchical model that takes into account laboratory variability. In chemical risk assessment
generally the lowest toxicity value will be used unless there is strong evidence that a higher value
is the best estimate.
I do not agree with their use of “true” because the response of a population to any
chemical is a distribution with certain individuals being more or less susceptible to certain
chemicals depending on their ability to metabolize the insecticide and the physical health of the
individuals. Therefore all toxicological analysis should be treated as distributions, however for
deterministic risk assessments on chemicals – especially pesticides – point estimates are used
which are based on the average response of the cohort. Wheeler and Bailer (2009) clearly
demonstrate in their analysis that there is no “true” value for a certain point estimate because the
variability is so large even when standardized protocols are used.
Bailer and Wheeler (2009) state “… the curve defined by the posterior mean values of
(0, 1, 2), suggests an increased response associated with sodium chloride at low
concentrations (i.e., the expectation of the linear term is positive by the 95% credible intervals
contain 0)…”. They do not elaborate on the implications of their 1 estimated 95% credible
intervals encompassing 0. This could have implications for their model because if it is not a
quadratic response then their interpretation that an increased positive response at lower sodium
chloride concentrations would be incorrect. The authors could have addressed this question in
biological terms by examining the rearing habits of the C. dubia. This study was done by
Winner (1989) which showed that a sodium concentration of 26.3 mg/L showed higher
reproduction rate than a sodium concentration of 4.6 mg/L which demonstrates that increases in
sodium content can have a positive impact on reproduction in C. dubia. Wheeler and Bailer
(2009) also present their analysis as if there is a surprise that laboratories show large variability
between labs. However this has been known and the differences are associated with age, sex,
environmental rearing conditions, and susceptibility of the organisms (Schleier III and Peterson
in press).
I think that Wheeler and Bailer (2009) analysis is important because it can be used to
estimate endpoints when multiple laboratories contribute data and it can also be used to show
which laboratories have credible intervals that do not overlap with the others. In fact their
analysis shows that laboratory CAMEC1 95% credible intervals for both the RI25 and RI50 are
below any of the other laboratories (Table 2). This demonstrates the power of a Bayesian
hierarchical model because the distributions can be obtained for the overall response and each
laboratory and can be used to evaluate both fixed and random effects based on the estimates.
This is also important because it shows that the laboratory may have quality control issues or has
a population that may have a higher susceptibility than any other.
There is a distinct need in chemical risk assessment for incorporating data from different
sources, especially when there is a known underlying variability in the data (Assmuth and Hilden
2008; Ellison 1996). Wheeler and Bailer (2009) analysis should be used by decision makers
when there is data from multiple sources for better estimates of toxicological endpoints.
Bayesian analysis techniques have been underutilized with respect to environmental and public
health, risk assessment, ecology, and environmental sciences (Clark 2005). Their method
derives toxicological endpoints from multiple sources of data, which provides a framework that
can be used by assessors and managers which can provide information for the underlying
distributions and uncertainty of toxicology analysis (Assmuth and Hilden 2008; Linkov et al.
2009).
Table 1: The comparison of the posterior estimate of the k and the reproduction inhibition of the
25 and 50% for the pooled data and hierarchical model taking into account laboratory variability.
Table 2: Posterior mean and standard deviation estimates for the lab- and population-average
reproduction inhibition of the 25 and 50% for the hierarchical model taking into account
laboratory variability.
Figure 1: The posterior means of the expected number of offspring given the lab-source
variability. The solid line represents the population average and the dotted lines represent the
individual laboratories.
Literature Cited
Ascough, J.C., H.R. Maier, J.K. Ravalico, and M.W. Strudley. 2008. Future research challenges
for incorporation of uncertainty in environmental and ecological decision-making.
Ecological Modelling 219: 383-399.
Assmuth, T., and M. Hilden. 2008. The significance of information frameworks in integrated risk
assessment and management. Environmental Science and Policy 11: 71-86.
Bailer, A.J., M.R. Hughes, D.L. Denton, and J.T. Oris. 2000. An empirical comparison of
effective concentration estimators for evaluating aquatic toxicity test responses.
Environmental Toxicology and Chemistry 19: 141-150.
Bailer, A.J., and J.T. Oris. 1993. Modeling reproductive toxicity in Ceriodaphnia tests.
Environmental Toxicology and Chemistry 12: 787-791.
Clark, J.S. 2005. Why environmental scientists are becoming Bayesians. Ecology Letters 8: 2-14.
Ellison, A.M. 1996. An introduction to Bayesian inference for ecological research and
environmental decision-making. Ecological Applications 6: 1036-1046.
Linkov, I., D. Loney, S. Cormier, F.K. Satterstrom, and T. Bridges. 2009. Weight-of-evidence
evaluation in environmental assessment: Review of qualitative and quantitative
approaches. Science of the Total Environment 407: 5199-5205.
Lynch, S.M. 2007. Bayesian statistics and estimation for social scientists. Springer
Science+Business Media, LLC, New York, NY, USA.
Schleier III, J.J., and R.K.D. Peterson. in press. Pyrethrins and pyrethroid insecticides. In: O.
Lopez and J. G. Fernández-Bolaños (eds.) Green Trends in Insect Control. Royal Society
of Chemistry, London.
Wheeler, M.W., and A.J. Bailer. 2009. Benchmark dose estimation incorporating multiple data
sources. Risk Analysis 29: 249-256.
Winner, R.W. 1989. Multigeneration life-span tests of the nutritional adequacy of several diets
and culture waters for Ceriodaphnia dubia. Environmental Toxicology and Chemistry 8:
513-520.
Download