Document 11259546

advertisement
Digitized by the Internet Archive
in
2011 with funding from
IVIIT
Libraries
http://www.archive.org/details/bayesempiricalba277dumo
working paper
department
of economics
Bayes and Empirical Bayes Methods
for Combining Cancer Experiments
in Man and Other Species
William
H.
DuMouchel*
and
Jeffrey
Number 277
E.
Harris**
February 1981
massachusetts
institute of
technology
50 memorial drive
Cambridge, mass. 02139
Bayes and Empirical Bayes Methods
for Combining Cancer Experiments
in Man and Other Species
William
H.
DuMouchel*
and
Jeffrey
Number 277
A
**
E.
Harris**
February 1981
Department of Mathematics, Massachusetts Institute of Technology.
Research supported by National Science Foundation Grant No. MCS-80-05483.
Department of Economics, Massachusetts Institute of Technology.
Research supported by Public Health Service Research Grant No.
DA-02620 and Research Career Development Award No. DA-00072.
This paper is also issued as Technical Report No. 24, Department
of Mathematics, Massachusetts Institute of Technology.
BAYES AND EMPIRICAL BAYES METHODS FOR COMBINING
CANCER EXPERIMENTS IN MAN AND OTHER SPECIES
by
William
H.
DuMouchel*
and
Jeffrey
E.
Harris**
Massachusetts Institute of Technology
ABSTRACT
This paper offers a method for combining the
results of diverse experiments when there is uncertainty
Within
about the relevance of some experiments to others.
a Bayesian framework motivated by Lindley and Smith (1972)
the method is used to assess human cancer risks from
heterogeneous toxicological and epidemiological data. A
distinction is drawn between the sampling error of each
experiment and an error of relevance among experiments.
The latter error reflects the uncertainty of interspecies
extrapolations.
It is shown how the experimental data,
along with prior information on the credibility of such
extrapolations, permits estimation of the human carcinoA crossgenic effects of various environmental emissions.
validation method is proposed for selecting the most
relevant subset among an array of experiments by eliminating
those species or environmental agents which contribute
most to the extrapolative error.
Finally, other types of
prior information on the relationships between experiments
are incorporated into the analysis.
,
Key Words:
A
**
CARCINOGENESIS; INTERSPECIES EXTRAPOLATION;
MUTAGENESIS; POLYAROMATIC HYDROCARBONS; LUNG
CANCER
Research supported by National Science Foundation
Grant No. MCS-80-05483
Research supported by Public Health Service Research
Grant No. DA-02620 and Research Career Development
Award No. DA-00072
.
-1-
DuMouchel-Harris
INTRODUCTION
1.
method
This paper offers a statistical
when
experiments
diverse
there
experimental results to others.
Feb 81
combining
for
results
the
of
uncertainty about the relevance of some
is
Our analysis is motivated by
increasing
the
prominence of public policy problems in which decision makers call on multiple
disciplines
for advice.
programs
research
encompass
We seek an "enlargement of statistical techniques to
at
one
than
rather
time
a
studies..."
(Schneiderman, 1966)
We apply our method to the specific problon
risk
This
available.
(1980)
when
but
absent,
their date
anci
while
vivo ,
precise
problem
toxicological
is
more
location.
others
studies
complicated
in which experiments of very
,
the
conpounds.
Interspecies
similar
various
in
differed
design
involve
conparisons
different
conpouncte
or
are invariably required.
understanding the etiology of cancer.
conclusions.
l!his
(1966)
,
hamster,
with
their
own
the
toxicity
of
who
of
Ideally, the
fundamental
ccmbine
the
prior judgments to reach quantitative
anti-cancer
agents
monkey, and man; Meselson and Russell (1977)
mutagenic and carcinogenic potency of 14
(1979,1980),
mixtures
objective is similar in spirit to those of Freireich et al.
who compared
dog,
in
Our more modest goal here
is to provide a statistical framework that permits scientists to
results
primarily
In our problem, sane experiments may be performed in
experiments
experimental
are
species
that posed by Cochran
than
exact relations among these experiitents should be determined fron
in
cancer
are conducted in cell culture or in subcellular systems.
Frequently,
advances
human
environmental agent when epidemiological data are imprecise or
an
from
assessing
of
examined
the
compounds;
relative
,
of
mouse,
rat,
who compared the
Crouch
and
potencies
in
and
several
Wilson
chemical
carcinogens in various pairs of species, most extensively in rats and mice.
-2-
DuMouchel-Harris
The main idea behind
Feb 81
approach
our
to
is
characterize
sources of variation among experiments.
different
precisely
In our method, the results
of each experiment are summarized by a single number, such as the slope
dose-response
errors of measurement are assumed to be independent.
The
actual
a
These
slopes,
we
lie near the response surface of an underlying regression model.
hypothesize,
Since same environmental agents may have distinctive effects in some
this
of
Each slope has an approximate standard error.
relation.
the
regression
model
linking the experiments
necessarily
The critical factor
entails some error.
scientist's
the
is
2.
species,
information
priori
on
the
exchangeability of these errors of interspecies extrapolation.
These ideas are formalized within a Bayesian framework similar to that of
Lindley
and
(1972)
Ehiith
"hyperparameters"
the
of
distributions
and
distributions
of
the
We
.
assign
underlying
experimental
regression
data,
dose-response slopes.
the
distributions
prior
the
Given these prior
model.
we
for
compute
the
posterior
Eknpirical Bayes versions of our
procedures are also presented.
In the next section, we pose a problem in the assessment
of
human
lung
cancer risk from a number of environmental emissions that contain polyarontatic
hydrocarbons.
Section
then apply our
method
cross-validation
3
to
develops our approach.
formally
the
procedure
data.
In
information
on
the
we
6,
offer
a
siitple
to assist in deciding which experiments are worth
In Section 7, we discuss the case
including.
Section
Sections 4 and 5
relationships
between
where
scientist
a
experiments.
has
prior
The final section
critically reviews our approach and suggests further lines of investigation.
Our calculations in this paper are illustrative.
to
draw
conclusions
We do not propose
here
about the public health significance of various ambient
concentrations of pollutants.
This would require a more
thorough
discussion
.
-3-
DuMouchel-Harris
from
dose-response models of carcinogenesis
Harris
which
our
were
data
the
to
derived.
has discussed the limitations of the use of such dose-response
(1981)
cancer
excess
predicting
estimates in
significance
special
attach
we
do
Nor
permits.
space
than
Feb 81
from
incidence
ambient
population
exposures.
THE PROBLEM
2.
two related environmental emissions, arranged in a 2«2 table.
four
experiments,
numbers
three
given;
are
studies
carcinogenesis
Table 1 displays the results of two types of
For each of the
slope
observed
the
dose-response relation; its ODefficient of variation (i.e.^ the ratio
standard
error
of the observed slope to its mean)
of the observed slope.
1971;
occupational
Mazumdar
et
the
results
under
identical
exposures
(Hammond et al.,
skin
of
dichloranethane extracts of these
performed
and the natural logarithm
to
anissions
oven
coke
tumor
1976).
initiation
emissions.
'The
The
second
experiments
latter
al.,
1979;
row
on
experiments
the
were
conditions in the same laboratory, as part of the
U.S. Environmental Protection Agency diesel onission research program
et
the
of
al., 1975; U.S. Environmental Protection Agency,
1979) and to roofing tar anissions
represents
of the
The first row of experiments represents the results of
epidemiological studies of
(Lloyd,
;
of
Huisingh et al., 1979).
(Nesnow
The slopes and their standard errors
were estimated by maximum likelihood methods, as described in Harris (1981)
Our goal is to improve
slopes for the human studies.
the
precision
of
the
estimated
dose-response
The main question is how to use all of the data
in Table 1 to achieve this objective.
One difficulty is immediately apparent.
and mice are measured in different units.
The dose-response slopes in
man
We might attempt to convert all the
)
DuMouchel-Harris
Feb 81
-3A-
TABLE 1.
2x2
Experimental Data Matrix
Roofing
Tar
Emissions
Lung Cancer (Man)*
Skin Tumor Initiation
(Senear Mice) **
increment
X
Coke
Oven
Emissions
1.64
1.41
0.49
4.40
0.34
1.48
0.54
0.04
-0.63
2.10
0.04
0.74
in relative risk per 10
y^g/m
(slope)
(coef .var.
(log slope)
extractable organics
years.
**Papillomas/mouse per mg extract at 27 weeks.
-4-
DuMouchel-Harris
experiments
mgAg body
tumor per
per
cumulative
e.g.? the incremental lifetime incidence of
units,
canmon
into
Feb 81
weight per day, or the age-specific probability of
dose
lifetime
unit
per
body
tumor
The choice of
surface area.
conversion factor, however, is hardly clear.
potencies
One way to circumvent this problem is to consider the relative
of
the
Since the dose-response
two environmental emissions in each species.
slopes in each row in Table 1 are measured in the same units,
the
the
ratios
of
In fact, a natural hypothesis
slopes are comparable unitless quantities.
for cait)ining these data is that the relative potency of the two emissions
is
preserved across the two biological systems.
The extent to which these data
ascertained
in
adhere
1, which depicts the
Figure
dose-response slopes on a logarithmic scale.
the
errors
standard
such
to
To
mice.
be
means and standard errors of the
correspond
bars
(The error
to
On a log scale, the difference between
coke oven slope and roofing tar slope in man is
in
can
the log slopes, which have been approximated by the
of
coefficients of variation in Table 1.)
difference
hypothesis
an
close
to
corresponding
the
show this, we have also drawn the (weighted) least
squares parallel lines on Figure 1, and the fit is good.
This result could be purely
epidemiological
data,
of
the
especially for roofing tar, are relatively large.
But
there is a deeper objection.
potency
of
these
fortuitous.
The hypothesis that
extractability
standard
the
errors
relative
carcinogenic
two emissions is preserved across species ignores possible
interspecies or interorgan differences in the
the
Ihe
of
particulate-bound
distribution
polyarcmatic
of
particulates,
hydrocarbons,
clearance, metabolism, and genetic and other repair mechanisms.
their
To claim that
the totality of data in Table 1 provides more information about the human lung
cancer
risks
from,
say,
roofing
tar
exposure
than
the
roofing
tar
—
9tfM§«»ch«l
r^
oMl Morris CifaO
30
i
-4A-
flauaE
1
2kZ
BKf^eRlMEllTAL
i.
DATA MATRiK
9
if"
UIN6
or
^
5
s
±1
if
1"^
S.e«
1
3
^
SKIN
•*
1
IMITiATiort
—
CiS!£f}
-in
9
S(
—
I
!r
oveN
-5-
DuMouchel-Harris
is to maintain sane degree of prior belief that
alone
study
epidemiological
Feb 81
The uncertainty inherent in
these interspecies differences are not too large.
such
interspecies
extrapolations
clearly
1 to estimate
fron
conventional
the
If we are to use all of the data in Table
error of each experiment.
sampling
differs
any one slope, then we must devise some measure of the extent of
this extrapolative uncertainty.
Finally, there is the objection that the hypothesis of preserved relative
potencies will not withstand other empirical canparisons.
It is possible that
such an hypothesis applies accurately only to the comparisons in Table 1,
not
to
bioassays or to other environmental otiissions*
other
had no prior belief that the hypothesis
roofing
and
tar
coke
should
any
hold
but
However, if we
rtore
exactly
for
emissions than, say, for autcniotive particulate
oven
anissions or cigarette smoke, then any enpirical canparisons
that
contradict
the hypothesis would raise our uncertainty in the current extrapolation.
3.
STATISTICAL MDDEL
3»1 Notation and Assumptions.
Let
y
experiment
be the logarithm of the estimated dose-response slope
.
species
in
k
on enviromiental agent
J
,
in
mice,
respectively,
enissions and coke oven emissions,
presumed
to
standard error
be
a
respectively.
tumor
c
.
relatively
initiation
correspond to roofing tar
The
variables
approximately normally distributed with mean
B^^
y
large
y^^
experiment.
«
are
and known
The assumptions of normality and known standard
are not unreasonable, since each
on
while 1=1,2
the
In the problem above,
k=l,2 correspond to epidemiological studies in man and skin
experiments
for
errors
was a maximum likelihood estimate based
The
quantities
dose-response slopes, the primary parameters of interest.
0^
are the true log
,
-6-
DuMouchel-Harris
We assume that each
Feb 81
Q^
has a symmetric prior
=^
^X^^ij^,
distribution
with
nean
value of the form
E[e^l|A,ec^,V^]
(3.1)
where the hyperparameters
{
M-
r
^^
i
represent the overall
}
^j>
mean
effect
species-specific effects, and onission-specific effects, respectively.
Bayesian
these
framework,
hyperparameters in turn have prior distributions.
Equation (3.1) embodies the hypothesis that the relative potency
emissions
on
is
potencies are
the
other.
the
logs
a.
average
The
experiments.
across species.
additive
(0„ -9,2^) - ^^ti~®,,^
model
Moreover, the relative
species
k
for
different
dose-response
f^
f
^k.
/
in
snd
-^-^-'^l
are similarly dimensionless.
measuring
than
species
In that case, the units of measurement for
can be chosen so that the quantities
effect,
two
is a dimensionless quantity, a condition satisfied
Vb
%
the
is meaningful, however, so long as
(3.1)
problem above.
=
quite
for
slopes
our
^Ki
of
are measured in different units, since they are
Q^^^
dose-response
of
preserved
priori just as likely to be larger for one
The various
In our
the
on emission
Each
amount
Ji
S
(on
deviates
g
a
is
species-emission
a
interaction
log scale) by which the experiment in
fron
the
relative
constant
potency
hypothesis.
Conditional on the value of another hyperparameter
that the
5^^
are independently distributed
critical assumption, the interaction effects
a.
piiori as
5^
are
a
CT
,
we further assume
N(0,<r ).
priori
Under this
exchangeable
-7-
DuMouchel-Harris
That is, we have no prior information that a deviation of
(de Finetti, 1964).
a
f ran
magnitude
given
Feb 81
the constant relative potency model is more likely in
one experiment than in any other.
prior
We take care here to elucidate the precise meaning of this form of
We
information.
recognize
that
Quite different metabolic pathways
may
of stages in expression of tumor may differ.
Any
considerably among agents or species.
be
The
involved.
number
mechanisms of carcinogenesis may vary
the
variation that is distinctive to a particular agent in
could
result
in
exchangeability
a
marked
hypothesis
deviation
does
fron
exclude
not
additive
our
a.
species
The
model.
possibility
the
It merely states that we cannot identify
deviations.
particular
a
such
of
priori which entry in
our two-way table of experiments is likely to have the largest deviation.
The hyperparameter «"
the
relative
equal
A value of
potency model.
ff'=0.05,
that within one normal standard deviation, i.e., with
additive
model
is
equivalently, each
relative
accurate
to
dose-response
within
slope
an
probability
absolute
conforms
to
C
in
error
the
the
0.68,
of
0.05?
underlying
or
equal
A prior
is of this magnitude thus implies a relatively high degree
the
underlying
model.
model to a multiplicative factor of exp(5)=150,
is of this magnitude implies much less
faith
that
the
of
^=5
On the other hand, a value of
implies that with probability 0,68, each dose-response slope conforms
underlying
of
for example, implies
potency model to a multiplicative factor of exp(0,05)si,05.
belief that
confidence
accuracy
measures our belief in the degree of
to
the
^
A belief that
experiments
be
can
profitably ccinbined.
We now generalize beyond the
c^j
log
?
k=l,...,K ;5=1,...,L
dose-response
slopes
}
2x2
case considered above, letting
be a set of experimental
observations
for K species and L environmental agents.
{
on
y
.
±
the
We also
-8-
DuMouchel-Harris
y^
admit the possibility that some
where
Except
experiments.
Feb 81
are missing from a set
of
conteroplated
otherwise noted below, we assume that the set of
available experiments is connected, in the sense that any available experiment
can be reached fron any other available experiment (k',^') by
(k,J^)
of
series
a
f rem one available experiment to another in which each move is along
rtsDves
6°
Conditional on
a single row or column.
,
the observed
y
are
.
generated
by the linear model
(3.2)
y^^
=ix
^^'^'^^li ^^a'
where the three sets
independent
N(0,c
z
q).
replace
a priori
variables
of
the expressions
p.
+'^(t.
^
£
i.i.d. N(0,(r) and the
independently estimable in
informative
classical
chosen
^^
'
^P-^'^ji'^i,'^
^^^
^kJL^
.
K+L+1
^
independent
we
further
is a column
design
matrix.
K+L-1
niost
long
So
sense.
prior distribution on all
rank
full
in
where
,
is an appropriately
X
the
X^
in (3.2) by
+ ^s
hyperparameters
K+L+1
the
^^^^
^tfi ^'
^
Following the usual general linear model formulation,
vector of hyperparameters and
Of
^'
^
t
o^
with the
,
^i^'^^/^s
as
we
are
an
use
hyperparameters, no
restrictions on these hyperparameters are necessary in our Bayesian framework.
In other cases, however, particularly when a
we
anployed,
shall
assume
estimable canponents of
that
prior
distribution
is
corresponds only to the independently
^
^^^r^^r^j^'i
diffuse
and that
X
is the
corresponding
full
with
mean
rank design matrix.
Finally, we assume that
vector
IT
b
and
with density
replacing
^
is
a
V
covariance matrix
'n'(<7)
.
Now
the paired indices
let
(k,i.)
priori
,
i
.
Let
multivariate
and that
=
O"
l,...,n
m
normal
has a prior distribution
index
be the rank of
experiments,
the
X
,
that is, the
.
.
-9-
DuMouchel-Harris
Feb 81
number of independently estimable elements of
column vectors replacing
nxl
be
£,
Let
respectively.
diag(c^ ,...,c*
=
y
X(l
(3.3a)
CT
~
(3.3b)
(i
~
(3.3c)
(eip,cr)
(3.3d)
(Yie)
+ S +£
TT
"^i^'
'
^^<i?
identity
^
'
^
^^
*
^^*<^^
*
Yr&r^r
^^*^
and
matrix,
b
Lindley
V
,
,
=
where
that there is
chi-squared
~
~
and
below.
no
prior
b
choice of
be
Xfl
+ ^, and
N(b,V),
N(xp,cr*l),
N(©,C).
Smith (1972)
distribution
prior
C
'
,
and
(Readers
,
advantage
or
distribution
are assumed be to known.
compelling
for
G~
g
reason
C
formulated
are
as well as
The choice of
a
We note here
the
inverse
as proposed by Snith (1973a)
interested
less
,
choose
to
is more conplicated, and will be
who
and
is left unspecified until Section 4.
TT
V
Y
The experimental data
.
and the distribution ft
,
^^\iS}
let
This model possesses a hierarchical structure similar to that
by
and
Our model can be formulated generally as
.
)
nxn
the
be
I
^y^p^
^ '^ '
considered
in
.
The
detail
the mathematical details of
in
estimation may wish to skim the remainder of Section 3 and resume in earnest at
Section
4.
3.2 Bayes Estimates.
Informative Prior on
^
Let us suppose that a scientist has prior information on
expresses
through
his
choice
of
b
and
V
.
^
,
which
Such choices could be made
directly, as we shall illustrate in Section 7, or by indirect elicitation,
in the method of Kadane et al.
(1980)
he
as
}
-10-
DuMouchel-Harris
Estimation of the parameters now
the hyperparameter
~
(YlC)
(3.4)
G*.
given
priori
straightforward
by
Bayesian
From (3.3),
N(Xb,C+<rVxVX').
In our Bayesian framework,
S
proceeds
conpute the marginal distribution of the data given
we
First,
methodology.
Feb 81
(3.4)
values of
b
c
can be regarded as the likelihood of
and
V
.
The posterior distribution of
<T
for
is
therefore
(3.5)
Tf
where
lAl
ec
(tflY)
Tr(<y)
lC+<r*I+XVX' r'^*exp{-A(Y-Xb)
is the determinant of A.
posterior expectation of
G-^^
=
cT
For
'
future
[C+(x'l+XVX' ]~
reference,
by
XB
we
define
estimating
a
Denoting
the
where
(3.7)
f (piY)
f(piY)
f(^lY,cr)
p
particular
under squared error loss.
,
fi
.
we have fron (3.3)
00
(3.6)
the
fo-*Tr(cr|Y)dc>
Now consider the posterior distribution of
density by
(Y-Xb)
as
which can be interpreted as the approximate risk in
6
*
=
=
ff(piY,«r)Tr«riY)d<y,
is the multivariate normal density
V[X'
(C+<3"*i)"
Y + v" b].
N(|i,V)
,
and
posterior
-11-
DuMouchel-Harris
V
=
Feb 81
[X'(C+<r*ir'x + V"']"'.
mixture
multivariate
normal
The posterior
distribution
distributions
with mixing probabilities given by (3.5), the posterior density
of
C
Equations (3.7) are
.
distributions
posterior
by
full rank)
of
V
and precision
.
values
In
the
least
density of
densities
N(xp,XVX')
^ and V
analysis
the
posterior
With
(1968).)
= (X'WX)
X'WY
W=
known,
(C+c" I)~
has mean
and precision
^
X
^
here
result
the
XA
we
below,
equal
to
the
sum
has mean
of
of
these
shall be interested in the fitted
relative
potency
hypothesis.
The
is the corresponding mixture of multivariate normal
where the mixing probabilities are still
,
is
estimate and the prior mean by their
squares
Tr(c"lY)
and
are defined in (3.7).
Consider, finally, the estimation of
density of
6
,
0.
If
gOlY)
the
is
we have
90
(3.8)
computing
for
data, normal prior, known variance
normal
fron the underlying constant
Xft
rule
Moreover, the prior distribution of
precisions, with the precision of
precisions.
familiar
the
The rule for computing the posterior distribution
.
weight
to
is
|l
of
(v^ere, for the sake of this intuitive argument,
(X'WX)
necessarily
|B
the
estimator
squares
least
matrix
b
derived
for
a
is
(i
(See, e.g., Raiffa and Schlaifer
case.
the
of
g(eiY)
where
g(9lY,0")
(3.9)
©
=
C
=
=
rg(6lY/r)ir(0'lY)d<r,
is the multivariate normal density
c[c''y + (xyx'+o-^D-'xh],
[C"'
+ (XVX'+A)"']"'
.
N(©,C)
,
and
posterior
-12-
DuMouchel-Harris
The
distribution
posterior
of
Feb 81
similarly
is
a
mixture
The means and covariances of these normal distributions, given
distributions.
by (3.9), are derived in a manner analogous to (3.7), where, by
N(6,C)
and
~
Q\<r
Y
original data
N(Xb,XVX'+o- I)
and
constant
underlying
normal
of
corresponding
the
potency
relative
9
Each
.
,^\&,^~
is a weighted average of the
prediction
prior
where
model,
(3.3)
Xb
weights
the
from
the
are
the
corresponding precisions.
3.3 Bayes Estimates.
Calculation of
requires us to specify the mean
distribution
of
iratrix
p
many
/3
(3.5),
.
V
and the covariance matrix
b
and
(3.6),
however,
information
will be extremely vague.
That is,
the
V will be large.
(3.8)
of the prior
situations,
In
|S.
hyperparameters
distributions
posterior
the
Diffuse Prior on
prior
about the
covariance
To investigate such cases in detail, we first need
the following lemma.
matrix of rank
nxm
Let
U
be an
matrix, and
t
be a scalar.
Lemma ;
(3.10)
d+tUU')"*
(3.11)
ll+tUU'l
where
=
=
I
Then as
t
m<n
» ©»
,
Proof:
UU'
The
=
nxn
the
nxn
identity
,
- U(U'U)''u' + t~'(UU')* +0(t"*),
t'^IU'Ul [l+t''tr(UU')*-K)(t'*)],
is the Moore-Penrose pseudo-inverse of
A"*"
be
I
matrix
21 ^iU.uJ
,
UU'
,
which has rank
A.
m, can be represented as
-13-
DuMouchel-Harris
where
iXi)
{u'}
are
the
characteristic
corresponding
roots
characteristic
positive
are the strictly
Feb 81
of
n«n
The
vectors.
UU'
and
identity
matrix can be represented as
where the unit vectors
{u:}.
{v-
are all orthogonal to the characteristic vectors
}
Combining these two expressions^ we have
3
+ tUU'
2
=
(l+t>;)u.uj +
21
V;v!
.
New
=
d+tUU')"
As
2
+
^l"''t\)'"'u.u'.
2^
v.v!
.
t-»0o,
^2
(l+t>-)"'u,u',
t"'^
=
Equation (3.10) now follows fron our
orthogonal
projection
operator
orthogonal to the columns of
defines the pseudo-inverse.
U
=
recognition
which
maps
(namely
2
that
R
v.v'-
jT(l+tV)
Jl
=
is
onto the subspace of
while
I-U(U'U)~ U'),
Similarly
*!
ll+tUU'l
X"'u u' + 0(t"^).
^
,
t'^Tl\(l+t'"*^>;,+0(t"*)).
^
^^^
the
R
u u'
.
-14-
DuMouchel-Harris
Feb 81
Equation (3.11) follows fron our recognition that
lU'Ul
=
TTA' and
J'
.
tr(tlU')
)
We now have the following result.
where
(a)
The posterior density of cT
(3.12)
"IT(criY)
oC
W = (C+/l)~'
where
and
f (pi Y,(y)"7r (<yiY)d<r,
(3.13)
=
p
V
where
where
©
=
C
=
R =
Proof:
by
tV,
approaches
,
density
of
f(piY^)
is multivariate normal
(3
approaches
=
f(fllY)
and
N(|5,V)
^^'(C+c^*I)"'Y
posterior
g(9lY,a)7r (cylY)dC, where
(3.14)
replaced
S = W-WX(X'WX)"'x'W.
= [X'(C+(r*I)"'x]''
The
(c)
is
:
TT(o-)IW|'^X'WXr''*exp{-iY'SY}
posterior
The
(b)
j
t -^ oo
is a scalar, then as
t
V
If the nonsingular covariance matrix
Proposition ;
&''y
=
.
&
approaches
density
of
g(©lY,<T)
is multivariate normal
g(0lY)
N(e,C)
and
(I+CR)"'y
(C"'+R)'*',
<f [i-X(X'X)"'x']
(a)
Note that the quadratic form
Y'SY
in (3.12) is the
sum
of
-15-
DuMouchel-Harris
X
columns of
weighted
the
(Y-Xb)
'
[C+cr*l+XVX']~' (Y-Xb)
in
=
That
w''''(I-U(U"U)"' U')w'''*:
becomes proportional to
formula
)wl
determinant
the
(3.11) under the same definition of
X
the
Y'SY
'A.
W XV
U =
result
a
on the
That
.
|C+<r''l+XVX'r
is
in (3.12)
IX'WXl
assume here a parametrization in which
(b)
W
reduces to
(3.5)
is a result of expansion formula (3.10), where we set
(3.12)
Y
squares regression of
least
where the weights are the diagonal elements of
r
form
quadratic
S
of
residuals
squared
Feb 81
''•
in
'/»,
and
in
(3.5)
expansion
of
(As noted in Section 3.1, we
U.
is of full rank.)
V
Expressions (3.13) follow from our setting
=
in
expressions
(3.7).
(c)
formula
expansion
That expression (3.9) reduces to (3.14) is a result of the
where
(3,10),
(XVX' + e-i)
we
in (3.9).
=
U
set
XV
*•
in
order to evaluate the terms
We note also that equation (3.14) is a special
case
of equation (Al) of Snith (1973a). ^
Empirical Bayes Estimates.
3.4.
In the analysis below, we shall also consider empirical Bayes
estimating
to
distribution
8
In
.
9ip,«^
alternative
these
~ N(X^,o- I)
Several options are available.
thing
and
<r
then
assume
estimate
<r
retain
the prior
(As
cr
and
|5
Y
.
Dempster (1980) notes, "there
is
no
as ibe empirical Bayes estimator.") First, we could estimate both
frcm the data
p
we
as specified in (3.3c), but use the data
itself to construct the prior distributions on
such
methods,
approaches
that
the
Y
,
by maximum likelihood
entire
prior
and the entire prior
density for
density
of
^
or
cr
is
other
methods,
and
is concentrated at the
concentrated
at
the
A
estimate
^
.
For
least squares estimate
a given
cr
,
the maximum likelihood estimate of
fi
is the
1
-16-
DuMouchel-Harris
^
Feb 81
= (x'(C+<r'l)"'x)"'x'(C+ff^r'Y
(<r)
As a function of
cr
,
.
the concentrated likelihood, evaluated at
& =
J3^, -
,
is
proportional to
=
L(o-)
where, again,
maximizing
lwr'''exp{-iY'(W-WX(X'WX)~'x'W)Y}
W
I)
distribution of
posterior
'^
the estimate
Q
Ptf^ie
N(©
is
b as
then
'^-)
/C^,
MCE.
Mt£
This estimate treats
denote
function,
tc-' + &-"•
=
lAtS.
If we
.
likelihood
this
c
A
-
a.
= (C+o"
cr„,
the
r
as
g.
resulting
(f
from
the data
Let
^„
Co
empirical
=
cr
empirical Bayes
if it were fixed and known
s.
priori ,
even
and
estimate
though
^^ used.
IWl
fi
only
In this case, the appropriate likelihood function for
Y.
onitted, that is,
'^'•IX'WXl'^^expC-^Y'SY}.
be the value of
Bayes
of
u-'.
is equation (3.12) with the prior density ITCo")
L*(C)
value
where
Alternatively, we could assume a diffuse prior on
c7
the
posterior
C
that
maximizes
distribution treats
with certainty, and so, by equation (3.14), is
L*(cr)
(T
NO
The
.
= or^^
,Cg-)
corresponding
as if it were known
,
where
-17-
DuMouchel-Harris
(3.16)
[I+crggC(l-X(K'xr'x')]-'Y
=
e^^
Feb 81
where
It is interesting to note that in the case
shows liaat
(1973b)
by^.a
i^;„y?^\j«pJlta,Qj^.jPj^^i a^^d, .c^g
then
,
6^^
A
clear fron comparison of (3.15) with (3.16) that if
C^^e ~
(in the sense that
C
Smith
= 6^^.
It is
A
C
A
cr
then
,
non-negative definite).
"""^
"^eb
=
CT
known.
is
cr
^
Moreover,
if
A
since they maxiinize
=
L(c7)/L*(cr)
IX'
and
Lds-)
(C+a
I)
L*(cr)
since
ratio
the
be shown to be a decreasing function of
can
XI
and
respectively,
,
/>
cr
maximum
is the product of
L*
Since
.
occur
will
later
and an increasing function of cr
L
L
than that of
The assumption that
.
with certainty leads to a smaller posterior variance for
prior
for
p
"Hierefore, to the extent that
.
/3
is
a
B
|3
,
=
its
p^Lg,
diffuse
than the
priori uncertain, the
A
variance
is inappropriately small.
C
the
In
results
we
below,
shall
therefore report the Empirical Bayes estimate (3.16).
Finally, if we wish to avoid
^MLE.
0^^
'
„
^&s
&»
we
^^
,
'
could
corputational
the
begin
in
determining
= (X'C''x)~'x'C"'y
(
P^.JO)
WiLE.
with
1^
..
.
The
,
has
'
residual sum of squares for
this
RSS
estimate,
=
r»
expectation
burden
E[RSS]
=
n-m + a[
'
^I
=
*•
-2-
-^^
c^
Zl,
t =
(y^-x^M /c^
'
-"'.. /•,i(-.-l-v\-l vir* ~
- tr C
X(X'C" X)"' X'C"']
.
'
,
which
I
.
suggests the estimate
(3.17)
o-^
=
[RSS -
where we take cr^^^ =
(n^)]/C^c:^-
if
RSS < n-m
.
tr C~'x(X'C~'X)-' X'C"']
(The
exact
derived for us by H. Chernoff, whose proof is emitted.)
we
shall
also
report
the
empirical
Bayes
value
of
E[RSS]
was
In the results below,
estimate of
9
when
cr
is
-18-
DuMouchel-Harris
C
substituted for
Feb 81
in (3,16),
TOE 2X2 CASE
4,
In the next three sections of this paper, we shall assume a diffuse prior
on the vector of hyperparameters
information
of
To be sure, a scientist may have
.
prior
the composition of each emission, the carcinogenic activities
on
bioavailability,
possible
their
constituents,
its
/2>
interactions,
their
a scientist may have prior information on
Similarly,
etc.
synergistic
the sensitivity of the mouse skin tumor initiation model in comparison to
human
respiratory
Our
tract.
impression,
however,
the
that this type of
is
information is not yet sufficiently refined to offer much help in specifying a
precise prior on
^
certain
involve
We recognize
.
that
paradoxes.
marginal ization
difficulties is deferred to Section 7.
analysis
of
proper prior for
with V = 10
p
Discussion
this
priors
improper
may
of these potential
we
point,
note
that
the
three sections was repeated under the assumption of a
next
the
At
of
use
the
4
I.
The results of
reported
all
quantities
were unchanged up to the number of decimals presented.
Devising a prior distribution for
another
matter.
extrapolation
Perfect
critical
the
mouse
fron
hyperparameter
to
experiments
likewise too strong.
in
Table
1
irrelevant
totally
are
is
from
one
To claim that
the
man
environmental agent to another is clearly quite unlikely.
various
cr
or
to each other is
The answer lies somewhere in between.
It seems reasonable to suspect that within a range of one normal standard
deviation, i.e., with
potency
could
model
probability
be
could
be
accurate
the
underlying
relative
constant
accurate within a multiplicative factor of exp(5) =
150, or even exp(0.5) = 1.6.
model
0.68,
To suspect
within
a
with
that,
factor
of
probability
exp(0.05)
=
0.68,
1.05
the
is more
-19-
DuMouchel-Harris
One of us, in
cx»ntroversial.
differences
interspecies
found
fact,
of
significant
a.
priori error factor
seemed
1.05
of
On the other hand, we both felt that it would be inappropriate
unimaginable.
to attach a uniform distribution to cr
order
possibility
the
in particulate distribution, extraction, clearance,
metabolism, etc. so compelling that an
the
Feb 81
of
magnitude
of
error.
articulate
To
agreements, we formulated two prior distributions on
Prior A:
uncertainty
since there is
,
in
and
differences,
our
cr
even
.
log cr uniformly distributed on the interval
0.05l<r 15; and
Prior B:
log
uniformly distributed on the interval
<j
0.5i.fr:15.
Itiese
distributions somewhat artificially attach zero probability mass outside
the specified intervals [0.05,5]
however,
and
As
[0.5,5],
shall
shortly,
see
this restriction does not significantly affect our main conclusions.
Giri (1970) has employed a uniform distribution on
his
we
Bayesian model for two-way ANOVA,
log
Priors
A and
B
retain
intervals, the posterior density of
cr < oo
in
However, we prefer the use of a proper
prior distribution because it conpels us to face the
beliefs.
0<
for
cr
feature
the
log
<r
will
task
our
within the relevant
that,
be
assessing
of
proportional
to
the
likelihood function.
In order to simplify the computations,
distributions^ and
therefore
the
posterior
we
shall
evaluate
distribution
ttCctIy)
discrete points in the relevant intervals, equally spaced on
This
means
that
the
posterior distributions of
mixtures of normal distributions.
Xj5
and
these
the
©
,
log
prior
only at
scale.
will be finite
'
-20-
DuMouchel-Harris
Feb 81
Figure 2 displays the posterior densities
for our two distinct priors.
of
the
For the
distribution
posterior
distribution that is assumed.
the
function
likelihood
estimate of
is
of
2X2
calculated from (3.12)
iTCcrlY)
data matrix in Table 1,
clearly
is
cr
In the interval
relatively
0.051
flat.
the
form
sensitive to the prior
Yet
particular,
in
<7 <.0.5,
the maximum likelihood
is zero.
cr
This finding is reflected in Table 2, which shows selected statistics
and
6
based upon the two prior distributions of
cr
the
distributions of
posterior
Xj3
of
for the epidemiological studies,
Also shown are the results for
.
the empirical Bayes estimate corresponding to (3.16).
(Both
and
o"__
cr
^^
were zero in this case.)
Although the posterior distribution of
normals
(recall
standard deviation of
PrCe-^ef +
< e?^
multivariate
of
(3.8)), the resulting marginal distributions did not in fact
deviate substantially from normality.
Pr{e
is a mixture
9JY
,
If
Q* = E[©JY]
and if
c*
Hence,
2.326c* Y},
I
- 2.326c*lY},
6
.
Because the original coke oven data were relatively
standard
deviations
tar
log
precise,
the
means
of the posterior distributions of the coke oven log
slope do not differ much from the original values of
roofing
normal
the
the mean and standard deviation adequately characterize the
marginals of the posterior distribution of
and
the
then the tail probabilities
do not deviate substantially from the value of 0.01 predicted for
density.
is
y
and
c.
For
the
slope, however, the precision of the posterior distribution
depends critically on the estimate used.
Since prior A admits the possibility
-20A-
>tfHtvcHei and
Figure Z
;
Tosterior Pev^slties of
iorB* Sojc vniform on
(^.§
i
^
lo ci
i 5.0
HarrU 6^90
c
DuMouchel-Harris
Feb 81
-20E.-
TABLE 2.
Bayes and Empirical Bayes Estimates of Log Slopes
For Lung Cancer Risk in Man
Data Matrix
2x2
Environmental
Emission
Mean
Stand.
Dev.
Poster.
Mean X^
Lower
Tail
Uppe r
Tail
0.495
0.365
0.229
0.135
1.415
1.152
0.788
0.337
0.304
0.205
0.135
0.011
0.015
0.013
0.025
1.482
1.489
1.497
1.502
0.341
0.338
0.334
0.331
1,550
1.522
1.502
0.010
0.010
0.010
0.010
Roofing Tar
Original Data
e|Y (Prior B)
elY (Prior A)
e|Y (Empirical
Bayes)
Coke Oven
Original Data
e|Y (Prior B)
eiY (Prior A)
&\Y (Empirical
Bayes)
©|Y (Empirical Bayes)
diffuse prior on (3
assumes
'nr(cr)
concentrated at
A
cr,
EB
= 0,
and
Lower Tail = Pr{e;l(9* - 2.326cf| Y} ,
Upper Tail = Pr{e,2e;*+ 2.326c.; Y} ,
where 0> and cf are the posterior mean and standard deviation
I
of
-21-
DuMouchel-Harris
of lower values of
cr
,
Feb 81
the corresponding posterior distribution has a smaller
For the empirical Bayes
standard deviation.
which
estiinate,
in
this
case
assumes that the underlying constant relative potency model holds exactly, the
6
only sources of variance in the posterior distribution of
are the original
sampling errors.
6
The posterior mean values of
values
mean
posterior
5-
lY
.
very
close
to
the
corresponding
That is, the posterior expectations of the
The posterior variances
these
of
residuals,
Although the variance of each posterior residual
not so small.
are
however,
xp>
are small.
5
model residuals
of
are
depends in part on the precision of the original data,
that
conponent
of the variance due purely to the underlying model is
(4,1)
Cj*"^
=
EEcr'^lY],
which for prior A in this case is 1.088.
data
S
predict
to
deviation of S
deviation
of
In effect, if we were to
S
would
A
estimate of cr
C^^
is
these
for another experiment yet to be performed, the standard
under Prior A, would be 1.04.
,
use
be
=0.
Under Prior
B,
the
standard
lo76, despite the fact that the empirical Bayes
(It is straightforward to
show
that
SlY
is
likewise a mixture of normals, each of which has covariance matrix of the form
cr^I + D
,
where
D
vanishes as
C"
approaches 0,)
Little credence, we conclude, can be attached to the apparently close fit
of the data in Table 1 to the underlying constant relative potency model.
scientist who objects that the data are just "too good to
legitimate
true"
makes
a
claim based on his prior belief that such extrapolative models are
unlikely to be so accurate.
1
be
Any
The extent to which the totality of data in Table
refines the precision of the estimated human lung cancer risk is, in effect.
-22-
DuMouchel-Harris
Feb 81
a matter of prior opinion.
2x2
The problon with the
The sampling errors for the skin tumor
precise experiments.
in
case, it appears, is that we don't have enough
initiation
data
are so small that the model is fitted, in effect, to the mouse data.
mice
The predicted relative potencies for the human lung cancer risks merely adjust
to the more precise non-human results.
extent
which
to
these
experiments can be conbined, then we need additional
5.
cancer
we
3 is
the
We now proceed in this direction.
precise experiments.
Table
If we are to learn any more about
-fflE
3X3 CASE
an augmented version of Table 2.
In addition
human
to
lung
epidaniological studies and skin tumor initiation experiments in mice,
have
included
transformation
experiments
on
the
enhancement
of
in Syrian hamster embyro (SHE) cells (Casto et al.
addition to studies on roofing tar and coke oven emissions, we
experiments
oncogenic
viral
,
have
1979)
.
In
included
on the dichloromethane extracts of particulate onissions fron one
light duty diesel engine.
Except for the epidemiological studies, experiments appearing in the same
row
were,
laboratory.
as
above,
Ihe
performed
under
identical
conditions
in
the
same
new slopes and standard errors were, as above, estimated by
maximum likelihood methods, as described in Harris (1981)
.
No epidemiological
study of the human lung cancer risks from exposure to light duty diesel engine
exhaust was available.
Although the corresponding cell is left anpty, we note
that the set of available experiments is connected, as defined in Section
3.1
above.
The results in Table
3
clearly reveal
inconsistencies
in
the
constant
.
DuMouchel -Harris
Feb 81
-22A-
TABLE 3.
3x3
Experimental Data Matrix
Roofing
Tar
Emissions
Lung Cancer (Man)
Skin Tumor Initiation
(Senear Mice)
Enhancement of Viral
Transformation
(SHE Cells)*
Coke
Oven
Emissions
Diesel
Engine
Emissions
1.64
1.41
0.49
4.40
0.34
1.48
0.54
0.04
-0.63
2.10
0.04
0.74
0.53
0.04
-0.64
2.07
0.18
0.73
0.86
0.10
-0.15
0.65
0.15
-0.44
slope
coef .var
log slope
*Transformations/2xl0 cells per Mg/ml extract .
Units for other rows as in Table 1,
There are no data for lung cancer risk of diesel engine emissions
in man.
-23-
DuMouchel-Harris
potency
relative
hypothesis.
extracts.
emission
In the skin tumor initiation experiments, for
emission
example, roofing tar
Feb 81
extracts
were
less
potent
than
coke
oven
In the viral transformation studies, roofing tar onission
extracts were more potent than coke oven emissions extracts.
Figure
3x3
3
shows the posterior densities
experimental
data
on
prior
distribution of
jff
is
<^£^
= 0,726.
contrast
In
We
shewn.
to
the
continue
2x2
case,
is considerably less sensitive to the
<r
likelihood
The
assumed.
that
\
the
In
range
to
this
The results for both prior distributions A
matrix.
and B, described in Section 4 above, are
diffuse
corresponding
ITCctiy)
assume
to
posterior
the
prior
a
distribution
function is now more concentrated around
cr <0.2,
posterior
the
density
of
cr
is
virtually zero.
These findings are reflected in Figure 4.
the
shows
means
and
standard
derived
from
Prior
Within
A.
each
points
species,
values
consecutive
dashed
of
X^,
pairs of these
lines.
Since
the
emission are equally spaced along the horizontal, and
each
for
mean
posterior
the
posterior mean data points have been connected by
data
figure
this
1,
errors of the original data on a logarithmic
Superimposed on these data are
scale.
Figure
Like
since the three logarithmic vertical axes are drawn to
the
same
scale,
the
underlying constant relative potency hypothesis requires that the dashed lines
connecting
each
pair
of
underlying model predictions
estimates
X^*
be
in effect
strike
Figure 4 shows, the
As
parallel.
a
between
balance
the
contradictory elements in the original data.
I^ble 4 shows selected statistics of the posterior distributions of
and
6
are
the
for the human lung cancer slopes, based on Priors
Bayes
empirical
Bayes estimate 1 uses
<r
A and
B.
X(J
Also shown
estimates corresponding to (3.16), where empirical
and empirical Bayes estimate
2
substitutes cr^^^
PvMoo^l cf^
-23A-
3
Fiftuftfc
Tostericr PgwsVties of
;
3«S
log
c
EX^eR'MeMTAt -PATA J^AT^iX
1
•«*
.20-
.15 -
.OS
TftoRA/
-
:
:
*
-
/:
1
;
^^V::
^-:l
-
.
\
:f: 1
;
1.
.10
1
1
-r-'X^
:
a
•OS
Tri^r
or
A
B
J
*•
^03
^
uniform
otn
I09
r UMtform
ovw
0.0S <
r
^ 5.0
^
c
^ 5.0
O.S
Harris
6990
-25B'
3x3
X
6
10
SXPERifAeMTAL
Harris Ofdl)
«»id
'^AT^I ^AAr(^lX
-a
-T-y*c
I *
2_
DaMoudiel
.-I
**
m&
(JAN)
y
.£-§
-•-y-c
3 1
J
|-
I
St
^^^
fHAmsmg.
1 t«8^<>
^
1
—
I
ROOFlNa
TAIL
I
COiC£
S£tLS
DuMouchel -Harris
Feb 81
-23C-
TABLE 4.
Bayes and Empirical Bayes Estimates of Log Slopes
For Lung Cancer Risk in Man
Data Matrix
3x3
Environmental
Emission
Stand.
Dev.
Poster.
Mean Xp
Lower
Tail
Upper
Tail
0.495
0.818
0.832
0.884
0.861
1.415
1.058
1.036
0.960
0.995
0.945
0.952
0.987
0.972
0.014
0.015
0.010
0.010
1.482
1.463
1.462
1.459
1.460
0.341
0.337
0.336
0.336
0.336
1.336
1.341
1,356
1.348
0.010
0.010
0.010
0.010
0.434
0.442
0.466
0.454
1.875
1.818
1,217
1.300
0.434
0.442
0.466
0.455
0.017
0.017
0.015
0.016
Mean
Roofing Tar
Original Data
6|Y (Prior B)
elY (Prior A)
0|Y (E. Bayes 1)
GlY (E. Bayes 2)
Coke Oven
Original Data
e|Y
elY
e|Y
elY
(Prior B)
(Prior A)
(E.
(E.
Bayes
Bayes
1)
2)
Diesel Engine
e|Y
6\Y
elY
e|Y
(Prior B)
(Prior A)
(E. Bayes 1)
(E.
Bayes
2)
(Empirical Bayes 1) assumes prior
and diffuse prior on /? .
©lY (Empirical Bayes 2) assumes prior
= 0.782, and diffuse prior on /S
©|Y
"Tr(cr)
concentrated at Cgg
ir(o')
concentrated at ccRSS
= 0.726,
.
-24-
DuMouchel-Harris
cr^^
for
Feb 81
.
In cxmparison to the results for the
deviations
case (Table
2>c2
2)
the posterior distributions for the roofing tar log slope were
of
considerably less sensitive to the estination method used.
now deviate from
In the diesel engine case, however, the contrast between
estimates
is more striking.
epidemiological data in this case, the
precision
posterior
a"
.
~ 0«'782 for these data, the Bayes estimates are
on Prior A, and
1
has, it appears, been vindicated.
cr^ = 0.726,
likelihood estimate
environmental
and
a*
9
depends
c^^
= 0.726
= 1.150, based
= 1.189, based on Prior B.
<j*
The scientist who voices skepticism at the
Figure
of
Whereas
A
^-^s
Bayes
the
Because there were no original
solely on our assumptions about the hyperparameter
and
roofing
xp
the corresponding posterior mean values of
Bayes
both
For
6
tar and coke oven emissions, the posterior mean values of
empirical
standard
the
,
agents,
then
close
If we take
fit
cr
extrapolations
of
the
data
in
to he its maximum
between
species
or
we conclude, will be accurate only to a multiplicative
factor of 2 with 68 percent probability and only to a multiplicative factor of
If we take
4 with 95 percent probability.
1,150
on
(based
Prior
multiplicative
accurate only to a
probability
only
and
then
A),
to
a
such
factor
cr
to be the Bayes estimate cr*
=
extrapolations, we conclude, can be
of
exp(1.15)=3
with
68
percent
multiplicative factor of exp (2x1.15)210 with 95
percent probability.
We are now in a position to contrast our statistical approach with others
in the literature.
among
experiments
Our equation (3.2) partitions
into
several components.
suggestion that "the summary of
experience
in
the
analysis
a
of
series
of
variance."
the
sources
of
variation
We thus follow Cochran's (1980)
experients
In
calls
mainly
for
our decomposition of these
,
-25-
DuMouchel-Harris
Feb 81
values
sources of variation, however, we do not assume that the true
values lie within
confronted
with
additive
an
obey
exactly
slopes
log
problem
of
estimating
the
We assume only that the true
model.
±(r of such a model with probability 0.68.
the
of
the
slope
when
Further,
for
particular
a
experiment, we have no belief that the slope in question is at all unusual
fron the underlying equal relative potency model.
deviation
its
set of all such deviations is exchangeable.
in
For us, the
The distinction between the Bayes
and the empirical Bayes approaches depends on our willingness either to assign
a prior distribution to
point estimate for cr
procedure,
especially
uncertainty in cr
(then integrating with respect to c-
(T
as if it were
when
there
We
known.
prefer
the
or to use a
)
Bayesian
full
are relatively few experiments, since the
is real and should contribute to our uncertainty about
B
.
This point is illustrated by the Bayes and empirical Bayes standard deviations
of
6
for diesel engine emissions in Table 4.
experiments
On the other hand,
many
v^en
are combined, we expect that the choice of prior distribution and
the choice between Bayes and empirical Bayes estimates will be less important.
(See, e.g., Tiao and Zellner
(Dne
(1964).)
procedure suggested by Lindley and Smith (1972)
and
Smith
(1973a)
A
which
describe as "modal Bayesian," amounts to the use of
they
in our
cr
'ML£
A
formula (3.16) for
version
the
anpirical
of
Bayesian
confidence
C
.
We would describe
intervals for
approach
Smith (1973a) shows that if
Bayes.
confidence
this
intervals
.
will
be
shorter
yet
as
cr
than
another
is known, then
the
classical
Our allowing for uncertainty in <T will tend
to lengthen the confidence intervals, but they will still be shorter than
classical
intervals
for
6:
based solely on the sampling errors
c-
the
from
each experiment.
The fully Bayesian analysis of Smith (1973a) differs from ours in several
-26-
DuMouchel-Harris
Since he concentrates on the two-^way table
respects.
the
components of error that we call
distribution
cr
(b)
;
calculate
and
procedure;
in order to:
cr
of
(c)
C
and
no
the
cr
for
=
E[c7- |y]
c^
determine
the
posterior
show the range of uncertainty ranaining
(a)
value
replications,
are combined in that paper
I
cr
with
Moreover, we explicitly calculate
as if C were equal to zero.
for
Feb 81
use
in
our
empirical
which can be interpreted as a
,
Bayesian risk of interspecies extrapolation if loss is proportional to
(O-X^)
.
The statistic
cr*
Bayes
S
=
will also play a critical role in the diagnostic
procedure of the next section.
Put there are still two serious problems with our analysis of the data of
Tfeble 3.
First, we note that
experiments.
One
may
the
more
legitimately
precise
cone
data
from
non-human
protest that we have merely learned how
accurately we can extrapolate from mouse skin to hamster onbryo cells.
At the
very least, some test of the assumption of exchangeable
errors
appropriate.
seems
Ideally,
extrapolation
should include the results of more precise
we
human experiments in our analysis.
Second, we have so far said nothing about the choice of experiments to be
because
they
were
laboratory
Harris (1981) selected these
included in the analysis.
considered
to
be
valuable
quantitative
bioassays
measures
of
carcinogenicity, and because tests of several related onissions were performed
in the same laboratory.
these
specific
emissions
(Huisingh et al., 1979).
initially
The U.S. Enviroimental Protection Agency
as
part
of
had
chosen
its diesel emission research program
Although we have presented only
a
few
experiments
for expository purposes, it is hardly clear what would happen if we
were to include many more experiments.
What is
more,
there
is
no
means of deciding which experiments are most appropriate to include.
obvious
.
-27-
DuMouchel-Harris
Feb 81
SELECTING AND REJECTING EXPERIMENTS
6.
6.1
The 5x9 Case
Table 5 further augments the experimental data in Table 3.
studies
epidemiological
cancer
lung
to
skin
man,
in
included
mutagenesis
experiments
under two types of conditions (Mitchell et al.
metabolic
Mutagenesis-MA,
no
Mutagenesis+MA,
metabolic
was
activator
,
was
activator
have
.
In
the
rcw
denoted
added.
In
the
row
denoted
1979)
included
in
the
experiiiental
Thus, both direct and indirect mutagenicity were measured.
preparation.
In addition to the three emissions given in T&ble
3,
we
have
included
other diesel engine emission samples; a sample of particulate emissions
three
fron
we
cells,
mouse lymjiiana cells performed
L5178Y
in
initiation
tumor
experiments in mice, and viral transformation studies in SHE
addition
In
gasoline-powered
a
and
benzo(a)pyrene;
automobile
cigarette
engine;
smoke
the
condensate
polyarcmatic
fron
the
hydrocarbon
Kentucky
experimental cigarette, which was designed to be typical of cigarettes
during
the
relabeled Diesel
numbered
The
1950s.
from
I,
II
smoked
diesel engine extract appearing in Table 3 has been
while the remaining
to
lAl
IV.
diesel
emission
have
samples
been
Diesel emissions II and III were, like Diesel
obtained frcm light duty diesel engines.
I,
Diesel emission IV was obtained from
a heavy duty diesel engine.
The conditions of collection of these samples are
described in Huisingh et al.
(1979).
cigarette
smoke
With the exception of
the
results
for
condensate, all do^-response slopes and the standard errors
are taken fron Harris (1981)
Although Harris (1981) did not
report
the
corresponding
dose-response
slopes for cigarette smoke condensate, experiments on this agent were reported
in
the
source
al,, 1979).
studies (Casto et al., 1979; Mitchell et al., 1979; Nesnow et
We were therefore able to
estimate
these
slopes
by
the
same
-
00
*
•
<u
fo
DiJid
•H
u
in vc
Q iH
o
•<3<
e
•
«
«
CO
S)
ISl
fO
O
•
o
N
c
dJ
•
1
1
—
CX5
•
00 00 "S*
in «a in
•
fo CO
in CN in
(Ti
.
s
in
,_^
(0
(S)
S! fO 00
«
(a
»
•
(SI tsi
1
in
n
""3'
rH r~
.
O
<Ti
.
•
SI SI
1
1
-0
cu
c
in
<u
Wl
0.03
0.04 6.29
85.28
>1
Oj
.
540.00
pa
•
iH
O
03
(13
0)
c
n
•H
en
C
aw
H
SJ
<x)
CNJ
•
•
.
<S
(S)
fO
VD
JJ
El CN <n
<N iH in
...
® O iH
in iH «^
<yi (SI
CO
C3^ f-i (SI
.
•
•
e
i-i (SI
•
6
CS
r-i
S
(SI
iH
oo
o
u
4J
X
(CJ
<L)
iH
<u
to
Q)
•H
X
Q)
c
•-<>
OlM
c
QW
^
iH CN rH
(SI 00 in
«*
<r>
(SI
in CM
S
S
ISI
(SI
"*
fO
4J
<* vo
rH CN 00
IX)
(SJ (SI
rH
in CO CN
SI
S^
S)
(SI
ro
•H
C
0)
>
rH
o
CO
iH
CU
CO
•
0)
c
M
•H C M
QU
0)
•H
cntH
M 00 VD
rH rH
...
(Si
in vo
in rH
Si
CM
CN Sl
(SI
(0
w
CO
<
<U
M
o;
c
q;
e
(SI
rH
(SI
CN
(SI
ISl
rH Sl Sl
(0
•H
CD
cu
H
c
•H
M
cr>M
t~(SI
ro
n t^
rCN
S) rH
(SI
SI CM
(SI (Sl
c
ow
rH
n
^ CO
CN VO
rH CN 00
<i>
(SI
(Si
U
X
<u
rH
VJ
rH
o e
VO >!i< r~
r- rH CN
H-l\
O IS
^i
(Sl
C71
(U
U
3 O
4J
rH
(U
0)
CD
CD
•H
H
c
^ "^
n
in SI vo
D^M
c
QW
(S
ISl
Sl
in in
rH
«X>
IS SI
^
•<*
o
m
rH
(Sl
(S
VD CO
00 CN vo
r~-
rH El SI
X
C
J^ <u
o >
uo
^ CO
^ CO ^
(S)
•
"^J-
•
SI rH
Sl -^ •<»
rH SI r-
VO (Sl in
CO rH rH
CO rH CN
r- CN CO
CN Sl Sl
IS Sl s>
(S SI S)
...
...
•
ro
JJ Gi
VD rH rH
in
VO
(Ti
in
C)
fO
<u
•H
X
4J
cr>4J
1-1
o
rH
in
rH
CTl
I
Q
I
n
O
-H
C
.
«
•
vo r- (Sl
<Ti SI CO
•
.
CT\ (Sl
CN
.
o
•H
4-)
CO
CO
U
c
(0
O
3 >
CO -H
>
Q) U
X!
-P
3
nj
CO
CO
rH
CQ
<U
rH
^
(0
Cve
El
01 (SI
C
'UrH
c\
O
CT«
c
•a
IW
o
O
«
u
'* rH O^
VO «*
^
d
"^ "^
in (SI VD
r^ 00 CO
(SI rH r~
rH
ro
<yi
r~
n rH
vo VD vo
in rH CN
<u
(CJ
E-1
O
rH rH
(Sl
(S
(S) tSI
CN IS Sl
S!
(SI
rH
cy> (SI
CN
^
CO
01
•H
o
CO
•rH 4-1
(C
c (C
o >
CQ
O rH
E O
CO o
I
.
1-1
H
4J
^
o
iH
(0
(0
•H
M
^^
k4
(0
0)
n
I
iH
0)
U
Q)
U
C
(d
u
3
O
s
3
Q
c
C
E
3-^
o
c •H
u O E
O -H
e -p u
3 (0 as
Eh-h O
4J C
C -H <u
•H C CO
J>i
CO
M
m
O
g
u
o
<
2
MH
1
CO
4J
C'--
CO
C
rH
eEHrH
(U
(U
O r-l U
C
(0 U M
(C3
(1)
>-l
(CJ
x: -H IE
c
w
>
CO
c«
e
O
< e
s o
4J
x:
x:
(U
+
a
e
<u
C
CD
en
(0
J**
QJ
en
3
O
S
—
S
-P
3
4->
a
1-1
<a
•H >i
CO
<U 4-)
(tJ
(C3
.l^^
CO
—
<u
CO
01
rH
rH
CJ^
(U
C_)
c
(0
4-)
3
s
J
<u
en
3
O
cr>
*
*
—
•H
O
•H
4J rH
4-)
(tJ
C71
(U
rcJ
E
CQ
<U
U
rH
0)
(1)
<u
£
3 o o
E jQ
u
OJ 4-» o
rH
rH
S
^
U
1-1
C o
(0
o >
<4-l
CO
II
4J
H
^<
< c
IS* s D
* *
-28-
DuMouchel-Harris
methods used by Harris (1981)
likelihood
maximum
Feb 81
For the human lung cancer
.
veterans
results r we applied Harris's estimation procedure to the U.S.
for men aged 35 to 84 observed during 1954 to 1962 (Kahn, 1966, ;^>pendix
data
Tables A to D)
If hCd^t)
.
dose
accumulated
with
parameter
^
is the incidence of lung cancer among men of age
obtained a maximum likelihood estimate of the
we
dy
h(trO)
(1
+^d).
(The estimation algorithm is described in DuMouchel
(1981).)
&)se measured in cigarettes per day x years, our maximum
>^
years (standard error, 0.103 x 10
likelihood
estimate
under
delivered 38
assumption
the
±
2
This estimate was then converted into
).
units of incranental relative risk per
delivery
With accumulated
1.085 x 10"^ units of incremental relative risk per cigarettes/day
C was
years
10~'*'|jig/m
cigarette snoke condensate x
the typical cigarette smoked by a subject
that
mg cigarette smoke
condensate,
and
that
c^ily
total
the
of condensate was diluted in a total (felly ventilation of 11
±
(We used the methods described by Harris to incorporate the uncertainty
these
t
for the relative risk model
h(t,d) =
for
study-
dosage
conversion
units
2
m
3
.
about
the slope and coefficient of variation
into
reported in Table 5.)
Except for the last two columns, the additional experiments
were
performed,
anissions.
For
as
above,
the
on
the
benzo(a)pyrene
results,
form
as
column, whole
EJtioke
condensate was used.
we
note,
this
agent
was
positive control in sane experiments.
The resulting
applied
(k,l)
and
(k',i')
,
the quantity
9^£ -
dose-response
^k'I
~
in
For the last
units,
are still compatible with, the constant relative potency model.
any two pairs
5
dichlororethane extracts of the various
concentrated
a
Table
in
^k4'
"*"
For
^k'Z'
-29-
DuMouchel-Harris
Feb 81
remains diitiensionless.
Table 6 reports the Bayes estimates of the
cancer
The
risk.
results
the
of
corresponding
first
two
columns
previous
two
sections.
results
for
Ihe
human
for
lung
third
column
provides
the
the 5x9 experimental data matrix in Table 5.
ihe
Only
monentarily.
slopes
2x2 and 3x3) sunmarize the
(denoted
right-most column shows the original data.
described
log
The
results
the
diesel engine I onissions are given.
remaining
for
columns
will
be
roofing tar, coke oven, and
The original slope for
human
the
lung
cancer risk from cigarette smoke was so precise that its posterior density did
not change substantially.
The eiis)iricia4..3ayes
Hence, it is not reported.
jestimate of
.
Because the posterior density
the
corresponding value of
-rr(<5'lY)
cT
for the 5x9 data matrix was 1.041.
was highly concentrated around
was nearly equal to
cr*
cr^g
deviation
for
8
between two sources of uncertainty about
hand,
the
larger
<r
These results reflect a balance
.
On
the
one
implies uncertainty in the deviation
number
of
roofing
the
By contrast, the corresponding standard
diesel engine I has declined.
posterior value of
,
in comparison to
.
the 3x3 case, the standard deviation of the posterior density of
tar log slope has increased slightly.
<r
experiments
permits
hand,
S
•
large
a
On the other
X^
us to estimate
more
precisely.
Figure 5 depicts
constant
relative
deviations
the
potency
the
When there are no data
posterior
mean
of
S
is
these
data
from
the
underlying
For each of the five species, the Figure
model.
shows the posterior mean values of the
Qtaresion.
in
y
residuals
S
=
6 -
xp>
for
each
for a particular species-emission pair,
necessarily zero.
Such cases are therefore
omitted fron the Figure.
The posterior mean residuals for cigarette smoke. Figure 5
shc^vs,
are
in
DuMouchel-Harris
Feb 81
-23A-
TABLE 6.
Bayes Estimates of Log Slopes For Lung Cancer Risk in Man
Based on Alternative Data Matrices
3x3
5x9
4x9
4x8
3x8
3x7
0.0
0.726
1.041
0.872
0.674
0.389
0.316
1.043
1.150
1.080
0.933
0.730
0.480
0.395
0.229
0.788
0.832
1.036
0.123
1.108
0.306
0.957
0.959
0.915
1.526
0.742
0.497
1.414
1.497
0.334
1.462
0.336
1.375
0.335
1,366
0.334
1.455
0.335
1.422
0.334
0.442 -0.458 -0,706
1.818 1.451 1.304
0.207
1.160
0.330 -0.836
0.867
1.582
2x2
C:„
e&
Data
Roofing
Tar
e*
0.495
1.415
Coke
Oven
Diesel
Engine
I
Prior A for -rrCc) and diffuse prior for
calculations.
a-* =
Elcr^lY]'/*-.
(S
assumed for all
1.482
0.341
Z.9&-
FROM 3><9 gKPERmei^TKL PATA MATRIX
-4
-5
.^
i.S"
&
S
-i.
.5
-
.«a
^ ^
:v
t
^5
1^
I
0^
I
I
^.
V
:(9.
i.r-
#1
f
-a-i
s
.1
UNO
I
Viral
eMBftyo
cetLS)
l>^^9\kMt\oM Harris (1^91)
-MA.
+MA.
-1
-30-
DuMouchel-Harris
cases
five
of
four
weaker
Feb 81
Cigarette smoke is
relatively large in absolute value.
human
carcinogen,
lung
a
weaker
mouse
skin
tumor
apparently
a
initiator,
a more potent transforming agent, and a more potent direct mutagen
than would be predicted in this case fron the constant relative potency model.
mean
Further, the range of the
experiments
absence
the
in
residuals
of
largest
is
metabolic
activator.
mutagenicity are apparently least compatible with the
mutagenesis
the
for indirect
Tests
underlying
By
model.
residuals for mutagenesis with activator are more concentrated
the
contrast,
for
around the origin.
A similar finding
applies
to
the
transformation
viral
results when cigarette smoke is eliminated.
A Diagnostic Procedure.
6.2
We seek a method
to
which
determine
subset
for predicting lung cancer risks in man.
relevant
of
experiments
most
is
TWo basic characteristics,
we suggest, are critical to such a procedure.
First, the method ought
predictive
between
bias
interested in a particular
is
c^
large).
to
and
6^
sensitive
be
predictive
of
+(7.
the
precision.
underlying
Suppose
for which there is little or
no
tradeoff
that
we are
data
(i.e.,
The inclusion of irrelevant experiments in the data matrix
could result in a biased estimate of
order
to
However,
if
B^
we
,
the size of this bias being in
eliminated
all
but
the
most
the
relevant
experiments, the remaining experiments could contribute little if anything
the
accuracy
of
our estimate of
©^
,
as measured by
applies especially to the case where we have no original
(e.g.,
the
human
c^.
data
to
This difficulty
y-
on
6^
lung cancer risks for diesel engine enissions in Table 5).
If we eliminated every conceivably irrelevant experiment, then we would end up
with exactly what we had at the start
—
no information on
©j
at all.
-31-
DuMouchel-Harris
Second, we should not
eliminate
Feb 81
experiments
individually.
One
could
exclude those species-eraission pairs with large posterior mean values of
interactions
these
Since
procedure
would
defeat
the
purpose
of
our
would be more appropriate to assess whether a specific species
It
or a specific environmental agent is more or less relevant to the others.
therefore
adopt
Y^_
experiments
be the
involving
posterior density
procedure
cross-validation
a
entire rows or columns
Let
frati
vector
,
log
of
species
"Tr((rlY^_)
agent
slopes
For
k.
and denote
Z
.
The
on the elimination of
based
Y
<<7*
,
lYj^_]
we
k,
''*
We
.
for each
or
cr*
all
of
evaluate the
where cr* = El <^
and cr*
-
exclusion
by
species
= E[cr
<7*
species or agent for which
termed the "least relevant".
formed
each
<T*
Analogous definitions apply to
(4.1).
We
the matrix of experiments.
species k is "less relevant" if
that
.
are what determines the relatedness of the various
species and onissions/ such a
analysis.
S
say
shall
as in
\Y]^*-
environmental
is lowest will be
crj^
The least relevant species or
onission
the
is
one whose elimination most iirproves the relevance of the ranaining experiments
to
each
other.
In
of Section 7, we note that the terms less
anticipation
relevant and least relevant are
&
posteriori concepts.
Given an initial set of experiments
particular
9{
of
interest,
we
Y
,
consider
a prior density
following
the
TTCo-)
data
and
,
a
analytic
procedure.
(i)
Calculate
cr*
and
c*
for each k and
X
,
and determine the least
relevant species or emission.
(ii)
Calculate the posterior distribution of
Q- before
and
after
the
least relevant species or emission is removed.
(iii)
(i)
Eliminate the least relevant species or emission and
and (ii) on the reduced set of experiments so long as:
(a)
repeat
steps
there exists a
-32-
DuMouchel-Harris
relevant experiment;
less
correspond to
&^
c*
emission reduces
(iv)
and
;
(c)
predicting
B^
species
the elimination of the least relevant
(a),
(b)
,
and
(c)
6^
relevant for
.
We applied this procedure to the 5x9 data matrix in Table 5.
cr
was
procedure
the
most
considered
are
or
,
satisfied^
are not
,
experiments
remaining
The
the least relevant species or emission does not
the posterior standard deviation of
,
If conditions
terminates.
(b)
Feb 81
We
assumed.
focused
Prior A
on
on predicting the human lung cancer slopes for
roofing tar emissions and diesel I emissions.
Steps
the
(i)
,
(ii)
successive
values
of
distrifcwtion
Figure
and (iii) were repeated four times.
,
and
displayed
are
iterations
<r*
of
fron
<7*
depicts
6
for each iteration, where
to
left
assist
To
right.
interpretation, a few species and agents are specifically identified.
For the original 5X9 data
metabolic
matrix,
cr*
activation
was
least relevant.
4X9 matrix with a new
cr*
=
cigarette
snxDke
to
be least relevant.
skin
tumor
initiation
in
to
be least relevant.
c*
oven emissions were
least
Elimination
to
be
procedure,
this
Again repeating this procedure,
mice
without
we
found
Removal of this column resulted in a
resulted in a 3x8 matrix with a new
found
Mutagenesis
1.08.
Removal of this row resulted in a
Repeating
0.933.
4X8 matrix with a new cr* = 0.730.
=
= 0.480.
we
found
Removal of this row
In the final iteration, coke
relevant,
with
o"*^^
=
0.395.
of coke oven emissions fron the 3x8 matrix violated condition
in step (iii) above.
Hence, the procedure was terminated and
the
3x8
(c)
array
was deemed most relevant.
The results of this procedure are summarized in those columns of Table
labelled
5X9,
4x9,
4x8,
3x8,
and
3x7.
As
we
successively
6
eliminate
mutagenesis without activator, cigarette smoke, and skin tumor initiation, the
Dv MoocKel - 'r\ax-n S
(
i
^0
-32A-
and cr* f©^ each (terat'ioi^ of the.
dLiasnostic procedure.
3>istribation of
oj^^
1.4
W+
/.3
M+
<«+
1,2.
/.i
1
1
I
i
1
— M fcop
1
HywAM
.
•
•
1.0
"»W-
1
4i
0,9
^;
4
1
0,9
«
cia ---
0.7
—
Humi
0.6
0.5
«
o>
1
1
\
1
COKE.
0.3
-
O.i
O.J
4k9
5x9
CogiGiK/AL
Cm-
4>'8
Ccia.
3}<.g
( stc(is)
V^
The arrows point to
t^\e.
value of o"* when
all dcx^a.
are induced.
-33-
DuMouchel-Harris
Feb 81
posterior standard deviations of roofing tar and diesel
Elimination
emissions,
oven
coke
of
roofing
0*
and
and
c*
deviations
standard
the
of
In the case of roofing tar, the
tar and diesel emissions parameters.
estimates
decline.
least relevant in the 3x8 array,
the
posterior
resulted in a marked increase in the
emissions
I
were almost identical to the original
values
of
y
c.
The resulting tradeoff between
efficiency
(c*.
is
)
predictive
graphically
depicted
bias
in
(c*)
Figure
and
predictive
By
successive
7.
elimination of least relevant experiments, we are able to reduce
Any further reduction in o*
is at the cost of a marked loss of precision.
Unless we are willing to specify a particular loss
however, a reduction in
To
seek
eliminate
icier)
,
of
less
uncertainty
relevant
appears
reduced,
is
procedure
(Our
distribution
extent
the
to
efficiency
situations.
cannot
For many public health and environmental policy applications,
most preferred.
predictive
we
function,
conclude that the predictions resulting frcm the 3x8 matrix are
unequivocally
desired.
to 0.48.
a*
depends
to
somewhat
human
about
experiments,
be
upon
so
appropriate
the
choice
but when many experiments are in\7olved,
risks
is
long
as
for
such
of
prior
dependence
this
should be minimal.)
The human lung cancer experiments,
relevant
(i.e.
data matrix.
cigarette
^t^,na.n-
*^
^*
^
we
note
fron
This conclusion does not apply to the 4^8 or
sirioke
removed.
various
heterocyclics.
6,
are
less
so long as cigarette smoke is included in the
3>«8
arrays,
with
Cigarette smoke contains numerous carcinogenic and
mutagenic compounds other than polyaromatic hydrocarbons,
and
Figure
The
e.g.,
nitrosamines
apparent deviations of the cigarette smoke
data from the constant relative potency model may reflect these differences in
chemical composition.
Peto (1977) has similarly remarked that
the
mutagenic
l>(J^Ao^6Kd
-
Marn6
Tv-ac^eof^
(!9^l)
-53A-
between cr* air\4 c* for each Iteration of
diagnostic yrocedore.
1.^
cf.
.0 -
0.5
-
eoKE
8f
(
B
-iii..i
^
It
Si?
3x6
-•4-
OVEH
^^
t
t
4x8
4*9
'
t
5x9
COMIT C>M»Y^
(D-S"
<
I.O
-34-
DuMouchel-Harris
potency
cigarette smoke appears to be greater than would be expected from
of
Whatever
its systemic carcinogenicity.
C
our estimate of
no
interpretation,
its
eliminate the most precise human experiment.
to
us
leads
Feb 81
procedure
our
As a consequence,
is constructed primarily fron the non-human data.
We
see
completely satisfactory response to this limitation other than to suggest,
where possible, the inclusion of other precise human data.
we
Nevertheless,
find
the
results
diagnostic
our
of
procedure
Assays for indirect mutagenicity and tumor initiation have been
intriguiyjg
excluded as less relevant.
The retaining laboratory bioassays are designed to
an agent's interference with gene replication and cell differentiation.
guage
3x8
For the polyaromatic-containing emissions ronaining in the
these
table,
biological processes could be critical to human lung carcinogenesis.
We recognize that the above cross-validation
analytic.
Adapting
the
methods
in
general
V(5-) = cr^ (for all
Biat is, we might assume
specification.
purely
is
V(S.) =
i)
T^
to derive the posterior distribution of
be
identified
a
.
Unless the
with
a
for seme
(cr,T
subset of experiments and then enploy a joint prior distribution for
can
data
Efron and Morris (1973) to the current
problen, we could replace our assumption that
more
procedure
"suspicious
)
subset"
priori y our present method appears to be much simpler in
practice.
7.
HKFECTLY REPLICATED, IMPERFECTLY REPLICATED,
AND STRdOLY RELATED EXPERIMENTS.
7.1
In seme situations, we may
relationships
between
Definitions.
have
experiments.
additional
In
Table
prior
5,
for
example,
unreasonable to posit that diesel engine emissions I through IV
more
related
to
each
on
information
the
it is not
ought
to
be
other than to the remaining environnental agents.
An
.
-35-
DuMouchel-Harris
Feb 81
analogous assumption might apply if experimental data were
available
two
in
closely related species, or two strains of the same species, or even males and
females of the same species.
We now investigate how one might take advantage of
this
form
need
to
characterize
experiments can be related a priori in terms
of
our
information.
we
classification
different
rather
purpose,
this
For
prior
of
how
precisely
statistical
A
model.
of degrees of relatedness is given by Smith
(1973c)
Consider the results
be
separated
into
conponents,
two
shall call two experiments
^l
(y.
=
^
^i*
and
i
x-ft
and
S-
with
,
Its mean
=
0-
sampling
the
error
associated
with
6
+
can
S"^-
We
.
x S = x.,e>
"perfectly replicated" if
i'
x,ft
Under this definition, the only source of
priori .
is
-y-/)
of a particular experiment.
y-
and
variation
in
experimental
each
observation.
We shall call two experiments
X/B
and
a priori , but, conditional on C"
N(0,o'),
a priori independent
x-/^
we
,
shall
say
that
If
,
x-j3
i'
"imperfectly replicated" if
the hyperparameters
corresponding
the
pairs
of
and
S"-
a
is highly correlated
experiments
The term "highly correlated" is tonporarily left
related."
all
i
x-fi
S/
priori
=
are
with
are
"strongly
vague.
Finally,
experiments that are neither perfectly replicated, imperfectly
replicated, or strongly related will be defined as "weakly related."
The case of perfectly replicated experiments presents no special problans
for the present paper.
If
for the two experiments, we
y^„
±c-„
,
where
y...
=
(c^,y.
y.±c.
and
merely
y.,±c-/
replace
+ c^y., )/(c^^ + c*/).
are the sufficient statistics
than
by
the
single
statistic
-36-
DuMouchel-Harris
=
c.„
(c'
2-
+
_-2•^c., )
Feb 81
'/2.
We might apply this procedure if two different experiments were
performed
species
same
the
in
independently
the same agent but, say, in different
with
This case will not be considered further.
laboratories.
When two experiments are imperfectly replicated, we admit the possibility
of independent deviations frcm the underlying regression
independent
sampling
distribution
N(b,V)
the slopes
and
6^
,
9^,
have
=
x.,
= x
,
identical
are row vectors.
pEifiia correlation coefficient
A
If
V
,
we must apply the definition
is replaced by
and
tV
(7«1) approaches unity.
Since
t->«>
and
d^
,
of
the
0j/
normally distributed with identical means and variances, a correlation
are
coefficient
0-
=
assumption
of
1
©/ a
of
a
would imply that
6-
posterioici, even if
diffuse
prior
on
= 0^/
(y-
^
&
But this would imply
priori .
-y-/ )/(c^j +C(^
apparently
irrperfect replication to that of perfect replication,
S^t
prior
,
replication with care.
imperfect
that
xb
pcieiLL identical prior means
&.
When a diffuse prior is assumed for
&
has
^
as
imperfect replication implies that (conditional on <r)
xVx'/(xVx' +0-^)
(7.1)
x.
hyperparameter
the
well
as
xVx' + cr^, and correlation coefficient
variances
where
When
errors.
model,
)'''^
is
reduces
even
large.
The
the notion of
though
5"-
and
are assumed to differ on the order of cr.
This difficulty appears to be related to
paradoxes
discussed
by
Dawid,
Stone,
the
class
and Zidek (1973)
of
,
marginal ization
which are known to
occur sometimes when improper prior distributions are employed.
We note
from
-37-
DuMouchel-Harris
(3.14)
prior,
E[eiY,o-]
y^
=
y-
perfectly
=
correlated
=
R
where
,
that when
X|^
= x^/.
a priori
The
&
takes on a diffuse
cr"^(I-X(X'X)"'x')
that for this formula,
shew
even if
,
however,
3.3,
(l+CR)"'y
to
straightforward
when
Section
in
formula
Feb 81
E[6^- ©/lY,o-]
paradox
that
is
is zero only
and
&{
It
.
are
0^^
is avoided so long as we use (3.14) to evaluate
Q
the posterior means and variances of
.
The case where two experiments are strongly related is even more general.
The correlation between
(7,2)
= X -Vx ?/ / [
r
while (conditional on
(7.3)
x.Vx.'/
<r
.'
)
(x .,Vx \,
) ]
,
the correlation between Q-
)
/[(x.Vx! +£^) (x.^Vx?,+«/)]
x^ = x-/
correlation between
the
is a priori
'''^
(x -Vx
which reduces to (7.1) when
that
X/R
and
x-S
and
d-
and
6-/
is
a
priori
'/2-
,
imperfect
(i.e.,
&,
replication).
Note
in (7.3) is always less than
r
in absolute value.
We do not wish to draw
and
related
weakly
sharp
a
boundary
between
If the correlation
related.
the
terms
strongly
in (7.2) exceeds 0.9 in
r
absolute value, we would certainly call the corresponding experiments strongly
related.
falls
If
Irl
below
the
related.
i.e.,
When
the
< 0.7,
"within
x- ^
x/
assumptions
r*
or
<
x^"
0.5 (i.e.,
the
x^
"
variance
variance), then we would use the term weakly
a priori and a diffuse
used
"between
in
Sections
corresponding experiments as weakly related.
4,
prior
5,
and
assumed
is
6,
we
for
regard
/S
,
the
-38-
DuMouchel-Harris
Informative Priors on
7.2
Figure
/3
For the
3>t8
Case.
diagrams a 3x8 array of experiments similar to that derived from
8
the diagnostic procedure in Section 6.
As a result of our elimination of
distributions
affect the posterior
place,
its
In
removed.
the
of
we
however,
other
have
slopes.
added
Hence,
it
results
the
epidemiological study of men exposed in their occupations to a fifth
engine,
diesel
among
London
from
Harris's
3
fug/m
an
type
of
lung
incremental
This
cancer
(The slope
relative
risk
per
It was converted to units of incronental relative
particulates x years.
risk per 10
of
Authority diesel bus workers.
Transport
estimate was originally reported in units of
Mg/m
analysis
(1981)
was
of
duty diesel different from the other diesels.
heavy
a
additional slope was taken
incidence
the
data, the remaining single experiment on benzo(a)pYrene could not
tumor
skin
Feb 81
extractable organics x years under the assumption that the
dichloromethane extractable fraction constitutes 18 percent of particulates by
weight.)
bioassay
Ito
studies
of the latter type of diesel engine onission
were available.
We wish to introduce our prior knowledge that the experiments
columns
through
I
experiments.
clear.
It
V
more
are
assumed
replicated.
requires,
in
effect,
assume
To
that
the
that
they
are
imperfectly
not
replicated
diesel emissions are virtually identical in
conposition, but that different diesel engine samples, through
variations
in
design or operating conditions, may result in idiosyncratic effects in
some species.
exposure
is
implausible that the diesel experiments in each row should be
perfectly
engine
diesel
related to each other than to the remaining
Just what degree of inter relatedness should be
is
in
to
If we were
light
interested
primarily
in
the
possible
risks
of
duty diesel emissions, the assumption that all light duty
and heavy duty diesel experiments were imperfect
replicates
could
be
too
r
-^^'^
l)o^oodAe]-Wcxrris (l9Bl)
pitseu
Sloop
coK-E.
—
—
I
3>tesei-
piesEu
;jiesEL
TZ
nr
:ii:
Diessu dASoutJe.
Powe^ep
S
1
X X
Vjral Tntmsf-.
Vnvita^enesis
+ MA
-—
-^-
-
—
1
--;-
—
i
!
i
X X X X
X
X
X
X X X X X
.
X
-T-
•---
X
X
1
8
t =
FiaoRe.
SX8
"X
3>ATA
vift^ay>s
8
rAATRiX
experimental doho.
ftvailobl<
-39-
DuMouchel-Harris
Feb 81
The assumption of strongly related experiments may therefore be
restrictive.
more useful.
Of course we could fall back on the assumptions
on
prior
diffuse
^
to
x's
unequal
and
Its value of
y
would always
be
perfectly
underlying regression model by the additional column effect that
the
must be estimated, and it would not contribute toward estimation of the
Q
Only
's.
a
In that case, however, there is no point in including
.
the observed slope for diesel V.
fit
of
a
proper prior on
the human occupational exposure
other
^
would reflect our belief that the data on
to
diesel
V are
engine
relevant
the
to
estimation of lung cancer fron exposure to the other diesel engines.
We now examine in detail the choice of a proper prior for
/B-
.
Return to
equation (3.2) in Section 3.1, that is.
where
^ +^^+
=
'kI
^i-'
^^
+ ^^5'
correspond to the three species and 1=1,..., 8
k=l,2,3
The hyperparameters
the eight environmental agents in Figure 8.
refer
specifically
to
the
correspond
various diesel effects.
{^j,...
to
,'2''^}
A natural model for the
relationship between these hyperparameters is
(7.4)
where
»V
^^
represent
=
^„
+-»!,,
i=3,.,.,7.
is a conponent common
to
all
diesel
Now denote the prior variance of
by
and
{"^j.^,..,
deviations of each diesel fron the common component.
prior mean zero and is independent of the other
71 «
engines
v
.
If
v^
=
a
1^
by
>j«
and of Y^
v^ and the
prior
Each
,'*]rj}
>jn
has
.
variance
priori . then the hyperparameters
{
^}
of
are
-40-
DuMouchel-Harris
The quantities
uncor related.
be
correlated
V-
{M+<V+''i}
i.e., the quantities
f
only through the ccmmon hyperparameters
is not anall conpared to the prior
cx)luiTins
Feb 81
variances
{^,<^k)
of
{x-6}
will
,
Provided that
-
{1*+*^}
the
,
diesel
are weakly related.
=
v
On the other hand, if
a
priori r then the hyperparameters
{!(,}
are perfectly correlated, that is, the diesel columns are iirperfect replicates.
Finally,
if
v >0
large and
is
v^
is small, then the diesel columns are
strongly related.
Since there are
{jx,aCj^,/£}
12x12
,
= 3+8+1 = 12 components in the hyperparameter set
K+L+1
we must specify a 12^1 vector
prior
covariance
V
matrix
b
Ttiis
.
of prior means
well
as
as
a
covariance matrix contains a 5x5
submatrix of covariances among the diesel column
effects
i^^,,..
,'^y}
We
.
now make the following numerical assumptions.
For every component of the prior mean of
(i)
(f^y'tj;^,"?!^}
,
we choose
=
b
0.
For all
(ii)
(j,j')
column effects, we choose
(iii)
= 100
for
pair
any
V-
•,
= lOOl
submatrix
V = lOOl
,
of
diesel
.
.j.
To exemplify weakly related experiments, we choose
In this case,
.
that are not part of the 5«5
where
I is
v^ =
the 12x12 identity
and
matrix,
v^
and
diesel experiments in the same species, the correlation r,
of
defined in (7.2), is 0.67.
(iv)
To exemplify imperfectly replicated experiments,
with the assumption that
column
effects
L
is
v^ = 100
and
analysis, where the design matrix
diesel engine, and
V=100I
=
,
replace
,
where
I is
i^,°'-Kf^i^
The corputations are equivalent
X
(iii)
In this case, the number of
reduced to 4, and the hyperparameter set
could be reduced to only 8 components.
3x4
v
we
to
a
has several identical rows for each
the 8x8 identity matrix.
-41-
DuMouchel-Har ris
To exemplify strongly related experiments, we replace
(v)
with the asumption that
experiments
the
in
v^ = 99
v^ =
and
V
For
1.
(ii)
pair
any
and (iii)
diesel
of
correlation r, defined in (7.2), is
the
species,
same
Hhe prior covariance matrix
0.997.
Feb 81
takes the form
100
100
100
V
=
_„
,.,
,
.
100 99
99 100
99
.
.
•
•
99
.
99
99 100
.
.....
.,
.
.
...
100
where the upper left
submatrix
in
the
submatrix
center
is
covariance
the
the covariance of
is
right element is the variance of
Vy
effects.
situation,
reflect
to
our
(It is possible,
near
as
{'^3,...,^7}
variance
complete
Smith (1973a)
,
t
the
and the lower
state
shows
the
of
hyperparameters
of ignorance about these
in
a
sonewhat
simpler
to let these variances go to infinity and still retain our concept
of strongly related experiments for fixed
value for
{M-,*icf^if^a,}
.
We chose a value of 100 for the prior
^Hf'^K.'Xt^
of
v^
is more convenient here.)
V„
.
Our choice of
Under this
prior
a large finite
assumption,
every
300+ cr% and therefore we are uncertain about the
log slope
0j
has variance
magnitude
of
each slope to a multiplicative factor of about exp(20) = 5x10
Since these variances are so large, the somewhat arbitrary choice of b=0
not
materially
affect
the calculations, since all values of
y^
.
does
are within
-42-
DuMouchel-Harris
±5
Xb =
of
Our choice of
.
magnitude
likely
difference
v~ =1
/2(v_ +c)
is
priori
reflects our sense
two
the
standard
C
slopes
exp(zy2) = 10
deviation
is
exp(zv/2)
as
the
=
v
No greater uncertainty seons justified.
other.
extreme, to assume that
«
v
o"
would
nearly
be
is
z
f°^ this to be
h9^a~"--ht
it
f
B
z=1.6, our prior choice of
for
this
of
v^ere
,
equivalent
allows
1
for more than a 10 percent chance that one of the two diesels is 10
potent
the
of
Even if cr=0, our choice of
.
(Note that it is essential to choose
.
Since
true.)
a
The
conditionally on
means the ratio of
N(0,1)
in (v)
the difference between any two diesel log slopes in the
of
species (e.g., ©jj^-^tr).
saiae
= 1
v
Feb 81
times
as
At the other
imperfect
to
replication.
Table 7 shows the resulting estimates of
V
tar, coke oven, diesel I and diesel
j6
the results in that column are quite
,
prior
diffuse
for the diesel
Since
assumption
the
close
to
V epidemiological study contributes to
obtained
those
in Table 6 (see the column labelled
case
Sj^S)
that
/3
the
log slope
itie
.
in
estimation
the
9's only through the vague prior on
and to the remaining
of
emissions.
0's for roofing
and of the
are weakly related approximates complete ignorance about
columns
diesel
the
c
of
cr
Its own value
.
is nearly perfectly fit to the underlying model.
y
The assumption of imperfect replication, however, results in
C
increase in the estimate of
due
fact
Hence,
the
assumption
the
to
1^'s
posterior
reduces
is
.
'
s that is in
5's.
forced to be fitted to the corresponding
variance
the
d
Any variation in the diesel
dramatic
a
of
posterior
5
Although
increases.
standard
precision of the other estimates deteriorates.
deviation
The
for
Diesel
assumption
prior
this
of
V,
the
imperfect
replication, we conclude, is inappropriate in the present illustration.
Ihe assumption
of
strongly
related
experiments
does
not
have
this
.
DuMouchel-Harris
Feb 81
-4-2A-
TABLE
7
Bayes Estimates of Log Slopes For Lung Cancer Risk in Man
Alternative Assumptions on the Relation Between Diesel Columns
Data Matrix
3x8
(^o
r
^>,
)
h.
Weakly
Related
Imperfect
Replicates
Strongly
Related
(0,100)
(100,0)
(99,1)
0.388
1.100
0o416
0.478
1.221
0.515
1.533
0.739
1.393
1.124
1.724
0.743
0.495
1.415
1.424
0,333
1.504
0.336
1.495
0.330
1.482
0.341
0.339
0.862
-0.401
1.594
0.442
0.868
1.877
1.497
0.443
1.165
0.241
1.029
Original
Data
Roofing Tar
6*
Coke Oven
e*
Diesel Engine
I
e*
c*
Diesel Engine V
e*
c*
1.921
1.512
-43-
DuMouchel-Harris
conservative use of prior information about the
Our
limitation.
of the diesel experiments
does
deviations
standard
posterior
Feb 81
not
increase
the
of
much
(j*
roofing
r elatedness
beyond
The
0.5.
tar, coke oven and diesel I
emissions are increased only slightly in conparison to the first column of the
But the precision of the estimate
table.
incorporation
epidsniological
of
data
for
on
diesel
V
is
improved.
T^ie
occupational exposures to diesel
engine V has led us to revise upward our estimate of the slope for
diesel
I.
Moreover,
the results of the viral transformation and mutagenesis experiments
on diesels
I
potency
of
through IV have led us to revise downward
the
of
Father than being more potent than roofing tar or coke
diesel V.
oven emissions, as originally suggested by the data, diesel
3 or 4
estimate
our
V
is likely to
be
the
of
times less potent, although not conclusively so.
8.
DISOJSSICN AND OOSICLUSICNS
We have constructed a general framework
when
experiments
diverse
experiments to others.
of
problCTi
there
is
for
combining
results
uncertainty about the relevance of some
Within this framework, we have attacked
the
specific
assessing human cancer risks f ran heterogeneous toxicological and
epidemiological data.
We distinguish between the conventional sampling error inherent
and
experiment
a
interagent
extrapolations.
extrapolations,
can
be
"The
interspecies
We shew how the available experimental data,
in canbination with the scientist's prior information on
such
each
novel error of imperfect relevance among experiments.
latter type of error formalizes our notion of the credibility of
and
in
used
to
estimate
the
the
credibility
effects
of
of
various
environmental agents in man and other species.
For
a
relatively
simple
example
involving
two
species
and
two
-44-
DuMouchel-Harris
environmental
agents,
overric3e the data
in
hew
shew
we
Feb 81
human
predicting
scientist's
the
cancer
prior information can
When
risks.
add
we
more
experiments on a third agent and in a third species, the data are predominant.
At
same time, a scientist's vague prior notion of the magnitude of these
the
extrapolative errors is made more precise.
We then propose a data analytic method for
subset
as
long
so
contributes
most
to
Such species or agents are successively
estimate of extrapolative error.
eliminated from the data base
relevant
most
The main idea behind this method is
among a multitude of experiments.
to determine which species or which envirormental agent
our
the
selecting
precision
the
of
particular
a
estimate, say, a particular human cancer risk, is improved.
We apply
containing
36
diagnostic
this
observed
to
method
relatively
a
dose-response slopes.
5x9
large
array,
This example demonstrates the
tradeoff between prediction bias due to potentially irrelevant experiments and
prediction efficiency
analysis
resulting
tentatively
suggests
frem
that
cesnbining
diverse
experiments.
The
for a particular class of environmental
emissions containing polyarcmatic hydrocarbons, the results of mammalian
transformation
experiments
metabolic activator
are
and
more
mammalian
relevant
to
mutagenesis
human
lung
cell
with
experiments
cancer
risks
than
mammalian skin tumor initiation experiments or tests of indirect mutagenicity.
Although
this
finding may not ultimately withstand scrutiny, it was derived,
we stress, frcm our adoption of an attitude of exploratory analysis.
Finally, we demonstrate how prior information on the relationship between
experiments can be incorporated into the analysis.
to
occur
when
experiments
have
been
performed
This situation
likely
in the same species or in
different strains of the same species, or when tests have
multiple samples of the same environmental mixture.
is
been
performed
on
-45-
DuMouchel-Harris
Feb 81
The main limitation of the present analysis, we feel, is our inability to
verify
that
assumption
the
extrapolation applies to humans.
merely
have
einbryo
exchangeable
of
errors
mouse
to
extrapolation
lymphoma
exchangeable
is
interspecies
The reader could legitimately object that we
assessed hew well one can extrapolate
cells
of
To
cells.
among
f ran
mouse skin to hamster
confirm
that
the
we
need
precise
species,
error
of
human
carcinogenesis data.
Our analysis in Section 6 revealed that cigarette smoke is a more
potent
direct mutagen and a more potent transforming agent in cell culture than would
be
from its observed carcinogenic potency in man.
expected
experiments involving cigarette smoke, the estimate of
We
declined.
data
not
do
regard
that
for
remaining
the
this finding as strong evidence against
exchangeability of extrapolation errors
conclusion
cr
When we excluded
across
all
species.
However,
the
the ronaining data fit the underlying model more exactly, we
acknowledge, is based primarily on the
more
precise
non-human
experimental
data.
We
should
note,
however,
that
the
(equation (3.3c)) is only a special case.
extrapolative
errors
example, the deviations
species
or
S
S
could
be
assumption
errors
spherical
of
The covariance matrix
tr
I
for the
replaced by a more general matrix.
corresponding
to
experiments
agent could have a variance that is different
We have not explored these possibilities in this
initial
in
a.
particular
a
priori from
paper
For
because
c
.
the
naive exchangeability assumption seems to be a reasonable starting point.
Moreover, we do not regard the basic normal data, normal prior
of
our
model
as particularly objectionable.
Deviations from the underlying
constant relative potency model may arise from biological processes
non-gaussian.
Since
the
normal
structure
distribution
that
are
has smaller tails than other
-46-
DuMouchel-Harris
Feb 81
likely candidates, any outliers fran the underlying normal model will
to
contribution
stronger
the overall estimate of the hyperparameter a.
this
use of the normal model is thus more conservative in
case,
it
gives
have
that
formulas
analytical
respect.
others
permit
a
Our
any
In
to reproduce our
results.
Nor do we take the underlying constant relative potency model
important
Our
limitation.
could
framework
to
=
X^+i
relative
,
where
potency
X
is a known design matrix.
concept
is
its
avoidance
an
acconmodate a constant
easily
additive potency model or, for that matter r any regression model of
Q
be
form
the
The appeal of the constant
of
complex
potentially
or
implausible conversions of dosage units between species.
Nor do we attach any special limitation to our apparent reliance
slope
of
a
linear
dose-response
en
the
Although we recognize that
relationship.
there is considerable support for such a dose-response
rrodel,,
it
should
be
clear that alternative methods of sunmarizing the results of an experiment are
possible.
For example, the TD^^ (dose at which 50 percent of subjects develop
clinical toxicity) or the MTD (maximum tolerated dose) could be used for
human experiment, as in Freireich et al.
(1966). For the non-human species, at
least, the LD^^ could be used, as in Meselson and Russell
our
each
(1977)
.
In
fact,
model could be generalized to the multivariate case where each experiment
is summarized by a vector of numbers.
effect
of
In this way, one could incorporate
the
such additional factors as the effects of duration and fraction of
exposure, or possible synergistic
effects
with
other
tumor
initiators
or
prcmotors.
The model described in this paper
appears
provide
to
the
theoretical
underpinning
for
experiments.
Meselson and Russell's (1977) comparison of mutagenic potency in
the
other
previous
attempts
to
combine
carcinogenesis
DuMouchel-Harris
^
-47-
Feb 81
Salmonella with carcinogenic potency in rodents constitutes a special 2xl case
In fact, the present methodology can be used to resolve the
in our framework.
criticism that the favorable results of these authors were merely fortuitous.
Similarly, our approach resolves the difficulties encountered
and
Wilson
(1979,1980)
in
having
to
perform
carcinogenic potency in different pairs of sf^cies.
authors'
desire
It also
Crouch
canparisons
satisfies
of
these
for a systematic approach to the identification of potential
exceptions to the
informative
separate
by
underlying
priors
on
the
extrapolative
model.
hyperparameters
Moreover,
the
use
of
of the model, as illustrated in
Section 7, permits us to include multiple experiments on the same agent in the
We therefore avoid the problem, encountered by
same species.
of
deciding
these
authors,
which of several experiments to incorporate in the analysis.
By
the use of informative priors, we could also incorporate information about the
faulty design or execution of an experiment.
Furthermore, in
a
multivariate
of our model, we could incorporate the incidences of tumors of
generalization
different sites.
This would avoid the additional difficulty,
encountered
by
these authors, of deciding which of several endpoints to choose.
Finally,
potency.
We
this
do
paper
considers
only
the
estimation
of
carcinogenic
not discuss the use of these estimates, in cotibination with
data on the extent of exposure to an environmental agent, to predict
excess cancer incidence.
possible
Such an application entails additional but important
uncertainties that are beyond the scope of this study.
-48-
DuMouchel-Harris
9.
Feb 81
ACKNOWLEDGEMENTS
Prof. DuMouchel's research was supported by National
Grant
No.
MCS-80-05483.
Prof.
Harris's
research
Science
was supported by Public
Health Service Research Grant No. DA-02620 and by Research Career
Award
No. DA-00072.
Development
A preliminary version of this paper was presented at the
21st meeting of the NBER-NSF Saninar on Bayesian
Chicago,
Foundation
October 31, 1980.
Inference
We would like to thank
in
Econonetrics,
H. Chernoff, R. Olshen, J.
Koziol, D. Krantz, and R. Solcw for reading and cotimenting on an earlier draft
of this paper.
The opinions and conclusions of this paper
sole responsibility.
are
the
authors'
-49-
DuMouchel-Harris
Feb 81
REFERENCES
Casto, B.C., Hatch, G.C.
Waters, M.S.
and
Huisingh, J.L.
,
Nesnow,
,
emissions:
environmental
vitro
in
transformation," Paper presented at the EPA
mutagenesis and oncogenic
International
Symposium
Health Effects of Diesel Engine Emissions. December 3-5, 1979.
Cochran,
W.G.
(1980)
"Summarizing
,
the
results
of
on
the
Cincinnati.
a
series
of
of the 25th Conference on Design of Experiments in
Proceedings
experiments,"
and
S.,
"Mutagenic and carcinogenic potency of extracts of diesel
(1979),
related
Hsung, S.L.
,
Army Research, Development and Testing, ARD
Report
80-2,
Washington,
D.C.,
U.S. Department of Defense, pp. 21-34.
Crouch,
and
E.
carcinogenic
Wilson,
Journal
potency,"
(1979)
R.
"Interspecies
,
comparison
of
Toxicology and Environmental Health, 5,
of
1095-1118.
and
(1980)
"The
,
regulation
of
Energy
carcinogens, "
and
Environmental Policy Center, Harvard University, Cambridge.
Dawid,
paradoxes
A.P.
Stone,
,
M.
and
,
Zidek,
J.V.
(1973),
"Marginal ization
in Bayesian and structural inference (with Discussion)
,
"
Journal of
the Royal Statistical Society, Ser. B, 35, 189-233.
De Finetti, B.
sources,"
(1964),
"Foresight:
its
logical
laws,
subjective
its
In Studies in Subjective Probability, eds. H.E. Kyburg, Jr and H.E.
Smokier, New York: Wiley & Sons.
Donpster, A.P.
the
law
school
(1980), "Coiment on 'Using empirical Bayes
validity
studies.'
techniques
in
by D.B. Rubin," Journal of the American
Statistical Association, 75, 817.
DuMouchel,
W.H.
(1981),
"Documentation
for
CATTiATA,"
Institute of Technology, Statistics Ctenter, Technical Report No.
Massachusetts
,
Cambridge.
-50-
DuMouchel-Harris
Ef ron, B. and Morris, C.
(1973)
Feb 81
related
"Combining possibly
,
estimation
problems (with Discussion)," Journal of the Royal Statistical Society, Ser. B,
35, 379-421.
Freireich, E.J., Gehan, E.A.
H.E.
"Quantitative
(1966),
Rail, D.P.
,
conparison
Schmidt,
,
L.H.
and
,
Skipper,
of toxicity of anti-cancer agents in
mouse, rat, hamster, dog, monkey, and man,"
Cancer Chemotherapy Reports,
50,
219-245.
Giri,
(1980)
N.
estimation
"Bayesian
,
means
of
classification with randan effect model," Statistiche Hefte,
Hammond,
"Inhalation
E.C.
Selikoff,
,
Lawther,
I.J.,
P.L.
a
of
2,,
254-260.
Seidman,
,
two-way
H.
(1976),
benzpyrene and cancer in man," Annals of the New York Academy
of
of Sciences, 271, 116-124.
(1981), "Potential risk of lung cancer
Harris, J.E.
onissions,"
R.B.,
Bushing,
carcinogenic
anissions:
Bradow, R.L.
,
K.M.
j»tency
study
Gill,
,
Jungers, R.H.
,
of
Diesel Engine Einissions.
elicitation
of
,
engine
and
diesel
December 3-5, 1979,
Dickey, J.M.
opinion
for
,
B.D.
(1979),
,
Zweidinger,
"Mutagenic and
related
envirormental
generation, collection, and preparation,"
sample
design,
and Albert, R.E.
B.E.,
extracts
of
Harris,
,
Paper presented at the EPA International Symposium on the
Kadane, J.B.
diesel
National Academy of Sciences.
Washington;
Huisingh, J.L,
from
Health
Effects
of
Cincinnati.
Winkler, R.L.
et
al.
(1980),
"Interactive
a normal linear model," Journal of the American
Statistical Association, 75, 845-854.
Kahn, H.A.
veterans;
(1966), "The Dorn study of smoking and
report
on
eight
and
one-half
years
mortality
of
Epi^miological /^roaches to the Study of Cancer and Other
among
observation,"
Chronic
ed. W, Haenszel, National Cancer Institute Monograph, 19, 1-125.
U.S.
In
Disease,
,
-51-
DuMouchel-Harris
Lindley, D.V. and Smithy A.F.M.
model
Discussion),"
(with
Feb 81
"Bayes estiinates for
(1972),
the
linear
Journal of the Royal Statistical Society, Ser. B,
34, 1-41.
Lloyd, W.J.
(1971)
"Ijong-term
mortality
study
of
steelworkers.
V.
in coke plant workers," Journal of Occupational Medicine,
cancer
Respiratory
,
13, 53—68.
Redmond,
Mazumdar, S.,
epidemiological
C.
Sollecito,
,
W.
Susaman,
,
N.
(1975),
"An
study of exposure to coal tar pitch volatiles among coke oven
workers," Journal of the Air Pollution Control Association, 25, 382-389.
Meselson, M. and Russell, K,
(1977)
"Comparisons
,
of
carcinogenic
mutagenic potency," In Origins of Human Cancer, eds. H.H. Hiatt, J.D.
and J. A. Winsten, Cold Spring Harbor:
Mitchell, A.D.
and
V.F.
Simtnon,
Evans, E.L.
,
,
Jotz, M.M.
,
Riccio, E.S., MOrtelmans,
K.E.
(1979)," Mutagenic and carcinogenic potency of extracts of
presented
Paper
Watson,
Cold Spring Harbor Laboratory.
diesel and related environmental emissions:
damage,"
and
in
vitro
and
mutagenesis
ENA
the EPA International Symposium on the Health
at
Effects of Diesel Engine Emissions. December 3-5, 1979. Cincinnati.
,
Slaga, T.J.
and
related
Nesnow, S., Triplett, L.L.
gasoline
exhaust,
skin,"
exhaust,
Paper presented at the EPA
(1980),
onission
Symposium
on
"Tumorigenesis of
diesel
extracts on Senear mouse
^plication
Bioassays in the Analysis of Ccmplex Environmental Mixtures.
of
Short-Term
March 4-7, 1980.
Williamsburg.
Peto,
(1977)
R.
mutagenicity
tests,"
,
multistage
"Epidemiology,
models,
and
short-term
In Origins of Human Cancer, eds. H.H. Hiatt, J. Watson,
J. Winsten, Cold Spring Harbor: Cold Spring Harbor Laboratory.
Raiffa, H. and Schlaifer, R.
Student Edition.
Cambridge:
(1968)
The M.I.T.
,
Applied Statistical Decision Theory.
Press.
-52-
Du^fouchel-Harris
Schneideman, M.A.
bringing
drug
a
to
(1967)
clinical
,
"Mouse
trial,"
Feb 81
to
problans
statistical
man:
Proceedings
of
the
in
Fifth Berkeley
Syirposium on Mathematical Statistics and Probability, IV, 855-866.
Sntiith,
A.F.M.
(1973a), "Bayes estimates in one-way and
models,"
two-way
Biometrika, 60, 319-329.
—
\
(1973b), "A general Bayesian linear
model,"
Journal
of
the
Royal Statistical Society, Ser. B, 35, 67-75.
(1973c), "Discussion of 'Ccsnbining possibly related estimation
problems.' by B. Efron and
C.
Morris,"
Journal
of
the
Royal
Statistical
Society, Ser. B, 35, 410-412.
Tiao, G.C. and Zellner, A.
(1964), "Bayes' theorem and the use
of
prior
knowledge in regression analysis," Bion^trika, 51, 119-130.
U.S. Environmental Protection Agency (1979), "The
(Carcinogen
Group's final risk assessment on coke oven exposure," Washington.
Assessment
38 pages.
///
/
Date Due
FEB.
21
1997
(^^wiA
APR 03
\
12
519S2
Lib-26-67
ACME
book8in:);ng co ,INC.
SEP
15
1983'
IGO CAMBRIDGE STREET
cha;?lestown, MASS.
p.i^u
mS
TDflD DDM
3
Mn
TIE
^y'
LiBRfiftiES
;Lfj>
mS
'^DaD D DM
3
MIT
I
1
TED
IBRARItS
^L.^
TOfi D
3
DDM
M7fl
bD5
3l
^060 004 M4b 65T^
3
3
MMb
TDflD DDM
3 ~?
DUPL
MIT LIBRARIES
'^
fib?
MIT LIBRARIES
III III
3
nil nil III
TDflD DDM
'^'^(^
II
ill III llllll
M M
t.
MIT LlBRAftlES
3
TDflD DDM MMb
II
I
II
fi7S
DUPL
flfi3
MIT LIBRARIES
TDflD DO
M
M7a 571
Download