Document 11070350

advertisement
4^'
S
(LIBRARIES
o/
TSC^V
^
O^y
Cewe
HD28
.M414
no-
35/6
'
Pooled Testing for HIV Prevalence Estimation:
E^loiting the Dilution Effect
Stefanos A. Zenios
Lawrence M. Wein
#3810-95-MSA
March 1995
POOLED TESTING FOR HIV PREVALENCE
ESTIMATION: EXPLOITING THE DILUTION
EFFECT
Stefanos A. Zenios
Operations Research Center, M.I.T
and
Lawrence M. Wein
Sloan School of Management, M.I.T.
Abstract
We study
pooled (or group) testing as a method
rather than testing each sample individually, this
pool and then tests the pool.
their dichotomization
estimating the prevalence of HIV;
method combines various samples
into a
Existing pooled testing procedures estimate the prevalence
using dichotomous test outcomes. However,
and
for
may ehminate
we develop a parametric procedure that
HIV
test
outcomes are inherently continuous,
useful information.
utilizes the
To overcome
this problem,
continuous outcomes. This procedure
employs a hierarchical pooUng model and estimates the prevalence using the likelihood equation.
The
likelihood equation
is
solved using an iterative algorithm, and a simulation study
shows that our procedure yields very accurate estimates
for
a fraction of the cost of existing
procedures.
March 1995
MASSACHUSETTS INSTITUTE
OF TECHNOLOGY
MAY
3
1995
LIBRARIES
Introduction
1
Estimation of the prevalence of
AIDS epidemic and
HIV
important
is
for evaluating the current status of the
planning effective intervention programs. However, precise estimates are
very difficult to obtain because
HIV
infection
not associated with any clinical symptoms.
is
Because the interval between the infection time and the onset of the
AIDS
is
very long, the total number of
AIDS
symptoms
clinical
on the
cases does not provide information
recent incidence of HIV. Consequently, population surveys are the primary
of
mode
HIV
of
prevalence estimation.
This method screens an unbiased population sample using an antibody
ELISA, and estimates the prevalence from the number
method has
its
and Day
can occur that results
1988). Also, the
number
must give
their prior consent,
an underestimation of the prevalence
in
of tests
is
such as
of positive tests. However, even this
limitations. Because participants in a siurvey
self-selection bias
test,
(Gill,
Adler
frequently limited by financial constraints that
adversely affect the accuracy of the estimates.
One
bias
is
possible
way
pooled testing. The rationale behind pooled testing
that a pool
is
formed from the blood of ten
single antibody test.
that
to improve the accuracy of the estimates
all
If
the prevalence
ten samples will be
raxely contain
HIV
more than one
test provides nearly the
achieves almost the
is
(for
infected individual.
same information
same accuracy
then there
Moreover, even
self-selection
simple and intuitive: suppose
example) individuals and
sufficiently small
negative.
is
and reduce the
if
is
is
a high probability
the pool
is
positive,
Under these conditions, a
as ten individual tests,
tested with a
it
will
single pooled
and the pooling method
as individual testing but for significantly lower cost.
Furthermore, the pooling procedure preserves the anonymity of the participating individuals,
and therefore can reduce the
In his seminal paper,
self-selection bias (Gastwirth
and Hammich
Dorfman (1943) showed how pooled
testing,
1989).
which
is
called
group
testing in the statistical literature
literature,
Group
can be employed to
testing
(see Sobel
testing in the environmental statistics
efficiently detect the defective
was researched aggressively
on the topic
exists
and composite
members
in the subsequent years
and a
and Groll 1958 and Johnson, Kotz and
Wu
of a population.
large literature
now
1991 and references
therein).
Group
testing has also been used to estimate the proportion of the defective
of a population
and
(e.g.,
Sobel and Elassoff 1975, Chen and Swallow 1989 and Lovison, Gore
Patil 1994). However, these studies
Motivated by the
Gastwirth and
members
HIV
assume that
tests
have no misclassification
errors.
prevalence estimation problem, Tu, Litvak and Pagano (1994) and
Hammich
procedures for imperfect
provide
tests.
new
insights into pooled testing
by developing poohng
Nevertheless, existing studies suffer from two shortcomings.
they assume that pooling does not affect the sensitivity and specificity of the
First,
However,
if
the group size
is
large,
some
positive sera
may be
test.
by negative
excessively diluted
sera
and become undetectable
in the pool.
Failure to explicitly capture this dilution effect
may
result in prevalence underestimation.
Second, they assume that the test results are
binary
(i.e.,
infected or non- infected), even though the outcomes of
optical density readings, are inherently continuous.
To implement
ELISA, known as the
their
poohng procedures,
they adopt the dichotomous classification of the test outcomes dictated by the test kit
manufacturers; this classification, in addition to ehminating useful information,
is
specifically
designed for individual testing.
The purpose
two shortcomings.
of this paper
to propose a parametric procedure that overcomes these
Our procedure employs the
Zenios (1995), which
effect
is
is
described in Section
and the continuous nature
2,
hierarcfiical
and
pooling model of Wein and
explicitly captures
of the test outcomes.
This model
is
both the dilution
used in Section 3
to develop an iterative estimation algorithm that utilizes the continuous optical density
readings.
To implement the algorithm,
expressed in closed form.
that
is
Section
necessary to calculate a quantity that cannot be
We overcome this
problem by developing an asymptotic expansion
described in Section
which
5,
An
4.
upper bound on the asymptotic variance
is
derived in
used in Section 6 to determine the sample size and the pool size that
is
balance the tradeoff between the testing cost and the accuracy of the estimate.
eflPectively
A
it is
Monte Caxlo simulation study
estimation procedure.
The
is
carried out in Section 7 to assess the performance of our
results indicate that the parametric procedure
can achieve more
accurate estimates and atteiin significant cost savings relative to existing poohng procedures.
Concluding remarks appear
The
2
ELISA
in Section 8.
Hierarchical
detects
HIV
'
'
PooHng Model
antibodies in the blood of infected individuals and then translates the
antibody concentration into a continuous quantity, known as the optical density (OD) reading.
To capture the continuity
hierarchical statistical model.
ified at
the
first level,
concentration
is
of the
OD readings, we describe this mechanism as a two-level
The probabihty
density of the antibody concentration
and the conditional density of the
determined at the second
level.
The
OD
is
spec-
reading given the antibody
dilution effect
is
captured by assuming
that the antibody concentration in a pool equals the average of the individual concentrations.
Readers are referred to our companion paper (Wein and Zenios)
scription of the hierarchical model. In that paper, the hierarchical
dilution series data
we
briefly
and pooled
OD
Let p denote the
ual blood samples.
a detailed de-
model was validated on
testing data using a generalized linear model. In this section,
summarize our hierarchical model and show how to use
density of pooled
for
it
to derive the probability
readings.
unknown HIV prevalence and
The
pool's
OD
consider a pool consisting of
m individ-
reading and antibody concentration are defined by
X^"^''
and
y^"*', respectively.
and
TTy
(y;p),
so as to
fall
and 7 are nuisance parameters that appear
where
hierarchical model.
Without
between
The model's
These random variables have probability densities fx{x;p,4>,^)
and
loss of generahty,
we assume that the
the density
first level specifies
T^Y
{y)
m—k
and define
=
tt*^*^'"'*
k_''\
tt*^ *
be the conditional density of
Ky
Let 'K+iy) and
iV'^P)-
of the
model
7r_(j/)
and non-infected
be the
individuals,
the convolution operator.
* is
Let
pool consists of k infected and
=
m7r*^''"^Hmy)
(1)
t [fjA^-prM^'^'Hy)-
(2)
states that
vrr(y;p)
The model's second
where
V*'"^ given that the
n^y'^'^Hy)
first level
OD reading is normalized
Then
non-infected individuals.
and the
second level of the
1.
probability density for the antibody concentration of infected
respectively,
in the
=
level specifies the conditional density of the
OD
reading given the
antibody concentration:
f{x-An\y)
where the nuisance parameters
and
(3),
the density of X^"")
(j)
=
'
,
,.
e
^^^l^
and 7 depend upon the
^i
test kit
(3)
,
employed.
By
equations (2)
is
fP{x-p,<f>n)
=
t (7)P'(1 -Pr"'/?'"''(x;</.,7),
(4)
where
/j?''"\x;0,7)
To complete the
and
7,
and the
=
r
T^'f'"'\y)f{x;<t>n\y)dy.
description of the model,
densities 7r_
and
7r+.
it
remains to determine the parameters
Because our goal
4
(5)
is
to estimate p,
we assume
that
these quantities can be obtained from an off-line analysis of historical data, and that the
unknown parameter
only
Subsection
It is
of the
worth noting that our model not only captures the dilution
OD
readings, but
first level
different,
ELISA: within sample (due to measurement
of the
model
captinres the
(3),
discussed in
and the continuity
two sources
(f)
and between sample.
errors)
variability.
The second
and
level of the
same sample
to
variability.
remainder of this paper we suppress the notational
7; for
example, we use f{x\y) instead of f{x; 0, 7|y).
Algorithm
Iterative
In this section,
effect
allows repeated measurements from the
readability, in the
dependence on the parameters
The
between sample
and hence captmes the within sample
To improve
3
is
also consistent with empirical evidence that identify
is
model, in particular equation
be
estimation of these quantities
off-line
7.2.
of variability in
The
The
is p.
we develop an algorithm
to estimate the prevalence of HIV. This
by deriving the likehhood equation from the probability density function
(4),
done
is
and then
iteratively solving this equation.
Suppose that we
n independent pools
where
Xj
of size
denotes the
likelihood function
collect
OD
blood samples from
m, and then
nm individuals,
test the pools.
reading of the ith pool.
The raw data
Equation
to form
them together
pool
are
(4) implies
x=
(xi,
.
.
.
,
Xn),
that the overall
is
^(^;p)
=
n (e {^)pHi - P)'"-V?''"V.))
To obtain the maximum likehhood estimate
(m.l.e.)
a log L(x;p)
dp
^
p of
^
p,
we
•
(6)
solve the likelihood equation
^^^
convenient to define Tk{x;p), the conditional probability that a pool contains k infected
It is
individuals given that
its
OD is x,
Tki^W)
—
—
i
T
V
/
\P)
•
Er=o(7Hi-p)'"-vi^''"^w
After some tedious but straightforward algebra, equation (7) reduces to
n
1
P
Because
m
im
—
!>
m
—— ^ ^(m -
-
P
^
i=l A:=0
12T=o'''k{^'^P)
n
1
kTk{x,\p)
k)Tk{xi;p)
=
0.
(9)
i=l fc=0
expression (9) simplifies to the self-consistency equation
n
nmp =
m
^^A;rfc(x,;j9).
(10)
i=l k=0
Intuitively, the left side of (10) is the
side
is
expected number of infected individuals and the right
the conditional expected number of infected individuals, given the data.
Motivated by
(10),
we propose the
following iterative algorithm, where
e
denotes the
desired tolerance level:
Step
1. Start
Step
2.
the procedure with an initial estimate p^°\
Obtain improved estimates by taking
n
1
771
p(3+i)^_L^^fcr,(xr,p(^)).
Step
3.
Terminate
To prove
if |p**+^^
— P^^^\ <
e;
otherwise, return to Step
that the algorithm converges,
(EM) algorithm
(see
we show that
it is
(11)
2.
an Expectation-Maximization
Dempster, Laird and Rubin 1977). The
EM
algorithm
is
applicable
whenever the raw data can be viewed as incomplete observations from a "complete" data
and the maximum hkelihood estimates
for the
complete data
the complete data set consists of both the observable
number
A.
of infected individuals in the pooled samples.
OD
The
set,
set are easily obtainable. Here,
readings,
and the unobservable
details are described in
Appendix
Analytical Approximation
4
The algorithm
for
A;
=
0,
.
.
,
.
of Section 3 requires the computation of the conditional probabihties Tk{x;
m. Since we have been unable
Monte Carlo simulation
is
employed
to obtain these probabilities in closed form,
in Subsection 7.3 to estimate these probabilities.
analytical approximation to these probabilities
which assumes large pool
procedure, and
size
much
sizes, is
less
is
approximation
is
An
derived in this section; this approximation,
computationally intensive than the simulation
used in Section 6 to derive
is
<j))
eflFective
derived in Subsection 4.1, and
is
samphng
employed
designs.
The
large pool
and
in Subsections 4.2
4.3,
respectively, to calculate asymptotic expansions for the probability densities defined in equa-
tions (4)
is
and
(5).
A
discussion of the appropriateness of the large group size approximation
deferred to Subsection 4.4.
A
4.1
Let
Large Pool Size Approximation
yt*^'"*)
denote the conditional antibody concentration given that the pool consists of
k infected and
m
—
converges in distribution to a normal random variable as
respectively, the
let
we prove
k non-infected individuals; in this subsection,
mean antibody
A;,
m ^ oo.
Let
/i+
and
that
/x_
yt*^-'"'
denote,
concentrations of infected and non- infected individuals, and
a+ and a_ be the respective standard
deviations. In addition, define fi{p)
—
piJ.+
+
(1
—
p)//_.
Proposition
1 Let
k,m
^^ oo, such that
y^(y(fc,m)
Proof: Let Y^^ and Y~,
distributed
(iid)
random
i
—
_
1,2,
^(^))
..
.
k/m
-^
s.
Then,
^ pj ^Q^ ^^2
denote
infinite
^
(1
_ ^yj^
sequences of independent, identically
variables with respective densities n+{y)
and
7r_(y).
The
definition
of
y^*^'*")
implies
y(fe,m)
If
we
_
^1
+
.
.
+
Vfc
+Yi +
.
.
+ y^_fc
define
(13)
and
As
m increases, the value of the integrand in (18) depends more and more completely on the
values of y in the neighborhood of ^(r). Hence,
around
we
replace f{x\y) by
=
/(a:|A.(r))
Taylor expansion
+ o(2/-Mr))'^(x|/x(r)) + O(|y-//(r)|^).
+ (y-Mr))^(x|M0)
|(x|M0)4(.-Mr))^g'
substitute (19) into (18), then (16) implies that E{Y^'''"'^ -n{r))
^{r)f
its
/i(r),
/(x|t/)
If
we
=
'•'^++^^-'')'^-
+ 0{m-^)
f^^''-\x)
The
and
E{Y^'''"'^
= fixHr)) +
calculation of ^-^
is
-
(//(r)))^
+ (l^-0^-
^^i
=
=
0{m-^),
(19)
E{Y^''-'^^
-
0{m-^). Thus, we have
|/(^|^(,))
+ 0(^-2).
required to complete the derivation.
(20)
This calculation was
performed using the symbolic differentiation of a computer algebra system (Maple V) and
the expression
4.3
is
given in Appendix B.
Approximating fxi^w)
The methodology
fx
i^'^P)^
which
Central Limit
of Subsection 4.2 can also be used to derive an asymptotic expansion for
defined in
is
Theorem
(4).
Let
cr(p)
=
Jp(^+
+
(1
~ p)^'- + p(l ~ p)(a*+ ~
A*-)^-
The
implies that
y/^{Y(^)
-
^{p))
^ N {0,a^{p))
as
m ^ oo.
(21)
Repeating the intermediate steps of Subsection 4.2 yields
/r (x;p) = fixHp)) + ^0(^Im(p)) + 0{m-').
This expression
is
(22)
used in Section 5 to derive an upper bound for the asymptotic variance
of the m.l.e. as the
number
of pools
n
—
>
oo.
Objective
To Estimate the Prevalence
p.
Known Parameters
Raw Data
{xi,
.
.
.
,Xn}
OD
'
readings from pools of size
m
Approximations
dy2
=
fk{x;p)
(T)p''(i-p)"'-*/x'"''w
The Algorithm
Step
Start the procedure with initial estimate
1
p'°'.
Step 2
Step 3
Terminate
if \p'-^'^^^
— p'^^l <
otherwise, return to Step
Figure
4.4
1:
The estimation algorithm
2.
the proposed estimate.
Discussion of the Approximation
The asymptotic
representation for fx
ment the estimation algorithm
will
for
e;
i^)
can be used to approximate Tk{x;p) and imple-
(see Figvue 1).
However,
it
is
expected that the algorithm
converge to a fixed point that will be different from the m.l.e. To distinguish this fixed
point from the m.l.e.,
we
call
it
the proposed estimate and denote
it
by
can show that the difference between the proposed estimate and the
however, the proof
is
omitted here and can be found
positive
Wein and Zenios
and HIV negative
size
is
(20),
we
of Oim^"^)]
approximation, we note that the
display a large separation between
individuals.
m.l.e.
Using
in Zenios (1995).
Regarding the appropriateness of the large group
empirical results in
p.
OD readings for HIV
Consequently, our parametric procedure can adopt
very large pool sizes without significant loss of information, thereby partially justifying our
10
large group size approximation.
However, the accuracy of the asymptotic expansion (20)
depends on the rate that y('"'+ and Y^'^^' converge to normal random variables
mean
Following the conventional rule-of-thumb (the sample
approximately normal
for
n >
we conclude that
15),
infected individuals in the pool, k, satisfy 15
<
A;
<
of
n
the pool size
if
random
iid
m
in (15).
variables
is
and the number of
m — 15, then the normal approximations
should be sufficiently accurate for practical purposes.
The
<
condition ^
m
—
15
is
apt to hold for large pool
may be
practical for other statistical applications,
than 15
for
HIV
Although
sizes.
we would expect k
to be
A;
much
prevalence studies. Approximation (15) would hold for small k
if
>
15
smaller
TT+{y)
was
approximately normal; however, this does not appear to be the case (see Figure 7 of Wein and
we do not have a persuasive
Zenios). In conclusion,
large
group
size
approximation developed in
study described in Section
5
on the Asymptotic Variance
This section contains a derivation, which
and
is
based on the information inequality, of an upper
from X^"^\ which
an unbiased estimate of
Since Var(p), which
is
=
of the proposed estimate. Let /i^'"^(p)
/i(p) denote, respectively, the Fisher's information
single observation
performs well in the computational
this section
7.
An Upper Bound
bound on the variance
justification for (15); nevertheless, the
/x*'"'(p),
is
the
OD
E{X^'^^\p),
about
/i^'"'(p)
and
let /i(^('")(p))
and p contained
in
reading of a pool of size m. Because X^'"^
the Cramer- Rao lower
bound
the asymptotic variance of p as n
—
>
a
is
implies that
oo, equals l/(n/i(p)),
it
follows
that
^-(^'"'Iri,
V„tf) <
11
(24)
Let us
first
calculate
=
/x('")(p)
Equation
//('"'(p).
Prom
+
x/(x|Mp))cix
jf
(22),
we have
^
x^(x|Mp))cix
^
+ 0(m-2).
(25)
(3) implies
,7
xj{x\y)dx=-^
/
(26)
y
and
difl^erentiating twice
under the integral sign yields
Thus, substituting (26) and (27) into (25) gives
l
+ Mf))^
2m
+
{l
V
^
(/i(p)n'
;
;
Differentiating (28) yields
'"'
dp
^^(P) / 7(7
^
2m\
(7
+
(7
where
^4
+ (/X(p))^)2
+ 1) (A^(P))'""^ -
(1
=
(1
1)^ (M(P))'^-'
+ 1)(7 + 2)
/x_
-
//+,
-
2) (/x(p))^^-^
(1
+
(m(p))')'
a^ -
a'i
-
V.r(X(-)M
+
(40
-
For brevity of notation,
the coefficient of
^
6)7
let
+
+
(7
-
4) (m(p))'^-'
(i
+
(Mp))^)^
+
27^)
-
(47^
n^.
1) (/^(p))""\
^
- ^t and C =
1)^(m(p))^-^
-
(7
+
-
cr^^+
1)(7
^
+ 2)
o-^^_
,
(/i(p))'^-'
+ /i+/i?. -
C
,
(29)
/i+A*--
for Vax(X^'"'|p) leads to
(/^(P))'
,.
+ 7(7 -
(;,(p))^)^
-
(/x(p))'^-'
B=
+
(47^
Carrying out a similar analysis
(2
47^ (/^(P))'""'
^
^'(P) f '/'(7-l)(Mp)r^
{MY'-' +
h be the
first
in (29). Similarly, let v
term
(7
2)(0
-
2
+ 27)(/^(p))^^-^
>^
in the right side of equation (29)
be the
12
-
first
term
.3Q.
and
in the right side of (30),
/
and
be
w
be the coefficient of
—
Substituting (29) and (30) into (24) and omitting terms of
in (30).
0{m~^), we have the upper bound
Var(p)
+—
^
V
<
n (h^
We
1
—a
+
0{m-').
(31)
+ ^)
conclude this section with a brief discussion of confidence intervals.
confidence interval for p
The Wald's
is
p±^M=,
(32)
\/nh{p)
where
Za/2
is
the upper a/2 quantile of the standard normal density.
form expression
°dpi
for /i(p)
instead.
is
not available,
Although the
(e.g.,
to use the observed information
details are omitted, the resulting confidence intervals
were tested in the computational study
practical use
we could attempt
Because a closed
in Section 7,
and were not
the actual covering probabihty for the
95%
sufficiently reliable for
confidence interval was roughly
85%). Alternatively, we could construct confidence intervals that use the upper bound
However, this gives a covering probability that
6
roughly 99%.
Designing Efficient Procedures
To conduct a population survey
the
is
(31).
number
of pools
that adopts the algorithm of Figure
n and the group
size
m. An
cost with the acciuracy of the resulting estimate.
efficient
The
1,
we must
first
specify
design must balance the testing
design problem that arises in practice
can be posed as one of the following two constrained optimization problems: either choose n
and
m to achieve a prespecffied variance at
minimum
cost, or
choose n and
m to minimize
the variance of the estimate subject to a prespecified budget constraint. These two problems
are closely related (more about that later), and for concreteness
problem.
13
we
will address the
former
Let
mate
is
A
denote the prespecified variance threshold. Since the true variance of the
esti-
not available in closed form, we employ the upper bound of the asymptotic variance
derived in (31) as a smrrogate.
Oiu
resulting design will be conservative, in that the actual
Of
variance will be less than or equal to A.
upper bound on the variance
course, the
is
a
function of the prevalence, which must be estimated at the design stage.
We
assume that the
cost of testing a pool of size
m
is
(see
Behets
et al.
1988 for a
detailed discussion)
= {' ^, '!"":;
if m > 1,
[ em + f
C('-)
where
and
m
e
and / are
positive constants.
(33)
The design problem
is
to choose positive integers
to
minimize nC{m)
—
>
subject to n
(34)
"^
-,
.
-A(/^^ +
It is
possible to solve this integer
this is
n
,
(35)
.
f)
programming problem using exhaustive
cumbersome and time consuming.
we obtain a
Instead,
search. However,
closed form solution to the
non-integer relaxation of this problem.
Proposition 2
//
ation of (34)- (35)
{wh?
-
—
2hlv){fh'^
2ehl)
>
0,
then the solution to the non-integer relax-
IS
(_r,,.
0,
n*
^^
=
,
/
{wh'2-2hlv){fh^ -2ehl)
\
^,
^^
,^'
?; -I-
A(/.^
,
(37)
.
+ ^)
The proof involves a standard Lagrangian argument and elementary
Similarly,
we can
to minimize the upper
solve the second optimization problem,
bound
(36)
,
which
of the asymptotic variance subject to
14
calculus,
is
and
is
to choose
omitted.
m
and n
nC{m) < B, where
B
is
the prespecified available budget.
The
solution to the integer relaxation of this problem
is
given by (36) and
n*
Notice that the pool size in (36)
and the imposed budget
=
is
constraint. It can
cost per unit information
(i.e.,
+
—^.
em*
f
,
^
(38)
independent of the specified variance threshold
be shown that this pool
cost times variance).
size also
Although the design
employs an upper bound on the asymptotic variance and
is
minimizes the
in Proposition 2
clearly suboptimal, this result
should provide a useful back-of-the-envelope calculation.
Simulation Study
7
The
results
from a Monte Carlo simulation study are reported in this section. The purpose
of this study
1.
How
2.
What
is
to address the following questions:
accurate are the approximations of Section 4?
is
the value of employing the continuous
dichotomous
test results?
necessary to exphcitly incorporate the dilution effect into our analysis?
3.
Is it
4.
What
The
OD readings rather than the traditional
are the expected benefits from pooled testing?
first
question
to the exact m.l.e.
is
addressed by comparing the proposed parametric estimator (p.p.e)
To answer question
2,
we develop a class of binary
the best binary estimator to the exact m.l.e.
To address question
estimators,
3,
we compare the
to two other binary estimators, one that explicitly captures the dilution effect
ignores
it.
We
also consider individual testing in order to provide
15
and compare
p.p.e.
and one that
an answer to question
4.
To obtain meaningful
from the simulation study, we employ real data. The data
results
are described in Subsection 7.1 and are used in Subsection 7.2 to estimate the parameters of
the simulation model.
The
various estimators are defined in Subsection 7.3, the results of the
simulation study are reported in Subsection 7.4 and a hypothetical application
is
described
in Subsection 7.5.
Data
7.1
We employ two data sets that were collected and tested by the National Reference Laboratory
The NRLl data
of Australia (NRL).
and 3000 HIV
and the cutoff
set consists of
positive sera.
The
classifying the test
OD
serum to produce a
set consists of the
OD
outcomes as positive or negative was
The NRL2 data
0.05.
readings from 10 infected sera, diluted sequentially to a fixed negative
series of 10 two-fold dilutions,
to the two data sets,
we note
negative
sera were tested using a commercially available test kit,
with ratios
samples were tested in duplicate using 10 different test
Finally,
HIV
readings for 4000
that
and the
all
test kit in
NRLl
individual samples in
1
kits.
:
16, 1
No
32,
:
.
,
.
.
1
:
8192.
individuals are
common
not one of the 10 kits used in
is
NRLl
The
NRL2.
were diluted according to the test
kit
manufacturer's instructions before being tested.
7.2
The Simulation Model
The simulation model randomly
generates antibody levels and
OD readings for both individ-
ual and pooled samples via the hierarchical model defined in Section
we
specify the
model parameters
The parameters
(f)
and 7 are
and
solely
estimated from dilution series data by
and Zenios carry out
this
7,
and the
densities 7r+(y)
dependent on the
fitting the
2.
and
test kit
In this subsection,
i^-iy).
employed, and can be
data to a generalized linear model; Wein
procedure for the 10 different test kits in NRL2. The ten estimated
16
values for
7 (one
the value for 7
for
each test
kit)
ranged from 0.09 to
dilution series data for the test kit employed in the
model.
and averaged
0.54. Evidently,
highly dependent on the test kit employed. Unfortunately,
is
used to estimate
1.11,
7r+(j/)
and
7r_(y).
Sensitivity analysis in
NRLl
data
Wein and Zenios shows that 7
which
set,
=
For simplicity, we use the value 7
=
we do not have
the data set
is
1.0 in the simulation
0.54
and 7
=
1.0 yield
qualitatively similar results.
We
and
and the
employ the parameter
Zenios,
and readers are
densities TT+{y)
(f)
and
=
and
7r_(y) that are
referred to that paper for details; here
summary. Rather than estimate
obtain the estimate
densities 7r+(t/)
in
used in Wein
we provide only a
brief
a manner similar to the estimation of 7, Wein and Zenios
0.0088 as a by-product of estimating 7r_(y) from the
7r_(y) are difficult to obtain
NRLl
data.
because the antibody concentration
is
The
not
an observable quantity. Moreover, these densities are population dependent and may vary
if
the epidemic
is
nonstationary. However,
densities can safely be
assumed
every time a population survey
if
the epidemic
to be constant,
is
is
progressing slowly, then these
and hence do not have to be re-estimated
conducted.
Using a combination of exploratory data analysis and the
7r_(j/)
from the 4000
OD
readings for
negative individuals in
HIV
algorithm to estimate
NRLl, Wein and Zenios
is
due nearly
sample variability and contains almost no between sample
variability.
conclude that the variabihty of
entirely to within
OD
HIV
EM
readings for
negative individuals
Consequently, for TT-{y) we use the degenerate density with
mean
//_
=
0.0086 and
cr_
=
0.0.
Exploratory data analysis in Wein and Zenios suggests that the within sample variability
is
substantially larger than the between sample variabihty for
HIV
and they make the simplifying assumption that the within sample
infected individuals.
Thus,
X\Y = E{X\Y)
positive individuals,
variability
for infected individuals,
17
is
zero for
and combining
this
0.08
r
u
C
01
g.
0.04
..
41
In
I
I
M
I
II
f*m
'M
'
I
I
in
I
I
I
I
I
I
NI
I
I
I
K^
I
I
A.
M
I
I
I
in
CO
00
<j\
fS
Antibody Concentration
Figure
with equation
(3)
2:
The
density iT+{y) used in the simulation model.
we obtain
Y=
Since our estimate for 7
is
one,
if
we
,
•
•
let Xi,
NRLl
the 3000 infected individuals in the
the empirical density of (yr^)
X\Y
^l-X\YJ
,
(
.
,
.
.
data
i-xToo )'
a; 3000
set,
(39)
denote the observed
OD
then the density TT+iy)
'^^^^ density is displayed in
readings for
is
Figure
ease of reference, the parameters of the simulation model are provided in Table
7.3
specified
2.
by
For
1.
Estimation Procedures
In this subsection,
we
describe the procedures that are considered in this simulation study:
the proposed parametric estimator, a numerical procedure for the exact m.l.e., three distinct
binary estimators that employ binary test results rather than continuous
individual testing.
18
OD
readings,
and
= 0.0086
/x+ = 3.79
cr_ = 0.00
cr+ = 4.00
= 0.0088
7 = 1.00
/i_
(t)
Table
1:
The parameters
of the simulation model.
Parametric Estimators. The proposed parametric estimator employs the algorithm
with the estimates
in Figure 1
Proposition
are e
=
2.
0.04
The
and /
derived pool size
Table
in
1.
The
employs the pool
p.p.e.
size given
cost parameters are obtained from the field study of Behets et
=
m*
1.35; details
< p <
for
may be found
1.
in
Notice that
Wein and
m* > 80
al.,
by
and
Zenios. Figure 2 displays the
p
for
<
The
0.22.
hierarchical
pooling model was vahdated on pools of size at most 80 in Wein and Zenios and, in fact,
80 appears to be the largest pool
size
reported in ihe literature (Gaboon- Young 1992).
Consequently, we do not consider pool sizes larger than 80.
To derive the exact
we use the algorithm
m.l.e.,
in Figure 1
with f^
{x;
(f))
replaced by
a lookup table representation of the density. The table entries are obtained from a simulated
data set as follows. For each value of k and m, a sample of 3000 pools that each consist
of k infected
7r+(t/)
and
and
m—k
The
7r_(y).
representation
is
is
is
generated from the antibody densities
readings are generated using the density in
in the estimation algorithm in Figure
approximately equal to the
parametric estimator
the p.p.e.
OD
is
obtained from the empirical density of the simulated
When embedded
an estimate that
non-infected individuals
(t.p.e.),
to be compared.
and
The
Binary Estimators: To
it
m.l.e.
We
1,
(3),
OD
and the table
readings.
these representations produce
call this
estimator the theoreticai
provides the benchmark to which the performance of
t.p.e. also
uses the pool sizes
shown
in Figure 2.
assess the value of employing continuous
19
OD readings rather
than traditional binary
test results,
we
also consider the following class of procedures: test
pools of size m, classify each pool outcome as
M,
and estimate the prevalence from the
Sp(m, u) denote the
the total
number
and
sensitivity
of positive pools.
-—1
^"
~
and the asymptotic variance of p
=
Var(p)
where f{m,u,p)
information
=
[1
—
(1
HIV
number
total
The prevalence estimate
— p)'"]Se(m,M) +
(1
who were
m
(i.e.,
there
minimized
the
u equal to
0.05,
=
and
It
'
(41)
.
Also, the cost per unit
{be{m,u) +bp[m,u)
bimary estimators that are based on
is
(s.b.e),
no dilution
0.05
under individual
first
which
effect),
m=
first
estima-
is
determined by the
test
manufacturer.
(42); recall that the
sensitivity
The pool
size
pool size derived
The simple binary estimator was proposed by Tu
the cutoff for the
testing; these
The
m, Se(m, u) and Sp(m, u) represent the
this quantity.
1) to derive
1)''
(40).
and that the cutoff u
to derive equations (40)-(41).
is
—
assumes that that Se(m, u) and Sp{m,u) are
chosen to minimize the cost per unit information
(with
^
'
— p)'"(l — Sp(m, u)).
under individual testing, as reported by the
in Section 5 also
set
be
is
the simple binary estimator
specificity
et al.
A;
is
Se{Tn,u) — k/n
\
f
l,Se(m,M) + Sp(m,«)-lj
test manufacturer. Consequently, for all
is
and
is
not a function of
m
cutoff
Let Se(m, u) and
of positive pools.
- P)'-'" /';-"'^''^-/'"';"'*;"
^(1
nm^
(Se(m,u) + Sp(m,u) — 1)^
We consider three different
and
OD
specificity for this class of pooling procedures,
m'^
tor, called
negative or positive using the
n
NRLl
To implement
data,
this procedure,
we
and use Monte Carlo simulation
very precise estimates of the sensitivity and specificity
two estimates are substituted into equations
perform the computations described above.
20
(42)
and
(40) to
Optimal Group Size
20
I
I
I
I
I
0.1
I
I
0.2
I
I
0.3
I
I
0.4
I
I
0.5
I
0.6
I
I
I
0.7
I
I
0.8
I
0.9
I
I
1
Prevalence
Figure
The other two binary
3:
The
derived pool size m*.
estimators exphcitly incorporate the dilution effect, and hence
are expected to outperform the simple binary estimator.
cutoff
u and pool
we show how
minimize
for
size
m; the estimators'
differentiating characteristics will
these two parameters are calculated.
(42).
Both estimators use the same
As with the
However, the quantities Se(m, u) and Sp(m, u)
s.b.e.,
u and
be defined
OD
after
m are chosen to
in (42) are difficult to
estimate
a testing procedure that deviates from the manufacturer's instructions; therefore, we
develop approximate closed form expressions for these quantities in order to find the cost-
minimizing values of u and m. These expressions are obtained from the simplified pooling
model
of
Wein and
concentration
with
mean
Zenios. This
model assumes that the
OD
reading
X
and the antibody
Y are related via In ( y^) — 7 ln(y)+e, where e is a Gaussian random variable
zero and variance a^. Proposition
asymptotic approximation
for the
1
and the
simplified pooling
model lead to an
probabiUty density of In {-[ZxW^)^ where
21
X^*^-'"'
is
the
OD
conditional
reading for a pool of rn given that
it
contains k infected individuals. This
approximation can be used to derive the expressions:
Se(m,u)
u-7ln(^//+ +
=
l-(l-p)'
,2
1
(l-^K)
kal + {m-k)cri
(43)
Sp(m,
^.)
=$
'u
—
7ln(/x_)
^:^^:^
(44)
.
'a2
In (43)-(44),
we
to estimate a^
TT+{y)
is
7 equal
set
and modify
obtained.
The
to
1,
//+
and use the approach
in Subsection 7.1 of
and a+ so that a conservative estimate
resulting estimates are
a
=
0.42,
^+
=
Wein and Zenios
for the left tail of
2.732 and a+
=
1.3032.
The
approximations (43) and (44) are substituted into equations (41) and (42) in steps 2 and
3,
respectively, of the following algorithm:
Step
1. Set
Step
2.
m=
1.
Use steepest descent to obtain a cutoff u{m) that achieves a
local
minimum
of
Var(p).
Step
3. If
F{Tn,u{m))
and return
Upon
> F{m — l,u{m —
to Step
1)) or
m
>
80, stop; otherwise, set
m
<—
m+
1
2.
termination, this algorithm gives a locally optimal procedure that satisfies the con-
straint
m < 80.
Table 2 displays the resulting pool sizes and cutoffs for the seven prevalence
values that are considered in the simulation study of the next subsection.
Notice that the binary estimate p in (40)
m
is
a function of Se(m, u) and Sp(m, u), where
and u are derived from the three-step algorithm described above. Our two binary
mators
differ
by how they approximate Se(m, u) and Sp(m,
22
u) in (40).
esti-
The proposed binary
0.1
parametric estimator to the theoretical binary estimator, and finally we compare the
the p.b.e., the
s.b.e.
Experiment
and the individual estimator.
1:
A
sample of 80,000 antibody concentrations
+
probability mixture pK^{y)
n
=
1000 pools of size
density
(3).
The
p.p.e.
m
=
(1
80,
— p)7r_(y). The
OD
and the
generated from the
simulated blood samples are divided into
readings are generated using the conditional
and the experiment
The simulation model was implemented
For the p.p.e.
is
and the theoretical parametric estimator are obtained under seven
different scenarios of varying prevalence,
scenario.
p.p.e.,
p, the
estimator
mean
rbias(p)
repeated 400 times for every
on a Sun Sparc Station
20.
E{p\p) and the estimator variance Var(p|p) are
We
obtained from the simulation experiment.
in C,
is
=
compute the
also
relative bias
l:i3M^
(45)
P
the relative variance error
\^Y""H - Var(pb)
= —
revar(p)
^^
(46)
P
and the mean squared
mined
for the theoretical
The
is
error (mse), E[{p
p)'^\p].
parametric estimate
results are given in Table
unbiased. In contrast, the p.p.e.
The
—
3.
is
Not
The analogous
p.
surprisingly, the theoretical parametric estimator
biased,
and tends to understate the true prevalence.
relative bias decreases as the prevalence increases.
positive,
which confirms that
Experiment
for the
2:
Once
quantities are also deter-
(31) provides
Also, the relative variance error
is
an upper bound on Vax(p|p).
again, 80,000 (different) simulated blood samples are generated
seven scenarios in Table
2.
The blood samples
axe divided into pools of size 80 to
obtain the theoretical parametric estimator. In addition, the same blood samples are divided
into pool sizes given in Table 2 to derive the theoretical binary estimator.
24
Prevalence
Confidence Intervals
pool contains more than one infected individual; pools that contain more than one infected
individual hide information on the true prevalence. In contrast, the theoretical parametric
estimator has the
flexibility to
employ substantially
larger pool sizes because
the total number of infected individuals in each pool from the
OD
then the optimal group
parametric procedure
size for
is
the binary estimator
exactly 80; recall that
is
still
infer
is
low because
and the group
we have introduced the
same
achieve the
when the prevalence
close to 80,
can
reading. Therefore, the
parametric estimator can adopt a substantially larger pool size and
variance as the binary procedure. This situation changes
it
size for
the
m
80.
restriction
<
Hence, the upper bound restricts the flexibihty of the parametric procedure and forces
behave
like
it
to
the binary procedure.
This experiment confirms our intuition that the continuous
OD
readings can be used
to produce very precise estimates. Unfortunately, the estimators in this experiment are hard
to employ in practice. In the next experiment
several approximations.
that the bias
is offset
we use more
practical estimators that contain
Although these estimators are expected to be biased, we
by their high
efficiency,
will see
and the procedures that utihze the hierarchical
pooling model produce very precise estimates.
Experiment
for the
3:
Once
again, 80,000 (different) simulated blood samples are generated
seven scenarios in Table
2.
The blood samples
are divided into pools of size 80 to
obtain the proposed parametric estimator. The same blood samples are also divided into the
pool sizes given in Table 2 to derive the proposed binaxy estimator. In addition, these blood
samples are divided into the pool
sizes dictated
by the simple binary procedure
in order to
derive the s.b.e. Finally, the samples are also tested individually to calculate the individual
estimator.
Table 5 gives the mean squared error (mse)
for the four estimators.
We
observe that
the mse for the simple and individual estimators are from 4 to 40 times larger than the mse
27
P(X102)
Confidence Intervals
p(xl02)
0.1
p.p.e.
(xlO^)
p.b.e.
(xlO^)
Relative
Relative
Efficiency
Cost Efficiency
A
mator and the proposed paxametric estimator.
if
individual testing
is
budget constraint
is
adopted, at most 1000 samples can be tested.
employ the procedure that
will give the smallest
=
Individual estimator: Here n
(not shown), the variance
The
objective
is
to
mse.
1000 and
expected to be 6.2 x
is
imposed such that,
m=
10""*.
1.
From the
results of experiment 3
Because the individual estimator
is
unbiased, this coincides with the mse.
Proposed binary estimator: From Table
n
constraint
1.00 X
=
^^^I'^^^^o)
10-^ and the mse
is
~
^65.
From Table
-
(0.04636
0.05)2
^
Proposed parametric estimator: Here
"
=
i.35+aM(80)
5.72 x
^
220.
From Table
10-^ and the mse
is
;^
-
O.OS)^
+
m =
20,
the variance
qO x 10"^
=
and to
is
satisfy the
(/lOO^f§P)^
2.30 x
budget
»^ =
10"^
m = 80, and to satisfy the budget constraint
the variance
6,
(0.04832
6,
2,
is
expected to be
5.72 x 10"^
=
{VTOO^^^f 5«^ =
8.54 x
10"^
Hence, the proposed parametric estimator has the smallest mse and
is
the preferred
procedm-e.
Concluding Remarks
8
We
have developed a parametric procedure to estimate the prevalence of
samples.
Our approach
is
novel in that
prevalence directly from the continuous
ically for the
HIV
it
OD
estimation problem but
is
HIV from
pooled
captures the dilution effect and estimates the
readings. This procedmre
was developed
specif-
applicable whenever liquids or gases are pooled
together and tested for estimation purposes;
many apphcations can be found
in industrial
or environmental quality control.
The procedure was
tested on simulated data that were generated from actual blood
samples, and the results indicate that
more
efficient
it is
more accurate and roughly an order-of-magnitude
than existing binary pooling procedures. As expected, the benefits from
30
this
procedure axe larger when the prevalence
that exphcitly accounts for the dilution
high.
is
parametric estimator was several times more
binary setting than in the continuous
in
OD
also derived a
new binary procedure
When the prevalence is above 5% our proposed
effect.
However, the approximations embedded
We
efficient
than the proposed binary estimator.
our procedures appear to more accurate in the
setting,
and when the prevalence
below
is
1%
the
proposed binary procedure performs better than proposed parametric procedure.
In conclusion, the numerical results provide strong evidence supporting the adoption
of our parametric pooled testing procedures for population surveys
prevalence.
The next
when the procedure
step
is
is
aimed
at estimating
HIV
to determine whether the conclusions of this paper are confirmed
tested
on
real data.
Acknowledgement
This research
is
supported by National Science Foundation grant DDM-9057297 and Ameri-
can Foundations
for
AIDS Research (AmFAR)
grant 02100-15-RG.
to Elizabeth Deix for providing the data, and
The authors
are grateful
John Karon and Glenn Satten
for helpful
discussions.
Appendix
A
Convergence Proof
In this appendix,
we show that the estimation algorithm
in Section 3
Maximization (EM) algorithm. Let us define the unobserved vector z
Zi is
the
number
of infected individuals in pool
observations from the complete data set (x,
z).
31
i,
=
is
an Expectation-
(zi,
.
.
.
,
2„),
where
and view the raw data x as the incomplete
Hence, the complete log-likelihood
is
given
by
(47)
1=1 fc=0
The expectation
EM
step of the
E[l{z,
it
=
algorithm calculates
E
/c(x, z;p)|x,p''''
.
Because
k)\x;p^'^]=Tk{x,;p^'^),
(48)
follows from equation (47) that
m
n
E
[/e(x, z;p)|x,p(^)]
=
^^
rfc(x.;p(^))
r
log
p'(i-pr-N+iog(/r^(a:.))
t=l *:=0
(49)
The maximization
step updates
p^'^
p(^+i)
The
function
E
/c(x,
z;p)|x,p^*M
=
using
arg maXpE" [/c(x,z;p)|x,p'is)
is
concave in
p,
and hence
(50)
p(*+i^
is
the unique solution to
the first-order optimality condition
dE
/c(x,z;p)|x,p'^^
=
(51)
0,
dp
which, after some algebra, gives
nm
The
(52)
1=1 fc=0
equivalence of (52) and (11) estabhshes that the iterative procedure
To prove
that the algorithm converges to a fixed point,
Theorem
2
(Dempster
et al.):
If the
we need the
is
the
EM algorithm.
following fact.
incomplete log-likelihood
is
bounded from
above and
/e(x,z;p(^+i))|x,p(^)]
for
some
scalar A
and
all p'-^^
,
-E
[/,(x,z;p(^))|x,p(^^]
then the sequence
32
p^^^
>
X{p^'^'^
-
p^'^f
converges to some p* in
[0, 1]
(53)
To show
that L(x;p)
is
bounded from above
our problem, we define
in
M = sup{max(7r+(?/),
7r_(?/))}.
y
Then
7r*<''''">(T/)
< M'" and
from
(1), 7r?'''"^(y)
< mM"^.
Thus,
follows
it
from
(6) that
L(x;p)<[mM"']".
To estabhsh
we use
(53),
(54)
(49) to obtain
^[/,(x,z;p(^+i))|x,pW] -^[Ze(x,z;p(^')|x,pW;
=
m
n
X:E^^(^«;P^'^)
+ (m -
(A:logp<^^^)
A;)log(l -p(^+^)))
-
t=l fc=0
m
n
^
^r,(x.;p(^)) (fclogp^
+ (m -
A;)
log(l
- p^))
.
(55)
i=l fc=0
Since X!r=i I^fcLo
^'^fc(^i;p'^^)
=
nrnp^^"*"^',
the right side of (55) reduces to
nm ['''""'°8(3 +
('-^'"""°«(t^))To complete the convergence
Lemma
1
For
all
G
{x,y)
/i'(x)
=
which
is
e
2/
(0, 1)
we need the
following lemma.
[0, 1]^,
xlog
Proof: Fix
proof,
(^^'
(£j
+
-
(1
x)log
=
and define h{x)
(^-^] >{x-
xlog (^)
+
(1
= ^r^ -
2
>
-
(57)
yf.
a:)log
[^) -
(x
-
yf. Setting
gives
satisfied
hy x
minimized when x
X log
—
[
=
y.
Also, h"{x)
Because h{y)
y.
j
+
(1
-
x) log
=
0, it
0.
Therefore, h{x)
is
convex and
is
follows that
(^^) >{x- y?
33
for all
x e
(0, 1).
(59)
Because y
is
arbitrary, relation (59) holds for every
to the closed interval
Lemma
E
1
and
[0, 1]
j/
in (0, 1).
The
(56) imply that
[/,(x,z;p(^+i))|x,p(^']
-E
[/c(x,z;p('))|x,p(^^]
Theorem
2 hold
>
nm{p^'^
- p'^'^^^f
for all
and the algorithm converges to a
Partial Derivative ^[x\y)
Define the constants
A =
47(x-l)2,
B = [47(x-l-0) +
C =
(40
D =
3;(47x
E =
47x^
F =
-87x2
(l-7)0](x-l),
-87x)(x-l) + 02(^-2),
-
+
40(1
(87
+
27)),
-40)x + 02(^^2).
Then
dy^
result extends
by continuity.
Therefore, the conditions of
B
x and
40^y''(l +7/T)
yT
\
34
y-^T
/
p*
.
(60)
fixed point
References
Bar ndorff- Nielsen, O.E., and Cox, D.R.
Chapman and
Hall:
(1989), Asymptotic Techniques for use in statistics,
London.
"A method
Becker, N.G., Watson, L.F. and Carlin, J.B. (1991),
projection and
Behets,
its
appUcation to
F., Bertozzi, S.
and Quinn, C.
,
AIDS
data". Statistics in Medicine 10, 1527-1542.
Kasah, M., Kashamnka, M., Atikala,
L.,
(1990), "Successful use of pooled sera to determine
Zaire with development of cost-efficiency models",
Billingsley, P. (1987), Probability
,
AIDS 4,
and Measure. Wiley:
Cahoon- Young, B. (1992), "Optimal pool
risk populations"
presented at the
of non-parametric back-
size for
HIV/ AIDS
New
Brown, C, Ryder, R.W.
HIV-1 seroprevalence
in
737-741.
York.
determination of
HIV
prevalence in low
Surveillance Workshop, South
San Pransisco,
CA.
Chen, C.L. and Swallow, W.H. (1990), "Using group testing to estimate a proportion, and
to test the binomial model". Biometrics 46, 1035-1046.
Dempster, A. P., Laird, N.M. and Rubin, D.B. (1977),
data via the
ries
B
EM
"Maximum
likehhood from incomplete
algorithm", (with Discussion), Journal of the Royal Statistical Society, Se-
339, 1-22.
Dorfman, R. (1943), "The detection of defective members of large populations". Annals of
Mathematical
Statistics 44, 436-441.
Gastwirth, J.L. and Hammick, P.A. (1989), "Estimation of the prevalence of a rare disease
35
preserving the anonymity of the subjects by group testing; Apphcation to estimating the
AIDS
prevalence of
antibodies in blood donors"
,
Journal of Statistical Planning and Infer-
ence 22, 15-27.
Gills,
O.N, Adier,
M.W. and
Day, N.E. (1989), "Monitoring the prevalence of HIV", British
Medical Journal 299, 1295-1299.
Hastie, T.J.
and Pregibon, D.
(1992), "Generahzed Linear Models", in Statistical Models
J.M. Chambers and T.J. Hastie,
in S, eds.
1
Wadsworth
&
Brooks/Cole: California, pp.
195-247.
Johnson, N.L., Kotz,
Control,
S.,
Chapman and
and Wu, X. (1991), Inspection Errors for Attributes in Quality
Hall:
London.
Lovison, G., Gore, S.D. and Patil, G.P. (1994), "Design and analysis of composite
procedures:
A
review", in Handbook of Statistics, Vol. 12, eds. G. P. Patil
Elsevier Science:
Sobel,
M. and
sampUng
and C. R. Rao,
Amsterdam, pp. 103-166.
R.M.
Elashoff,
(1975),
"Group
testing with a
new goal, estimation" Biometrika
,
62, 181-193.
Tu, X. M., Litvak, E. and Pagano, M. (1994), "Screening
less?". Statistics in
Can we
Working Paper, Sloan School
of
for
HIV
dissertation. Operations Research Center,
vJ.
:>
Vi
o
^ iii
4
more by doing
screening: Capturing the dilu-
Management, MIT, Cambridge, MA.
Zenios, S. A. "Health Care Applications of Stochastic Optimal Control",
-,
get
Medicine 13, 1905-1919.
Wein, L.M. and Zenios, S.A. (1995), "Pooled testing
tion effect",
tests:
MIT, Cambridge, MA,
36
I
7
'J.
It
-^j
v^
unpubhshed Ph.D.
in preparation.
Date Due
MIT LIBRARIES
3 9080 00922 4608
Download