Sequential probability ratio tests based on grouped observations

advertisement
Sequential probability ratio tests based on
grouped observations
Karl-Heinz Eger
Chemnitz University of Technology
Chemnitz, Germany
eger@mathematik.tu-chemnitz.de
Evgeny Borisovich Tsoy
Novosibirsk State Technical University
Novosibirsk, Russia
ebcoi@nstu.ru
Abstract - This paper deals with sequential likelihood ratio tests based on grouped observations.
It is demonstrated that the method of conjugated
parameter pairs known from the non-grouped case
can be extended to the grouped case obtaining
Waldlike approximations for the OC- and ASNfunction. For near hypotheses so-called F -optimal
groupings are recommended. As example an SPRT
based on grouped observations for the parameter
of an exponentially distributed random variable is
considered.1
{XiG }∞
i=1 . This test is defined as follows (see, e.g., [5], [3]
or [4]). Let
n
Y
pGθ1 (XiG )
LGn,θ0 ,θ1 =
pG (XiG )
i=1 θ0
~G =
be for n = 1, 2, ... the likelihood ratio of sample X
n
G
G
(X1 , ..., Xn ), then to given stopping bounds 0 < B < 1 <
A < ∞ the sample size NG and the termination rule δG of a
Wald SPRT for the hypotheses (1) are defined as follows:
/ (B, A)}
NG = inf{n ≥ 1 : LGn,θ0 ,θ1 ∈
and
δG = 1{LG
≤B} .
N,θ0 ,θ1
Index Terms - Hypotheses testing, probability ratio
test, classified observations, grouped observations, That means, we continue the observations for n = 1, 2, ... as
long as the critical inequality B < LGn,θ0 ,θ1 < A holds. If on
sequential test, sequential analysis.
observation stage n for the first time LGn,θ0 ,θ1 ∈
/ (B, A) and
if then LGn,θ0 ,θ1 ≤ B or LGn,θ0 ,θ1 ≥ A holds we accept the
1 Introduction
hypothesis H0 or H1 , respectively. We denote this SPRT
by SG (B, A).
Let {Xi }∞
i=1 be a sequence of i.i.d. random variables with
The most important characteristics for evaluation of the
a density function fθ (x), θ ∈ Θ, with respect to some meastatistical
properties of our test are the operating characsure µ. Our aim is to discriminate between two simple
teristic
function
(OC-function) QG (θ) = Eθ δG , θ ∈ Θ, and
hypotheses
the average sample number function (ASN-function) Eθ NG ,
H0 : θ = θ0 and H1 : θ = θ1 , θ0 6= θ1 ,
(1) θ ∈ Θ.
If Pθ (LG1,θ0 ,θ1 = 1) < 1 then we have Pθ (NG < ∞) = 1
by means of a sequential probability ratio test (SPRT).
and Eθ NG < ∞. Moreover, the Wald-WolfowitzIn this context we suppose, that the random variables Theorem holds. That means, the test SG (B, A) minimises
{Xi }∞
i=1 can be observed only in a restricted manner as the average sample number function for θ = θ0 and θ = θ1
follows. Let G be a partition of the domain X of the random among all tests whose error probabilities are not greater
variables {Xi }∞
i=1 in disjoint subsets X1 , ..., Xm , m ≥ 2, such than the error probabilities of Wald’s SPRT at θ = θ0 and
that on each observation stage i, i = 1, 2, ..., instead of Xi θ = θ1 .
only a random variable XiG can be observed defined by
The general problem of Wald’s SPRT consists in the
computation
of its characteristics, e.g., the OC-function or
XiG = k
⇐⇒
Xi ∈ Xk , k = 1, ..., m,
the ASN-function. This especially holds for the grouped
case considered here.
We will demonstrate that the
with
so-called
method
of
conjugated
parameter pairs known
pGθ (k) = Pθ (XiG = k) = Pθ (Xi ∈ Xk ) > 0
from the non-grouped case (see [3]) can be extended to the
θ ∈ Θ. That means, instead of a special measured value we grouped case obtaining Waldlike approximations for the
observe only a corresponding group number and we have OC- and ASN-function. Moreover we will discuss some
a so-called grouped or classified observation scheme. The possibilities for the determination of optimal groupings. In
partition G of X is called a grouping.
this context the Fisher information and the so-called F In the following we consider a Wald SPRT for the hy- optimal groupings will play an important part. As example
potheses (1) based on observations of the random variables we consider an SPRT based on grouped observations for
the parameter of an exponential distribution and present
1 Proceedings of ’The Second International Forum on Strategic
Technology - IFOST 2007’, 3-5 October 2007, Ulaanbaatar, Mongolia, corresponding F -optimal groupings.
pp. 284-287.
284
2
2.2
The Wald approximations
The stopping bounds
Under the conditions (4) we obtain a test SG (B, A) at size
The OC- and ASN-function of test SG (B, A) can be com(α, β), 0 < α, β < 1, α + β < 1, that means QG (θ0 ) = 1 − α
puted approximately in sense of the so-called Wald apand QG (θ1 ) = β, if the stopping bounds B and A satisfy
proximations by means of conjugated parameter pairs as
the condition
follows [3].
β
1−β
Definition. Two parameter pairs (θ0 , θ00 ) and (θ0 , θ1 ) ∈
and A = A∗ =
.
(5)
B = B∗ =
Θ × Θ are said to be conjugated, if a real number h, h 6= 0,
α
1−α
exists, such that
The values B ∗ and A∗ are the so-called Wald approximations for the stopping bounds.
LGn,θ0 ,θ00 = (LGn,θ0 ,θ1 )h , n = 1, 2, ...,
A sufficient condition for an admissible test for the hypotheses
(1) at size (α, β) is B = β and A = 1/α. Then we
h
holds. We write: (θ0 , θ00 ) ∼ (θ0 , θ1 ).
have QG (θ0 ) ≥ 1 − α and QG (θ1 ) ≤ β.
h
If (θ0 , θ00 ) ∼ (θ0 , θ1 ) the OC-function QG (θ) and the power
function MG (θ) = Eθ (1−δG ), θ ∈ Θ, of test SG (B, A) satisfy 2.3 The ASN-function
the relations
G
By means of the moment equation Eθ ZN,θ
= E θ NG ·
0 ,θ1
00
G
G
QG (θ )
G
h
h
E
Z
which
holds
for
our
tests
if,
e.g.,
P
(L
θ 1,θ0 ,θ1
θ
1,θ0 ,θ1 =
= Eθ0 (LN,θ0 ,θ1 ) |H0 is acc. ≤ B
(2)
0
G
QG (θ )
1) < 1 we get in case of Eθ Z1,θ0 ,θ1 6= 0 for the average
sample number
and
G
|H0 is acc.)QG (θ)
Eθ NG = (Eθ (ZN,θ
0 ,θ1
MG (θ00 )
G
h
h
0
=
E
(L
)
|H
is
acc.
≥
A
,
(3)
G
G
θ
1
N,θ
,θ
0 1
+Eθ (ZN,θ0 ,θ1 |H1 is acc.)(1 − QG (θ))/Eθ Z1,θ
.
MG (θ0 )
0 ,θ1
If we again assume that condition (4) holds approximately
we obtain the so-called Wald approximation Eθ∗ NG for the
average sample number Eθ NG :
where in case of
Pθ0 (LGN,θ0 ,θ1 = B|H0 is accepted)
= Pθ0 (LGN,θ0 ,θ1 = A|H1 is accepted) = 1
(4)
Eθ NG ≈ Eθ∗ NG =
ln BQ∗G (θ) + ln A(1 − Q∗G (θ))
G
Eθ Z1,θ
0 ,θ1
.
(6)
the equals signs hold. We remark, that in case of Pθ (NG <
∞) = 1 (closed test) moreover MG (θ) = 1 − QG (θ) In case of Eθ Z G
1,θ0 ,θ1 = 0 we get by means of the
holds. A sufficient condition for closeness is, for instance,
G
G
moment equation Eθ (ZN,θ
)2 = Eθ NG · Eθ (Z1,θ
)2
0 ,θ1
0 ,θ1
Pθ (LG1,θ0 ,θ1 = 1) < 1.
∗
analogously the approximation Eθ NG ≈ Eθ NG =
G
− ln B ln A/Eθ (Z1,θ
)2 .
0 ,θ1
2.1
The OC-function
For a closed test SG (B, A) we get under the condition (4)
h
and (θ0 , θ00 ) ∼ (θ0 , θ1 ) by (2) and (3) for the OC-function
QG (θ0 ) = Q∗G (θ0 ) =
Ah − 1
Ah − B h
and
QG (θ00 ) = Q∗G (θ00 ) = B h Q∗G (θ0 ).
If condition (4) holds approximately, that means the excess over the stopping bounds is negligible, then we have
QG (θ0 ) ≈ Q∗G (θ0 ) and QG (θ00 ) ≈ Q∗G (θ00 ) = B h Q∗G (θ0 ).
This are the famous Wald approximations for the OCfunction. If to given θ0 an h 6= 0 and θ00 6= θ0 do
h
not exist such that (θ0 , θ00 ) ∼ (θ0 , θ1 ), e.g., in case of
G
Eθ0 Z1,θ
= Eθ0 ln LG1,θ0 ,θ1 = 0, we can extend the Wald
0 ,θ1
approximation for the OC-function by QG (θ0 ) ≈ Q∗G (θ0 ) =
G
ln A/(ln A − ln B) for Eθ0 Z1,θ
= 0.
0 ,θ1
2.4
Conjugated parameter pairs
According to our definition of conjugated parameter pairs
we have in the i.i.d. case the following criterion. It holds
h
(θ0 , θ00 ) ∼ (θ0 , θ1 ) if to a given parameter value θ0 a real
number h 6= 0 and a parameter value θ00 6= θ0 exist such
that
!h
pGθ1 (x)
pGθ00 (x)
=
(7)
pGθ0 (x)
pGθ0 (x)
holds for x ∈ {1, ..., m}. Hence, a necessary existence condition for conjugated parameter pairs is, that the function
qθG0 (x) defined by
qθG0 (x)
=
pGθ1 (x)
pGθ0 (x)
!h
pGθ0 (x),
x ∈ {1, ..., m},
is a probability mass function. Because of qθG0 (x) ≥ 0 for
x ∈ {1, ..., m} we can compute a value h, −∞ < h < ∞,
285
G
expectation value Eθ∗ (Z1,θ
)2 with respect to G, respec0 ,θ1
tively.
An interesting case are near hypotheses: If ∆θ = |θ1 −
G
θ0 | is small, then it can be shown that Eθ0 Z1,θ
=
0 ,θ1
G
G ∗
1 G
2
2
− 2 IF (θ0 )(∆θ) , Eθ∗ (Z1,θ0 ,θ1 )
= IF (θ )(∆θ)2 and
G
Eθ1 Z1,θ
= 21 IFG (θ1 )(∆θ)2 for ∆θ → 0 holds, where
0 ,θ1
such that
ϕθ0 (h) =
m
X
qθG0 (x)
x=1
=
m
X
pGθ1 (x)
x=1
pGθ0 (x)
= Eθ0 e
!h
G
hZ1,θ
pGθ0 (x)
0 ,θ1
=1
holds. The function ϕθ0 (h) is as function of h, −∞ <
h < ∞, the moment-generating function of the random
G
variable Z1,θ
= ln LG1,θ0 ,θ1 . It holds ϕθ0 (0) = 1,
0 ,θ1
G
limh→±∞ ϕθ0 (h) = ∞, ϕ0θ0 (0) = Eθ0 Z1,θ
as well as
0 ,θ1
hZ G
G
ϕ00θ0 (h) = Eθ0 (Z1,θ
)2 e 1,θ0 ,θ1 > 0. This means that
0 ,θ1
ϕθ0 (h) is a convex function in h. Hence, we have in case
G
G
of Eθ0 Z1,θ
< 0 and Eθ0 Z1,θ
> 0 beside the trivial
0 ,θ1
0 ,θ1
solution h = 0 of equation ϕθ0 (h) = 1 always an unique
solution h > 0 and h < 0, respectively.
The case m = 2: In this case our test becomes an SPRT
for discriminating between two probabilities. Then in case
of Eθ0 Z1,θ0 ,θ1 6= 0 beside the solution h 6= 0 of ϕθ0 (h) = 1
always a parameter value θ00 6= θ0 exists such that condition
h
(7) holds for x ∈ {1, 2}. This implies (θ0 , θ00 ) ∼ (θ0 , θ1 ) and
we obtain the usual Wald approximations for the OC- and
ASN-function.
The case m > 2: Examples show that as a rule to a given
solution h 6= 0 of ϕθ0 (h) = 1 it does not exist a parameter
value θ00 such that pGθ00 (x) = qθG0 (x), x ∈ {1, ..., m}. However,
we can find always a value θ00 such that this relation holds
approximately. Hence we have then
pGθ00 (x)
≈
pGθ0 (x)
pGθ1 (x)
pGθ0 (x)
!h
,
x ∈ {1, ..., m},
IFG (θ)
= Eθ
∂ ln pGθ (X1G )
∂θ
!2
denotes the Fisher information of a single observation of
random variable X1G depending on the parameter value θ.
This underlines the importance of the Fisher information
with respect to optimal groupings in this context.
Definition. Let G = {X1 , ...Xm } be an interval grouping such that Xi = [x∗i−1 , x∗i ) for i = 1, ..., m and inf x∈X x =
x∗0 < x∗1 < · · · < x∗m = supx∈X x, holds. An interval grouping G0 is said to be F -optimally for θ, if
IFG0 (θ) =
max∗ IFG (θ)
,...,x
x∗
1
m−1
holds.
The F -efficiency of such a grouping can be measured by the ratio Feff (G, θ) = IFG (θ)/IF (θ), where
2
fθ (X)
denotes the Fisher information
IF (θ) = Eθ ∂ ln ∂θ
of a non-grouped observation of the random variable X1 .
It holds 0 ≤ Feff (G, θ) ≤ 1. Numerical studies show that
F -optimal groupings are quite robust with respect to their
efficiencies against modifications of the group bounds. For
instance, a simplification of the group bounds by moderate
rounding does not lead to a significant loss of F -efficiency
or discrimination information.
and in sense of this approximation we can compute corresponding modified Wald approximations for the OC- and Table 1 F -optimal group bounds of interval groupings Gm
ASN-function if we use now the left-hand approximation for θ = 1, m = 2, ..., 9, and F -efficiencies F (Gm , θ) [%] eff
for the OC-function in (2.1)
exponential distribution.
QG (θ0 ) ≈ Q∗G (θ0 ) =
Ah − 1
.
Ah − B h
An explicit determination of the parameter value θ00 is not
necessary then.
3
Optimal groupings
The Wald approximations for the ASN-function provide
hints how a grouping does influence to the average sample size. While the numerator in (6) is independent on
the grouping G the denominator depends on G via the exG
pectation value Eθ Z1,θ
. This dependence can be used
0 ,θ1
optimising the average sample number by means of an appropriate grouping for a given parameter value θ.
Especially, we get for our test corresponding small average sample sizes for θ0 , θ1 or θ∗ in sense of the Wald
approximations if a grouping G is chosen which maximises
G
G
G
|Eθ0 Z1,θ
|, Eθ1 Z1,θ
or, in case of Eθ∗ Z1,θ
= 0, the
0 ,θ1
0 ,θ1
0 ,θ1
286
m
x∗1
x∗2
x∗3
x∗4
%
2
1.5936
3
1.0176
2.6112
4
0.7540
1.7716
3.3652
64.76
82.03
89.10
m
x∗1
x∗2
x∗3
x∗4
x∗5
x∗6
x∗7
x∗8
%
6
0.4993
1.0997
1.8538
2.8714
4.4650
7
0.4276
0.9269
1.5273
2.2813
3.2989
4.8925
8
0.3739
0.8015
1.3008
1.9012
2.6553
3.6729
5.2665
94.76
96.06
96.93
5
0.6004
1.3545
2.3720
3.9657
92.69
9
0.3323
0.7062
1.1338
1.6331
2.2336
2.9876
4.0052
5.5988
97.54
Figure 1 Wald approximations Q∗Gm (θ) of the OCSince Eθ Z1,θ0 ,θ1 < 0 and Eθ Z1,θ0 ,θ1 > 0 for θ < 1 and
functions of tests SGm (B ∗ , A∗ ) for m = 2, ..., 9 and Q∗ (θ) θ > 1, respectively, the test should prefer the hypothesis H0
of the non-grouped test S(B ∗ , A∗ ).
for the first case and the hypothesis H1 for the second one.
That means, the test should be most selective for parameter
1.0
values in the neighbourhood of θ = 1. This can be reached
0.9
by means of F -optimal groupings for θ = 1.
Table 1 presents the corresponding F -optimal group
0.8
bounds x∗1 , ..., x∗m−1 of interval groupings Gm for θ = 1
and m = 2, ..., 10 as well as the reached relative efficien0.7
cies (in percent) Feff (Gm , θ) = IF (Gm , θ)/IF (θ), IF (θ) =
0.6
Eθ (∂ ln fθ (x)/∂θ)2 = 1/θ2 for θ = 1.
We now consider the special hypotheses H0 : θ0 =
0.5
0.85 and H1 : θ1 = 1.166687.
Then we have
Eθ Z1,θ0 ,θ1 = 0 for θ = 1. Let α = 0.05 and β = 0.05 be the
0.4
given risks of an error of first and second kind, respectively.
0.3
Then we get by (5) the following Wald approximations for
the stopping bounds B and A: B ∗ = 0.052632 and A∗ =
0.2
19.
Figure 1 shows the Wald approximations of the OC0.1
functions obtained by the method of conjugated parameter
0.0
pairs of Section 2.4 for the tests SGm (B ∗ , A∗ ), m = 2, .., 10,
0.7
0.8
0.9
1.0
1.1
1.2
1.3
based on the F -optimal interval groupings Gm of Table 1.
Figure 2 Wald approximations Eθ∗ NGm of the ASNWe see that grouping has only a slight influence on the
functions of tests SGm (B ∗ , A∗ ), m = 2, ..., 9 and Eθ∗ N of
WALD approximation of the OC-function. That means,
the non-grouped test S(B ∗ , A∗ ) (bold).
the OC-function of Wald’s SPRT remains almost unaltered
150
if we switch over from non-grouped to grouped observations.
Figure 2 presents the corresponding Wald approximations of the ASN-functions of the tests SGm (B ∗ , A∗ ), m =
2, ..., 10, as well as the Wald approximation of the ASNfunction Eθ∗ N for the non-grouped test S(B ∗ , A∗ ) (bold
line). Here we can see how grouped observations increase
100
the Wald approximations of the ASN-functions of our tests
depending on the F -efficiency of a grouping or the number
of groups, respectively.
50
5
0
The method of conjugated parameter pairs is an effective
method obtaining Waldlike approximations for the OC- and
ASN-function for sequential likelihood ratio tests based on
grouped observations. With respect to their hight efficiencies F -optimal groupings are recommended.
0.7
4
0.8
0.9
1.0
1.1
1.2
1.3
Example
Conclusions
References
{Xi }∞
i=1
Let
be independent exponentially distributed random variables with density function
−θx
θe
for
x≥0
fθ (x) =
0
for
x < 0,
0 < θ < ∞. Our aim is to discriminate between the hypotheses H0 : θ = θ0 and H1 : θ = θ1 , θ0 < θ1 , where 0 <
θ0 < 1 < θ1 < ∞ and Eθ Z1,θ0 ,θ1 = ln θ1 /θ0 −(θ1 −θ0 )/θ = 0
for θ = 1 holds. This side condition is no restriction since
the parameter θ is a scale parameter here and other simple
hypotheses can be reduced to this one by an appropriate
transformation of the random variables {Xi }∞
i=1 .
[1] Denisov, V.I., Eger, K.-H., Lemesko, B.Yu., Tsoi, E.B.
(2004). Design of experiments and statistical analysis for
grouped observations. Novosibirsk, NSTU Publishing House.
[2] Eger, K.-H. (2003). Likelihood ratio tests for grouped observations. Chemnitz University of Technology, Faculty of
Mathematics, Preprint 2003-10.
[3] Eger, K.-H. (1985). Sequential tests. Teubner, Leipzig.
[4] Ghosh, B.K., Sen, P.K. (editors) (1991). Handbook of Sequential Analysis. Marcell Dekker, Inc., New York.
[5] Wald, A. (1947). Sequential Analysis. Wiley, New York.
287
Download