Numerical comparison of approximations of the distributions of statistics for multinomial

advertisement
Numerical comparison of approximations of
the distributions of statistics for multinomial
homogeneity test
Nobuhiro Taneichi1 and Yuri Sekiya2
1
2
Obihiro University of Agriculture and Veterinary Medicine, Inada-chō, Obihiro
080-8555, Japan nobutane@obihiro.ac.jp
Hokkaido University of Education, Siroyama, Kushiro 085-8580, Japan
sekiya@kus.hokkyodai.ac.jp
Summary. Statistics Ra based on power divergence can be used for testing the homogeneity of a product multinomial model. All Ra have the same chi-square limiting
distribution under the null hypothesis of homogeneity. R0 is the log likelihood ratio
statistic and R1 is Pearson’s X 2 statistic. In this paper, we focus on two improved
approximations of distribution of Ra . One is an approximation based on multivariate
Edgeworth expansion assuming a continuous distribution. The other is a momentcorrected type of chi-square approximation. We investigate the performance of the
improved approximations numerically and find that both of the approximations perform much better than that of usual chi-square approximation for the statistics Ra
when a ≤ 0, which include the log likelihood ratio statistic.
Key words: product multinomial model, test of homogeneity, null distribution,
chi-square distribution, Edgeworth expansion, moment correction
1 Introduction
In r × s contingency table, let Xij , (i = 1, . . . , s, j = 1, . . . , r) be a product multinomial model that is, (X1j , . . . , Xsj )′ , (j = 1, . . . , r) are distributed
independently
Ps
according to Mults (nj ; p1j , . . . , psj ), (j = 1, . . . , r), P
where
X
= nj , (j =
ij
i=1
1, . . . , r), 0 < pij < 1, (i = 1, . . . , s, j = 1, . . . , r), and si=1 pij = 1, (j = 1, . . . , r) .
The null hypothesis for the homogeneityof each population is
H0 : pi1 = pi2 = · · · = pir ≡ qi ,
(i = 1, . . . , s).
(1)
For testing the hypothesis H0 , we consider the statistics Ra based on power divergenceRead and Cressie( [PE01], pp.23–24)). We denote the MLEs of pij and qi under
H0 by p̂ij and q̂i , respectively, i.e., p̂ij P
= Xij /nj , (i = 1, . . . , s, j = 1, . .P
. , r) and
q̂i = Xi· /n, (i = 1, . . . , s), where Xi· = rj=1 Xij , (i = 1, . . . , s) and n = rj=1 nj .
Then the statistics based on power divergence are
624
Nobuhiro Taneichi and Yuri Sekiya
Ra = 2
s X
r
X
nj I a (p̂ij , q̂i ),
i=1 j=1
where
I a (e, f ) =
n a
o
8
1
>
− 1 (a 6= 0, −1)
e e
>
>
< a(a + 1) f
e log
e
(a = 0)
e
(a = −1).
>
f >
>
: f log f
(2)
It is immediately shown that R0 is the log likelihood ratio statistic and that R1
is Pearson’s X 2 statistic. R2/3 corresponds to the statistic recommended by Cressie
and Read [Wil94] and Read and Cressie [PE01] for the multinomial goodness-of-fit
test.
If we assume
nj /n → νj (0 < νj < 1) for each j (j = 1, . . . , r), as n → ∞,
(3)
it is known that Ra has the χ2(r−1)(s−1) limiting null distribution for any a under
H0 . Using the limiting results, we usually approximate
Pr{Ra ≤ b|H0 } ≈ A0 (b),
where
(4)
A0 (b) = Pr χ2(r−1)(s−1) ≤ b
and χ2ν denotes a chi-square random variable with ν degrees of freedom. In this
paper, we consider two approximations for Pr{Ra ≤ b|H0 } which improve (4) and
investigate the performance of the approximations numerically.
2 Approximations
In this section, we consider two kinds of approximations. First approximation is obtained by multivariate Edgeworth expansionssuming a continuous distribution. That
is, we consider the characteristic function of Ra assuming a continuous distribution
and expand it. Then, by inverting the expanded expression, we obtain the following
approximation.
Pr{Ra ≤ b|H0 } ≈ A1 (b),
(5)
where
A1 (b) = Pr χ2(r−1)(s−1) ≤ b +
(a)
1
24n
3
X
j=0
(a)
wj Pr χ2(r−1)(s−1)+2j ≤ b ,
w0 = −2P Q1 + 2Q1 ,
(a)
w1 = P {−(a2 − 2)Q1 + 3a2 Q2 + 9aQ3 − 4aQ4 } − 3r 2 a2 Q2
−6ra{Q1 − aQ3 + aQ4 } − 2Q1 − 3a(a − 3)Q3 + 4a(a − 2)Q4 ,
(a)
w2 = a[P {−6aQ2 + 3(a − 3)Q3 − 4(a − 1)Q4 } + 6r 2 aQ2
+6r{Q1 − aQ3 + 2aQ4 } + 3(a − 3)Q3 − 8(a − 1)Q4 ],
(a)
w3 = a2 {P (3Q2 + 2Q4 ) − 3r 2 Q2 − 6rQ4 + 4Q4 },
Numerical comparison of multinomial homogeneity test
625
P
P = rj=1 νj−1 , Q1 = Q01 − 1, Q2 = Q01 − s2 , Q3 = Q01 − 2s + 1, Q4 = Q01 − 3s + 2,
P
and Q01 = si=1 qi−1 .
We call this approximation A1 approximation.
Second approximation is a moment-corrected type of χ2 approximation. Though
the moments of Ra are infinite, the probability that Ra are infinite goes to zero
quickly as nj , (j = 1, . . . , r) become large. Then, we consider the moment-corrected
type approximationas follows. We expand Ra under H0 and assumption (3) as
Ra = W a + op (n−1 ),
where
a
W =
r
X
U′j Ω −1 Uj
j=1
−
!
(νj νl )
1/2
U′j Ω −1 Ul
j=1 l=1
1
1
+ √ g1a + g2a ,
n
n
3
X X X 1/2 Uij2 Uil
a − 1 X X −1/2 Uij
−a
νj
νl
2
3 j=1 i=1
qi
qi2
j=1
i=1
r
g1a =
r X
r
X
r
s
r
s
l=1
2a + 1 X X X X
Uij Uil Uim
+
,
(νj νl νm )1/2
3 j=1
qi2
m=1 i=1
r
r
r
s
l=1
g2a =
s
r
(a − 1)(a − 2) XX
12
4
Uij
a(a − 1) XXX
−
3
qi
3
j=1
i=1
r
νj−1
j=1 i=1
r
r
s
r
s
l=1
2
Uij
Uil Uim
a(a + 1) XX X X
+
(νl νm )1/2
2
qi3
j=1
m=1 i=1
r
l=1
νl
νj
1/2
3
Uij
Uil
qi3
(a + 2)(3a + 1) XX X XX
Uij Uil Uim Uit
,
−
(νj νl νm νt )1/2
12
qi3
j=1 l=1 m=1 t=1 i=1
√
Uj = (U1j , . . . , Us−1 j )′ , (j = 1, . . . , r), Uij = (Xij − nj qi )/ nj , (i = 1, . . . , s, j =
′
1, . . . , r), Ω = diag(q1 , . . . , qs−1 ) − q̃q̃ and q̃ = (q1 , . . . , qs−1 )′ .
Then the expanded expression W a has finite moments. The mean and variance
of W a under H0 are approximated as
and
r
r
r
r
s
E(W a ) = (r − 1)(s − 1) +
ma
+ o(n−1 )
n
V (W a ) = 2(r − 1)(s − 1) +
va
+ o(n−1 ),
n
where
1
1
1
(3a + 1)(a + 2) +
(3a + 2)(a − 1)P
ma = − a(a + 1)r +
2 12
12
1
1
+as ar − (a + 1) − (a − 1)P
2
2
1
1
1
+(a − 1)Q01 − ar +
(3a + 2) +
(3a − 2)P
2
12
12
and
626
Nobuhiro Taneichi and Yuri Sekiya
2
2
(7a2 + 7a + 1) + (5a2 − a − 1)P
3
3
2 2 2
+as{14ar
− 4(2a + 1) − 2(3a − 2)P } + a s (r − P )
2
1
+Q01 −a2 r 2 − 2a(3a − 2)r + (5a2 − a − 1) + (11a2 − 10a + 2)P .
3
3
va = −4a(2a + 1)r +
If we put
γa = (r − 1)(s − 1) 1 −
√
δa +
ma
n
and
va
,
(6)
2n(r − 1)(s − 1)
√
then the mean and variance of (Wa − γa )/ δa are matching the mean and variance
of χ2(r−1)(s−1) , (r − 1)(s − 1) and 2(r − 1)(s − 1), respectively, to o(n−1 ). Therefore,
√
we consider an approximation of the distribution of (Ra − γa )/ δa as a χ2(r−1)(s−1)
distribution, that is,
δa = 1 +
where
Pr{Ra ≤ b|H0 } ≈ A2 (b),
b − γa
A2 (b) = Pr χ2(r−1)(s−1) ≤ √
δa
We call this approximation A2 approximation.
(7)
.
3 Numerical investigation
In this section, we numerically investigate the performance of the approximation
based on the multivariate Edgeworth expansion assuming a continuous distribution
given by (5)(A1 approximation), the moment-corrected type of χ2 approximation
given by (7)(A2 approximation), and the χ2 approximation given by (4)(A0 approximation). We evaluate the performance of the approximations by the following
Monte Carlo procedure. For a given observed value xij , (i = 1, . . . , s, j = 1, . . . , r)
of Xij , (i = 1, . . . , s, j = 1, . . . , r), weP
estimate
P qi , (i = 1, . . . , s) defined in (1) as
q̂i = xi· /n, (i = 1, . . . , s), where n = si=1 rj=1 xij . On the basis of A0 , A1 , and
A2 approximations, we consider the distribution of the statistic Ra under H0 in the
case in which qi is estimated as q̂i . Let c0 (α), c1 (α), and c2 (α) be the approximate
critical points of the distribution of the statistic Ra for significance level α based on
the A0 , A1 , and A2 approximations, respectively, that is,
Aj (cj (α)) = 1 − α, j = 0, 1, 2.
We generate r mutually independent s-variate multinomial random vectors under
H0 in the case that qi is estimated as q̂i N1 times and construct N1 r ×s contingency
tables from them. We arrange the tables as x(i), (i = 1, . . . , N1 ). Let Ra (x(i)), (i =
(j)
1, . . . , N1 ) be the value of statistic Ra at x(i), and let N2 , (j = 0, 1, 2) be the
number of the elements of i that satisfies the condition Ra (x(i)) ≥ cj (α). Then
the performance of A0 , A1 , and A2 approximations can be evaluated on the basis
(α)
(j)
of the index Ij = N2 /N1 − α, j = 0, 1, 2, for each statistic Ra . For models (I)
r = 2, s = 4, (II) r = 2, s = 5, (III) r = 3, s = 4, (IV) r = 3, s = 5, (V) r = 4, s = 4,
and sample sizes
Numerical comparison of multinomial homogeneity test
627
Ne ≡ x·1 = · · · = x·r = ls, (l = 6, 8, 10),
observed values that we consider for numerical investigations are listed as follows.
[1] (8,7,5,4), (4,5,7,8) for model (I) and Ne = 24.
[2] (10,9,7,6), (6,7,9,10) for model (I) and Ne = 32.
[3] (12,11,9,8), (8,9,11,12) for model (I) and Ne = 40.
[4] (8,7,5,4), (6,6,6,6) for model (I) and Ne = 24.
[5] (10,9,7,6), (8,8,8,8) for model (I) and Ne = 32.
[6] (12,11,9,8), (10,10,10,10) for model (I) and Ne = 40.
[7] (8,7,6,5,4), (4,5,6,7,8) for model (II) and Ne = 30.
[8] (10,9,8,7,6), (6,7,8,9,10) for model (II) and Ne = 40.
[9] (12,11,10,9,8), (8,9,10,11,12) for model (II) and Ne = 50.
[10] (8,7,6,5,4), (6,6,6,6,6) for model (II) and Ne =30.
[11] (10,9,8,7,6), (8,8,8,8,8) for model (II) and Ne = 40.
[12] (12,11,10,9,8), (10,10,10,10,10) for model (II) and Ne = 50.
[13] (6,6,6,6), (8,7,5,4), (4,5,7,8) for model (III) and Ne = 24.
[14] (8,8,8,8), (10,9,7,6), (6,7,9,10) for model (III) and Ne = 32.
[15] (10,10,10,10), (12,11,9,8), (8,9,11,12) for model (III) and Ne = 40.
[16] (6,6,6,6,6), (8,7,6,5,4), (4,5,6,7,8) for model (IV) and Ne = 30.
[17] (8,8,8,8,8), (10,9,8,7,6), (6,7,8,9,10) for model (IV) and Ne = 40.
[18] (10,10,10,10,10), (12,11,10,9,8), (8,9,10,11,12) for model (IV) and Ne = 50.
[19] (4,5,7,8), (5,7,8,4), (7,8,4,5), (8,4,5,7) for model (V) and Ne = 24.
[20] (6,7,9,10), (7,9,10,6), (9,10,6,7), (10,6,7,9) for model (V) and Ne = 32.
[21] (8,9,11,12), (9,11,12,8), (11,12,8,9), (12,8,9,11) for model (V) and Ne = 40.
In the list, [1], [2], [3], [7], [8], [9], [13], [14], [15], [16], [17], [18], [19], [20] and [21]
are cases in which q̂1 = · · · = q̂s = s−1 . [4], [5], [6], [10], [11] and [12] are cases in
(α)
which q̂i , (i = 1, . . . , s) are not equal. Values of Ij × 104 , j = 0, 1, 2 for statistics
a
R (a=–1, –0.5, 0, 2/3, 1, 1.5, 2) and significance level α = 0.05 of each model are
shown in Tables 1–5.
From Tables 1–5, we find the following results. When a =–1 and a=–0.5, A1
and A2 approximations always perform much better than does usual χ2 approximation (A0 approximation). Therefore, A1 and A2 approximations are very effective
for these statistics. When a = 0 (the log likelihood ratio statistic), there is little
improvement of A1 and A2 approximations in the case of a small sample size. However, A1 and A2 approximations perform much better than does A0 approximation
in the case of a moderate sample size. When a = 2/3, a =1 (Pearson’s X 2 statistic),
a =1.5, and a =2, A1 and A2 approximations do not perform much better than does
A0 approximation.
4 Concluding remarks
In the test of homogeneity for multinomial populations, the approximation based on
multivariate Edgeworth expansion assuming a continuous distribution (A1 approximation) and the moment-corrected type of χ2 approximation (A2 approximation)
are very effective for the statistics Ra when a ≤ 0, which include the log likelihood
ratio statistic. The performance of A1 approximation is very similar to that of A2
628
Nobuhiro Taneichi and Yuri Sekiya
approximation. However, when δa < 0, where δa is defined in (6), A2 approximation
can not be calculated. Therefore, we recommend A1 approximation for statistics Ra
when a ≤ 0. For the statistics Ra when a > 0, we can not recommend either A1 or
A2 approximation.
(0.05)
Table 1. Values of Ij
× 104 (j = 0, 1, 2) for model (I)
case
A0
[1] A1
A2
A0
[2] A1
A2
A0
[3] A1
A2
−1
302
50
50
255
52
52
220
53
53
−0.5
159
3
–2
161
22
19
141
35
34
0
56
–37
–40
77
7
6
78
25
24
2/3
–38
–58
–57
16
–9
–8
30
15
17
1
–57
–65
–62
–7
–12
–10
17
13
15
1.5
–71
–65
–63
–17
–14
–10
10
12
15
2
–57
–65
–62
–7
–12
–10
17
13
15
A0
[4] A1
A2
A0
[5] A1
A2
A0
[6] A1
A2
317 160 52 –42 –61 –74 –61
45 –3 –41 –60 –67 –68 –67
45 –6 –45 –60 –64 –64 –64
253 157 77 12 –9 –20 –9
53 23 5 –9 –13 –15 –13
53 21 4 –8 –11 –12 –11
192 114 56 10 –4 –12 –4
31 14 4 –6 –7 –8 –7
31 13 3 –4 –5 –6 –5
References
[RC88]
[CR84]
Read, T.C.R., Cressie, N.: Goodness-of-fit statistics for discrete multivariate data. Springer, New York (1988)
Cressie, N., Read, T.C.R.: Multinomial goodness-of-fit tests. J. Roy.
Statist. Soc. B 46, 440–464 (1984)
Numerical comparison of multinomial homogeneity test
(0.05)
Table 2. Values of Ij
× 104 (j = 0, 1, 2) for model (II)
case
A0
[7] A1
A2
A0
[8] A1
A2
A0
[9] A1
A2
−1
372
62
60
316
85
86
227
44
45
−0.5
192
–1
–3
188
45
44
133
25
24
0
56
–38
–41
95
20
18
66
7
6
2/3
–44
–60
–61
17
3
2
7
–4
–4
1
–71
–67
–67
–4
1
1
–8
–5
–5
1.5
–87
–69
–68
–16
–1
–0
–18
–6
–6
2
–71
–67
–67
–4
1
1
–8
–5
–5
A0
[10] A1
A2
A0
[11] A1
A2
A0
[12] A1
A2
357
44
42
304
66
67
240
55
56
173
–17
–19
174
25
23
144
33
33
38
–50
–52
77
–0
–2
77
16
14
–55
–70
–70
–2
–15
–16
15
3
3
–79
–74
–74
–24
–20
–20
–0
2
2
–92
–76
–75
–35
–20
–19
–9
2
2
–79
–74
–74
–24
–20
–20
–0
2
2
(0.05)
Table 3. Values of Ij
× 104 (j = 0, 1, 2) for model (III)
case
A0
[13] A1
A2
A0
[14] A1
A2
A0
[15] A1
A2
(0.05)
Table 4. Values of Ij
−1
475
91
83
382
102
100
308
66
67
−0.5
226
–1
–7
211
44
41
173
34
34
0
55
–44
–48
88
8
5
80
13
11
2/3
–51
–64
–66
–6
–15
–17
10
4
2
1
–67
–60
–64
–21
–15
–19
–2
2
–1
1.5
–53
–50
–56
–15
–12
–16
1
3
–2
2
–5
–43
–51
27
–3
–9
30
5
2
× 104 (j = 0, 1, 2) for model (IV)
case
A0
[16] A1
A2
A0
[17] A1
A2
A0
[18] A1
A2
−1
559
120
103
469
133
126
336
79
76
−0.5
260
6
–5
252
56
52
183
29
28
0
64
–47
–52
103
15
10
74
6
4
2/3
–52
–61
–66
6
0
–4
–3
–8
–11
1
–70
–57
–63
–9
–1
–5
–17
–10
–13
1.5
–48
–44
–51
–1
1
–3
–11
–9
–13
2
10
–36
–41
43
6
2
23
–10
–14
629
630
Nobuhiro Taneichi and Yuri Sekiya
(0.05)
Table 5. Values of Ij
× 104 (j = 0, 1, 2) for model (V)
case
A0
[19] A1
A2
A0
[20] A1
A2
A0
[21] A1
A2
−1
609
130
104
486
136
124
374
94
89
−0.5
270
5
–4
258
50
45
204
42
40
0
67
–52
–58
97
7
3
86
19
16
2/3
–53
–60
–65
–2
–7
–11
11
6
3
1
–66
–55
–61
–14
–5
–10
1
9
5
1.5
–41
–44
–51
–2
–6
–9
10
8
4
2
34
–33
–36
52
–2
–5
55
11
9
Download