Continuous bootstrapping

advertisement
Continuous bootstrapping
Naoto Niki1 and Yoko Ono2
1
2
Tokyo University of Science, niki@ms.kagu.tus.ac.jp
Niigata University of International and Information Studies, onoyk@nuis.ac.jp
Summary. Bayesian bootstrap method is proposed by Rubin where prior distribution is Diriclet distribution. However, Efron’s bootstrap method has Multinomial
prior from the Bayesian’s point of view. According to the difference in these prior,
we have proposed another type of prior distribution and continuous version of bootstrapping.
Key words: Bayesian Bootstrapping, Prior distribution
1 Introduction
1.1 Bootstrapping as parameter sampling
Let x = (x1 , . . . , xn ) be a random vector of observations independently drawn from
an unknown population F , for estimating a population parameter θ = T (F ) by using
θx = T (Fx ), where Fx is the empirical distribution based on x. In “nonparametric
bootstrapping” due to Efron [Efr79], a large number of samples of size n from Fx
are drawn for numerical evaluation of properties of the distribution of θx , or more
essentially, for simulating the distribution of the “random distribution” Fx .
Hereafter, for the sake of simplicity, it is assumed that i 6= j ⇒ xi 6= xj . Let
b = (b1 , . . . , bn ) be a random vector drawn from Fx in place of F . Then, the
empirical distribution Fb is a multinomial distribution Mul(n; p1 , . . . , pn−1 ) on x
of which parameter vector p = (p1 , . . . , pn ), providing pn = 1 − p1 − · · · − pn−1 , is
distributed as np ∼ Mul(n; 1/n, . . . , 1/n). The common marginal distribution of
npi (i = 1, . . . , n) is a binomial distribution Bin(n, 1/n).
1.2 Continuous analogue to bootstrapping
It is well known that, if n is large enough, the distribution of θb = T (Fb ) furnishes
as good estimates for the low order moments of the sampling distribution of θx , as
statisticians need in practical applications. But, for smaller n, discreteness involved
in the values of p brings fatal influence on the accuracy of the estimates. See, e.g.,
Bickel and Freedman [?] for more details.
1076
Naoto Niki and Yoko Ono
The purpose of this article is to propose a vector r = (r1 , . . . , rn ) of continuous
random variables approximately distributed with Mul(n; 1/n, . . . , 1/n), in the sense
that
r1 ≥ 0, . . . , rn ≥ 0, r1 + · · · + rn = 1;
1
1
Pr k − ≤ ri < k +
≈ n Ck
2
2
k 1
n
1
≈
2
Pr ri <
1
1−
n
1−
n−k
1
n
n
,
i, k ∈ {1, . . . , n} .
1.3 Bayesian bootstrapping
One possible solution may be the use of a Dirichlet variate
d = (d1 , . . . , dn−1 ) ∼ Dir(n; 1, . . . , 1)
employed in the “Bayesian Bootstrapping” due to Rubin [?]. However, this distribution has clearly heavier tails than desired, besides the fact that time-consuming
sorting operation is involved in bootstrapping.
For example, for large n, the common marginal distribution Beta(1, n − 1) of
di has the right tail deceasing geometrically with fixed ratio e, whereas the tail
of Bin(n, 1/n) reduces in factorial descend, as demonstrated in Table 1, where
Beta∗ (1, ∞) is the limiting distribution of n di as n tends to infinity.
Table 1. Difference in Pr{X ∈ [k − 0.5, k + 0.5)}.
Distribution of X k = 0 k = 1 k = 2
1
1
1
X ∼ Poison(1)
e
e
2!e
√
e−1 e−1 e−1
√
√
X ∼ Beta∗ (1, ∞) √
e
e e e2 e
k=3
1
3! e
e−1
√
e3 e
k=4
1
4! e
e−1
√
e4 e
k=5
1
5! e
e−1
√
e5 e
2 Continuous Bootstrapping
Let u = (u1 , . . . , un ) be a vector of random variables i.i.d. with the uniform distribution on [0, 1]. For some constant α > 1, we consider a simple rational transformation
gα (x) = x/(α − x),
and write gα (u) = (gα (u1 ), . . . , gα (un )). Then, the common marginal distribution
of xi = gα (ui ) (i = 1, . . . , n) has pdf, cdf, mean and variance as given below:
8
>
>
<
1
α
0≤x≤
2
α
−
(1
+
x)
1
fα (x) =
1
>
>
x>
,
:0
α−1
Continuous bootstrapping
1077
Table 2. Marginal probabilities for the intervals [k − 0.5, k + 0.5).
Distribution
k=1
k=2
Bin(n, 1/n) 10
20
50
100
∞
0.349
0.358
0.364
0.366
0.368
0.387
0.377
0.372
0.370
0.368
0.194
0.189
0.186
0.185
0.184
CB(n) 10
20
50
100
500
0.351
0.362
0.367
0.369
0.370
0.395
0.381
0.374
0.371
0.370
Beta∗ (1, n − 1) 10
20
50
100
∞
0.370
0.382
0.389
0.391
0.393
0.399
0.391
0.386
0.385
0.383
8
>
>
<
Fα (x) =
n k=0
0.011
0.013
0.015
0.015
0.015
0.001
0.002
0.003
0.003
0.003
0.000
0.000
0.000
0.001
0.001
0.199
0.195
0.190
0.187
0.186
0.049
0.057
0.067
0.072
0.074
0.006
0.005
0.002
0.001
0.000
0.001
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.157
0.148
0.144
0.142
0.141
0.054
0.053
0.052
0.052
0.052
0.016
0.018
0.019
0.019
0.019
0.004
0.006
0.007
0.007
0.007
0.001
0.002
0.003
0.004
0.004
0≤x≤
>
>
:1
0.057
0.060
0.061
0.061
0.061
1
α
−
1
1
0≤x≤
α−1
αx
1+x
µα = α log
k=3 k=4 k=5 k≥6
α
α−1
− 1,
σα2 =
α
− α2 log
α−1
α
α−1
2
,
respectively. Discussion in this article is focused mainly upon the distribution of
r = gα (u)
X
n
gα (ui ) .
i=1
and its use in place of the bootstrapping and the Bayesian bootstrapping.
2.1 Numerical Comparisons
The constant α employed here is determined through numerical experiments and for
mnemonic’s sake as
4
α=
≈ 1.4715.
e
4
Setting α =
yields
e
µ=
4
log
e
4
4−e
− 1 ≈ 0.6747, σ 2 =
4
16
− 2 log
4−e
e
4
4−e
2
≈ 0.3161.
For α = 4/e, the densities fα (x) and fα∗10 (x) for gα (u1 ) and gα (u1 ) + · · · + gα (u10 ),
respectively, and the normal approximation to the latter one (thin line) are shown
in Fig. 1. Fig. 2 shows densities fα∗2 (x) to fα∗6 (x). Table 2 illustrates numerical
comparisons of the three bootstrapping concerned.
1078
Naoto Niki and Yoko Ono
1.4
0.2
1.2
1
0.15
0.8
0.1
0.6
0.4
0.05
0.2
0.5
1
1.5
5
2
15
10
fα (x)
20
fα∗10 (x)
Fig. 1. Densities fα (x) and fα∗10 (x)
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
1
2
3
1
4
2
fα∗2 (x)
3
4
5
6
8
10
12
fα∗3 (x)
0.35
0.25
0.3
0.25
0.2
0.2
0.15
0.15
0.1
0.1
0.05
0.05
2
4
6
fα∗4 (x)
8
2
4
6
fα∗6 (x)
Fig. 2. Densities fα∗2 (x) to fα∗6 (x)
References
[BF81]
Bickel, P.J. and Freedman, D.A.: Some asymptotic theory for the bootstrap. Ann. Statist., 7, 1–26 (1979)
[Efr79] Efron, B.: Bootstrap methods: another look at the jackknife. Ann. Statist.,
9, 1196–1217 (1981)
[Rub81] Rubin, D. B.: The Bayesian bootstrap. Ann. Statist., 9, 130–134 (1981)
Download