Numerische Mathematik A Monte Carlo Method for High ...

advertisement
Numerische
Mathematik
Numer. Math. 55, 137-157 (1989)
~ Springer-Verlag 1989
A Monte Carlo Method for High Dimensional Integration
Yosihiko Ogata
The Institute of Statistical Mathematics, Minami-Azabu 4-6-7, Minato-ku, Tokyo 106, Japan
Summary. A new method for the numerical integration of very high dimensional functions is introduced and implemented based on the Metropolis'
Monte Carlo algorithm. The logarithm of the high dimensional integral is
reduced to a 1-dimensional integration of a certain statistical function with
respect to a scale parameter over the range of the unit interval. The improvement in accuracy is found to be substantial comparing to the conventional
crude Monte Carlo integration. Several numerical demonstrations are made,
and variability of the estimates are shown.
Subject Classifications: AMS(MOS): 65D30; CR: G 1.4.
1. Introduction
Suppose that we wish to estimate the integral
b b
Zr=
b
S ~ ...S f ( x, ,x2,
a
a
...,xr)dx~ dxz...dxr,
(1)
a
for some constants a and b. We shall denote the vector (xx, x2 . . . . , xK) by x.
Numerical methods for the evaluation of Z r involve the calculation of f ( x )
at a number N of points xi. The crude Monte Carlo method gives as an estimate
for Z~ the sum
N
1~
f(x,)
(2)
i=1
where the points xi are chosen at random in the range of integration. The
error of such an estimate has standard deviation of the order O ( N - z/z). Although
there are some sophisticated modifications or improvements of the method (see
Hammersley and Handscomb, 1964, for example), this method based on (2)
is not practical for large K, the multiplicity of the integral: that is to say, the
138
Y. O g a t a
integrated values are usually very small or very large, while the range of the
function in (2) can be too large to get rid of the biased error caused by the
skewness of the function: see Example 1 in Subsection 4.1.
In this paper we are interested in estimating log Z K directly rather than
Z K itself. The principal motivation of the present estimation is related to providing a solution to the objective Bayesian method which is described in the companion paper (Ogata 1988). The idea for evaluating log Z K is generalized and
developed from the papers by Ogata and Tanemura (1981, 1984a, b, 1988) which
discuss the maximum likelihood method to estimate the pairwise interaction
potential of a Gibbsian distribution for the spatial point patterns.
2. The Method
2.1. The Derivation
Suppose that f(x)=f(xl, Xz .... , xK) be a function defined and to integrated
on a K-dimensional cube [a, b] K which includes the origin 0=(0, 0 .... ,0).
Assume also that this function is of bounded from below, so that without loss
of generality we can hereafter assume positivity of the function. Consider
a scaling parameter a such that 0 _ < a < l , and let the vector a x denote
(axi, crx2, ..., axK). Define a family of probability densities {g~(x)} on the cube
[a, b] r which are characterized by the scale parameter a in such a way that
f(trx)
g~(x)----Z~(o-)
where
(3)
Zr(tr) is the normalizing factor
b b
b
ZK(o-)= J' j'... j' f(ax) dx.
a
a
(4)
a
Suppose that a random vector ~=(~1, ~z . . . . . ~ ) is simulated from the density
g~(x). Then, from (3), we have
log g~ (~) -- log f (a r - log ZK (a).
(5)
Under very broad regularity conditions, such as continuously differentiability
of the function f(x), we can expect
E~[~a log g~(~)] = 0,
(6)
where the expectation E,[--] is taken by the distribution g,(x). See Appendix 1
for the explicit conditions and proof of the Eq. (6). This equation implies the
following equality
E,,[~--~log f (~r~)l= ~ log Z~(a).
(7)
A Monte Carlo Method for High Dimensional Integration
139
For the convenience in later description let us set the both hand side of the
equality in (7) by ~9(a). Replacing the expectation in the left hand side of the
above equality by the time average, we get the consistent and unbiased estimate
of ~,(a)
~(cr)= 1 ~ ~ l o g f ( c r x ( t ) ) ,
(8)
t=l
where {x(t)=(xl (t), x 2 ( t ) . . . . , xK(t)); t = 1, 2 . . . . . T} are vector series of samples
following the distribution g,(x). The practical simulation methods include the
so-called Metropolis' procedure. For the procedure we define the potential function by
U.(x) = --log f (c~x),
(9)
for the present case. See Sect. 3 for the brief review of the Metropolis' procedure
and some notes in its application.
On the other hand, from the Eq. (7), the eventual estimation of log ZK
= log ZK(1) is written by
1
log ZK (1) = log Z/( (0 + ) + S ~b(or) d a,
(1O)
0
where
b
b
Z~(O+)= lim j'... I f ( a x ) dx =(b-a)~f(O),
~
a
(1 l)
a
and 0 = (0, 0 . . . . . 0).
2.2. Numerical Approximation and Error Estimate
Before carrying out the integration in (10), estimation (8) of O(as)
=(0/Oa) log Z~(ai) should be made for some a s such that 0 < a s < 1, preferably
with their estimated errors. In Ogata and Tanemura (1981, i984a, b, 1988) polynomials or spline functions are fitted to this sort of experimental data to get
a smooth and well fitted function, and then to be used for the integrand in
(10).
The alternative but simple method to evaluate the integral in (10) will be
by the approximation in the trapezoidal rule, for example,
J
j=I
If the estimated errors of the ~(aj) for respective a s are not highly inhomogeneous, the above sum, for example with the equispaced nodes {aj}j=t.2 ..... s,
140
Y. Ogata
for a suitable J, is expected to provide accurate estimate to the required integral
in (10). This is because that the estimate for O(crj) of each aj is consistent and
unbiased and further that the random variable ~(aj) and ~(ak) are mutually
independent for k4=j, provided a suitable generation of the random numbers
is carried out. Thus, if the standard error of each ~(aj) is s~, then error variance
of log Zk(1) estimated by using (10) and (12) is given by
/G
O \2
J
(13)
Estimation of s{ can be carried out in the following way. Let us set tlo(t )
= (~/~3a) log f ( a x (t)). Then, from (8), we have approximately
To
T
Var(~ (~)) ~ Va~-~/~ {I + ~ ( 1 - ~ ) p~ (z)},
(14)
where p,(r) = E {qo(t) q,(t + 0} = E {t~e('c) 2} is the auto-correlation but p~(z) = 0
is assumed for all v > To with some T0. Here Var(q,) is estimated by
~y~ (.o(t)- 6(~))2 =
~(t)2 _ 6(o)2,
(15)
and also p~(z) is estimated by the sample auto-correlation
1
T-z
T-~ Z (.~(0- ~(o))(~(t + 0 - ~(o))
~,(z)=
t:l
(16)
1 ~ (qo(t)_~(~r))2
t=l
Further, the step To in (14) is determined by looking at the plotted graph of
~o(z) in (16) against the time lag z. These evaluation methods will be used for
the numerical examples in later section.
3. Simulation of the Gibbsian Random Field
For a practical method to get the sample from the distribution in (3), let us
review a simulation method which uses a particular type of random walk known
as a Markov chain. The simulation was originally devised by Metropolis et al.
(1953) and developed by Wood (1968) and others for the study of atomic systems.
Consider a continuous Gibbs random field of a state space VK (the cubic
I-a, b] K in the present case for (1)) whose probability density distribution g(x)
has the form
g(x) = 1 exp { - U (x)},
where U(x) is a total potential of x, and Z is the normalizing constant.
(17)
A Monte Carlo Method for High Dimensional Integration
141
The most commonly used algorithm is described in the following manner.
Assume that, at time t, the state of the K particle system is x(t)={(xk(t)) in
VK; k = l , ..., K}. A trial state x'(t+l)~-{(x'k(t+l))} for the next time t + l is
then chosen in such a way that the coordinate x'r(t+ 1) of a randomly chosen
particle r lie in some interval [xr(t)-6, xr(t)+6] (that is to say, x',(t+ 1)=x,(t)
+3(1 2 0 for a uniform random number 0, while all other K - - 1 particles
have the same position as in state x(t), where 6 > 0 is a parameter to be discussed
below. The corresponding total potential energy U(x'(t + 1)) in (17) is then calculated and compared with U(x(t)) as follows.
1. If U ( x ' ( t + 1))< U(x(t)), then without further ado the next state x ( t + 1)
of the realization is taken as the trial state x'(t + 1).
2. If U ( x ' ( t + 1))> U(x(t)), then we obtain a uniform random number r/, and
(i) if tl<=exp{U(x(t))-U(x'(t+l))}, state x ( t + l ) i s taken to be the trial state
x ' ( t + 1); (ii) otherwise, state x(t + 1) is taken to be the previous state x(t).
It should be noticed that the normalizing factor Z in (17) has not been
used in the simulation. In essence, the Monte Carlo procedure here is nothing
but to select the transition probabilities
q(x, d y ) = P r o b { x ( t + 1 ) ~ d y l x ( t ) : x }
(18)
of a M a r k o v chain x(t) which satisfy ~p(dx)q(x, dy) p(dy) for all the state
y of the K-particle system in V K (for the stationary probability p ( d x ) = g ( x ) dx,
and further satisfy the condition that the n-step transition probability q(n)(x, dy)
converges to the given equilibrium probability p(dy) in (17). Thus, of course,
there are many possible algorithms for carrying out the condition other than
the above (Wood 1968). For example, Ripley (1979) applies a spatiotemporal
birth and death process and uses the rejection method in generating a birth
point, but the range of the trial point is the whole state space.
The parameter 6, the maximum single step displacement allowed in passing
from one state to the next, ought in principle to be adjusted for optimum rate
of convergence of the Markov chain. Wood (1968), according to his experiments,
suggests that a reasonable choice of the adjusting parameter 6 has been found
to be a value leading to the rejection of the trial configuration on about a
half of the time-steps. This is a trade-off, especially in case of highly variated
Gibbs distribution, between effective transition of the state and avoding of unnecessary repetition of the same state. This suggestion can be confirmed by comparing the convergence rate, of the auto-correlation (16) of series of ~9-values or
potential values, to zero as the time lag z increases. In addition to the selection
of 6, in order to attain the equilibrium state in fewer time-steps in Monte Carlo
simulation, the initial configuration should be suitably chosen. We adopt the
final equilibrium state x(T) used for the estimate ~(ai) in (12) for the initial
state of the next step a j + l . To examine whether the states have reached the
equilibrium or not, we can just look at the plot of the time series of potentials
or ~9-values. Also it is suggested that the same results can be obtained by taking
r = 1, 2 . . . . , N sequatially instead of the random choice of r. Further practical
discussion will be made in Sect. 5.
142
Y. O g a t a
4. Numerical Experiments
4.1. Example I
As an example the integral
1 1
I
Z = ~ f...SeX'eX~...e""dx, dx2...dxK,
0 0
(19)
0
with K = 1000 was calculated. The expected values are therefore
Zr(a)=(e ~ - 1)K/a K and
log ZK((r)=K log{(e ~
1)/a},
especially Z = ( e - 1)K= 1.24279 • I0235, log ZK(0 ) = 1 and log ZK(1)= 541.3248.
Further ~(a)=OlogZ~(cr)/Ocr, especially q J ( 0 + ) = K / 2 = 5 0 0 and ~9(1)=K/(e
- I) = 581.9767.
The potential function for simulating the samples is
K
U, (x) = -- log f ( x ) = -- a ~
X k.
(20)
k=l
Further,
K
8 logf(a;x)=
8or
(x)= ~ Xk.
(21)
k=l
We now suppose that the values of log ZK(a ) and O(a) are unknown but,
of course, that the value of log Z K ( 0 + ) = 0 , the form of the potential U,(x)
and its derivative 8 U (x)/8 cr for each o are known. The equi-distant nodal points
{ % ; j = 1, 2 . . . . , J} are taken from the unit interval [0, 1].
Since the statistic in (21), and the potential function (20), are very smooth
and rather flat with respect to ~r, the trial interval [x,(t)-6, x,(t)+ 6] (see Sect. 3)
is taken to the whole space [0, 1] in the present case. For each aj experiments
of the first 10 x K steps, which may be not in equilibrium, are to be thrown
away.
After some computer simulations with T = K x 104 steps in (8) for the K
= 1000, we obtained respective estimates ~(crj), on aj = ( j - 1)/10, j = 1, 2, ..., 11,
together with their standard errors 3~. See Fig. 1 for the shape of the true ~k
function. This looks increasing linear but not exactly. The variability of estimated
~O-functions with 20 experiments is shown in Fig. l b, and the estimation of
the Var {r/~(t)} ~/2 in (15) is plotted in Fig. t c. Each of the graphs seems homogeneous throughout [0, 1], but notice that the scale of the y-axis is quite different.
It is seen that variability of errors of 0-function is very small comparing to
that of the 0-values themselves, and further that this is quite symmetric with
respect to the origin. These are the reason of success in obtaining the accurate
estimate in spite of the very rough partition {ai}j= 1.2 . . . . . 11 of [0, 1]. The estimated log Z and their error estimates (13) using (14) are listed in Table 1. It
seems that the variability of difference between the estimate and true value
of the integral is distributed according to the normal distribution with the given
error estimate.
A Monte Carlo Method for High Dimensional Integration
143
Xo
o
c~
e-.,
I
I
i
1
i
I
I
I
1
1
0.00
0.20
0.40
0.60
0.80
1.00
0.00
0.20
0.40
0.60
0.80
l,O0
a
?
b
§
§
*
I
I
§
§
t
I
4,
§
I
!
4'
§
I
I
4'
'~r~
r
00.00
t
0.20
0.40
0.60
0.80
I
1.00
C
Fig. 1. a Plot of the theoretical ~b(a) versus a of Example 1. b Variability of ~i(a)-t~(a) versus aj
for 20 experiments in i of Example 1. Sign " + " is the estimate for each e i and each experiment.
c Estimates of Var{r/.} I/2 in (15) versus crj for respective 20 experiments in Example 1. Sign " + "
is the estimate for each crj and each experiment
Further, in Table 2, an estimate by the crude Monte Carlo method in (2)
and its logarithm are provied for the comparison. Here, to avoid the overflow,
f ( x ) = e l~176176
e x p { ~ ( x , - 1 / 2 ) } is used for the function in (2). Nevertheless, the
considerable underestimates took place, owing to the very large variation and
144
Y. Ogata
Table 1. Present Monte Carlo estimates for Example 1 with 1ogZ~ooo=541.3248. The step size is
T= K x II = 103 • 104, the number of nodal points is J = 11, and uniform random numbers 2 • J
times many as the step size T are used. Standard errors (13) using (14) are attached for respective
estimated integral values
No.
logZ(1)
s.c.
No.
logZ(l)
s.c.
I
2
3
4
5
6
7
8
9
10
-541.3636
-541.3393
- 541.3299
- 541.3783
-541.3408
-54t.3430
-541.3027
-541.3445
-541.3728
-541.3756
0.02329
0,02323
0.02318
0.02331
0.02325
0.02322
0.02328
0.02328
0.02329
0.02326
I1
t2
13
14
15
16
17
18
19
20
--541.3411
--541.3232
--541.3272
--541.3508
--541.3467
--541.3698
--541.4006
--541.3463
--541.2827
--541.3457
0.02329
0.02332
0.02324
0.02329
0.02331
0.02327
0.02330
0.02323
0.02323
0.02338
Table 2. Crude Monte Carlo estimates for Example 1 with Ziooo=0.124279x 10236 o r log Zlooo
=541.3248. Number of steps in (2) is N=106. Random numbers of K=1000 times as many as
the N = 106 are used
No.
21000
log 2
No.
21000
1
2
3
4
5
0.3384991 • 10232
0.8431270 • 10230
0.3299905 x 10TM
0.4164859 • 10T M
0.9635451 • 10230
532.5976
529.4239
530.7885
531.0213
529.5574
6
7
8
9
10
0.1432387
0.5987711
0,1544526
0.1184123
0.6898787
log 2
•
•
x
•
•
10T M
10230
10TM
10T M
10230
529.9539
529.0817
530.0293
529.7636
529.2233
s k e w n e s s o f t h e f u n c t i o n . E s p e c i a l l y , t h e v a l u e o f t h e t i m e a v e r a g e (2) f o r t h e
p r e s e n t e x a m p l e is a l m o s t d o m i n a t e d b y t h e e x t r e m e s o f t h e f u n c t i o n f ( x ) , so
t h a t t h e e n o u g h s t e p size T s e e m s t o b e t o o h u g e t o a p p r o a c h t h e r i g h t o r d e r
of the true value. The used random numbers for the crude Monte-Carlo method
a r e m o r e t h a n 5 t i m e s u s e d f o r t h e m e t h o d s u g g e s t e d in t h i s p a p e r . F r o m t h e s e ,
t h e s i g n i f i c a n t i m p r o v e m e n t i n a c c u r a c y b y t h e p r e s e n t m e t h o d is s e e n clearly.
From our experience, the accuracy does not seem to be dependent on the dimens i o n o f t h e i n t e g r a l , a l t h o u g h t h e c.p.u, t i m e b e c o m e s l o n g p r o p o r t i o n a l l y t o
the dimension.
4.2. E x a m p l e 2
T h e n e x t e x a m p l e is t o see h o w t h e h i g h d i m e n s i o n a l i n t e g r a t i o n o n t h e i n f i n i t e
domain can be reasonably calculated. Consider an integral
--oo
--o0
--o0
A Monte Carlo Methodfor High DimensionalIntegration
for x = ( x 1. . . . , x r ) where f ( x ) = e x p { -- 89
that
t - 12
B=
0
- 12
-I
145
and B is a Toeplitz matrix such
--10
2
.0 0 1
0 . ....
- I ... 0 .
............
0
0
0
0...2
Therefore Z = ( 2 r c ) r/z det(B)= (2~)K/2(K+ 1). One possible way to calculate
(22) by our method may be resetting the integral by using a transformation
of variables from the infinite range into some finite range, but here we directly
approximate the integral by taking the range [a, b] in (1) sufficiently large.
Let us take a = - 2 0 . 0 and b=20.0 and K = I 0 0 0 , then we can expect in the
present case that the trancated integral (1) well approximate (22). The analytically
expected values to check the accuracy are
log Zr(0)= K l o g ( b - a ) = I000 log 40.0,
log Z r (1)= log Z = (K/2) log 2 7c-(1/2) log (K + 1)= 915.484,
and ~(1)= - K = -- 1000.
The potential function for simulating the sample is
0-2
U~(x) = - l o g f ( 0 - x ) = ~ xBx',
(23)
and then
0 logf(0-x)=
0o
~ - a ~ ( x ) = - 0 - x B x '.
(24)
Following the suggestion of Wood described in the last paragraph in Sect. 3,
I made some test experiments to choose the maximum single step displacement
parameter 6=2.0 for the random walk of the trial points on the state space
[ - 2 0 , 20] whose extremes are identified to regard as a torus. Looking at the
time series of potentials or ~O-values for each a j, j = 1, 2. . . . . J, the experiments
of the first K • 103 steps which may not in equilibrium, were thrown away.
Since variation of the function if(o) with respect to o seems large comparing
to that of previous example, I took J = 8 1 in order to get enough accurate
estimate for log Zr(1 ) by the trapezoidal rule (12): see Fig. 2a for the shape
of the ~b function obtained by averaging the estimates if(a1) of 10 experiments
for each oj, Fig. 2b for the variability of ~ obtained from the 10 experiments,
and Fig. 2c for the standard errors of the time series t/,(t)=(0/00-)logf(0-x(t))
of t for respective 0-=0-i. It should be noted here that the V a r ( q , ( t ) ) 1/2 is very
similar to the absolute value of ~,(t) although its scale is much smaller. The
magnitude of the variability in Fig. 2 b is also related to the other two figures
but not quite in the beginning part. This is due to the different mixing rate,
or convergence rate of the correlation t~,(z) in (16), among the different 0-=0-i
in my Metropolis' procedure: see Eq. (14) for the reason. Note also that the
variability in Fig. 2b seems symmetric, which is a good news for the estimate
146
Y. Ogata
X
i -
c:)
o
o
l
1
1
1
l
I
0.20
0.00
a
!
|
!
0.60
0.40
l
i
0.80
1.00
OlD
c~
+
I
I
I
~.00
I"
i
1
0.20
t
1
0.40
|
I
I
0.60
I
I
0.80
1.00
b
00
%0 ~
x
o
(:~
I
0.00
I
0.20
!
I
0.40
|
I
0.60
I
I
0.60
|
|
1.00
C
Fig. 2. a Plot of the sample average ~(a)= ~ ~i(a)/lO versus aj for 10 experiments of Example 2.
~=a
b Variability of ~(aj)-~(aj) for 10 experiments of Example 2. Sign " + " is the estimate for each
% and each experiment, e Estimates of Var{r/.} ~/2 in (15) versus aj for respective 10 experiments
in Example 2. Sign " + " is the estimate for each 6j and each experiment
A Monte Carlo Method for High Dimensional Integration
147
Table 3. Estimations of the logarithm of the integral (22) of Example 2 and their standard errors
with T = K x I I = I000 x 5000 and J = 8 1 . The true value is log Z = 915.484
No.
log Z(1)
s.c.
No.
logZ(1)
s.c.
I
2
3
4
5
915.862
915.470
916,208
916.592
917.620
0.517
0,525
0.520
0.521
0.519
6
7
8
9
I0
914.684
918.168
916.457
916.547
916.699
0.524
0.521
0.524
0.523
0.520
~(a) from the viewpoints of the unbiased estimation theory. Table 3 lists thus
estimated values of log ZK(1) and also their estimated errors (13) using (14).
From this it is seen that the performance is reasonably well, although we see
a slight bias from the true value which, we suspect, may due to insufficient
number of partitions especially in the beginning part of the interval.
4.3. Example 3
For the last example, consider a random fields x = { x i i e S } on 32 x 32 lattice
(i,j) where S is one dimensional torus being identical to [--5, 5], and xij is
conditionally dependent on the values of Xk,,,EN~J within the second nearest
neighbourhood Nij= {(i,j+_ 1), (i-L-1,j), (i_ 1 , j + 1)} on the lattice. The potential
function here is arbitrarily defined in such a way that
31
O-X{OE+ll~xij-~
U~(xz0)=- ~
i,j = 2
~
~Xkmll2}-x
(25)
(k, m ) e N i j
with an adjusting parameter 0, and metric II" 11 is the natural distance on the
torus S. The values on the boundary of the lattice are fixed in such a way
that
xi~=
I
for i =
1 and
l<j<16,
or
1_<i_<16 and j = 1,
=
2
for i =
1 and
17<j<32,
or
i_<i<16 and j = 3 2 ,
=-1
=
0
for i = 3 2
and
l<j<16,
or
1 7 < i < 3 2 and j =
I,
for i = 3 2
and
17<j<32,
or
17_<i_<32 and j = 3 2 .
(26)
Then we calculate the normalizing constant of Gibbs distribution
Z=
5
5
~
~ ... ~ e x p { - U x ( x ; 0 ) } d x .
-5
-5
5
(27)
-5
We do not know the analytic value of the integral, but let us see the estimated
value of log Z for each 0 = 0.1, 0.2, ..., 0.8. Thus, for each 0, ~,-value is calculated
for equidistant 101 points {ak} between 0 and 1, and for further equidistant
148
Ogata
Y.
TAU=0.50
TAU=O. 10
c~
co
Co
x~_
x
v
CO
I
I
0.00
I
I
0.40
I
I
t
0,00
0.80
I
0.8o
TAU=O.20
o~
I
0,40
TAU=O.GO
. . . . . . . . . . . . . . . . . . . . . . . . .
Col
Co
~o~ ..........................
X
XCO_
v
c)
v
_
I:::1
Co
I
i
0.00
I
I
0.40
I
I
]
I
I
.00
0.80
I
TAU=0.30
. . . . . . . . . . . . . . . . . . . . . . . . .
C:)--~ CO . . . . . . . . . . . . . . . . . . . . . . . . .
CO
,-
!
~c) ~ -
><
v
I
0.80
TAU:O.70
0--
~
I
0.40
x
CoCo
r-1
b~
I
I
0.00
I
t
8.40
t
1
I
i
1
.00
0.80
TAU=O.4D
. . . . . . . . . . . . . . . . . . . . . . . . . .
Co
•
I
I
0.80
%~'-~_
-
X
v
I
0.40
TAU=0.80
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
_
C~
Co
o4
I
I
0.00
I
0.40
1
I
0.80
I
i
I
.00
1
0.40
I
I
I
0.80
Fig. 3. a Plots of ~,(o) versus a~, r = 0 . I , 0 . 2 , . . . , 0 . 8 , of Example 3. The horizontal dotted lines are
level of zero for respective z. b Estimates of Vat {r/o} t lz v e r s u s a t for respective
21 points between 0 and 0.05 (thus 116 distinct points altogether in [0, 1]).
The estimated functions ~o(ak) for 0=0.1, 0.2 . . . . . 0.8 are plotted in Fig. 3a.
As in the previous example, shape of the standard error of the time series t/~(t)
in Fig. 3b for each 0 is very similar to that of the corresponding J~(a)l. It
is seen from the definition that ~ ( 0 ) = 0 always hold for any 0, and we suspect
A Monte Carlo Method for High Dimensional Integration
149
TAU:O.50
TAU=D.10
o
o
X
X
v
cc~
o
o
I
0,00
I
1
!
I
o
0.80
0.40
!
0.00
I
I
0.40
I
I
0.80
T^U=0.60
TAU=0.20
CD
CD_
O ~
M
X
r
c~
c~
I
I
I
0.40
0.00
I
I
c~
I
0.80
.00
I
I
0.40
l
I
0.80
TAU=0.70
TAU=0.30
o
c~
o2
•
v
o
o
o
o
o
I
0.00
I
I
I
I
o
0.80
0.40
0.00
I
I
0.40
I
I
0.80
TAU=O.80
TAU=0.40
123
0
r'~
d_
X
_
i
Q
O
o
o
!
0.00
I
0.40
!
I
!
I
O
.00
0.80
I
0.40
I
I
i
0.80
Fig. 3b
that the sharp trough next to a = 0 in Fig. 3 a might relate to the sharp increase
rate of the potential function in (25) a r o u n d a = 0, especially when 0 is small.
This is the reason why we have taken the further nodal points in the interval
[0, 0.05] to maintain the reasonable accuracy of the numerical integration. Integrating these function by the trapezoidal rule, we have Table 4.
150
Y. O g a t a
Table 4. S u m m a r y o f the e s t i m a t i o n of the l o g a r i t h m of the integral (27) of E x a m p l e 3
log Z(O +)
] ~(a)da
logZ(1)
0
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
1 0 2 4 000.00
128 000.00
37925.93
16000.00
8192.00
4 740.74
2985.42
2000.00
- 44 895.08
- 7 893.36
-4870.52
- 3858.16
- 3 376.75
- 2 756. I 0
- 1927.37
- 1 324.99
979104.92
120106.64
33055.41
12141.84
4815.25
t 984.64
1058.06
675.01
5. Discussions
In principle, the essential assumption to the class of integrand functions for
the applicability of the present method are continuously differentiability of the
function to endure the relation (6) and (11): see Appendix 1. However, from
the preceding three examples, we have learnt some negative correlation between
the smoothness of the ~k-function with respect to (r and number of the nodal
points {ak} to maintain the reasonably accurate estimate of the integral. Indeed,
the error variance (13) shows the accuracy of the estimate log Z~(1). Suppose
the value ~(~r~)is large, then ~ = V a r { ~ ( o i ) } is also expected proportionally
large under moderate sample size T in (8): see Figs. 2a and c and also Figs. 3a
and b for instance. In such case interval length o j - o ~ _ l between the nodal
points in such particular parts should be small enough to maintain a required
accuracy as well as to avoid the bias. Further it is lucky for the estimate ~(o)
in (8) from the viewpoint of the unbiased estimation theory that the variability
of the estimate (8) seems symmetric: see Figs 1 b and 2 b.
Metropolis' simulation method includes very delicate parameters, that is
to say, the maximum single step displacement 6 and number of steps needed
to reach the equilibrium. We usually begin with large 6 and decrease this, then
the ratio of the rejection of the configuration decreases. We continue this until
to meet the Wood's suggestion that the ratio should roughly be equal to a
half of the step times: that is I D 3 = I D 1 + I D 2 very roughly holds in the output
of the program in Appendix 2. To ensure the step of the pattern having reached
the equilibrium (that is, I S in the program), we examine graphs of the moving
average of the time series of the total potential values or of 0-values. If the
time series seems to get stationary, we may judge that the patterns get in equilibrium. To get equilibrium as early as possible, the step size 6 has better to
be larger so far as the above ratio of the rejection is satisfied. This also makes
the mixing rate of the Markov process in (18) faster. The mixing rate is partly
measured by the auto-correlations of time series of potentials or 0-values: thus
To in (14) relates to the mixing rate.
A Monte Carlo Method for High Dimensional Integration
151
Acknowledgements. I have benefited greatly from useful discussions with Masaharu Tanemura who
have been working with me on the estimation of the interaction potential in a spatial point pattern.
Koichi Katsura generously helped me in preparing the Figures and Tables of the paper. Comments
of the anonymous referee was helpful for the revision of the present paper.
Appendix 1. Assumptions for the Eq. (6)
We assume that
1. For almost every xe[a, b] r, f ( x ) > 0 holds;
2. for almost every x t [ a , b] K, OJ'(ox)/0o exist for all a t [ 0 , i]; and
3. for every a t [ 0 , 1], I~f/O~I<A(x) holds, where A(x) is integrable over
[a, b]r.
Since, from (3) and (4),
g~(x)dx= 1,
(28)
[a,b]g
and because of the above assumptions 2 and 3, using the Fubini's theorem
of the change between signs of integral and differential, we have
0g~(X) d x = ~
[a,b]K
6~X
f
go(x)dx=0
(29)
[a, bl K
for every cr in [-0, i], and so, adding the above assumption 1,
E ~ [ ~ log g~(~)] =
kCO
j
S
i ago(x) g~(x) dx = 0 '
ta,blK go(x) C2C7
(30)
which is nothing but the Eq. (6).
Now, since the both ~r and x are defined on the compact sets [0, l] and
[a, b] ~ respectively, the above assumptions 2 and 3 are implied, for example,
by assuming the continuously differentiability of the function f(x) on [a, b]r.
This, of cource, implies f ( 0 ) < o o and f ' ( 0 ) < o o . The former is necessary for
the relation (1 t).
Appendix 2. Fortran Program
The Fortran program given here is for the Example 1. However, this is just
for the test program for the readers. For example, the dimension here is K = 100
instead of K = 1000: the calculations are carried out for the latter case in the
present paper and then given in Table 1. Also, a very simple subroutine for
the generation of quasi-random numbers is given here. However, it is crutial
for avoiding the biases of the present statistics ~(aj) in such a very large sample
size of Monte Carlo experiments, to choose a well tested random number generation. Throughout this work, I have used physical random number generator
(200 kilo-byte per second) installed in the Institute of Statistical Mathematics.
The program here can be easily rewritten for the computer having a parallel
processor, which will improve the c.p.u, time significantly.
152
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
Y. Ogata
PROG;~bkM MAIN
Monte-Carlo Integration: estimation of log(Znn)
Exponential function exp{X!-...-Xnn} is InteErated as an example.
Inputs:
iX, IY,IZ...Seeds for random nuzber ~enerat=r.
NN...Dimenalon (=ulzi~!Icity) of the Inte~raZlon.
ll...Repeat number of simulation such that K=NN'il.
IS...Repeat number for s=eps before the equ!llbrulm.
IM...Step interval for dlsplayln~ psl-value ear!mates.
J D T . . . N u m b e r of equidistant subintervals in [0,I] for
the trapezoidal rule for integrating psl-functlon.
Outputs:
AVF...TIme average of the psi-function at each SG_J,
where J,l,2 .... ,JDT-I.
ID1,ID2,1Db...CountlnE the numbers of rejections and acceptances.
SS...Partlal sums in the trapezoidal rule up to each SG_J.
IMPLICIT REAL'8 (A-H,O-Z)
DIMENSION X(2000)
DATA IX, IY, IZ/IS74,2Zb,~7/
DATA IDI,ID2,1D~/O,0,0/
DATA SS/0.0/
NN-100
II,200
IS-100
IM*I00
JDT-20
DO I00 N-I,NN
I00 X(N)-KANDOM(IX, IY,IZ)
DO 50 J-I,JDT-I
DT-I.D0/JDT
SG.(J-I)*DT
FN-PSI(NN,I,X(1),SG,X,FN)
K-0
POT-I.DbO
SF-O.ODO
SF2,0.0D0
WRITE(S,3)
3
FORMAT(/IH
&
,bX,'ll',tX,'Psl-values',tX,'Differences',
5X,'e.e.(Psl)',$X,'IDI',7X,'ID2',7X,'IDb')
DO i0 I-l,ll-IS
IT-I-IS
DO 20 N*I,NN
XTEST-RANDOM(IX, IY, IZ)
TESTPT*POTEN(NN,N,XTEST,SG,X,POT)
IF(TESTFT.GT.POT) GO TO 30
IDI-IDI-I
POT-TESTPT
IF(I.LT.IS) GO TO 20
FN-PSI(NN,N,XTEST,SG,X,FN)
X(~)-XT~ST
K.K-I
SF-SF-FN
SF2-SF2*FN'*2
GO TO 20
A Monte Carlo Method for High Dimensional Integration
30
CONTINUE
PROBAB=DEXP(POT-TESTPT)
RN=RANDOM(IX, IY, IZ)
IF(KN.GT.PROBAB) GO TO 40
ID2=ID2-1
POT=TESTFT
IF(I.LT.IS) GO TO 20
FN=PSI(NN,N,XTEST,SG,X,FN)
X(N)=XTEST
K=K~
SF=SF-FN
SF2.SF2.FN~176
GO TO 20
CONTINUE
ID3=ID3-1
IF(I.LT.IS) GO TO 20
K-K-I
SF=SF-FN
SF2=SF2~176
CONTINUE
40
20
IF(MOD(!T,IM).NE.0) GO TO 10
IF(K.E~.0) GO TO i0
Av~=s//x
AVF2=SF2/K
SLK=DS~RT(AVF2-AVF~176
CALL DIFFEK(NN.J,JDT~176
WRITE(6,1) IT,AVF,DI,$DK,IDI,ID2,1D3
FO~MAT(IH olT,X,DI3.E,X,2DI2.4,2X~
IDI=0
ID2=O
I0
ID3=0
CONTINUE
IF(J.EQ.I) SS=SS-AVF'DT/2
IF(J.GT.I .AND. J.LT.JDT-I) SS=SS-AVF'DT
IF(J.EQ.JDT-I) SS=SS-AVF~
CALL DIFFEE(NN,J,JDT-I,SG,DT,AVF,SS,DI,D2)
WRITE(B,2) J,SG,SS,D2
2 FO~MAT(IH ,' J=',I3,X," SG=',F5.3oX,' S$=',F15.7,' D2-',DI5.4)
50 CONTINUE
STOP
END
DOUBLE PRECISION FUNCTION KANDOM(IX,IY,IZ)
Wich=ann an~ Hill: Appl. Statist. (JRSSC), (31) 188-190 (19S2).
IMPLICIT KEAL'8 (A-H,O-Z)
IX,ITI'MOD(IX,177)-2~
IY-172"MOD(IY,176)-35"(IY/17S)
IZzlT0"MOD(IZ.178)-S3~
IF (IX.IT.O) IX=IX-3OZSS
IF (IY.LT.O) IY=IY-30307
IF (IZ.LT.O) IZ=IZ-30323
RANDOM-MOD(FLOAT(IX)/30269.0,FLOAT(IY)/30307.0.
&
FLOAT(IZ)/30323.0,1.O)
RETURN
END
153
154
Y. Ogata
DOUBLE PRECISION FUNCTION POTEN(NN,N,XTEST,SG,X,POT)
IMPLICIT REAL'8(A-H,O-Z)
DIMENSION X(2000)
I F ( N . N E . 1 ) GO TO 20
POTEN-O.0
DO i0 K-I,NN
POTEN.POTEN-SG~
10 CONTINUE
GO TO 30
20 CONTINUE
POTEN-POT
POTEN-POTEN-(-SG*X(N))
POTEN=POTEN-(-SG'XTEST)
50 CONTINUE
RETURN
END
DOUBLE PRECISION FUNCTION PSI(NN,N,XTEST,SG,X,FN)
IMPLICIT REAL'8(A-H.O-Z)
DIMENSION X(2000)
IF[N.NE.I) GO TO 20
PSI=O.O
DO 10 K-I,NN
PSI-PSI-(X(K))
I0 CONTINUE
GO TO 50
20 CONTINUE
PSI-FN
PSI-PSI-(X(N))
PSI=PSI-(XTEST)
30 CONTINUE
RETURN
END
SUBROUTINE DIFFER(NN,J,JDTI,SG,DT,FI,F2,DI,D2)
IMPLICIT REAL~
IF(J.EQ.I) FTI=NN/2
IF(J.NE I) FTI=NN'~(P(SG)/(EXP(SG)-I)-NN/SG
DI=FI-FTI
IF(J.NE.I) FT2.NN~
IF(J.EQ.JDTI) FT2=NN~
D2,F2-FT2
~ETUP_~
END
$
=onSet
J.
II
Psi-values
0 0.529478E-02
100 0.503037E~
200
0.498055E-02
1 SG=O.O00 SS=
II
Psi-values
0 0.455334E-02
100 0.499405E-02
200
0.502595E-02
J=
2 SG=0.050
SS=
Differences
s.e-(Psl)
IDI
0.2948E-01
0.1314E-01
I0000
0.3037E-00
0.2808E-01
10000
-0.1945E~
0.3085E-01
I0000
1.2451381 D2=
0.1245E-01
Differences
-0.4883E-01
-0.4761E+00
-0.1572E-00
3.7581126
s.e.(PsI)
IDI
O.2682E*OI
5734
0.2779E-01
4944
0.2765E-01
4891
D2=
-0.1532E-01
ID2
0
0
0
ID3
0
0
0
ID2
4102
48S5
4927
ID3
164
191
182
A M o n t e C a r l o M e t h o d for High D i m e n s i o n a l I n t e g r a t i o n
II
Psi-values
0 0.509892E~
I00
0.508i87E-02
200
0.509765E-02
J=
3 SG=0.100
SS=
IDI
4931
49!8
4925
ID2
4864
4816
4905
ID3
175
185
170
Psl-values
0.477783E~
0.513690E-02
0.510647E~
SG,O.150
SS=
Differences
s.e.(Psl)
IDI
-0.3471E-01
0.5673E-00
5175
0.I195E-00
0.2986E-01
4840
-0.1848E-00
0.2945E-01
4840
8.8451705 D2-0.5240E-01
ID2
4486
4828
4654
ID3
338
332
306
Ir
Psi-values
0 0.554143E-02
100
0.515734E*02
200
0.515618E*02
J=
5 SG,O.200
SS=
Differences
s.e.(Psi)
ID1
0.3749E-01
0.1593E-01
4713
-0.9213E-01
0.2816E-01
4858
-0.1038E~
0.2711E*01
4880
11.4232593 D2=
-0.3759E-01
102
4956
4791
ID3
328
318
349
II
Psl-values
0 0.523624E-02
100
0.524041E-02
200
0.523286E*02
J=
8 SG=0.250 SS =
Differences
s.e.(Psi)
IDZ
0 . 2 8 1 2 E - 0 0 0.I096E-01
4795
0.3228E-00
0.2838E~
4702
0.2474E-00
0.2755E-01
4724
1 4 . 0 3 9 6 8 7 8 D2=
-0.2522E-01
ID2
4707
ID3
468
4804
494
4776
500
II
Psi-values
0 0.532400E-02
I00
0.527219E-02
200
0.527066E-02
J=
7 SG=0.300
SS=
Differences
s.e.(Psl)
IDI
0.7437E-00
0.1279E-01
4422
0.2257Eo00
0.2602E~
4735
0.2134E~
0.2787E-01
4754
1 8 . 6 7 5 1 6 9 3 D2 =
-0.1455E-01
ID2
5009
4740
4761
ID3
569
525
505
Psl-values
0
0.523163E-02
100
0.529133E-02
200
0.528371E-02
J=
8 SG=0.350
SS=
Differences
s.e.(Psl)
IDI
-0.5945E-00
0.1219E-01
4813
0.2573E-02
0.2832E-01
4865
-0.7364E-01
0.2606E-01
4671
19.3170237 D2=
-0.1823E-01
ID2
4431
4658
4622
ID3
656
677
707
II
Psi-values
0 0.489396E-02
100
0.532312E-02
200
0.535440E-02
J=
6 SG=0.400
SS=
Differences
s.e.(Psl)
ID1
-0.4385E-01
0.1952E~
5349
-0.8324E-01
0.2883E-01
4591
0.2195E*00
0.2823E-01
4660
21.9842243 D2=
-0.7250E-02
ID2
4039
4643
4595
ID3
812
766
745
II
Psi-values
0 0.553653E*02
100
0.535057E-02
200
0.537209E,02
J= 10
SG=0.450
SS=
Differences
s.e.(Psl)
tD1
0.1628E-01
0.1172Eo0I
4675
-0.2317E-00
0.3101E-Of
4800
-0.1854E-01
0.3125E-01
4582
24.8802675 D2,
-0.8075E-02
ID2
4432
4587
4814
ID3
893
813
904
Psi-values
0.517229E-02
100 0 . 5 3 8 5 8 7 E * 0 2
200
0.541822E*02
J= It
$G=0.500
SS =
Differences
s.e.(Psi)
IDI
-0.2426E-01
0.8160E-00
4638
-0.2897E*00
0.3112E-01
4545
0.3275E-01
0.2819Eo01
4544
2 7 . 3 8 9 3 7 5 2 D2=
-0.6436E-02
ID2
4465
4582
ID3
867
873
91.5
II
Psi-values
0 0.532739E-02
I00
0.547561Eo02
200
0.54397!E-02
J= 12
SG=0.550
SS=
Differences
s.e.(Psl)
IDI
-0.1287E-01
0.1162E-01
4508
0.2087E-00
0.3014E-01
4451
-0.1533E-00
0.3035E-01
4504
30.I082307 D2=
-0.!463E-01
ID2
4473
II
0
I00
200
J=
4
II
I[
0
Differences
s.e (Psl)
-0.1440E-00
0.8742E-00
-0.1454E-01
0.2954E-01
-0.1567E-00
0.2785E-01
6 . 2 9 1 9 3 5 6 D2=
-0.2316s
155
4823
4541
4497
4512
ID3
916
1052
994
156
Y. O g a t a
II
Psi-values
0 0.557831E-02
I00
0.544720E-02
200
0.546524E-02
J= 13 SG=O.600
SS=
Differences
s.e.(Psl)
iDI
0.8128E-00
0.6446E-00
4442
-0.4~83E-00
0.2336E-0!
4~29
-0.3378E.00
0.2764E-01
4517
52.8408521D2=
-0.5149E-01
ID2
4530
4537
4547
ID3
1028
954
936
II
Psl-values
0
0.574539E-02
i00
0.549764E~
200
0.551577E~
J= 14 S G - 0 . 6 5 0 SS=
Differences
s.e.(Pst)
IDI
0.2075E-01
0.8029E-00
4274
-0.4025E-00
0.3515E-01
4389
-0.2212E-00
0.3168E,01
4382
35.5987360 D2=
-0.4255Eo01
ID2
4486
4454
4472
ID3
1240
1167
116S
Differences
s.e.(Psl)
ID1
0.I086E-01
0.9127E*00
4288
-0.1670E-01
0.3097E-01
4406
-0.1785E~
0.3070E-01
4488
58.5762341 D2-0.5136E-01
ID2
4505
4422
4409
ID3
1207
1172
1123
Psl-values
0.549843E-02
0.565611E-02
0.560001E-02
SG=0.750 SS=
Differences
s.e.(Psl)
IDI
-0.1208E-01
0.!327E~
4311
0.3689E-00
0.2833E~
4289
-0.1620E-00
0.2724E-01
4592
41.1792407 D2-0.8088E-01
ID2
4328
4423
4424
ID3
1381
1289
1184
II
Psl-v81ues
0 0.535163E-02
100
0.572751E-02
200
0.573158E~
J- 17 SG-O.600
SS=
Differences
s.e.(Pst)
ID1
-0.3080E-01
0.1280E-01
4500
0.6785E~
0.2580E-01
4286
0.7190E-00
0.2806E-01
4256
44.0450215 D2-0.2500E-01
ID2
4127
4301
4278
ID5
1373
1401
II
Psi-values
0 0.611592E.02
100 0 . 5 7 4 5 8 6 E - 0 2
200
0.571589E-02
3* 18
SG-0.650
SS-
Differences
=.e.(Psi)
ID1
0.4160E-01
0.1730E-01
4028
0.4591E-00
0.2751E-01
4328
0.1395E*00
0.2876E~
4509
4 8 . 6 0 1 9 6 8 7 D2=
-0.1803E-01
ID2
4462
4207
4305
ID3
1510
1465
1386
Differences
$.e.(Pst)
IDt
-0.3702E-00
0.1737E*01
4162
-0.2030E*00
0.2780E*01
4232
0.1114E,00
0.2859E~
4224
49.7775721D2-0.1245E-01
ID2
4354
4248
4263
ID3
1484
1522
1513
J-
Pst~
0 0.570505E*02
100
0.571977E,02
200
0.575121E~
19 SG=O.SO0 SS-
Differences
s.e.(Psl)
IDI
0.5526E~
0.1117E-01
4268
0.3680E~00
0.2551E-01
4160
0.1759E*00
0.2712E~
4152
52.8782733 D2=
-0.3751E-02
ID2
4115
4228
4212
ID3
1617
1612
1636
J-
II
Psi-values
0 0.583530E,02
100
0.581981E-02
200
0.579740E~
20 SG-0.950
SS-
Differences
s.e.(P$1)
IDI
-0.1470E*01
0.11S7E-01
4316
-0.I027E*00
0.2749E*01
4164
0.1237E-00
0.2868K-01
4128
54.1343075 D20.1822E-02
ID2
4062
4158
4151
ID3
1822
1878
1721
J-
II
Pet-values
0 0.567282E-02
100
0.580950E-02
200
0.583214Eo02
21
SG-I.000
SS-
II
Psi-values
0 0.568727E-02
100
0.557695E~
200
0.556100E~
J= 15 S G , 0 . 7 0 0
SS=
II
0
I00
200
J= 18
II
F o r t r a n STOP
r
~lme: 6 2 3 . 9 3 8 .
real
~tme:
655.335.
1484
A Monte Carlo Method for High Dimensional Integration
157
References
Metropolis, N., RosenbIuth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equation of state calculations by fast computing machines. J. Chem. Phys. 21, 1087-1092 (1953)
Ogata, Y.: A Monte Carlo method for the objective Bayesian procedure. Research Memorandum
No. 347, The Institute of Statistical Mathematics, Tokyo 1988
Ogata, Y., Tanemura, M.: Approximation of likelihood function in estimating the interaction potentials from spatml point patterns. Research Memorandum No. 216, The Institute of Statistical
Mathematics, Tokyo 1981
Ogata, Y., Tanemura, M.: Likelihood analysis of spatial point patterns. Research Memorandum
No. 241, The Institute of Statistical Mathematics, Tokyo 1984a
Ognta, Y., Tanemura, M.: Likelihood analysis of spatial point patterns. J.R. Statist. Soc. B46, 496-518
(1984b)
Ogata, Y., Tanemura, M.: Likelihood estimation of soft-core interaction potentials for Gibbsian
point patterns. Ann. Inst. Statist. Math. (1988)
Hammersley, J.M., Handscomb, D.C.: Monte Carlo Methods. London: Methuen & Co Ltd (1964)
Ripley, B.D.: Simulating spatial patterns: dependent samples from a multivariate density. Appl.
Statist. 28, 109-112 (1979)
Wood, W.W.: Monte Carlo studies of simple liquid models. In: Temperley, H.N.V., Rowlinson,
J.S., Rushbrooke, G.S. (eds.) Physics of simple liquids, Chap. 5, pp. 115-230, Amsterdam: NorthHolland (1968)
Received March 28, 1988/October 7, 1988
Download