Monte Carlo methods

advertisement
Chapter 4 Bayesian Computation (Monte
Carlo Methods)
Let
y   y1 , y2 ,, yn  be the data observed and
r  E g   | y    g   f  | y d ,
be of interest. Sometimes, it might be very difficult to find the explicit
form the above integral. In such cases, Monte Carlo method might be
an alternative choice.
(I)
Direct sampling
The direct sampling is to generate  1 , 2 ,, N from f  | y 
and then use
N
rˆ 
 g  
i
i 1
N
to estimate r. Note that N is usually large. The variance of
Varg  
.
N
Var rˆ  
Thus, the standard error of
r̂
is
N
s.erˆ  
2
ˆ




g


r
 i
i 1
N  N  1
An approximate 95% confidence interval for r is
rˆ  2s.erˆ  .
Further, the estimate of
1
.
r̂
is
p  Pa  g    b | y 
is
number of g  i   a, b 
.
N
ˆ 
p
p̂
The standard error of
is
ˆ 1  p
ˆ
p
.
N
ˆ
s.e p
(II)
Indirect sampling
As  1 , 2 ,, N can not be generated from the posterior directly,
the following sampling methods can be used:
(a) important sampling
(b) rejection sampling
(c) the weighted boostrap
(a) Important sampling
Since
r  E  g   | y  

 g   f  | y d 
 g  w h d
 w h d
 N  g  w 

1 N  w 
1
N
i
i 1
i
N
i
i 1
N

 g  w 
i
i 1
i
N
 w 
i 1
i
2
 g     f  y |  d
    f  y |  d
where
   f  y |  
h 
w  
and h  is a density which the data can be generated from easily
and be in generally chosen to approximate the posterior density, i.e.,
h   c   f  y |  , c  R .
X
h  is called importance function. Var  can be estimated by
Y 
s X2 x sY2 x s XY
 2  3
2
y
y
y
,
where
N
x
x
i 1
N
i
N
,y
N
sY2 
 y
i 1
i
y
i 1
N
i
N
, s X2 
2
i 1
2
, s XY 
 x
i 1
i
N 1
N
 y
N 1
 x  x 
i
,
 x  yi  y 
N 1
The accuracy of the important sampling can be estimated by
plugging in xi  g  i w i  and yi  wi  .
(b) Rejection sampling
Let h  be a density which the data can be generated from easily
and be in generally chosen to approximate the posterior density. In
addition, there is a finite known constant M  0 such that
   f  y |    Mh  ,
3
for every  . The steps for the rejection sampling are:
1. generate  j from h  .
2. generate  j independent of  j from U ~ Uniform0,1 .
3. If
j 
  j  f  y |  j 
Mh  j 
accept  j ; otherwise reject  j .
4. Repeat steps 1~3 until the desired sample (accepted  j ) 1 , 2 ,, N
,
are obtained. Note that 1 , 2 ,, N will be the data generated from the
posterior density. Then,
N
rˆ 
 g  
i
i 1
N
.
Note:

   f  y |   

P    j   P   j | U 


Mh



j


   f  y | 
Mh  

 h dd
0
   f  y | 
Mh  



 h dd
0
j

    f  y |  d


    f  y |  d

4
Differentiation with respect to  j yield
  j  f y |  j 

    f  y |  d
, the posterior

density function evaluated at  j .
(c) Weighted boostrap
It is very similar to the important sampling method. The steps are as
follows:
1. generate 1 , 2 ,, N from h  .
2. draw  i from the discrete distribution over 1 , 2 ,, N which put
mass
qi 
wi
N
w
i 1
, wi 
  i  f  y |  i 
h i 
,
i
at  i . Then,
 g  
N
rˆ 
(III)

i
i 1
N
.
Markov chain Monte Carlo method
There are several Markov chain Monte Carlo methods. One of the
commonly used methods is Metropolis-Hastings algorithm. Let  
be generated from h      f  y |   which is needed only up to
proportionality constant. Given an auxiliary function q ,  such
that q,  is a probability density function and q,   q ,  ,
the Metropolis algorithm is as follows:
1. Draw


from the p.d.f. q ,
5

 , where 

  j is the current
state of the Markov chain.
 
 
h 
2. Compute the odds ratio  
.
h

3. If   1 , then  j 1   .
If   1 , then  j 1
   with probabilit y 

 j with probabilit y 1  
4. Repeat steps 1~3 until the desired sample (accepted  j ) 1 , 2 ,, N
,
are obtained. Note that 1 , 2 ,, N will be the data generated from the
posterior density. Then,
N
rˆ 
 g  
i  M 1
i
N M
.
Note:
For the Metropolis algorithm, under mild conditions,  j converge in
distribution to the posterior density as j   .
Note:
q ,  is called the candidate or proposal density. The most
commonly used q ,  is the multivariate normal distribution.
Note:
Hastings (1970) redefine
 
 


h   q   , 
 
h   q   ,  ,
where q ,  is not necessarily symmetric.
6
Download