Chapter 4 Bayesian Computation (Monte Carlo Methods) Let y y1 , y2 ,, yn be the data observed and r E g | y g f | y d , be of interest. Sometimes, it might be very difficult to find the explicit form the above integral. In such cases, Monte Carlo method might be an alternative choice. (I) Direct sampling The direct sampling is to generate 1 , 2 ,, N from f | y and then use N rˆ g i i 1 N to estimate r. Note that N is usually large. The variance of Varg . N Var rˆ Thus, the standard error of r̂ is N s.erˆ 2 ˆ g r i i 1 N N 1 An approximate 95% confidence interval for r is rˆ 2s.erˆ . Further, the estimate of 1 . r̂ is p Pa g b | y is number of g i a, b . N ˆ p p̂ The standard error of is ˆ 1 p ˆ p . N ˆ s.e p (II) Indirect sampling As 1 , 2 ,, N can not be generated from the posterior directly, the following sampling methods can be used: (a) important sampling (b) rejection sampling (c) the weighted boostrap (a) Important sampling Since r E g | y g f | y d g w h d w h d N g w 1 N w 1 N i i 1 i N i i 1 N g w i i 1 i N w i 1 i 2 g f y | d f y | d where f y | h w and h is a density which the data can be generated from easily and be in generally chosen to approximate the posterior density, i.e., h c f y | , c R . X h is called importance function. Var can be estimated by Y s X2 x sY2 x s XY 2 3 2 y y y , where N x x i 1 N i N ,y N sY2 y i 1 i y i 1 N i N , s X2 2 i 1 2 , s XY x i 1 i N 1 N y N 1 x x i , x yi y N 1 The accuracy of the important sampling can be estimated by plugging in xi g i w i and yi wi . (b) Rejection sampling Let h be a density which the data can be generated from easily and be in generally chosen to approximate the posterior density. In addition, there is a finite known constant M 0 such that f y | Mh , 3 for every . The steps for the rejection sampling are: 1. generate j from h . 2. generate j independent of j from U ~ Uniform0,1 . 3. If j j f y | j Mh j accept j ; otherwise reject j . 4. Repeat steps 1~3 until the desired sample (accepted j ) 1 , 2 ,, N , are obtained. Note that 1 , 2 ,, N will be the data generated from the posterior density. Then, N rˆ g i i 1 N . Note: f y | P j P j | U Mh j f y | Mh h dd 0 f y | Mh h dd 0 j f y | d f y | d 4 Differentiation with respect to j yield j f y | j f y | d , the posterior density function evaluated at j . (c) Weighted boostrap It is very similar to the important sampling method. The steps are as follows: 1. generate 1 , 2 ,, N from h . 2. draw i from the discrete distribution over 1 , 2 ,, N which put mass qi wi N w i 1 , wi i f y | i h i , i at i . Then, g N rˆ (III) i i 1 N . Markov chain Monte Carlo method There are several Markov chain Monte Carlo methods. One of the commonly used methods is Metropolis-Hastings algorithm. Let be generated from h f y | which is needed only up to proportionality constant. Given an auxiliary function q , such that q, is a probability density function and q, q , , the Metropolis algorithm is as follows: 1. Draw from the p.d.f. q , 5 , where j is the current state of the Markov chain. h 2. Compute the odds ratio . h 3. If 1 , then j 1 . If 1 , then j 1 with probabilit y j with probabilit y 1 4. Repeat steps 1~3 until the desired sample (accepted j ) 1 , 2 ,, N , are obtained. Note that 1 , 2 ,, N will be the data generated from the posterior density. Then, N rˆ g i M 1 i N M . Note: For the Metropolis algorithm, under mild conditions, j converge in distribution to the posterior density as j . Note: q , is called the candidate or proposal density. The most commonly used q , is the multivariate normal distribution. Note: Hastings (1970) redefine h q , h q , , where q , is not necessarily symmetric. 6