Markov chain and MH

Markov Chains 1 Markov Chains (1) A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed by transition probability.  Suppose the random variable X 0 , X1 , take state space (Ω) that is a countable set of value. A Markov chain is a process that corresponds to the network.  X0 X1 ... X t 1 Xt X t 1 ... 2 Markov Chains (2)  The current state in Markov chain only depends on the most recent previous states. P  X t 1  j | X t  i, X t 1  it 1 , , X 0  i0   P  X t 1  j | X t  i   Transition probability where i0 , , i, j    http://en.wikipedia.org/wiki/Markov_chain http://civs.stat.ucla.edu/MCMC/MCMC_tuto rial/Lect1_MCMC_Intro.pdf 3 An Example of Markov Chains    1, 2,3, 4,5  X   X 0 , X1, , Xt ,  where X 0 is initial state and so on. P is transition matrix. 1 1 0.4 2  0.5 P  3 0.0  4 0.0 5 0.0 2 3 4 5 0.6 0.0 0.0 0.0  0.0 0.5 0.0 0.0  0.3 0.0 0.7 0.0   0.0 0.1 0.3 0.6  0.3 0.0 0.5 0.2  4 Definition (1)  Define the probability of going from state i to state j in n time steps as ( n) pij  P  X t n  j | X t  i   A state j is accessible from state i if there ( n) p are n time steps such that ij  0 , where n  0,1,  A state i is said to communicate with state j (denote: i  j ), if it is true that both i is accessible from j and that j is accessible from i. 5 Definition (2) A state i has period d  i  if any return to state i must occur in multiples of d  i  time steps.  Formally, the period of a state is defined as    d  i   gcd n : Pii( n )  0  If d i   1, then the state is said to be aperiodic; otherwise (d i   1), the state is said to be periodic with period d  i  . 6 Definition (3) A set of states C is a communicating class if every pair of states in C communicates with each other.  Every state in a communicating class must have the same period  Example:  7 Definition (4) A finite Markov chain is said to be irreducible if its state space (Ω) is a communicating class; this means that, in an irreducible Markov chain, it is possible to get to any state from any state.  Example:  8 Definition (5)  A finite state irreducible Markov chain is said to be ergodic if its states are aperiodic  Example: 9 Definition (6) A state i is said to be transient if, given that we start in state i, there is a non-zero probability that we will never return back to i.  Formally, let the random variable Ti be the next return time to state i (the “hitting time”): Ti  min n : X n  i | X 0  i  Then, state i is transient iff there exists a finite Ti such that: P Ti    1  10 Definition (7) A state i is said to be recurrent or persistent iff there exists a finite Ti such that: P Ti    1.  The mean recurrent time i  E Ti .  State i is positive recurrent if i is finite; otherwise, state i is null recurrent.  A state i is said to be ergodic if it is aperiodic and positive recurrent. If all states in a Markov chain are ergodic, then the chain is said to be ergodic.  11 Stationary Distributions  Theorem: If a Markov Chain is irreducible and aperiodic, then (n) ij P   1 j as n  , i, j  Theorem: If a Markov chain is irreducible P Xn  j and aperiodic, then !  j  lim n  and  j   i Pij ,  j  1, j  i i where  is stationary distribution. 12 Definition (8)  A Markov chain is said to be reversible, if there is a stationary distribution  such that  i Pij   j Pji i, j   Theorem: if a Markov chain is reversible, then  j    i Pij i 13 An Example of Stationary Distributions  A Markov chain: 0.4 0.3 0.3 0.7 1  2 0.3 0.3 0.7 3  0.7 0.3 0.0  P   0.3 0.4 0.3   0.0 0.3 0.7  1 The stationary distribution is    3 0.7 0.3 0.0  1 1 1   1 1 1    3 3 3   0.3 0.4 0.3    3 3 3   0.0 0.3 0.7  1 3 1 3  14 Properties of Stationary Distributions  Regardless of the starting point, the process of irreducible and aperiodic Markov chains will converge to a stationary distribution.  The rate of converge depends on properties of the transition probability. 15 Monte Carlo Markov Chains 16 Monte Carlo Markov Chains MCMC method are a class of algorithms for sampling from probability distributions based on constructing a Markov chain that has the desired distribution as its stationary distribution.  The state of the chain after a large number of steps is then used as a sample from the desired distribution.  http://en.wikipedia.org/wiki/MCMC  17 Metropolis-Hastings Algorithm 18 Metropolis-Hastings Algorithm (1) The Metropolis-Hastings algorithm can draw samples from any probability distribution , requiring only that a function proportional to the density can be Π  x calculated at .  Process in three x steps:      Set up a Markov chain; Run the chain until stationary; Estimate with Monte Carlo methods. http://en.wikipedia.org/wiki/MetropolisHastings_algorithm 19 Metropolis-Hastings Algorithm (2) Let   1, ,  n  be a probability density (or mass) function (pdf or pmf).  f  is any function and we want to estimate  n I  E  f    f  i   i i 1  Construct P   Pij  the transition matrix of an irreducible Markov chain with states 1, 2, , n, where Pij  Pr  Xt 1  j | X t  i , X t  1,2, , n and Π is its unique stationary distribution. 20 Metropolis-Hastings Algorithm (3)  Run this Markov chain for times t  1, and calculate the Monte Carlo sum ,N N 1 Iˆ   f  Xt N t 1 then Iˆ  I as N   Sheldon M. Ross(1997). Proposition 4.3. Introduction to Probability Model. 7th ed.  http://nlp.stanford.edu/local/talks/mcmc_2 004_07_01.ppt  21 Metropolis-Hastings Algorithm (4) In order to perform this method for a given distribution Π , we must construct a Markov chain transition matrix P with Π as its stationary distribution, i.e. ΠP = Π.  Consider the matrix P was made to satisfy the reversibility condition that for all i and j Πi Pij = Π j Pij .  The property ensures that   i Pij   j for all j  i and hence Π is a stationary distribution for P 22 Metropolis-Hastings Algorithm (5) Let a proposal Q  Qij  be irreducible where Qij  Pr  Xt 1  j | X t  i  , and range of Q is equal to range of Π .  But Π is not have to a stationary distribution of Q.  Process: Tweak Qij to yield Π .  States from Qij not π Tweak States from Pij π 23 Metropolis-Hastings Algorithm (6)  We assume that Pij has the form Pij  Qij    i, j   i  j  , Pii  1   Pij , i j where   i, j  is called accepted probability, i.e. given X t  i ,  X t 1  j with probability   i, j  take   X t 1  i with probability 1-   i, j  24 Metropolis-Hastings Algorithm (7)  For i  j,  i Pij   j Pji   iQij  (i, j)   jQji  ( j, i) * WLOG for some (i, j ) ,  iQij   j Qji.  In order to achieve equality (*), one can introduce a probability   i, j   1 on the lefthand side and set   j, i   1 on the righthand side.  25 Metropolis-Hastings Algorithm (8)  Then  i Qij    i, j    j Q ji    j , i    j Q ji  j Q ji    i, j    i Qij  These arguments imply that the accepted probability  (i, j ) must be   jQ ji    i, j   min 1,   iQ  ij   26 Metropolis-Hastings Algorithm (9)  M-H Algorithm: Step 1: Choose an irreducible Markov chain transition matrix Q with transition probability Qij . Step 2: Let t  0 and initialize X 0 from states in Q. Step 3 (Proposal Step): Given X t  i , sample Y  j form QiY . 27 Metropolis-Hastings Algorithm (10)  M-H Algorithm (cont.): Step 4 (Acceptance Step): Generate a random number U from Unif  0,1 If U   i, j  , set X t 1  Y  j else X t 1  X t  i Step 5: t  t  1 , repeat Step 3~5 until convergence. 28

Markov chain and MH

Related documents

Products

Support

Markov chain and MH

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib