Lecture 1: Introduction

Lecture 7b: Sampling

Machine Learning

CUNY Graduate Center

Today

• Sampling

– Technique to approximate the expected value of a distribution

• Gibbs Sampling

– Sampling of latent variables in a Graphical

Model

1

Expected Values

• Want to know the expected value of a distribution.

– E[p(t | x )] is a classification problem

• We can calculate p( x ), but integration is difficult.

• Given a graphical model describing the relationship between variables, we ’d like to generate E[p( x )] where x is only partially observed.

2

Sampling

• We have a representation of p(x) and f(x), but integration is intractable

• E[f] is difficult as an integral, but easy as a sum.

• Randomly select points from distribution p(x) and use these as representative of the distribution of f(x).

• It turns out that if correctly sampled, only 10-20 points can be sufficient to estimate the mean and variance of a distribution.

– Samples must be independently drawn

– Expectation may be dominated by regions of high probability, or high function values

3

Monte Carlo Example

• Sampling techniques to solve difficult integration problems.

• What is the area of a circle with radius 1?

– What if you don’t know trigonometry?

4

Monte Carlo Estimation

• How can we approximate the area of a circle if we have no trigonometry?

• Take a random x and a random y between 1 and -1

– Sample from x and sample from y.

• Determine if

• Repeat many times.

• Count the number of times that the inequality is true.

• Divide by the area of the square

5

Rejection Sampling

• The distribution p(x) is easy to evaluate

– As in a graphical model representation

• But difficult to integrate.

• Identify a simpler distribution, kq(x), which bounds p(x), and sample, x

0

, from it.

– This is called the proposal distribution .

• Generate another sample u from an even distribution between 0 and kq(x

) accept the sample

0

).

– If u ≤ p(x

0

• E.g. use it in the calculation of an expectation of f

– Otherwise reject the sample

• E.g. omit from the calculation of an expectation of f

6

Rejection Sampling Example

7

Importance Sampling

• One problem with rejection sampling is that you lose information when throwing out samples.

• If we are only looking for the expected value of f(x), we can incorporate unlikely samples of x in the calculation.

• Again use a proposal distribution to approximate the expected value.

– Weight each sample from q by the likelihood that it was also drawn from p.

8

Graphical Example of

Importance Sampling

9

Markov Chain Monte Carlo

• Markov Chain:

– p(x

1

|x

2

,x

3

,x

4

,x

5

,…) = p(x

1

|x

2

)

• For MCMC sampling start in a state z (0) .

• At each step, draw a sample z (m+1) based on the previous state z (m)

• Accept this step with some probability based on a proposal distribution .

– If the step is accepted: z (m+1) = z (m)

– Else: z (m+1) = z (m)

• Or only accept if the sample is consistent with an observed value

10


• Goal: p(z (m) ) = p*(z) as m →∞

– MCMCs that have this property are ergodic.

– Implies that the sampled distribution converges to the true distribution

• Need to define a transition function to move from one state to the next.

– How do we draw a sample at state m+1 given state m?

– Often, z (m+1) is drawn from a gaussian with z (m) mean and a constant variance.

11


• Goal: p(z (m) ) = p*(z) as m →∞

– MCMCs that have this property are ergodic.

• Transition properties that provide detailed balance guarantee ergodic MCMC processess.

– Also considered reversible .

12

Metropolis-Hastings Algorithm

• Assume the current state is z (m).

• Draw a sample z* from q(z|z (m) )

• Accept probability function

• Often use a normal distribution for q

– Tradeoff between convergence and acceptance rate based on variance.

13

Gibbs Sampling

• We’ve been treating z as a vector to be sampled as a whole

• However, in high dimensions, the accept probability becomes vanishingly small.

• Gibbs sampling allows us to sample one variable at a time, based on the other variables in z .

14

Gibbs sampling

• Assume a distribution over 3 variables.

• Generate a new sample for each variable conditioned on all of the other variables.

15

Gibbs Sampling in a Graphical

Model

• The appeal of Gibbs sampling in a graphical model is that the conditional distribution of a variable is only dependent on its parents.

• Gibbs sampling fixes n-1 variables, and generates a sample for the the n th .

• If each of the variables are assumed to have easily sample-able distributions, we can just sample from the conditionals given by the graphical model given some initial states.

16

Next Time

• Perceptrons

• Neural Networks

17

Lecture 1: Introduction

Lecture 7b: Sampling

Today

Expected Values

Sampling

Monte Carlo Example

Monte Carlo Estimation

Rejection Sampling

Rejection Sampling Example

Importance Sampling

Graphical Example of

Importance Sampling

Markov Chain Monte Carlo

Markov Chain Monte Carlo

Markov Chain Monte Carlo

Metropolis-Hastings Algorithm

Gibbs Sampling

Gibbs sampling

Gibbs Sampling in a Graphical

Model

Next Time

Related documents

Products

Support

Lecture 1: Introduction

Lecture 7b: Sampling

Today

Expected Values

Sampling

Monte Carlo Example

Monte Carlo Estimation

Rejection Sampling

Rejection Sampling Example

Importance Sampling

Graphical Example of

Importance Sampling

Markov Chain Monte Carlo

Markov Chain Monte Carlo

Markov Chain Monte Carlo

Metropolis-Hastings Algorithm

Gibbs Sampling

Gibbs sampling

Gibbs Sampling in a Graphical

Model

Next Time

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib