Monte Carlo simulation - University of South Carolina

advertisement

Random Numbers and

Simulation

Generating truly random numbers is not possible

• Programs have been developed to generate pseudo-random numbers

• Values are generated from deterministic algorithms

© Fall 2011 John Grego and the University of South Carolina

1

Random Numbers

Pseudo-random deviates can pass any statistical test for randomness

They appear to be independent and identically distributed

Random number generators for common distributions are available in R

Special techniques (STAT 740) may be needed as well

2

Monte Carlo Simulation

Some common uses of simulation

• Modeling stochastic behavior

• Calculating definite integrals

• Approximating the sampling distribution of a statistics (e.g., maximum of a random sample)

3



Modeling Stochastic Behavior

Buffon’s needle

Random Walk

Observe X

1

, X p=P(X

S

1

,S i

=1)=P(X

2 i

, …, where

2

,…, where

=-1)=.5 and study

S i i

X j  1 j

4

Modeling Stochastic Behavior

This is also called

X i

Gambler’s ruin ; each represents a $1 bet with a return of

$2 for a win and $0 for a loss.

5

Gambler’s Ruin

The properties of a of an unfair game ( fair p≠.5

) game ( p=.5

) are a lot more interesting than the properties

Some properties of this process are easy to anticipate ( E(S) )

6

Gambler’s Ruin

Some properties are difficult to anticipate, and can be aided by simulation.

• Expected number of returns to 0

• Expected length of a winning streak

• Probability of going broke given an initial bank

7

Calculating Definite Integrals

In statistics, we often have to calculate difficult definite integrals (posterior distributions, expected values)

I 

 a b h ( x ) dx

(here, x could be multidimensional)



8



Calculating Definite Integrals

Example 1

I

1

0

1

4

1  x 2 dx

Example 2

 I

2

0

1 

0

1

(4  x

1

2  2 x

2

2 ) dx

2 dx

1

9



Hit-or-Miss Monte Carlo

Example 1 h ( x ) 

4

1  x 2



0

1

4

1  x 2 dx  4(arctan(1)  arctan(0))  4  /4  



Determine c such that c≥h(x) entire region of interest (here, across c=4 )

10

Hit-or-Miss Monte Carlo

Generate

X i

’s from from n random uniform (X

U[a,b] (here, U[0,1] ) and Y

U[0,c] (here, U[0,4] ) i

,Y i

) pairs, i

’s

Count the number of times (call this that Y i is less than h(X i

) m )

Then I

1

≈c(b-a)m/n

• I.e., (height)(width)(proportion under curve)

11

Classical Monte Carlo

Integration

I 

 a b h ( x ) dx

Take n random uniform values, U over [a,b] and estimate I using

1

,…,U n



I  b  a n n

 i  1

 

This method seems straightforward, but is actually more efficient than Hit-or-Miss

12

Expected Value of a Function of a Random Variable

Suppose X is a random variable with density f . Find E[h(x)] for some function h , e.g.,

2

E

E

  sin

  

13



Expected Value of a Function of a Random Variable

 

 h x dx

For n the distribution of

X random values X

1

, X

2

, …, X n from

X (i.e., with density f ),



E

 h

1 n i n 

1 h

  i

14

Examples

Example 3: If a

X is a random variable with

N(10,1) distribution, find E(X 2 )

Example 4: If a

Y is a random variable with

Beta(5,1) distribution, E(-lnY)

There are more advanced methods of integration using simulation (Importance

Sampling)

15

Integration

 integrate() performs numerical integration for functions of a variable ( single not using simulation techniques) adapt() in the adapt package performs multivariate numerical integration

16



Approximating the Sampling

Distribution of a Statistic

To perform inference (CI’s, hypothesis tests) based on sampling statistics, we need to know the sampling distribution of the statistics, at least up to an approximation

Example: X

1

, X

2

, …, X n

~ iid N( m , s 2 ).

T 

X  m s n

has a t ( n  1) distribution

17

Approximating the Sampling

Distribution of a Statistic

What if the data’s distribution is not known?

• Large sample: Central Limit Theorem

• Small sample: Normal theory or nonparametric procedures based on permutation distributions

18

Approximating the Sampling

Distribution of a Statistic

If the population distribution is known, we can approximate the sampling distribution with simulation.

• Repeatedly ( samples of size distribution m times) generate random n from the population

• Calculate a statistic (say, S ) each time

• The empirical (observed) distribution of Svalues approximates the true distribution of S

19



Example

X

1

, X

2

, X

3

, X

4

~Expon(1)

What is the sampling distribution of:

X (the mean) max( X )  min( X )

(the midrange)

2

20

Download