Elementary Simulation Chapter 3 3.1 Basic Problem

Chapter 3 Elementary Simulation 3.1 Basic Problem Simulation allows us to get approximate results to all kinds of probability problems, that we couldn’t solve analytically. The basic problem is: X, Y, . . . , Z | {z } k random variables k independent random variables. Assume, we know how to simulate each of these variables g(x, y, . . . , z) some quite complicated function g of k variables V = g(X, Y, . . . , Z) a random variable of interest We might be interested in some aspects of the density of V - e.g. P (13 < V < 17) = ? E[V ] = ? V ar[V ] = ? unless g is simple, k is small, and we are very lucky, we may not be able to solve these problems analytically. Using simulation, we can do the following: steps of Simulation: 1. Simulate some large number (say n) of values for each of the k variables X, Y, . . . , Z. we then have a set of n k-tuples of the form (Xi , Yi , . . . , Zi ) for i = 1, . . . , n 2. plug each (Xi , Yi , . . . , Zi ) into function g and compute Vi : Vi = g(Xi , Yi , . . . , Zi ) for i = 1, . . . , n 3. then approximate (a) P (a ≤ V ≤ b) by #Vi : Vi ∈ [a, b] n 49 50 CHAPTER 3. ELEMENTARY SIMULATION (b) E[h(V )] by n n 1X h(Vi ) n i=1 i.e. E[V ] = 1X Vi = V̄ n i=1 (c) V ar[V ] by n 1X (Vi − V̄ )2 n i=1 We want to be able to perform an experiment with a given set of probabilities. Starting point: 3.2 Random Number Generators Random number generators (rng) produce a stream of numbers that look like realizations of independent standard uniform variables U1 , U2 , U3 , . . . Usually, these numbers are not completely random , but pseudo random. This way, we ensure repeatability of an experiment. (Note: even the trick to link the system’s rand() function to the internal clock, gives you only pseudo random numbers, since the same time will give you exactly the same stream of random numbers.) There are hundreds of methods that have been proposed for doing this - some (most?) are pretty bad. A good method - and, in fact, current standard in most operating systems, is: Linear Congruential Method Definition 3.2.1 (Linear Congruential Sequence) For integers a, c, and m a sequence of “random numbers” xn is defined by: xi ≡ (axi−1 + c) mod m for i = 1, 2, . . . Note: this sequence still depends on the choice of x0 , the so-called seed of the sequence. Choosing different seeds yields different sequences. That way, we get a sequence with elements in [0, m − 1]. We define ui := xmi . The choice of the parameters a, c and m is crucial! obviously, we want to get as many different numbers as possible - therefore m needs to be as large as possible and preferably prime (that way we get rid of small cycles). Example 3.2.1 rng examples Status-quo in industry is the so called Minimal standard generator. It fulfills the common requirements of a rng and at the same time is very fast. Its parameters are: c = 2 a = 16807 m = 231 − 1 An example for a terrible random number generator is the RANDU, with c = 0 a = 65539 m = 231 3.2. RANDOM NUMBER GENERATORS 51 It was widely used, before people discovered how bad it actually is: Knowing two successive random numbers gives you the possibility to predict the next number pretty well. . . . that’s not, how rng s are supposed to work. For more information about random number generators and different techniques, how to produce and check them, look at http://crypto.mat.sbg.ac.at/results/karl/server/ State of the art at the moment is the Marsaglia-Multicarry-RNG: #define znew ((z=36969*(z&65535)+(z>>16))<<16) #define wnew ((w=18000*(w&65535)+(w>>16))&65535) #define IUNI (znew+wnew) #define UNI (znew+wnew)*2.328306e-10 static unsigned long z=362436069, w=521288629; void setseed(unsigned long i1,unsigned long i2){z=i1; w=i2;} /* Whenever you need random integers or random reals in your C program, just insert those six lines at (near?) the beginning of the program. In every expression where you want a random real in [0,1) use UNI, or use IUNI for a random 32-bit integer. No need to mess with ranf() or ranf(lastI), etc, with their requisite overheads. Choices for replacing the two multipliers 36969 and 18000 are given below. Thus you can tailor your own in-line multiply-with-carry random number generator. This section is expressed as a C comment, in case you want to keep it filed with your essential six lines: */ /* Use of IUNI in an expression will produce a 32-bit unsigned random integer, while UNI will produce a random real in [0,1). The static variables z and w can be reassigned to i1 and i2 by setseed(i1,i2); You may replace the two constants 36969 and 18000 by any pair of distinct constants from this list: 18000 18030 18273 18513 18879 19074 19098 19164 19215 19584 19599 19950 20088 20508 20544 20664 20814 20970 21153 21243 21423 21723 21954 22125 22188 22293 22860 22938 22965 22974 23109 23124 23163 23208 23508 23520 23553 23658 23865 24114 24219 24660 24699 24864 24948 25023 25308 25443 26004 26088 26154 26550 26679 26838 27183 27258 27753 27795 27810 27834 27960 28320 28380 28689 28710 28794 28854 28959 28980 29013 29379 29889 30135 30345 30459 30714 30903 30963 31059 31083 (or any other 16-bit constants k for which both k*2^16-1 and k*2^15-1 are prime)*/ Armed with a Uniform rng, all kinds of other distributions can be generated: 52 3.2.1 CHAPTER 3. ELEMENTARY SIMULATION A general method for discrete data Consider a discrete pmf with: x p(x) x1 < x2 < . . . < xn p(x1 ) p(x2 ) . . . p(xn ) The distribution function F then is: F (t) = X p(xi ) i,xi ≤t Suppose, we have a sequence of independent, standard uniform random variables U1 , U2 , . . . and they have realizations u1 , u2 , . . . (realizations are real values in (0,1)). then we define the ith element in our new sequence to be xj , if j−1 X p(xk ) ≤ ui ≤ k=1 | j X p(xk ) k=1 {z F (xj−1 ) } {z | F (xj ) } Then X has probability mass function p. This is less complicated than it looks. Have a look at figure 3.1. Getting the right x-value for a specific u is done by drawing a horizontal line from the y-axis to the graph of F and following the graph down to the x-axis. - This is, how we get the inverse of a function, graphically. 1 u1 x u2 x 0 x1 x2 x3 x4 ... xn x Figure 3.1: Getting the value corresponding to ui is done by drawing a straight line to the right, until we hit the graph of F , and following the graph down to xj . Example 3.2.2 Simulate the roll of a fair die Let X be the number of spots on the upturned face. The probability mass function of X is p(i) = i = 1, . . . , 6, the distribution function is FX (t) = btc 6 for all t ∈ (0, 6). We therefore get X from a standard uniform variable U by  1   1 if 0 ≤ U ≤ 6     2 if 16 < U ≤ 62      3 if 2 < U ≤ 3 6 6 X= 3 4  4 if < U ≤  6 6      5 if 46 < U ≤ 65     6 if 56 < U ≤ 66 A faster definition than the one above is X = d6 · U e. 1 6 for all 3.2. RANDOM NUMBER GENERATORS 3.2.2 53 A general Method for Continuous Densities Consider a continuous density f with distribution function F . We know (cf. fig. 3.2) that F : (x0 , ∞) 7→ (0, 1) (x0 could be −∞) has an inverse function: F −1 : (0, 1) 7→ (x0 , ∞) FX x xo Figure 3.2: Starting at some value x0 any continuous distribution function has an inverse. In this example, x0 = 1. General Method: For a given standard uniform variable U ∼ U(0,1) we define −1 X := FX (U ) Then X has distribution FX . Why? For a proof of the above statement, we must compute the distribution function X has. Remember, the distribution function of X, FX at value x is the probability that X is x or less: P (X ≤ x) trick = apply FX to both sides of the inequality = P (FX (X) ≤ FX (x)) dfn of U = = P (U ≤ FX (x)) = U is a standard uniform variable,P (U ≤ t) = t = F (x). Therefore, X has exactly the distribution, we wanted it to have. Example 3.2.3 Simulate from Expλ Suppose, we want to simulate a random variable X that has an exponential distribution with rate λ. How do we do this based on a standard uniform variable U ∼ U(0,1) ? The distribution function for Expλ is 0 for x ≤ 0 Expλ (x) = 1 − e−λx for x ≥ 0 So, Expλ : (0, ∞) 7→ (0, 1) has an inverse: Let u be a positive real number: u = 1 − e−λx ⇐⇒ 1 − u = e−λx ⇐⇒ ln(1 − u) = −λx ⇐⇒ x 1 −1 (u) = − ln(1 − u) =: FX λ 2 54 CHAPTER 3. ELEMENTARY SIMULATION then X := − λ1 ln(1 − U ) has an exponential distribution with rate λ. In fact, since 1 − U is uniform, if U is uniform, we could also have used X := − λ1 ln U For specific densities there are a lot of different special tricks for simulating observations: For all of the next sections, let’s assume that we have a sequence of independent standard uniform variables U1 , U2 , U3 , . . . 3.2.2.1 Simulating Binomial & Geometric distributions Let p be the rate of success for a single Bernoulli trial. define: Xi = 0 1 if ui ≥ p if ui < p Then X := n X Xi ∼ Bn,p i=1 and W := # of Xi until the first is 1 W ∼ Geometricp 3.2.2.2 Simulating a Poisson distribution With given U , we know, X = − λ1 ln U has an exponential distribution with rate λ. Define j j+1 X X Y := largest index j such that Xi ≤ 1 and Xi > 1 i=1 i=1 then Y ∼ P oλ 3.2.2.3 Simulating a Normal distribution To simulate a normal distribution, we need two sequences of standard uniform variables. Let U1 and U2 be two independent standard uniform variables. Define −1/2 cos(2πU2 ) −1/2 sin(2πU2 ) Z1 := [−2 ln U1 ] Z2 := [−2 ln U1 ] Then both Z1 and Z2 have a standard normal distribution and are independent, Z1 , Z2 ∼ N (0, 1) and X := µ + σZi ∼ N (µ, σ 2 ) 3.3. EXAMPLES 3.3 55 Examples Example 3.3.1 Simple electric circuit Consider an electric circuit with three resistors as shown in the diagram: Simple Physics predicts that R, the overall resistance is: R = R1 + ( 1 1 −1 R2 · R3 + ) = R1 + R1 R3 R2 + R3 Assume, the resistors are independent and have a normal distribution with mean 100 Ω and a standard deviation of 2 Ω. What should we expect for R, the overall resistance? The following lines are R output from a simulation of 1000 values of R: # Example: Simple Electric Circuit # # Goal: Simulate 1000 random numbers for each of the resistances R1, R2 and R3. # Compute R, the overall resistance, from those values and get approximations for # expectated value, variance and probabilities: # # rnorm (n, mean=0, sd = 1) generates n normal random numbers with the specified # mean and standard deviation # R1 <- rnorm (1000, mean=100, sd = 2) R2 <- rnorm (1000, mean=100, sd = 2) R3 <- rnorm (1000, mean=100, sd = 2) # # compute R: R <- R1 + R2*R3/(R2 + R3) # # now get the estimates: mean(R) > [1] 149.9741 sd(R) > [1] 2.134474 # # ... the probability that R is less than 146 is given by the number of values # that are less than 146 divided by 1000: sum(R<146)/1000 > [1] 0.04 Example 3.3.2 at MacMall Assume, you have a summer job at MacMall, your responsibility are the Blueberry IMacs in stock. At the start of the day, you have 20 Blueberry IMacs in stock. We know: X = # of IMacs ordered per day is Poisson with mean 30 Y = # of IMacs received from Apple is Poisson with mean 15 a day 56 CHAPTER 3. ELEMENTARY SIMULATION Question: What is the probability that at the end of the day you have inventory left in the stock. Let I be the number of Blueberry IMacs in stock at the end of the day. I = 20 − X + Y Asked for is the probability that I ≥ 1). Again, we use R for simulating I: # Example: MacMall # # Goal: generate 1000 Poisson values with lambda = 30 # # Remember: 1 Poisson value needs several exponential values # step 1: produce exponential values u1 <- runif(33000) e1 <- -1/30*log(u1) sum(e1) [1] 1099.096 # # sum of the exponential values is > 1000, therefore we have enough values # to produce 1000 Poisson values # # step 2: # add the exponential values (cumsum is cumulative sum) E1 <- cumsum(e1) E1[25:35] [1] 0.7834028 0.7926534 0.7929962 0.7959631 0.8060001 0.8572329 0.8670336 [8] 0.8947401 1.0182220 1.0831698 1.1001983 E1 <- floor(E1) E1[25:35] [1] 0 0 0 0 0 0 0 0 1 1 1 # # Each time we step over the next integer, we get another Poisson value # by counting how many exponential values we needed to get there. # # step 3: # The ’table’ command counts, how many values of each integer we have X <- table(E1) X[1:10] 0 1 2 3 4 5 6 7 8 9 32 26 31 32 17 27 31 33 32 31 # # we have 1099 values, we only need 1000 X <- X[1:1000] # # check, whether X is a Poisson variable (then, e.g. mean and variance # must be equal to lambda, which is 30 in our example) # mean(X) [1] 30.013 var(X) [1] 29.84067 3.3. EXAMPLES 57 # # generate another 1000 Poisson values, this time lambda is 15 Y <- rpois(1000,15) # looks a lot easier! # # now compute the variable of interest: I is the number of Blueberry IMacs # we have in store at the end of the day I <- 20 - X + Y # # and, finally, # the result we were looking for; # the (empirical) probability, that at the end of the day there are still # computers in the store: sum(I > 0)/1000 [1] 0.753 Using simulation gives us the answer, that with an estimated probability of 0.753 there will be Blueberry IMacs in stock at the end of the day. Why does simulation work? On what properties do we rely when simulating? P (V ∈ [a, b]) approximated by p̂ = #Vi :Vni ∈[a,b] Pn 1 E[h(V )] approximated by h̄ := n i=1 h(Vi ) Pn 1 2 V ar[V ] approximated by i=1 (Vi − V̄ ) n Suppose V1 = g(X1 , Y1 , . . . , Z1 ), V2 = g(X2 , Y2 , . . . , Z2 ) . . . Vn = g(Xn , Yn , . . . , Zn ) are i.i.d then #{Vi : Vi ∈ [a, b]} ∼ Bn,p with p = P (V ∈ [a, b]), n = # trials. So, we can compute expected value and variance of p̂: n E[p̂] = 1 1X E[Vi ] = n · p = p n i=1 n V ar[p̂] = n 1 1 X p(1 − p) 1 V ar[Vi ] = 2 n · p(1 − p) = ≤ → 0 for n → ∞ n2 i=1 n n 4n i.e. we have the picture that for large values of n, p̂ has a density centered at the “true” value for P (V ∈ [a, b]) with small spread. i.e. for large n p̂ is close to p with high probability. Similarly, for Vi i.i.d, h(Vi ) are also i.i.d. Then n 1X E[h(Vi )] = E[h(V )] E[h̄] = n i=1 and V ar[h̄] = n 1 X V ar[h(Vi )] = V ar[h(V )]/n → 0 for n → ∞. n2 i=1 Once again we have that picture for h̄, that the density for h̄ is centered at E[h(V )] for large n and has small spread.

Elementary Simulation Chapter 3 3.1 Basic Problem

Related documents

Products

Support

Elementary Simulation Chapter 3 3.1 Basic Problem

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib