Generating Random Variates Introduction • There are various procedures to produce observations (variates) from some desired input distribution (like exponential, gamma, etc.) • The algorithm usually depends on desired distribution • But all algorithms have the same general form • Note the critical importance of a good random-number generator • There may be several different algorithms for the desired distribution Desired Properties • • • Exact: X should have exactly (not approximately) the desired distribution – Example of approximate algorithm: • Treat Z = U1 + U2 + ... + U12 – 6 as N(0, 1) • Mean and variance are correct; it relies on CLT for approximate normality • Range is clearly incorrect Efficient: – Low storage – Fast – Efficient regardless of parameter values (robust) Simple: – Easy to understand and implement (often tradeoff against efficiency) – Requires only U(0, 1) input – If possible, one U one X (for synchronization in variance reduction) Generating Random Variates • There are five general approaches to generating a univariate random variables from a distribution Inverse transform method Composition method Convolution method Acceptance-rejection method Special properties Inverse Transform Method • The simplest (in principle) and “best” method in some ways Continuous Case • Suppose X is continuous with cumulative distribution function (CDF) F(x) = P(X x) for all real numbers x that is strictly increasing over all x Algorithm: 1. Generate U ~ U(0, 1) using a random number generator 2. Find X such that F(X) = U and return this value of X • • Step 2 involves solving the equation F(X) = U for X; the solution is written as X = F–1(U) and we must invert the CDF F Inverting F might be easy (exponential), or difficult (normal) in which case numerical methods might be necessary Inverse Transform Method Mathematical proof: Assume that F is strictly increasing for all x. For a fixed value x0 P(X x0) = P(F–1(U) x0) = P(F(F-1(U)) F(x0)) = P(U F(x0)) = F(x0) Graphical proof: X1 x0 if and only if U1 F(x0), so P(X1 x0) = P(U1 F(x0)) = F(x0) Examples • Weibull (α, β) distribution with parameters α > 0 and β > 0 Density function is CDF is F ( x) x x 1e ( x / ) if x 0 f ( x) otherwise 0 1 e ( x / ) f (t )dt 0 if x 0 otherwise Solving U = F(X) for X, we get U 1 e ( X / ) X [ ln( 1 U )]1/ • Since 1 – U ~ U(0, 1) as well, we can replace 1 – U by U to get X [ ln( U )]1/ Intuition Behind the Inverse Transform Method • Weibull (α = 1.5, β = 6) Intuition Behind the Inverse Transform Method The Algorithm in Action Inverse Transform Method Discrete case • Suppose X is discrete with cumulative distribution function F(x) = P(X x) for all real numbers x and probability mass function p(xi) = P(X = xi), where x1, x2, ... are the possible values X can take Algorithm: 1. Generate random number U ~ U(0,1) 2. Find the smallest positive integer i such that U ≤ F(xi) 3. Return X = xi • • Unlike the continuous case, the discrete inverse transform method can always be used for any discrete distribution Graphical proof: From the above picture, P(X = xi) = p(xi) in every case Example • • Discrete uniform distribution on 1, 2, ..., 100 xi = i, and p(xi) = p(i) = P(X = i) = 0.01 for i = 1, 2, ..., 100 Example • “Literal” inverse transform search 1. Generate U ~ U(0,1) 2. If U ≤ 0.01 return X = 1 and stop; else go on 3. If U ≤ 0.02 return X = 2 and stop; else go on 4. If U ≤ 0.03 return X = 3 and stop; else go on . . 100. If U ≤ 0.99 return X = 99 and stop; else go on 101. Return X = 100 • Equivalently (on a U-for-U basis) 1. Generate U ~ U(0,1) 2. Return X = 100 U+ 1 Generalized Inverse Transform Method X = min{x: F(x) U} • This is valid for any CDF F(x) – Continuous, possibly with flat spots (i.e., not strictly increasing) – Discrete – Mixed continuous-discrete Problems with the inverse transform method • Inverting the CDF may be difficult (numerical methods) • It may not be the fastest or simplest approach for a given distribution Other Implementations of Inverse Transform Approach • • • • • Sometimes the inverse function is not analytically available but can be computed numerically. In this case, we can still ply the inverse transform approach by using the numerical inverse. Consider the normal distribution, the CDF cannot be written in closed form so the inverse function does not exist in closed form. However, most software enables numerical calculation of the inverse function. Example: NORMINV(p,m,s) in Excel returns F-1X(p) where X is Normal(m,s) and 0<p<1. Therefore, we can generate realizations from X by: NORMINV(RAND() ,m,s) which is equivalent to: F-1X(U) where U is uniform (0,1). Acceptance-Rejection Method • • • • This is used when inverse transform is not directly applicable or is inefficient (like for gamma and beta distributions) It has continuous and discrete versions; we’ll just do the continuous case, discrete case is similar Goal is to generate X with some density function f Specify a function t(x) that majorizes f(x), (t(x) ≥ f(x) for all x) Acceptance-Rejection Method • • • • • • • • Then t(x) ≥ 0 for all x, but t ( x)dx f ( x)dx 1 , so t(x) is not a density Set c t ( x)dx 1 Define a new density r(x) = t(x)/c for all x Algorithm 1.Generate Y having density r 2.Generate U ~ U(0,1) (Independent of Y in Step 1) 3.If U ≤ f(Y)/t(Y), return X = Y and stop; else, go back to step 1 and try again (Repeat 1-2-3 until acceptance finally occurs in Step 3) Since t majorizes f, f(Y)/t(Y) ≤ 1 so “U ≤ f(Y) / t(Y)” may or may not occur We must be able to generate Y with density r, hopefully this is easy with our choice of t On each pass, P(acceptance) = 1/c, so we want small c = area under t(x), to “fit” down on top of f closely (we want t and thus r to resemble f closely) Tradeoff between ease of generation from r, and closeness of fit to f Acceptance-Rejection Method Proof: The key point is that we get an X only conditional on acceptance in step 3 P( X x) P(Y x | acceptance) P(acceptance, Y x) P acceptance f (Y ) f ( y) f ( y) t ( y) 1 P(acceptance) P U r ( y )dy dy t (Y ) t ( y ) t ( y) c c f ( y) f ( y) P(acceptance| Y y ) P U t ( y) t ( y) P(acceptance, Y x) P(acceptance, Y x | Y y )r ( y )dy - x P(acceptance | Y y )r ( y )dy - 1 x f ( y) t ( y )dy c t ( y ) F(x)/c P( X x) P(acceptance, Y x) F ( x) / c = F ( x) P acceptance 1/ c Example of Acceptance-Rejection Method • Beta (4,3) distribution, density is f(x) = 60 x3(1-x)2 for 0 ≤ x ≤ 1 • The maximum occurs at x=0.6, f(0.6)= 2.0736 (exactly), so let t(x) = 2.0736 for 0 ≤ x ≤ 1 • Thus, c = 2.0736, and r is the U(0,1) density function Algorithm: 1. Generate Y ~ U(0,1) 2. Generate U ~ U(0,1) independent of Y 3. If U ≤ 60Y3(1-Y)2 /2.0736, return X=Y; else go back to step 1 and try again P(acceptance) in step 3 is 1/2.0736 = 0.48 Intuition Intuition • A different way to look at it • Accept Y if U ≤ f(Y)/ t(Y), so plot the pairs (Y, U t(Y)) and accept the Y’s for which the pair is under the f curve A Closer Fit • • Higher acceptance probability on a given pass Harder to generate Y with density shaped like t, can perhaps use the composition method Example • fX(x)=3/4(-x2+1) if -1<x<1 and fX(x)=0 otherwise. fX(x) Majorizing function? 3/4 -1 1 x Composition Method • Suppose that the inverse transform method is difficult or too slow • Suppose we can find other CDFs F1, F2, ... (finite or infinite list) and weights p1, p2, ... (pj ≥ 0 and p1 + p2 + ... = 1) such that for all x F(x) = p1F1(x) + p2F2(x) + ... • Equivalently, we can decompose the density or mass function f(x) into convex combination of other density or mass functions f(x) = p1f1(x) + p2f2(x) + ... • Algorithm: 1. Generate a positive random integer J such that P(J = j) = pj 2. Return X with CDF FJ (given J = j, X is generated independent of J) • The trick is to find Fj’s from which generation is easy and fast • Sometimes we can use geometry of distribution to suggest a decomposition Example of Composition Method Symmetric triangular distribution on [–1, +1] if 1 x 0 x 1 Density : f ( x) x 1 if 0 x 1 0 otherwise 0 0.5 x 2 x 0.5 CDF : F ( x) 2 0.5 x x 0.5 1 if if if if x 1 1 x 0 0 x 1 x 1 • Inverse transform method 0.5 X 2 X 0.5 if U 1 / 2 U F(X ) 2 0 . 5 X X 0 . 5 if U 1 / 2 2U 1 if U 1 / 2 X 1 2(1 U ) if U 1 / 2 Example of Composition Method • Composition of the density f ( x) ( x 1) I[ 1,0] ( x) ( x 1) I[ 0,1] ( x) 0.52( x 1) I[ 1,0] ( x) 0.52( x 1) I[ 0,1] ( x) p1 f1 ( x) p2 f 2 ( x ) p1 p2 f1(x) f2(x) Example of Composition Method • Composition algorithm 1. Generate U1, U2 ~ U(0,1) independently 2. If U1< 1/2, return X U 2 1 ;otherwise return X 1 1 U 2 • Comparison of algorithms (expectations) Method U’s Compares Adds Multiplies Sq. Roots Inverse transform 1 1 1.5 1 1 Composition 2 1 1.5 0 1 • So composition needs one more U, one less multiplication—faster if RNG is fast Convolution • Suppose X has the same distribution as Y1 + Y2 + ... + Ym, where the Yj’s are IID and m is fixed and finite • Then, X ~ Y1 + Y2 + ... + Ym has the m-fold convolution of the distribution of Yj • Contrast with the composition method – Composition method expresses the distribution function (or density or mass) as a (weighted) sum of other distribution functions (or densities or masses) – Convolution method expresses the random variable itself as the sum of other random variables • Algorithm (obvious) 1. Generate Y1, Y2, ..., Ym independently from their distribution 2. Return X = Y1 + Y2 + ... + Ym Examples of Convolution Method • Symmetric triangular distribution on [–1, +1] (again) x 1 1 x 0 Density: f ( x) x 1 0 x 1 0 otherwise • By simple conditional probability, if U1,U2 ~ IID U(0,1), then U1 + U2 ~ symmetric triangular distribution on [0, 2], so just shift it left by 1 X U1 U 2 1 (U1 0.5) (U 2 0.5) Y1 Y2 • 2 Us, 2 adds; no compares, multiplies, or square roots and it clearly beats both inverse transform and composition methods • X ~ Erlang m with mean b > 0 – Then, X = Y1 + Y2 + ... + Ym where Yj’s ~ IID exponential with mean b/m Special Properties • Simply “tricks” that rely completely on a given distribution’s form • It often involves combining several “component” variates algebraically (like convolution) • Must be able to figure out distributions of functions of random variables • There is no coherent general form, only examples