SUMMARY OF DISTRIBUTION FACTS This document contains a number of distributional facts about the common random variables. These facts are somewhat difficult to organize, but the following index should be helpful: Distribution Beta Binomial Cauchy Chi-squared Exponential F Gamma Hypergeometric Lognormal Negative hypergeometric Negative binomial Noncentral t Noncentral F Noncentral chi-squared Normal Poisson t Uniform See items 14 15 18 21 12 13 18 16 4 5 6 7 9 19 20 2 4 9 17 20 21 1 2 4 5 15 17 22 23 25 23 24 19 20 8 20 7 8 11 16 19 25 10 13 17 19 3 4 18 This document is not intended to be comprehensive. Revision date 2011JAN25 Page 1 gs2011 SUMMARY OF DISTRIBUTION FACTS (1) X ~ Gamma(r, ) means that the density is f ( x) r x r 1e x I( x 0 ) (r ) EX= r and Var X = r 2 In many cases X will have units, say seconds. Then the parameter will 1 have units of . The parameter r has no units; it’s a “pure number.” sec In the numerator of f(x), observe that r xr – 1 = (x)r – 1 will have the 1 units of . The resulting units for f(x) with then be . It is always the sec case that a random variable and its density have reciprocal units. The parameters are sometimes described as shape and scale. The shape 1 parameter is r and the scale parameter is . In these terms, E X = Shape Scale and Var X = Shape Scale2. See the comment below 1 about replacing by . The function (r) is u r 1 e u du , which is the gamma function. 0 The moment generating function is defined as E( et X ). This can only make sense if t 1 has reciprocal units of X. Thus t has units . sec The moment generating function is M(t) = 1 t 1 r , restricted to t < . Observe that t has no units and also that M(t) has no units. This leads to E Xk = (k r ) . k (r ) The CDF cannot be written in simple closed form, but some help is available. Start from Page 2 gs2011 SUMMARY OF DISTRIBUTION FACTS x F(x) = 0 u v r u r 1eu du = ( r ) du dv x 1 = ( r ) x 0 v r 1e v dv ( r ) v r 1e v dv 0 y The function IGr(y) = v r 1 v e dv is the incomplete gamma function. Thus 0 1 F(x) = IG r x . There are competing notations, so be careful. r The cumulative distribution function F does not have units. This is the case for every random variable, as F is defined as a probability. Specifically F(x) = P[ X ≤ x ]. If X1 ~ Gamma(r1, ) and X2 ~ Gamma(r2, ), and if X1 and X2 are independent, then X1 + X2 ~ Gamma(r1+r2 , ). This property generalizes to more than two X’s. I F G Hc J K. If X ~ Gamma(r, ) and if Y = cX (with c > 0), then Y ~ Gamma r, The multiplier c can be thought of as a pure number, but it is often used minutes to change units, as perhaps c = . 60 seconds In some notational schemes the parameter is replaced with 1 . Observe that has the same units as X. With this change, we have x r 1e x / f ( x) I( x 0) ( r ) r E X = r and Var X = r2 M(t) = E Xk = 1 1 t r , restricted to t < 1 (k r ) k ( r ) Page 3 gs2011 SUMMARY OF DISTRIBUTION FACTS (2) If X ~ Gamma(r = 1, ), then its density is f(x) = e-x I(x > 0) which is the exponential with mean 1 1 1 . E(X) = and Var(X) = 2 . The CDF is F(x) = 1 - e-x, for x > 0. In many cases X will have units, say seconds. Then the parameter will 1 have units of . sec The moments and moment-generating function are obtained as special cases of (1), using r = 1. If X1, X2, …, Xn are independent, each exponential with mean ~ Gamma(n, ). 1 , then X1 + X2 + … + Xn In some notational schemes the parameter is replaced with 1 . With this change, we have x 1 f ( x ) e I( x 0) E X = and Var X = 2 M(t) = 1 1 t r , restricted to t < 1 E Xk = k! k Page 4 gs2011 SUMMARY OF DISTRIBUTION FACTS (3) If the random variable U is uniform on the interval [a, b], then its density is f (u ) 1 I aub ba a f The CDF is given by 0 R ua | F (u ) S |Tb 1 a if u a if a u b if u b a f 2 ba ab The expected value is E(U) = and the variance is Var(U) = . 12 2 One will occasionally see this defined over the open interval (a, b), replacing I(a u b) with I(a < u < b). The motive for doing this is almost certainly the creation of a mathematical counterexample. In the most common application, a = 0 and b = 1, and we write U ~ unif(0, 1). In this case, the mean is 12 and the variance is 121 . If it happens that U has units, then a and b must be in the same units. If the continuous random variable X has cumulative distribution function F, then the random variable U = F(X) has the distribution unif(0, 1). This forms the basis of the computer simulation method for generating X. Create the unif(0, 1) random variable U and then let X = F -1(U). This is only helpful if F -1 is easy to compute. If a statistical hypothesis test of H0 versus H1 is based on a continuous test statistic, then the p-value is distributed as unif(0, 1) when H0 is true. Page 5 gs2011 SUMMARY OF DISTRIBUTION FACTS (4) If U ~ unif(0, 1), then X = ln U ln U ~ Gamma(1, ). That is, follows the exponential distribution. If U ~ unif(0, 1), then X = 2 [- ln U] ~ 22 , the chi-squared distribution with two degrees of freedom. In this spirit, if U1, U2, …, Uk are independent, each unif(0, 1), then 2 [- ln U1] + 2 [- ln U2] + … + 2 [- ln Uk] = 2[ - ln (U1 U2 … Uk) ] ~ 22 k F H I K k 1 If X ~ Gamma r , , and if k is an integer, then X has the chi-squared 2 2 distribution with k degrees of freedom. The density is (5) k 1 x x2 e 2 f ( x) k I( x 0 ) k 2 2 2 FI HK We write X ~ 2k . We have E 2k = k and Var 2k = 2k. F k , 1 I. H 2 2 K The distribution described as 2 2k is Gamma r 2 (6) If W1, W2, …, Wk are independent chi-squared random variables with n1, n2, …, nk degrees of freedom, then W1 + W2 + … + Wk is chi-squared with n1 + n2 + … + nk degrees of freedom. (7) If Z ~ N(0, 1), then Z2 ~ 12 . If Z1, Z2, …, Zn are independent, each N(0, 1), then Z12 Z22 ... Zn2 ~ 2n . If the individual distributions are N(0, 2), then the distributional property is Z12 Z22 ... Zn2 ~ 2 2n . Page 6 gs2011 SUMMARY OF DISTRIBUTION FACTS (8) If Z1, Z2, …, Zn are independent, with Zi ~ N(i, 2), then W = Z12 Z22 ... Zn2 has the distribution called noncentral chi-squared with n degrees of freedom and 2 22 ... 2n noncentrality parameter 2 = 1 . We would write W ~ 2n (2 ) . The 2 mean is E W = n + 2 , and the variance is Var W = 2n + 42 . (9) The chi-squared distribution on two degrees of freedom has density x f ( x) 1 2 e I x0 2 a f which is exponential with mean 2. (10) The Poisson law is discrete, with support over the set of non-negative integers {0, 1, 2, ...}. f(x) = P[ X = x ] = e x for x = 0, 1, 2, 3, x! We write X ~ Poisson(). This is a discrete probability law, so that there are no units to X or to . t The MGF is M(t) = exp{ (et - 1) } = e( e 1) . It happens that E X = and Var X = . If X ~ Poisson() and Y ~ Poisson(), and if X and Y are independent, then X + Y ~ Poisson( + ). This property generalizes to more than two summands. Page 7 gs2011 SUMMARY OF DISTRIBUTION FACTS (11) The normal random variable has density 1 x 2 F I 1 f ( x) e 2 H K 2 The parameters and have the same units as the random variable X. We write X ~ N(, 2). If X ~ N(, 2) and Y ~ N(, 2), and if X and Y are independent, then X + Y ~ N( + , 2 + 2). This generalizes to more than two summands. If X ~ N(, 2), then E(X) = and Var(X) = 2. Also, E[ (X - )k ] = 0 for odd positive integer k E[ (X - )2 ] = 2 E[ (X - )4 ] = 34 E[ (X - )6 ] = 156 E[ (X - )8 ] = 358 E[ (X - )10 ] = 31510 m r If X ~ N(, 2), then its moment generating function is M(t) = exp t 12 t 22 e t 12 t 22 . The case ( = 0, = 1) is the standard normal. The density is usually written as (rather than f ), and the random variable is usually named Z (rather than X ). Thus 1 z2 1 (z) = e 2 . The cumulative distribution function is usually written as 2 z 2 1 u2 e du . The integral cannot be evaluated in (rather than F ), and (z) = 2 closed form. Values of are given in tables of the normal distribution, but several different layouts are used for these tables. Point (4) noted that if U ~ unif(0, 1), then X = 2 [- ln U] ~ 22 . Point (7) gave the relationship between the normal and chi-squared distribution. Together these can be used as a clever basis for computer generation for independent standard normal random variables, in pairs. Specifically, let U and V be independent unif(0, 1) random variables. Then define X 1 2 ln U cos 2 V X 2 2 ln U sin 2V The random variables X1 and X2 will be independent standard normal. While computer routines easily generate uniform random numbers, there is still computational labor in calculating logarithms, sines, and cosines. Page 8 gs2011 SUMMARY OF DISTRIBUTION FACTS (12) The binomial random variable is discrete with support on the set {0, 1, 2, ..., n}. f(x) = P[ X = x ] = nI F G HxJ Kp x (1 p)n x for x = 0, 1, 2, …, n Write X ~ bin(n, p). E(X) = np and Var(X) = np(1 - p). The moment-generating function is c M(t) = (1 p) pet h n If X1 ~ bin(n1, p) and X2 ~ bin(n2, p) and if X1 and X2 are independent, then X1 + X2 ~ bin(n1 + n2, p). (13) If Y ~ Poisson() and if XY=y ~ bin(y, p), then X ~ Poisson(p). (14) The beta (a, b) density is f(x) = (a b) a1 x (1 x )b1 I(0 x 1) ( a) ( b) This is sometimes defined in terms of the beta function. This function is z 1 B(a, b) = x a1 (1 x )b1 dx 0 The random variable has no units, and the parameters a and b also have no units. It can be shown that B(a, b) = ( a) (b) . ( a b) The density could then be written as f(x) = 1 x a1 (1 x )b1 I(0 x 1) B(a, b) For the case in which a and b are integers, this can be written as Page 9 gs2011 SUMMARY OF DISTRIBUTION FACTS f(x) = (a b 1)! x a1 (1 x )b1 I(0 x 1) (a 1)! (b 1)! This probability law has E Xk = For k = 1, this leads to E X = a(a 1) . ( a b)(a b 1) It follows that Var X = (a b) (a 1) (a b) a(a) a . (a) (a b 1) (a) (a b)(a b) ab ( a b) ( a 2) (a b) (a 1)a(a) ( a) ( a b 2) (a) (a b 1)(a b)(a b) For k = 2, this is E X2 = = ( a b) ( a k ) . ( a) ( a b k ) F I= H K a(a 1) a (a b)( a b 1) ab 2 ab . (a b) (a b 1) 2 (15) If X ~ Gamma(r, ), if Y ~ Gamma(s, ), and if X and Y are independent, then X ~ beta(r, s). Moreover, this ratio is independent of X + Y. X Y The random variables X and Y must be in the same units, and will then X be in the reciprocal units. There are no units for r, s, or . X Y (16) If X and Y are independent normal random variables, each N(0, 1), then Z = X is Y distributed as Cauchy with density f(z) = 1 1 1 x2 The random variables X and Y must be in the same units. The random variable Z will have no units. The identical conclusion applies when the distributions are N(0, 2), but not when they are N(, 2) with 0. Page 10 gs2011 SUMMARY OF DISTRIBUTION FACTS (17) If X ~ Gamma(k, ), and if k is an integer, then z P[ X > x ] = x k t k 1 e t dt (k ) k 1 e x axf i i! i 0 This result links the CDF of the gamma distribution to the CDF of the Poisson distribution. The right side is the probability that a Poisson random variable with mean x takes a value less than k. This result links a number of facts about Poisson processes: The inter-event times in a Poisson process with rate have an exponential 1 distribution with mean ; this distribution is Gamma(1, ). The sum of k independent versions of Gamma(1, ) is Gamma(k, ), as noted in point (2). If X above represents the total of k inter-event times, then the description [ X > x ] says that in time interval (0, x) there are fewer than k events. In a Poisson process with rate , the number of events in the time interval (0, x) follows the Poisson distribution with mean x. (18) If X ~ beta(k, n + 1 - k), and if k and n are integers, then z 1 P[ X > p ] = n! z k 1 1 z ( k 1 )! ( n k )! p a f nk dz k 1 FnI a1 pf G Hj J Kp j 0 j n j The right side is the probability that a binomial (n, p) random variable takes a value less than k. This links the CDF of the beta distribution to the CDF of the binomial. This result is an interesting assembly of some other facts: Consider a sample U1, U2, …, Un from the uniform (0, 1) distribution. If the n values are sorted as U(1) U(2) … U(n) , this process forms the order statistics. The distribution of U(k) can be shown to be beta(k, n + 1 - k). Page 11 gs2011 SUMMARY OF DISTRIBUTION FACTS The density of the beta(k, n + 1 - k) probability law, using z as the carrier, is n! nk z k 1 1 z I(0 z 1) . (k 1)! (n k )! a f Consider a sample U1, U2, …, Un from the uniform (0, 1) distribution. For any value p in (0, 1), the probability that U1 is less than p is p. Thus, the total number of these U ’s which are less than p must follow the binomial (n, p) distribution; n we can represent this as Y = ~ binomial(n, p). U pg Ib i i =1 Consider a sample U1, U2, …, Un from the uniform (0, 1) distribution. It happens that U(k) > p if and only if the number of U’s which is less than p is 0, 1, 2, …, k - 1. (19) If Z ~ N(0, 1) and U ~ 2k , and if Z and U are independent, then the distribution of Z U k is t with k degrees of freedom. The same distribution results if Z ~ N(0, 2) and U ~ 2 2k . If Z ~ N(, 2) and U ~ 2 2k , and if Z and U are independent, then the distribution of Z U k is called the noncentral t with k degrees of freedom and with noncentrality parameter In this construction, the random variable Z and the parameters and will have the same units. The random variable U will be the squared units. Page 12 . gs2011 SUMMARY OF DISTRIBUTION FACTS (20) If U ~ 2m and V ~ 2n , and if U and V are independent, then the distribution of U m V n is Fm,n , the F distribution with (m, n) degrees of freedom. The same distribution results if U ~ 2 2m and V ~ 2 2n . It should be noted that the reciprocal V n U m has the F distribution with (n, m) degrees of freedom. The degrees of freedom numbers are reversed. One can also have a noncentral chi-squared in the numerator. If the assumptions are changed to U ~ 2 2m (2) and V ~ 2 2n with U and V independent, then the distribution of U m V n is called the noncentral F with (m, n) degrees of freedom and noncentrality parameter 2. (21) If F ~ Fm,n then the random variable 1 m 1 F n has the beta Fn , m I distribution. H2 2 K Page 13 gs2011 SUMMARY OF DISTRIBUTION FACTS (22) Consider a population consisting of N objects of which M are special in some sense and N - M are non-special. If a sample of size n is taken without replacement, and if X denotes the number that are special in the sample, then M N M n N n x n x x M x P[ X = x ] = N N n M The possible values of X are the integers from max(0, n - (N - M)) to min(n, M). The M M M N n mean is E X = n and the variance is Var X = n . The random 1 N N N N 1 variable X is hypergeometric, and we write X ~ HG(N; n, M). This notation reflects the fact that n and M are exchangeable in P[ X = x ]. (23) Consider a population consisting of N objects of which M are special in some sense and N - M are non-special. Consider sampling without replacement until exactly r special objects are obtained. Let X denote the number of non-special objects which precede the rth special object. Then M N M r 1 x M ( r 1) P[ X = x ] = N ( x r 1) N x r 1 The event [ X = x ] is the situation in which x + r selections are required, and in which the final selection is special. Thus, the first x + r - 1 selections yield r - 1 which are special, and this reflects the hypergeometric factor in P[ X = x ]. The possible values of X are the integers from 0 to N - M. The probability law for X is described as negative N 1 and Var X hypergeometric. We have E X = r M 1 ( N 1)( N M )( M 1 r ) = r . ( M 1) 2 ( M 2) Page 14 gs2011 SUMMARY OF DISTRIBUTION FACTS (24) In the case of repeated sampling from an infinite population in which the probability of success is p, suppose that you sample until you achieve exactly r successes. Let X be the number of failures preceding the rth success. Then x r 1 r x r 1 r x x P[ X = x ] = p (1 p ) = p (1 p ) x r 1 for x = 0, 1, 2, 3, ... This is described as the negative binomial random variable, and we can write X ~ NegBin(r, p). It is important to define things very carefully, as there are times in which one is keeping track of Y = the total number of trials; of course, Y = X + r. Here E X = r 1 p 1 p and Var X = r 2 . p p This random variable reproduces in that if X1 ~ NegBin(r1, p) and X2 ~ NegBin(r2, p), and if X1 and X2 are independent, then X1 + X2 ~ NegBin(r1 + r2, p). Page 15 gs2011 SUMMARY OF DISTRIBUTION FACTS (25) If X ~ N(, 2), then the random variable Y = eX is called lognormal. (Somehow the name exponormal seems more appropriate.) Observe that Y > 0. The density of Y is f (y) = alog y f 2 1 y 2 e 2 2 If X has units, say dollars, then Y has units of edollars . This can create confusion. The CDF is easily derived: F(y) = P[ Y y ] = P[ eX y ] = P[ X log y ] = Flog y I H K where is the cumulative distribution function of the standard normal. The moments of Y are obtained from the moment generating function of X, which is k t 1 t 2 2 MX(t) = E etX = e 2 . Then E Yk = E e X = E ekX = MX (k). In particular ch 12 2 E Y = MX (1) = e 2 E Y2 = MX (2) = e2 2 2 2 e j Var Y = e2 2 e 12 2 2 2 2 e j 2 = e 2 2 e2 = e 2 e 1 If X has units of dollars, then E Y2 and Var(Y) will have units of edollars 2 = e2 dollars . One can debate whether e2 dollars and e dollars are in fact the same kind of units. Page 16 gs2011