®OF12поRM65поPL52,54,48поTS6,12,18,24,30,36,42,48поJUпоRFA

advertisement
SUMMARY OF DISTRIBUTION FACTS

This document contains a number of distributional facts about the common random
variables. These facts are somewhat difficult to organize, but the following index should
be helpful:
Distribution
Beta
Binomial
Cauchy
Chi-squared
Exponential
F
Gamma
Hypergeometric
Lognormal
Negative
hypergeometric
Negative binomial
Noncentral t
Noncentral F
Noncentral chi-squared
Normal
Poisson
t
Uniform
See items
14 15 18 21
12 13 18
16
4 5 6 7 9 19 20
2 4 9 17
20 21
1 2 4 5 15 17
22 23
25
23
24
19
20
8 20
7 8 11 16 19 25
10 13 17
19
3 4 18
This document is not intended to be comprehensive.
Revision date 2011JAN25

Page 1
gs2011
SUMMARY OF DISTRIBUTION FACTS

(1)
X ~ Gamma(r, ) means that the density is
f ( x) 
r x r 1e  x
I( x  0 )
 (r )
EX=
r

and Var X =
r
2
In many cases X will have units, say seconds. Then the parameter  will
1
have units of
. The parameter r has no units; it’s a “pure number.”
sec
In the numerator of f(x), observe that r xr – 1 =  (x)r – 1 will have the
1
units of . The resulting units for f(x) with then be
. It is always the
sec
case that a random variable and its density have reciprocal units.
The parameters are sometimes described as shape and scale. The shape
1
parameter is r and the scale parameter is . In these terms, E X

= Shape  Scale and Var X = Shape  Scale2. See the comment below
1
about replacing  by .


The function (r) is
u
r 1
e  u du , which is the gamma function.
0
The moment generating function is defined as E( et X ). This can only make sense if t
1
has reciprocal units of X. Thus t has units
.
sec
The moment generating function is M(t) =
1
t

1  


r
, restricted to t < .
Observe that
t
has no units and also that M(t) has no units.

This leads to E Xk =
(k  r )
.
k (r )
The CDF cannot be written in simple closed form, but some help is available. Start from

Page 2
gs2011
SUMMARY OF DISTRIBUTION FACTS

x
F(x)
=

0
u  v 
 r u r 1eu
du = 

( r )
 du  dv 
x
1
=
( r )

x

0
v r 1e v
dv
( r )
v r 1e  v dv
0
y
The function IGr(y) =
v
r 1  v
e dv is the incomplete gamma function. Thus
0
1
F(x) =
IG r  x  . There are competing notations, so be careful.
 r
The cumulative distribution function F does not have units. This is the
case for every random variable, as F is defined as a probability.
Specifically F(x) = P[ X ≤ x ].
If X1 ~ Gamma(r1, ) and X2 ~ Gamma(r2, ), and if X1 and X2 are independent, then
X1 + X2 ~ Gamma(r1+r2 , ). This property generalizes to more than two X’s.
I
F
G
Hc J
K.
If X ~ Gamma(r, ) and if Y = cX (with c > 0), then Y ~ Gamma r,
The multiplier c can be thought of as a pure number, but it is often used
minutes
to change units, as perhaps c =
.
60 seconds
In some notational schemes the parameter  is replaced with
1
. Observe that 

has the same units as X. With this change, we have
x r 1e x / 
f ( x) 
I( x  0)
( r ) r
E X = r and Var X = r2
M(t) =
E Xk =

1
1  t 
r
, restricted to t <
1

(k  r ) k

( r )
Page 3
gs2011
SUMMARY OF DISTRIBUTION FACTS

(2)
If X ~ Gamma(r = 1, ), then its density is
f(x) =  e-x I(x > 0)
which is the exponential with mean
1
1
1
. E(X) =
and Var(X) = 2 .



The CDF is F(x) = 1 - e-x, for x > 0.
In many cases X will have units, say seconds. Then the parameter  will
1
have units of
.
sec
The moments and moment-generating function are obtained as special cases of (1), using
r = 1.
If X1, X2, …, Xn are independent, each exponential with mean
~ Gamma(n, ).
1
, then X1 + X2 + … + Xn

In some notational schemes the parameter  is replaced with
1
. With this

change, we have
x
1 
f ( x )  e  I( x  0)

E X =  and Var X = 2
M(t) =
1
1  t 
r
, restricted to t <
1

E Xk = k! k

Page 4
gs2011
SUMMARY OF DISTRIBUTION FACTS

(3)
If the random variable U is uniform on the interval [a, b], then its density is
f (u ) 
1
I aub
ba
a
f
The CDF is given by
0
R
ua
|
F (u )  S
|Tb 1 a
if u  a
if a  u  b
if u  b
a f
2
ba
ab
The expected value is E(U) =
and the variance is Var(U) =
.
12
2
One will occasionally see this defined over the open interval (a, b), replacing I(a  u  b)
with I(a < u < b). The motive for doing this is almost certainly the creation of a
mathematical counterexample.
In the most common application, a = 0 and b = 1, and we write U ~ unif(0, 1). In this
case, the mean is 12 and the variance is 121 .
If it happens that U has units, then a and b must be in the same units.
If the continuous random variable X has cumulative distribution function F, then the
random variable U = F(X) has the distribution unif(0, 1). This forms the basis of the
computer simulation method for generating X. Create the unif(0, 1) random variable U
and then let X = F -1(U). This is only helpful if F -1 is easy to compute.
If a statistical hypothesis test of H0 versus H1 is based on a continuous test statistic, then
the p-value is distributed as unif(0, 1) when H0 is true.

Page 5
gs2011
SUMMARY OF DISTRIBUTION FACTS

(4)
If U ~ unif(0, 1), then X =
 ln U
 ln U
~ Gamma(1, ). That is,
follows the


exponential distribution.
If U ~ unif(0, 1), then X = 2 [- ln U] ~  22 , the chi-squared distribution with two degrees
of freedom. In this spirit, if U1, U2, …, Uk are independent, each unif(0, 1), then
2 [- ln U1] + 2 [- ln U2] + … + 2 [- ln Uk] = 2[ - ln (U1 U2 … Uk) ] ~  22 k
F
H
I
K
k
1
If X ~ Gamma r  ,   , and if k is an integer, then X has the chi-squared
2
2
distribution with k degrees of freedom. The density is
(5)
k
1

x
x2 e 2
f ( x)  k
I( x  0 )
k
2
2 
2
FI
HK
We write X ~ 2k . We have E 2k = k and Var 2k = 2k.
F k ,   1 I.
H 2 2 K
The distribution described as 2 2k is Gamma r 
2
(6)
If W1, W2, …, Wk are independent chi-squared random variables with n1, n2, …, nk
degrees of freedom, then W1 + W2 + … + Wk is chi-squared with n1 + n2 + … + nk degrees
of freedom.
(7)
If Z ~ N(0, 1), then Z2 ~ 12 .
If Z1, Z2, …, Zn are independent, each N(0, 1), then Z12  Z22 ... Zn2 ~ 2n . If the
individual distributions are N(0, 2), then the distributional property is Z12  Z22 ... Zn2 ~
2 2n .

Page 6
gs2011
SUMMARY OF DISTRIBUTION FACTS

(8)
If Z1, Z2, …, Zn are independent, with Zi ~ N(i, 2), then W = Z12  Z22 ... Zn2
has the distribution called noncentral chi-squared with n degrees of freedom and
 2   22 ... 2n
noncentrality parameter 2 = 1
. We would write W ~ 2n (2 ) . The
2

mean is E W = n + 2 , and the variance is Var W = 2n + 42 .
(9)
The chi-squared distribution on two degrees of freedom has density
x
f ( x) 
1 2
e I x0
2
a f
which is exponential with mean 2.
(10) The Poisson law is discrete, with support over the set of non-negative integers
{0, 1, 2, ...}.
f(x) = P[ X = x ] = e 
x
for x = 0, 1, 2, 3,
x!
We write X ~ Poisson().
This is a discrete probability law, so that there are no units to X or to .
t
The MGF is M(t) = exp{ (et - 1) } = e( e 1) . It happens that E X =  and Var X = .
If X ~ Poisson() and Y ~ Poisson(), and if X and Y are independent, then X + Y
~ Poisson( + ). This property generalizes to more than two summands.

Page 7
gs2011
SUMMARY OF DISTRIBUTION FACTS

(11)
The normal random variable has density
1 x  2
 F I
1
f ( x) 
e 2 H K
 2
The parameters  and  have the same units as the random variable X.
We write X ~ N(, 2). If X ~ N(, 2) and Y ~ N(, 2), and if X and Y are independent,
then X + Y ~ N( + , 2 + 2). This generalizes to more than two summands.
If X ~ N(, 2), then E(X) =  and Var(X) = 2. Also,
E[ (X - )k ] = 0 for odd positive integer k
E[ (X - )2 ] = 2
E[ (X - )4 ] = 34
E[ (X - )6 ] = 156
E[ (X - )8 ] = 358
E[ (X - )10 ] = 31510
m
r
If X ~ N(, 2), then its moment generating function is M(t) = exp t  12 t 22  e
t  12 t 22
.
The case ( = 0,  = 1) is the standard normal. The density is usually written as 
(rather than f ), and the random variable is usually named Z (rather than X ). Thus
1
 z2
1
(z) =
e 2 . The cumulative distribution function is usually written as 
2
z
2
1  u2
e du . The integral cannot be evaluated in
(rather than F ), and (z) = 
2

closed form. Values of  are given in tables of the normal distribution, but several
different layouts are used for these tables.
Point (4) noted that if U ~ unif(0, 1), then X = 2 [- ln U] ~  22 . Point (7) gave the
relationship between the normal and chi-squared distribution. Together these can be used
as a clever basis for computer generation for independent standard normal random
variables, in pairs. Specifically, let U and V be independent unif(0, 1) random variables.
Then define
 X 1  2   ln U   cos  2 V 



 X 2  2   ln U   sin  2V 
The random variables X1 and X2 will be independent standard normal. While computer
routines easily generate uniform random numbers, there is still computational labor in
calculating logarithms, sines, and cosines.

Page 8
gs2011
SUMMARY OF DISTRIBUTION FACTS

(12)
The binomial random variable is discrete with support on the set {0, 1, 2, ..., n}.
f(x) = P[ X = x ] =
nI
F
G
HxJ
Kp
x
(1  p)n  x for x = 0, 1, 2, …, n
Write X ~ bin(n, p).
E(X) = np and Var(X) = np(1 - p). The moment-generating function is
c
M(t) = (1 p)  pet
h
n
If X1 ~ bin(n1, p) and X2 ~ bin(n2, p) and if X1 and X2 are independent, then X1 + X2
~ bin(n1 + n2, p).
(13)
If Y ~ Poisson() and if XY=y ~ bin(y, p), then X ~ Poisson(p).
(14)
The beta (a, b) density is
f(x) =
(a  b) a1
x (1  x )b1 I(0  x  1)
 ( a)  ( b)
This is sometimes defined in terms of the beta function. This function is
z
1
B(a, b) =
x a1 (1  x )b1 dx
0
The random variable has no units, and the parameters a and b also have no units.
It can be shown that B(a, b) =
 ( a)  (b)
.
 ( a  b)
The density could then be written as
f(x) =
1
x a1 (1  x )b1 I(0  x  1)
B(a, b)
For the case in which a and b are integers, this can be written as

Page 9
gs2011
SUMMARY OF DISTRIBUTION FACTS

f(x) =
(a  b  1)!
x a1 (1  x )b1 I(0  x  1)
(a  1)! (b  1)!
This probability law has E Xk =
For k = 1, this leads to E X =
a(a  1)
.
( a  b)(a  b  1)
It follows that Var X =
(a  b) (a  1)
(a  b) a(a)
a


.
(a) (a  b  1)
(a) (a  b)(a  b)
ab
 ( a  b)  ( a  2)
(a  b) (a  1)a(a)

 ( a)  ( a  b  2)
(a) (a  b  1)(a  b)(a  b)
For k = 2, this is E X2 =
=
 ( a  b)  ( a  k )
.
 ( a)  ( a  b  k )
F I=
H K
a(a  1)
a

(a  b)( a  b  1)
ab
2
ab
.
(a  b) (a  b  1)
2
(15) If X ~ Gamma(r, ), if Y ~ Gamma(s, ), and if X and Y are independent, then
X
~ beta(r, s). Moreover, this ratio is independent of X + Y.
X Y
The random variables X and Y must be in the same units, and  will then
X
be in the reciprocal units. There are no units for r, s, or
.
X Y
(16)
If X and Y are independent normal random variables, each N(0, 1), then Z =
X
is
Y
distributed as Cauchy with density
f(z) =
1
1

 1  x2
The random variables X and Y must be in the same units. The random variable Z will
have no units.
The identical conclusion applies when the distributions are N(0, 2), but not when they
are N(, 2) with   0.

Page 10
gs2011
SUMMARY OF DISTRIBUTION FACTS

(17)
If X ~ Gamma(k, ), and if k is an integer, then
z

P[ X > x ] =
x
k t k 1 e  t
dt 
(k )
k 1
 e x
axf
i
i!
i 0
This result links the CDF of the gamma distribution to the CDF of the Poisson
distribution.
The right side is the probability that a Poisson random variable with mean x takes a
value less than k. This result links a number of facts about Poisson processes:
The inter-event times in a Poisson process with rate  have an exponential
1
distribution with mean ; this distribution is Gamma(1, ).

The sum of k independent versions of Gamma(1, ) is Gamma(k, ), as noted in
point (2).
If X above represents the total of k inter-event times, then the description [ X > x ]
says that in time interval (0, x) there are fewer than k events.
In a Poisson process with rate , the number of events in the time interval (0, x)
follows the Poisson distribution with mean x.
(18)
If X ~ beta(k, n + 1 - k), and if k and n are integers, then
z
1
P[ X > p ] =
n!
z k 1 1  z
(
k

1
)!
(
n

k
)!
p
a f
nk
dz 
k 1
FnI a1  pf
G
Hj J
Kp
j 0
j
n j
The right side is the probability that a binomial (n, p) random variable takes a value less
than k. This links the CDF of the beta distribution to the CDF of the binomial. This
result is an interesting assembly of some other facts:
Consider a sample U1, U2, …, Un from the uniform (0, 1) distribution. If the n
values are sorted as U(1)  U(2)  …  U(n) , this process forms the order statistics.
The distribution of U(k) can be shown to be beta(k, n + 1 - k).

Page 11
gs2011
SUMMARY OF DISTRIBUTION FACTS

The density of the beta(k, n + 1 - k) probability law, using z as the carrier, is
n!
nk
z k 1 1  z
I(0  z  1) .
(k  1)! (n  k )!
a f
Consider a sample U1, U2, …, Un from the uniform (0, 1) distribution. For any
value p in (0, 1), the probability that U1 is less than p is p. Thus, the total number
of these U ’s which are less than p must follow the binomial (n, p) distribution;
n
we can represent this as Y =
~ binomial(n, p).
U  pg
 Ib
i
i =1
Consider a sample U1, U2, …, Un from the uniform (0, 1) distribution. It happens
that U(k) > p if and only if the number of U’s which is less than p is 0, 1, 2,
…, k - 1.
(19)
If Z ~ N(0, 1) and U ~ 2k , and if Z and U are independent, then the distribution of
Z
U
k
is t with k degrees of freedom. The same distribution results if Z ~ N(0, 2) and
U ~ 2 2k .
If Z ~ N(, 2) and U ~ 2 2k , and if Z and U are independent, then the distribution of
Z
U
k
is called the noncentral t with k degrees of freedom and with noncentrality parameter
In this construction, the random variable Z and the parameters  and 
will have the same units. The random variable U will be the squared
units.

Page 12

.

gs2011
SUMMARY OF DISTRIBUTION FACTS

(20)
If U ~  2m and V ~ 2n , and if U and V are independent, then the distribution of
U
m
V
n
is Fm,n , the F distribution with (m, n) degrees of freedom. The same distribution results
if U ~ 2  2m and V ~ 2 2n . It should be noted that the reciprocal
V
n
U
m
has the F distribution with (n, m) degrees of freedom. The degrees of freedom numbers
are reversed.
One can also have a noncentral chi-squared in the numerator. If the assumptions are
changed to U ~ 2  2m (2) and V ~ 2 2n with U and V independent, then the distribution
of
U
m
V
n
is called the noncentral F with (m, n) degrees of freedom and noncentrality parameter 2.
(21)
If F ~ Fm,n then the random variable
1
m
1 F
n
has the beta

Fn , m I distribution.
H2 2 K
Page 13
gs2011
SUMMARY OF DISTRIBUTION FACTS

(22) Consider a population consisting of N objects of which M are special in some
sense and N - M are non-special. If a sample of size n is taken without replacement, and
if X denotes the number that are special in the sample, then
M  N  M 
n  N  n 
x  n x 
 

   x   M  x
P[ X = x ] =   
N
N 
n
M 
 
 
The possible values of X are the integers from max(0, n - (N - M)) to min(n, M). The
M
M 
M  N n
mean is E X = n
and the variance is Var X = n
. The random
1 

N
N 
N  N 1
variable X is hypergeometric, and we write X ~ HG(N; n, M). This notation reflects the
fact that n and M are exchangeable in P[ X = x ].
(23) Consider a population consisting of N objects of which M are special in some
sense and N - M are non-special. Consider sampling without replacement until exactly r
special objects are obtained. Let X denote the number of non-special objects which
precede the rth special object. Then
M N  M 
 r  1  x 

  M  ( r  1)
P[ X = x ] = 
N  ( x  r  1)
 N 
 x  r  1


The event [ X = x ] is the situation in which x + r selections are required, and in which the
final selection is special. Thus, the first x + r - 1 selections yield r - 1 which are special,
and this reflects the hypergeometric factor in P[ X = x ]. The possible values of X are the
integers from 0 to N - M. The probability law for X is described as negative
 N

 1 and Var X
hypergeometric. We have E X = r 
 M 1 
( N  1)( N  M )( M  1  r )
= r
.
( M  1) 2 ( M  2)

Page 14
gs2011
SUMMARY OF DISTRIBUTION FACTS

(24) In the case of repeated sampling from an infinite population in which the
probability of success is p, suppose that you sample until you achieve exactly r successes.
Let X be the number of failures preceding the rth success. Then
 x  r  1 r
 x  r  1 r
x
x
P[ X = x ] = 
 p (1  p ) = 
 p (1  p )
x 

 r 1 
for x = 0, 1, 2, 3, ...
This is described as the negative binomial random variable, and we can write
X ~ NegBin(r, p). It is important to define things very carefully, as there are times in
which one is keeping track of Y = the total number of trials; of course, Y = X + r.
Here E X = r
1 p
1 p
and Var X = r 2 .
p
p
This random variable reproduces in that if X1 ~ NegBin(r1, p) and X2 ~ NegBin(r2, p), and
if X1 and X2 are independent, then X1 + X2 ~ NegBin(r1 + r2, p).

Page 15
gs2011
SUMMARY OF DISTRIBUTION FACTS

(25) If X ~ N(, 2), then the random variable Y = eX is called lognormal. (Somehow
the name exponormal seems more appropriate.) Observe that Y > 0.
The density of Y is
f (y) =
alog y   f
2
1
y 2 
e

2
2
If X has units, say dollars, then Y has units of edollars . This can create
confusion.
The CDF is easily derived:
F(y) = P[ Y  y ] = P[ eX  y ] = P[ X  log y ] = 
Flog y   I
H K
where  is the cumulative distribution function of the standard normal.
The moments of Y are obtained from the moment generating function of X, which is
k
t  1 t 2  2
MX(t) = E etX = e 2 . Then E Yk = E e X = E ekX = MX (k). In particular
ch
  12  2
E Y = MX (1) = e
2
E Y2 = MX (2) = e2  2 
2
2
e j
Var Y = e2 2   e
  12 2
2
2
2
e j
2
= e 2   2   e2    = e 2    e   1
If X has units of dollars, then E Y2 and Var(Y) will have units of
 edollars 2 = e2 dollars . One can debate whether e2 dollars and e dollars are in
fact the same kind of units.

Page 16
gs2011
Download