Uploaded by Vatsalya Agarwal

ST2131 Cheatsheet

advertisement
ST2131 Probability
Tay Yong Qiang
Properties of Probability Mass Function
1. pX (xi ) ≥ 0; for i = 1, 2, · · · ;
2. pX (x) = 0; for other values of x;
∞
X
3.
pX (xi ) = 1.
Combinatorial Analysis
Binomial Theorem
n X
n
n
k n−k
(x + y) =
x y
k
k=0
i=1
No. of Integer Solutions that satisfies
x1 + x2 + · · · + xr = n (1)
Cumulative Distribution Function
FX (x) = P (X ≤ x), for x ∈ R
Positive
Integer:
n−1
positive integer-valued vectors.
r−1
Nonnegative
Integer:
n+r−1
nonnegative integer-valued vectors.
r−1
Remark:
If X has a probability mass function given by
1
1
1
1
p(1) = , p(2) = , p(3) = , p(4) =
4
2
8
8
then its c.d.f.
 is
0
a
<
1



1


4 1 ≤ a < 2
1
F (a) =
2≤a<3
2


1 3 ≤ a < 4


8
1
4≤a
8
Expectation
X
E(X) =
xpX (x)
Multinomial Expansion
(x1
+
·
··
+
xr )n
X
n
n1
n2
n
x1
x2
· · · xr r
n1 , · · · , nr
=
Axioms of Probability
x
Inclusion-Exclusion Principle
P (A1 , · · · , An )
=
n
X
P (Ai )
−
E[g(X)]
X
E[g(X)] =
g(xi )pX (xi )
i=1
X
Property of Expected Values
E[aX + b] = aE(X) + b
P (Ai1 , Ai2 )
1≤i1 <i2 ≤n
r+1
X
+ · · · + (−1)
P (Ai1 · · · Air )
1≤i1 <···<ir ≤n
n+1
+ · · · + (−1)
i
P (A1 · · · An )
Tail Sum Formula for Expectation
∞
∞
X
X
E(X) =
P (X ≥ k) =
P (X > k)
k=1
k=0
Variance
If X is a random variable with mean µ, then,
var(X) = E(X − µ)2 or
P (AB)
, P (A) > 0
P (B|A) =
var(X) = E(X 2 ) − [E(X)]2
P (A)
Note: E(X 2 ) ≥ [E(X)]2 ≥ 0
Useful Formulae
General Multiplication Rule
∞
X
λk
P (A1 , · · · , An )
λ
=e
= P (A1 )P (A2 |A1 )P (A3 |A1 A2 ) · · · P (An |A1 · · · An−1 )
k!
Conditional Probability
Bayes’ Formulae
1. P (B) = P (B|A1 )P (A1 ) + · · · + P (B|An )P (An )
P (B|Ai )P (Ai )
2. P (Ai |B) =
P (B|A1 ) + · · · + P (B|An )P (An )
P (·|A) is a Probability
Let A be an event with P (A) > 0, then
1. 0 ≤ P (B|A) ≤ 1
2. P (S|A) = 1
3. Let B1 , B2 , · · · be mutually exclusive events,
then
∞
∞
[
X
P
Bk |A =
P (Bk |A)
k=1
k=1
Independent Events
Two events A and B are said to be independent if
P (AB) = P (A)P (B), P (A|B) = P (A)
Events A1 , A2 , · · · , An are said to be independent if for every subset collection of events
Ai1 , Ai2 , · · · , Air , we have
P (Ai1 Ai2 · · · Air ) = P (Ai1 )P (Ai2 ) · · · P (Air )
Properties of Independent Events
If A and B are independent then so are
A and B c , Ac and B, Ac and B c
If A, B and C are independent then A is independent of any events formed from B and C.
A is independent of B ∪ C and B ∩ C.
Discrete Random Variables
pX is defined
( as
P (X = x) if x = x1 , x2 , · · ·
pX (x) =
0
if otherwise
k=0
∞
X
ir
i−1
i=1
1−
λ n
n
=
1
(1 − r)2
≈e
, |r| < 1
−λ
Standard
p Deviation
σX = var(X)
Properties of Standard Deviation and Variance
var(aX + b) = a2 var(X)
SD(aX + b) = |a|SD(X)
Discrete Random Variables
Bernoulli Random Variable, Be(p)
The experiment
is only performed once, and define
(
1 if it is a success
X =
0 if it is a failure
P (X = 1) = p, P (X = 0) = 1 − p
E(X) = p, var(X) = p(1 − p)
Binomial Random Variable, Bin(n, p)
Let X = number of successes in n Bernoulli(p)
trials.
For 0 ≤ k ≤ n, n k n−k
P (X = k) =
p q
k
i X
n k
n−k
P (X ≤ i) =
p (1 − p)
, i =
k
k=0
0, 1, · · · , n
E(X) = np, var(X) = np(1 − p)
Geometric Random Variable, Geom(p)
Let X = no. of trials required to obtain the first
success. Note: X takes values 1, 2, · · ·
P (X = k) = pq k−1 , k ≥ 1
1−p
1
E(X) = , var(X) =
p
p2
Negative
Binomial
Random
Variable,
NB(r, p)
Let X = no. of trials required to obtain r success.
Note: X takes values
r,
r + 1, · · ·
n − 1 r k−r
P (X = k) =
p q
k−1
r
r(1 − p)
E(X) = , var(X) =
p
p2
Note: Geom(p) = NB(1, p)
April 8, 2017
Normal Distribution
X ∼ N (µ, σ 2 )
1
2
2
e−(x−µ) /(2σ ) , −∞ < x < ∞
fX (x) = √
Z 2πσ
x
1
−(x−µ)2 /(2σ 2 )
FX (x) =
e
dx, −∞ <
√
2πσ
−∞
x<∞
E(X) = µ, Var(X) = σ 2
Properties of Distribution Function
1. Fx is a nondecreasing function, i.e., if a < b,
then FX (a) ≤ FX (b).
2. lim FX (b) = 1,
lim FX (b) = 0
b→∞
b→−∞
3. FX is right continuous.
b ∈ R,
lim FX (x) = FX (b).
+
Computing Poisson Distribution Function
λ
P (X = i)
P (X = i + 1) =
i+1
Hypergeometric Random Variable
A set of N balls, m are red and N − m are blue.
We choose n of these balls, without replacement, and define X to be the no. of red balls in
our sample.
P (X = x) =
m
x
N −m
n−x
N
n
, x = 0, 1, · · · , N
A Hypergeometric Random Variable for some
values n, N, m is denoted by H(n, N, m). Also,
nm
E(X) =
N
nm h (n − 1)(m − 1)
nm i
var(X) =
+1−
N
N −1
N
Continuous Random Variable
Exponential Random Variable
X ∼ Exp(λ),
λ>0
(
λe−λx ,
if x ≥ 0,
fX (x) =
0,
if x < 0
(
1 − e−λx , x > 0
Fx (x) =
0,
x>0
1
1
E(X) = , Var(X) =
λ
λ2
Note: The exponential variable is memoryless.
Uniform Distribution
If X is uniformly distributed over (a, b) then
X ∼ U (a, b)
 1

, a<x<b
fX (x) =
b−a

0,
otherwise


0,
x<a


x − a
, a≤x<b
FX (x) =

b−a


1,
b≤x
E(X) =
a+b
2
, var(X) =
(b − a)2
12
Marginal Distribution function of X
FX (x) = lim FX,Y (x, y)
y→∞
Useful Formulae
P (X > a, Y > b) = 1 − FX (a) − FY (b) +
FX,Y (a, b)
P (a1 < X ≤ a2 , b1 < Y ≤ b2 )
= FX,Y (a2 , b2 ) − FX,Y (a1 , b2 ) + FX,Y (a1 , b1 ) −
FX,Y (a2 , b1 )
Standard Normal Distribution
Z ∼ N (0, 1)
1
2
fZ (x) = φ(x) = √
e−x /2
2π Z
x
1
−x2 /2
FZ (x) = Φ(x) = √
e
dx
2π −∞
E(X) = 0, Var(X) = 1
Joint Discrete Random Variable
Marginal pmf of X X
pX (x) = P (X = x) =
pX,Y (x, y)
y∈R
Gamma Distribution
X ∼ Γ(α, λ) 
 λe−λx (λx)α−1

,
fX (x)
=
Γ(α)


0,
λ, α > 0Z
∞
Γ(α) =
e
−y α−1
y
0
E(X) =
α
λ
, var(X) =
Jointly Continuous Random Variable
x≥0
,
where
x<0
α
λ2
* Γ(1, λ) =Exp(λ), Γ(n) = (n − 1)!, Γ(
2
√
)= π
Cauchy Distribution
X has a cauchy distribution with parameter θ if
X ∼ Cauchy(θ), −∞ < θ < ∞
1
1
, −∞ < x < ∞
fX (x) =
π 1 + (x − θ)2
Approximation
Let X ∼Bin(n, p), where n is large, say (≥ 30).
Binomial to Poisson
Suitable if n is large and p is small such that
λ = np is moderate, where p < 0.1.
X ∼ Poisson(np)
Binomial to Normal
Suitable if np(1 − p) ≥ 10.
X − np
≈ Z.
√
npq
Remember to apply continuity correction (cc).
Bin(n, p) ≈ N (np, npq) and
Expected Value of Sums of Random Variables
X
E[X] =
X(s)p(s)
s∈S
i=1
(x,y)∈C
fX,Y (x, y) dxdy
−∞
1
Beta Distribution
X ∼ Beta(a,
 b)
1


xa−1 (1 − x)b−1 , 0 < x < 1
fX (x) =
B(a, b)


0, otherwise
Z 1
Γ(a)Γ(b)
a−1
b−1
B(a, b) =
x
(1 − x)
dx =
Γ(a + b)
0
a
ab
E(X) =
, Var(X) =
2
a+b
(a + b) (a + b + 1)
i=1
Joint pmf of X and
Z Z Y
P ((X, Y ) ∈ C) =
MarginalZ pmf of X
∞
fX (x) =
fX,Y (x, y) dy
dy
For "
random variables
X1 , X2 , · · · , Xn ,
#
n
n
X
X
E
Xi =
E[Xi ]
1
Joint Distribution function of X and Y
FX,Y (x, y) = P (X ≤ x, Y ≤ y), ∀x, y ∈ R
That is, for any
x→b
Poisson Random Variable
If X ∼ Poisson(λ), X = 0, 1, 2, · · · then
e−λ λk
, k≥0
P (X = k) =
k!
E(X) = λ, var(X) = λ
∞
∞
X
X
λk
−λ
−λ λ
Note:
P (X = k) = e
=e
e =1
k!
k=0
k=0
Distribution of g(X)
Let X be a random variable with p.d.f. fX and
g(x) be a strictly monotonic function. Then
Y = g(X) has p.d.f.
(
d g −1 (y) , y = g(x)
fX (g −1 (y)) dy
fY (y) =
0,
y 6= g(x)
Jointly Distributed Random Variable
Useful Formulae
P (a1 < X ≤ a2 , b1 < Y ≤ b2 ) =
Z a Z b
2
2
fX,Y (x, y) dydx
a1
b1
FX,Y (a, b) = P (X ≤ a, Y ≤ b) =
Z a Z b
fX,Y (x, y) dydx
−∞
−∞
fX,Y (x, y) =
∂2
∂x∂y
FX,Y (x, y)
Independent Random Variables
Jointly Continuous Random Variable
The following are equivalent:
1. X and Y are independent
2. fX,Y (x, y) = fX (x)fY (y), ∀x, y ∈ R
3. FX,Y (x, y) = FX (x)FY (y), ∀x, y ∈ R
4. ∃g, h : R → R, such that for all x, y ∈ R, we
have
fX,Y (x, y) = g(x)h(y)
Note: Neither g(x) and h(y) has to be pdf.
Sum of Independent Random Variables
If X and Y are
continuous and independent, then
Z ∞
FX (x − t)fY (t) dt
FX+Y (x) =
Z −∞
∞
fX+Y (x) =
fX (x − t)fY (t) dt
−∞
Sum of 2 Uniform Random Variables
If X = Y 
∼ U (0, 1) are independent, then

0<x≤1
x,
fX+Y = 2 − x,
1 < x < 2.


0, otherwise
Sum of 2 Gamma Random Variables
If X ∼ Γ(α, λ) and Y ∼ Γ(β, λ) are independent,
then
X + Y ∼ (α + β, λ)
ST2131 Probability
Tay Yong Qiang
Sum of n Exponential Random Variables
If X1 = X2 = · · · = Xn ∼ Exp(λ) are independent, since
Γ(1, λ) =Exp(λ), we have
X1 + X2 + · · · + Xn ∼ Γ(n, λ).
Sum of 2 Normal Random Variables
2
2
If X ∼ N (µ1 , σ1
) and Y ∼ N (µ2 , σ2
) are independent,
2
2
X + Y ∼ N (µ1 + µ2 , σ1
+ σ2
)
Sum of Independent Discrete Random Variables
Sum of 2 Poisson Random Variables
If X ∼ Poisson(λ) and Y ∼ Poisson(µ) are id then,
X + Y ∼ Poisson(λ + µ).
Sum of 2 Binomial Random Variables
If X ∼ Bin(n, p) and Y ∼ Bin(m, p) are independent,
X + Y ∼ Bin(n + m, p)
Sum of 2 Geometric Random Variables
If X ∼ Geom(p) and Y ∼ Geom(p) are independent,
X + Y ∼ NB(2, p)
Note: Geom(p) = NB(1, p)
Conditional Distribution: Continuous Case
Conditional pdf of X given Y = y
fX,Y (x, y)
fX|Y (x|y) =
, fY (y) > 0
fY (y)
Conditional cdf of X given Y = yZ
x
FX|Y (x|y) = P (X ≤ x|Y = y) =
−∞
0:
1:
2:
3:
fX|Y (t|y) dt
Variance of a Sum
n
n
X
X
var
Xk =
var(Xk ) + 2
k=1
k=1
X
cov(Xi , Xj )
1≤i<j≤n
Variance of a Sum under Independence
Let X1 , X2 , · · · , Xn be independent random variables, then
n
n
X
X
var
Xk =
var(Xk )
k=1
k=1
Sample Variance
Let X1 , · · · , Xn be independent and identical, E(Xi ) = µ,
n
X
Xi
be the sample mean
var(Xi ) = σ 2 . Let X̄ =
i=1 n
n
X
(Xi − X̄)2
n−1
var(X̄) =
Properties of Expectation
If a ≤ x ≤ b, then a ≤ E(X) ≤ b.
Proposition 7.1
If X and Y are jointly discrete with joint pmf pX,Y , then
XX
E[g(X, Y )] =
g(x, y)pX,Y (x, y)
x
If X and Y are jointly continuous with joint pdf pX,Y , then
Z ∞ Z ∞
E[g(X, Y )] =
g(x, y)fX,Y (x, y) dxdy
−∞
Corollaries of Proposition 7.1
1. If g(x, y) ≥ 0 then E[g(X, Y )] ≥ 0.
2. E[g(X, Y ) + h(X, Y )] = E[g(X, Y )] + E[h(X, Y )].
3. E[g(X) + h(Y )] = E[g(X)] + E[h(Y )].
4. Monotone Property:
If X ≤ Y , then E(X) ≤ E(Y ).
5. E(X + Y ) = E(X) + E(Y )
6. E(a1 X1 + · · · + an Xn ) = a1 E(X1 ) + · · · + an E(Xn )
Uniqueness Property
If X and Y have m.g.f. MX and MY and
MX (t) = MY (t), ∀t ∈ (−h, h), then X and Y have the same
distribution.
MGF of Common Random Variables
is called the sample variance.
n
2
Things to Note!
Binomial Random Variable
X ∼ Bin(n, p), MX (t) = (1 − p + pet )n
Geometric Random Variable
pet
X ∼ Geom(p), MX (t) =
1 − (1 − p)et
E[S ] = σ
2
Correlation Coefficient
The correlation coefficient of random variables X and Y is denoted by ρ(X, Y ), where
cov(X, Y )
and −1 ≤ ρ(X, Y ) ≤ 1
ρ(X, Y ) = p
var(X)var(Y )
Conditional Expectation
X
E[X|Y = y] =
xpX|Y (x|y) = h(y)
Zx∞
E[X|Y = y] =
xfX|Y (x|y) dx = h(y)
Expectation by Conditioning
X

E(X|Y = y)P (Y = y)


y
E[X] = E[E(X|Y )] = Z ∞


E(X|Y = y)fY (y) dy

−∞
Probabilities by Conditioning
P (A)
= E(IA ) = E[E(IA |Y )]
X

P (A|Y = y)P (Y = y)


y
= Z ∞


P (A|Y = y)fY (y) dy

−∞
Conditional Variance
var(X|Y ) = E[(X − E[X|Y ])2 |Y ]
var(X) = E[var(X|Y )] + var(E[X|Y ])
Moment Generating
Functions
X
tx

e pX (x)


x
tX
Z
MX (t) = E[e
]=
∞
tx


e fX (x) dx

−∞
Covariance
cov(X, Y ) = E(X − µX )(Y − µY )
If cov(X, Y ) = 0 then X and Y are uncorrelated.
Using MGF to find moments
Alternative Formulae
cov(X, Y ) = E(XY ) − E(X)E(Y )
= E[X(Y − µY )]
= E[Y (X − µX )]
Multiplicative Property
If X and Y are independent, then
MX+Y (t) = MX (t)MY (t)
(n)
MX (0) = E(X n )
1. Remember to c.c whenever we try to approximate from discrete to continuous, this applies even when using Central Limit
Theorem!
ZZ
2. When dealing with
fX,Y (x, y) dA, you can try to
(x,y)∈A
Poisson Random Variable
X ∼ Poisson(λ), MX (t) =exp(λ(et − 1))
Uniform Random Variable
eβt − eαt
X ∼ U (α, β), MX (t) =
(β − α)t
Exponential Random Variable
λ
X ∼ Exp(λ), MX (t) =
, t<λ
λ−t
Normal Random Variable
X ∼ N (µ, σ 2 ), MX (t) = exp(µt +
σ2
Properties of Standard Normal
P (Z ≥ 0) = P (Z ≤ 0) = 0.5
−Z ∼ N (0, 1)
P (Z ≤ x) = 1 − P (Z > x)
P (Z ≤ −x) = P (Z ≥ x)
−µ
If Y ∼ N (µ, σ 2 ), then X = Y σ
∼ N (0, 1)
If X ∼ N (0, 1), then Y = aX + b ∼ N (b, a2 ), a, b ∈ R
Bernoulli Random Variable
X ∼ Be(p), MX (t) = 1 − p + pet
i=1 j=1
j=1
−∞
∂x
∂y
J(x, y) is now in x and y, next convert it to u and v with the
inverses found.
−∞
i=1
Results
∂y
∂h
y
Properties of Covariance
1. var(X) = cov(X, X)
2. cov(X, Y ) = cov(Y, X)
m
n
m
n X
X
X
X
ai bj cov(Xi , Yj )
3. cov
ai Xi ,
bj Yj =
i=1
Determine the support of U and V
Find the Jacobian determinant of g and h
Find x and y in terms of u and v, i.e. find the ‘inverses’
Substitute x, y and J(x, y) in fU,V (u, v)
∂g
∂g
∂x
J(x, y) = ∂h
Covariance of Independent Variables
If X and Y are independent then cov(X, Y ) = 0.
S2 =
Joint Dist of Functions of Random Variables
Suppose X and Y have joint pdf fX,Y and suppose
U = g(X, Y ), V = h(X, Y ), then
fU,V (u, v) = fX,Y (x, y)|J(x, y)|−1
Step
Step
Step
Step
Expectation Independent Variables
If X and Y are independent, then for any g, h : R → R,
E[g(X)h(Y )] = E[g(X)]E[h(Y )]
April 8, 2017
σ 2 t2
2
)
Joint Moment Generating Functions
Let X1 , · · · , Xn be n random variables, then
M (t1 , · · · , tn ) = E[et1 X1 +···+tn Xn ]
Independent MGFs
If X1 , · · · , Xn are independent variables then,
M (t1 , · · · , tn ) = MX1 (t1 )MX2 (t2 ) · · · MXn (tn )
Limit Theorems
Markov’s Inequality
Let X be a nonnengative variable, then
E(X)
P (X ≥ a) ≤
, a>0
a
Chebyshev’s Inequality
Let X be a variable with E(X) = µ, var(X) = σ 2 , then for
a > 0,
σ2
P (|X − µ| ≥ a) ≤
a2
Weak Law of Large Numbers
Let X1 , · · · , Xn be independent and identically distributed
variables, with E(X) = µ, then ∀ > 0,
P
X1 + · · · + Xn
n
− µ ≥ → 0 as n → ∞
Strong Law of Large Numbers
Let X1 , X2 , · · · be independent and identically distributed variables with E(Xi ) = µ, then
X1 + X2 + · · · + Xn
P
n
→ µ as n → ∞ or
n
o
X1 + X2 + · · · + Xn
lim
=µ
n→∞
n
Central Limit Theorem
Let X1 , X2 , · · · be independent and identically distributed variables with mean µ and variance σ 2 , then
X1 + · · · + Xn − nµ
≈Z
√
σ n
think of it as volume bounded above by fX,Y and below by A.
Download