Probability Cheat Sheet Poisson Distribution Distributions notation P oisson (λ) Unifrom Distribution cdf e−λ notation U [a, b] x−a for x ∈ [a, b] cdf b−a 1 pdf for x ∈ [a, b] b−a 1 expectation (a + b) 2 1 variance (b − a)2 12 tb e − eta mgf t (b − a) story: all intervals of the same length on the distribution’s support are equally probable. Gamma Distribution notation pdf pmf expectation λk k! mgf exp λ et − 1 n X i=1 notation kθ2 variance 1 (1 − θt) for t < θ ! n n X X Xi ∼ Gamma ki , θ −k i=1 Geometric Distribution cdf pmf ind. sum i=1 story: the sum of k independent exponentially distributed random variables, each of which has a mean of θ (which is equivalent to a rate parameter of θ−1 ). notation mgf k 1 − (1 − p) for k ∈ N k−1 (1 − p) p for k ∈ N 1 expectation p 1−p variance p2 pet mgf 1 − (1 − p) et story: the number X of Bernoulli trials needed to get one success. Memoryless. n X ! λi N µ, σ 2 √ 1 2πσ 2 ind. sum 1 λ 1 λ2 λ λ−t k X Xi ∼ Gamma (k, λ) ∼ exp minimum k X ! λi FX ∗ = FX E (X ∗ ) = E (X) Expectation 1 Z E (X) = k X n i p (1 − p)n−i i i=0 n pi (1 − p)n−i i cdf pmf X ∗ (p)dp 0 0 Z Z E (X) = −∞ ∞ Z E (X) = xfX xdx −∞ ∞ E (g (X)) = g (x) fX xdx −∞ E (aX + b) = aE (X) + b Variance Var (X) = E X 2 − (E (X))2 Var (X) = E (X − E (X))2 Var (aX + b) = a2 Var (X) expectation np Standard Deviation variance np (1 − p) σ (X) = i=1 mgf i=1 1 − p + pet n story: the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p. (1 − FX (t)) dt 0 1 exp µt + σ 2 t2 2 ! n n n X X X Xi ∼ N µi , σi2 i=1 ∞ FX (t) dt + Z Bin(n, p) −(x−µ)2 /(2σ 2 ) 2 The function X ∗ : [0, 1]→ R for which for any p ∈ [0, 1], FX X ∗ (p)− ≤ p ≤ FX (X ∗ (p)) i=1 story: the amount of time until some specific event occurs, starting from now, being memoryless. notation e Quantile Function i=1 µ σ for x ≥ 0 Binomial Distribution Standard Normal Distribution p Var (X) Covariance Cov (X, Y ) = E (XY ) − E (X) E (Y ) Cov (X, Y ) = E ((X − E (x)) (Y − E (Y ))) Var (X + Y ) = Var (X) + Var (Y ) + 2Cov (X, Y ) N (0, 1) Z x 2 1 Φ(x) = √ e−t /2 dt 2π −∞ 2 1 √ e−x /2 pdf 2π 1 expectation λ 1 variance λ2 t2 mgf exp 2 story: normal distribution with µ = 0 and σ = 1. cdf λe story: describes data that cluster around the mean. notation G (p) −λx mgf story: the probability of a number of events occurring in a fixed period of time if these events occur with a known average rate and independently of the time since the last event. expectation 1 − e−λx for x ≥ 0 i=1 θk xk−1 e−θx Ix>0 Γ (k) Z ∞ Γ (k) = xk−1 e−x dx variance ind. sum Xi ∼ P oisson Normal Distribution kθ cdf variance λ pdf exp (λ) expectation variance ind. sum notation pdf · e−λ for k ∈ N λ 0 mgf k X λi i! i=0 Gamma (k, θ) expectation Exponential Distribution Basics Correlation Coefficient Comulative Distribution Function ρX,Y = FX (x) = P (X ≤ x) Probability Density Function Z ∞ FX (x) = fX (t) dt −∞ Z ∞ fX (t) dt = 1 −∞ d fX (x) = FX (x) dx Cov (X, Y ) σX , σY Moment Generating Function MX (t) = E etX (n) E (X n ) = MX (0) MaX+b (t) = etb MaX (t) Joint Distribution Conditional Density PX,Y (B) = P ((X, Y ) ∈ B) FX,Y (x, y) = P (X ≤ x, Y ≤ y) fX,Y (x, y) fX|Y =y (x) = fY (y) fX (x) P (Y = n | X = x) fX|Y =n (x) = P (Y = n) Z x FX|Y =y = fX|Y =y (t) dt Joint Density ZZ fX,Y (s, t) dsdt ZBx Z y FX,Y (x, y) = fX,Y (s, t) dtds −∞ −∞ Z ∞ Z ∞ fX,Y (s, t) dsdt = 1 PX,Y (B) = −∞ −∞ PX (B) = PX,Y (B × R) PY (B) = PX,Y (R × Y ) Z a Z ∞ FX (a) = fX,Y (s, t) dtds −∞ −∞ Z b Z ∞ FY (b) = fX,Y (s, t) dsdt −∞ Z ∞ fX (s) = fX,Y (s, t)dt Z −∞ ∞ fY (t) = fX,Y (s, t)ds −∞ Joint Expectation Z meaning ∞ E (X | Y = y) = xfX|Y =y (x) dx lim sup An = {An i.o.} = R2 m=1 n=m lim inf An ⊆ lim sup An (lim sup An )c = lim inf Acn (lim inf An )c = lim sup Acn n→∞ P (lim inf An ) = lim P n→∞ P (X ≤ x, Y ≤ y) = P (X ≤ x) P (Y ≤ y) FX,Y (x, y) = FX (x) FY (y) fX,Y (s, t) = fX (s) fY (t) E (XY ) = E (X) E (Y ) Var (X + Y ) = Var (X) + Var (Y ) Independent events: P (A ∩ B) = P (A) P (B) ∞ \ P (A ∩ B) P (A | B) = P (B) P (B | A) P (A) bayes P (A | B) = P (B) An meaning −−→ ! ! An n=m P (An ) < ∞ ⇒ P (lim sup An ) = 0 n=1 And if An are independent: ∞ X P (An ) = ∞ ⇒ P (lim sup An ) = 1 Convergence in Probability meaning p Xn − →X lim P (|Xn − X| > ε) = 0 n→∞ Var (X) ε2 Chernoff ’s inequality Let X ∼ Bin(n, p); then: n→∞ 2 P (X − E (X) > tσ (X)) < e−t Simpler result; for every X: P (X ≥ a) ≤ MX (t) e−ta /2 Jensen’s inequality for ϕ a convex function, ϕ (E (X)) ≤ E (ϕ (X)) Miscellaneous Lp Xn −−→ X E (Y ) < ∞ ⇐⇒ lim E (|Xn − X|p ) = 0 E (X) = ⇒ a.s. − −− → p −−→ P (X > n) (X ∈ N) Convolution ⇒ p D ∞ X X ∼ U (0, 1) ⇐⇒ − ln X ∼ exp (1) −−→ ⇒ P (Y > n) < ∞ (Y ≥ 0) n=0 Lp q>p≥1 ∞ X n=0 n→∞ ⇓ D −−→ →c If Xn −→ c then Xn − p If Xn − → X then there exists a subsequence a.s. nk s.t. Xnk − −− →X Laws of Large Numbers If Xi are i.i.d. r.v., p weak law Xn − → E (X1 ) strong law Xn − −− → E (X1 ) n=1 notation P (|X − E (X)| ≥ ε) ≤ a.s. Xn − −− →X P lim Xn = X = 1 An n=m ∞ \ Convergence Conditional Probability notation Lq Borel-Cantelli Lemma ∞ X Chebyshev’s inequality Relationships ∞ [ E (|X|) t Convergence in Lp An lim inf An = {An eventually} = P (lim sup An ) = lim P n→∞ n=1 m=1 n=m ∞ [ ϕ (x, y) fX,Y (x, y) dxdy Independent r.v. ∞ [ P (|X| ≥ t) ≤ lim Fn (x) = F (x) • ∀ε∃N ∀n > N : P (|Xn − X| < ε) > 1 − ε • ∀εP (lim sup (|Xn − X| > ε)) = 0 ∞ X • ∀ε P (|Xn − X| > ε) < ∞ (by B.C.) Sequences and Limits ∞ \ Xn −→ X Criteria for a.s. Convergence E (E (X | Y )) = E (X) P (Y = n) = E (IY =n ) = E (E (IY =n | X)) Inequalities Markov’s inequality D Almost Sure Convergence Conditional Expectation ZZ E (ϕ (X, Y )) = meaning notation −∞ Marginal Densities notation −∞ −∞ Marginal Distributions Convergence in Distribution a.s. For ind. X, Y, Z =X +Y: Z ∞ fX (s) fY (z − s) ds fZ (z) = −∞ Kolmogorov’s 0-1 Law If A is in the tail σ-algebra F t , then P (A) = 0 or P (A) = 1 Ugly Stuff cdf distribution: Z t ofk Gamma θ xk−1 e−θk dx (k − 1)! 0 Central Limit Theorem Sn − nµ D −→ N (0, 1) √ σ n If tn → t, then Sn − nµ P ≤ tn → Φ (t) √ σ n This cheatsheet was made by Peleg Michaeli in January 2010, using LATEX. version: 1.01 comments: peleg.michaeli@math.tau.ac.il