Week 1: Review of Probability Week 1: Review of Probability Key Concepts: Week 1: Review of Probability Key Concepts: Sample space and Events Rules of Probability Conditional probability and Independence Computing Probabilities Week 1: Review of Probability Key Concepts: Sample space and Events Rules of Probability Conditional probability and Independence Computing Probabilities Probability: Definition and Properties (i) 0 ≤ P(A) ≤ 1 (ii) P(S) = 1 (iii) if A1 , A2 , · · · , and B are mutually exclusive, i.e. Ai ∩ Aj = ∅, for i 6= j, then P(∪Ai ) = X P(Ai ) i (iv) Not hard to see that P(∅) = 0 (v) P(A ∪ B) = P(A) + P(B) − P(A ∩ B) If A1 , A2 , · · · An are n events, then n P(U1 Ai ) = X P(Ai ) − X P(Ai ∩ Aj )+ i<j X i<j<k P(Ai ∩ Aj ∩ Ak ) − · · · + (−) n−1 n P(∩1 Ai ) Week 1: Review of Probability Week 1: Review of Probability Conditional Probability & Independence: Week 1: Review of Probability Conditional Probability & Independence: The conditional probability of A given B, P(A∩B) P(A|B) = P(B) Events A and B are independent if P(A ∩ B) = P(A)P(B) Week 1: Review of Probability Conditional Probability & Independence: The conditional probability of A given B, P(A∩B) P(A|B) = P(B) Events A and B are independent if P(A ∩ B) = P(A)P(B) Discrete Probability Week 1: Review of Probability Conditional Probability & Independence: The conditional probability of A given B, P(A∩B) P(A|B) = P(B) Events A and B are independent if P(A ∩ B) = P(A)P(B) Discrete Probability List of values x1 , x2 , · · · , xn Associated probabilities: P(x1 ), P(x2 ), · · · , P(xn ) (i) P(x P i) ≥ 0 (ii) i P(xi ) = 1 Examples: Bernoulli(p), Binomial(n,p), hypergeometric, and Poisson. Week 1: Review of Probability Conditional Probability & Independence: The conditional probability of A given B, P(A∩B) P(A|B) = P(B) Events A and B are independent if P(A ∩ B) = P(A)P(B) Continuous Probability A continuous probability distribution is completely defined via the probability density function, f . The p.d.f satisfies (i) f ≥ 0 R∞ (ii) −∞ f (x)dx = 1 (iii) The probabilities are calculated by Discrete Probability List of values x1 , x2 , · · · , xn Associated probabilities: P(x1 ), P(x2 ), · · · , P(xn ) (i) P(x P i) ≥ 0 (ii) i P(xi ) = 1 Examples: Bernoulli(p), Binomial(n,p), hypergeometric, and Poisson. b Z P(a, b) = f (x)dx a (iv) Note: for continuous distribution, p(x) = 0 for all x. So, P(a, b) = P[a, b] Examples: Normal, Uniform, exponential ,and Pareto. Week 1: Review of Probability Week 1: Review of Probability Continuous Probability on R2 Week 1: Review of Probability Continuous Probability on R2 Given a p.d.f f (x1 , x2 ) for any a < b; c < d d Z b Z f (x1 , x2 )dx1 dx2 P(a < X1 < b, c < X2 < d) = c a More generally, for densities on Rn n for any AR⊂ RR , P(A) = f (x1 , x2 , · · · , xn )dx1 dx2 · · · dxn A Random Variables Week 1: Review of Probability Continuous Probability on R2 Given a p.d.f f (x1 , x2 ) for any a < b; c < d d Z b Z f (x1 , x2 )dx1 dx2 P(a < X1 < b, c < X2 < d) = c a More generally, for densities on Rn n for any AR⊂ RR , P(A) = f (x1 , x2 , · · · , xn )dx1 dx2 · · · dxn A Random Variables A real valued random variable X is a function from a probability space (Ω, P) to R X : Ω 7→ R The distribution of X is the probability on R given by P(X ∈ A) = P(X −1 (A)) = P({ω : X (ω) ∈ A}) Typically, all that matters is the distribution of X . The underlying sample space is not very relevant. Week 1: Review of Probability Continuous Probability on R2 Cumulative Distribution function Given a p.d.f f (x1 , x2 ) for any a < b; c < d d Z b Z f (x1 , x2 )dx1 dx2 P(a < X1 < b, c < X2 < d) = c a More generally, for densities on Rn n for any AR⊂ RR , P(A) = f (x1 , x2 , · · · , xn )dx1 dx2 · · · dxn A If P(X = x) > 0, then F (x) has a jump at x with jump size equal to p(x) If P is continuous then F is continuous Rt If f is pdf then F (t) = −∞ dx and f (x) = Random Variables A real valued random variable X is a function from a probability space (Ω, P) to R X : Ω 7→ R The distribution of X is the probability on R given by P(X ∈ A) = P(X The function F : R 7→ [0, 1] defined by F (t) = P(X ≤ t), is called the (Cumulative) Distribution Function of X F is increasing int, right continuous, limt→−∞ F (t) = 0; limt→∞ F (t) = 1 −1 (A)) = P({ω : X (ω) ∈ A}) Typically, all that matters is the distribution of X . The underlying sample space is not very relevant. dF (t) dt |x Week 1: Review of Probability Continuous Probability on R2 Cumulative Distribution function Given a p.d.f f (x1 , x2 ) for any a < b; c < d d Z b Z f (x1 , x2 )dx1 dx2 P(a < X1 < b, c < X2 < d) = c a More generally, for densities on Rn n for any AR⊂ RR , P(A) = f (x1 , x2 , · · · , xn )dx1 dx2 · · · dxn A If P(X = x) > 0, then F (x) has a jump at x with jump size equal to p(x) If P is continuous then F is continuous Rt If f is pdf then F (t) = −∞ dx and f (x) = Random Variables A real valued random variable X is a function from a probability space (Ω, P) to R X : Ω 7→ R The distribution of X is the probability on R given by P(X ∈ A) = P(X The function F : R 7→ [0, 1] defined by F (t) = P(X ≤ t), is called the (Cumulative) Distribution Function of X F is increasing int, right continuous, limt→−∞ F (t) = 0; limt→∞ F (t) = 1 −1 (A)) = P({ω : X (ω) ∈ A}) Typically, all that matters is the distribution of X . The underlying sample space is not very relevant. dF (t) dt |x Moment generating function The function MX (t) := E(etX ) is called the moment generating Function (MGF) of X . Moment generating function, if it exists in an interval containing 0, huniquelyi determines the distribution. E(X ) = dMX (t) dt t=0 MGF may not always exist. E(e−itX ) is called the characteristics function. This always exists and has nice properties. Week 1: Review of Probability Week 1: Review of Probability Joint Distribution: Discrete Week 1: Review of Probability Joint Distribution: Discrete If (X , Y ) are two discrete random variables, their joint probability are described by Joint Probability distribution P(xi , yj ) = P(X = xi , Y = yj ) P for all xi , yj ; P(xi , yj ) > 0; i,j P(xi , yj ) = 1 P PX (xi ) = j P(xi , yj ) is called the marginal distribution of X Week 1: Review of Probability Joint Distribution: Discrete If (X , Y ) are two discrete random variables, their joint probability are described by Joint Probability distribution P(xi , yj ) = P(X = xi , Y = yj ) P for all xi , yj ; P(xi , yj ) > 0; i,j P(xi , yj ) = 1 P PX (xi ) = j P(xi , yj ) is called the marginal distribution of X Joint distribution: continuous Week 1: Review of Probability Joint Distribution: Discrete If (X , Y ) are two discrete random variables, their joint probability are described by Joint Probability distribution P(xi , yj ) = P(X = xi , Y = yj ) P for all xi , yj ; P(xi , yj ) > 0; i,j P(xi , yj ) = 1 P PX (xi ) = j P(xi , yj ) is called the marginal distribution of X Joint distribution: continuous The joint density of (X , Y ) is given by the joint pdf f (x, y ) R R P((X , Y ) ∈ A) = f (x, y )dxdy A The marginal density is given by Z ∞ fX (x) = f (x, y)dy −∞ Conditional density of Y given X = x is f (y|x) = f (x, y) fX (x) Week 1: Review of Probability Independence Joint Distribution: Discrete If (X , Y ) are two discrete random variables, their joint probability are described by Joint Probability distribution P(xi , yj ) = P(X = xi , Y = yj ) P for all xi , yj ; P(xi , yj ) > 0; i,j P(xi , yj ) = 1 P PX (xi ) = j P(xi , yj ) is called the marginal distribution of X Joint distribution: continuous The joint density of (X , Y ) is given by the joint pdf f (x, y ) R R P((X , Y ) ∈ A) = f (x, y )dxdy A The marginal density is given by Z ∞ fX (x) = f (x, y)dy −∞ Conditional density of Y given X = x is f (y|x) = f (x, y) fX (x) X and Y are independent if and only if f (x, y) = fX (x)fY (x) for all x, y If X and Y are independent then ρ(X , Y ) = 0. The converse is not true If X and Y are independent V (X + Y ) = V (X ) + V (Y ) If X1 and X2 are independent then MX1 +X2 (t) = MX1 (t) × MX2 (t) Week 1: Review of Probability Independence Joint Distribution: Discrete If (X , Y ) are two discrete random variables, their joint probability are described by Joint Probability distribution P(xi , yj ) = P(X = xi , Y = yj ) P for all xi , yj ; P(xi , yj ) > 0; i,j P(xi , yj ) = 1 P PX (xi ) = j P(xi , yj ) is called the marginal distribution of X Joint distribution: continuous The joint density of (X , Y ) is given by the joint pdf f (x, y ) R R P((X , Y ) ∈ A) = f (x, y )dxdy A The marginal density is given by Z X and Y are independent if and only if f (x, y) = fX (x)fY (x) for all x, y If X and Y are independent then ρ(X , Y ) = 0. The converse is not true If X and Y are independent V (X + Y ) = V (X ) + V (Y ) If X1 and X2 are independent then MX1 +X2 (t) = MX1 (t) × MX2 (t) Inequalities Markov’s Ineqaulity if X ≥ 0 then ∞ fX (x) = f (x, y)dy −∞ Conditional density of Y given X = x is f (x, y) f (y|x) = fX (x) P(X ≥ a) ≤ E(X ) a Chebyshev’s inequality P|(X − E(X )| > a) ≤ V (x) a2 Hoeffding’s inequality if X1 , X2 , · · · , Xn are i.i.d Ber(p) then P(X̄n − p ≥ a) ≤ 2e−2na 2 Week 1: Limit Theorems Week 1: Limit Theorems Key Concepts: Week 1: Limit Theorems Key Concepts: Notions of convergence Law of Large Number (LLN) Central Limit Theorem (CLT) Week 1: Limit Theorems Key Concepts: Notions of convergence Law of Large Number (LLN) Central Limit Theorem (CLT) Notions of Convergence Let (Ω, P) be a probability space. For each ω ∈ Ω, we define a sequence of random variable X1 (ω), X2 (ω), · · · We know, what it means to say a sequence of numbers an converges to a X1 (ω), X2 (ω), · · · are functions. A natural definition would be Xn → X if Xn (ω) → X (ω) for all ω Note that the underlying probability plays no role here. We need a notion of convergence that uses P Week 1: Limit Theorems Key Concepts: Notions of convergence Convergence in Probability We say that Xn → X in probability if Law of Large Number (LLN) Central Limit Theorem (CLT) Notions of Convergence Let (Ω, P) be a probability space. For each ω ∈ Ω, we define a sequence of random variable X1 (ω), X2 (ω), · · · We know, what it means to say a sequence of numbers an converges to a X1 (ω), X2 (ω), · · · are functions. A natural definition would be Xn → X if Xn (ω) → X (ω) for all ω Note that the underlying probability plays no role here. We need a notion of convergence that uses P P(|Xn − X | > ) = 0, as n → ∞ or conversely, if for every positive , P(|Xn − X | ≤ ) = 1, as n → ∞. This means that by taking n sufficiently large, one can achieve arbitrarily high probability that Xn is arbitrarily close to X . Week 1: Limit Theorems Week 1: Limit Theorems Convergence in Distribution: Week 1: Limit Theorems Convergence in Distribution: Xn → X in distribution if, FXn (t) → FX (t) for all t such that FX is continuous at t The restriction on to points of continuity makes this definition look artifitial (will discuss this in a moment) Note: If the limit X has a continuous distribution, then FX is continuous and we have FXn (t) → FX (t) for all t Now, suppose Yn ∼ N(0, σn ); and σn → 0 P(−a < Yn < a) = P( −a σn < Z < a σn ) Week 1: Limit Theorems Convergence in Distribution: Xn → X in distribution if, FXn (t) → FX (t) for all t such that FX is continuous at t The restriction on to points of continuity makes this definition look artifitial (will discuss this in a moment) Note: If the limit X has a continuous distribution, then FX is continuous and we have FXn (t) → FX (t) for all t Now, suppose Yn ∼ N(0, σn ); and σn → 0 P(−a < Yn < a) = P( −a σn < Z < (why?), a σn )→1 Week 1: Limit Theorems Convergence in Distribution: Xn → X in distribution if, FXn (t) → FX (t) for all t such that FX is continuous at t The restriction on to points of continuity makes this definition look artifitial (will discuss this in a moment) Note: If the limit X has a continuous distribution, then FX is continuous and we have FXn (t) → FX (t) for all t Now, suppose Yn ∼ N(0, σn ); and σn → 0 P(−a < Yn < a) = P( −a σn < Z < (why?), since σan → ∞ a σn )→1 so we would like to say that the distribution of Yn converges to the probability concentrated at 0 But Fn (0) = 0.5 for all n and F (0) = 1. So Fn (0) does not converege to F (0) F is not continuous at 0. At all other t, Fn (t) → F (t) In the topics we cover F will typically be Normal distribution. So we do not have to worry about discontinuous points. There is none. A useful tool for showing convergence in distribution is the following: If MXn (t) → MX , then Xn converges to X . Week 1: Limit Theorems Convergence in Distribution: Xn → X in distribution if, FXn (t) → FX (t) for all t such that FX is continuous at t The restriction on to points of continuity makes this definition look artifitial (will discuss this in a moment) Note: If the limit X has a continuous distribution, then FX is continuous and we have Now, suppose Yn ∼ N(0, σn ); and σn → 0 a σn )→1 so we would like to say that the distribution of Yn converges to the probability concentrated at 0 But Fn (0) = 0.5 for all n and F (0) = 1. So Fn (0) does not converege to F (0) F is not continuous at 0. At all other t, Fn (t) → F (t) A useful tool for showing convergence in distribution is the following: If MXn (t) → MX , then Xn converges to X . Example: Poisson Approximation to Bionamial If npn → λ then Bin(n, pn ) convergies to Poisson(λ) MGF of Bin(n, Pn ); Mn (t) = (1 − pn + pn et )n FXn (t) → FX (t) for all t P(−a < Yn < a) = P( −a σn < Z < (why?), since σan → ∞ In the topics we cover F will typically be Normal distribution. So we do not have to worry about discontinuous points. There is none. write pn = λ/n, so that t n Mn (t) = (1 − λ n (1 − e )) Since (1 − xn )n → e−x ; Mn (t) → exp(λ(et − 1)) Week 1: Limit Theorems Week 1: Limit Theorems Law of Large Numbers: Week 1: Limit Theorems Law of Large Numbers: Theorem (WLLN) Let X1 , X2 , · · · , Xi · · · Xn be a sequence of independent random variables with E(Xi ) = µ and P Var (Xi ) = σ 2 . Let X̄n = n−1 ni=1 Xi Then, for any > 0, P(|X̄n − µ| > ) → 0 as n → ∞ Proof. We first find E(X̄n ) and Var (X̄n ): E(X̄n ) = n 1X E(Xi ) = µ n i=1 Since the Xi ’s are independent, Var (X̄n ) = n 1 X σ2 Var (Xi ) = 2 n i=1 n The desired result now follows immediately from Chebyshev’s inequality, which states that P(|X̄n − µ| > ) ≤ ∞ Var (X̄n ) 2 = σ2 n2 → 0, as n → Week 1: Limit Theorems Law of Large Numbers: Theorem (WLLN) Let X1 , X2 , · · · , Xi · · · Xn be a sequence of independent random variables with E(Xi ) = µ and P Var (Xi ) = σ 2 . Let X̄n = n−1 ni=1 Xi Then, for any > 0, P(|X̄n − µ| > ) → 0 as n → ∞ Proof. We first find E(X̄n ) and Var (X̄n ): E(X̄n ) = n 1X E(Xi ) = µ n i=1 Since the Xi ’s are independent, Var (X̄n ) = n 1 X σ2 Var (Xi ) = 2 n i=1 n The desired result now follows immediately from Chebyshev’s inequality, which states that P(|X̄n − µ| > ) ≤ ∞ Var (X̄n ) 2 = σ2 n2 → 0, as n → Problems 1. Let X1 , X2 , · · · be a sequence of independent random variables with E(Xi ) = µ and P Var (Xi ) = σi2 . Show that if n−2 ni=1 σi2 → 0, then X̄ → µ in probability. 2. Let Xi be as in Problem 1 but with E(Xi ) = µi P and n−1 ni=1 µi → µ. Show that X̄ → µ in probability. Week 1: Limit Theorems Law of Large Numbers: Theorem (WLLN) Let X1 , X2 , · · · , Xi · · · Xn be a sequence of independent random variables with E(Xi ) = µ and P Var (Xi ) = σ 2 . Let X̄n = n−1 ni=1 Xi Then, for any > 0, P(|X̄n − µ| > ) → 0 as n → ∞ Problems 1. Let X1 , X2 , · · · be a sequence of independent random variables with E(Xi ) = µ and P Var (Xi ) = σi2 . Show that if n−2 ni=1 σi2 → 0, then X̄ → µ in probability. 2. Let Xi be as in Problem 1 but with E(Xi ) = µi P and n−1 ni=1 µi → µ. Show that X̄ → µ in probability. Solution X1 , X2 , · · · , is a sequence of independent random variables with E(Xi ) = µ, Var (Xi ) = σi2 . If P 2 n−2 σi → 0, show that WLLN holds. Proof. We first find E(X̄n ) and Var (X̄n ): Pn n 1X E(Xi ) = µ E(X̄n ) = n i=1 1 E(X̄ ) = µ Var (X̄ ) = Var Xi Pn n = By Chebyshev Since the Xi ’s are independent, 1 Var (X̄n ) = 2 n n X i=1 σ2 Var (Xi ) = n The desired result now follows immediately from Chebyshev’s inequality, which states that P(|X̄n − µ| > ) ≤ ∞ Var (X̄n ) 2 = σ2 n2 → 0, as n → P(|X̄ − µ| > ) ≤ 1 2 Pn 2 1 σi n2 The last term goes to zero by assumption. 2 1 σi n2 Week 1: Limit Theorems Week 1: Limit Theorems Example (Monte Carlo): Week 1: Limit Theorems Example (Monte Carlo): R Suppose we want to evaluate 01 f (x)dx which is difficult to evaluate analytically R Note: 01 f (x)dx = E(f ) with respect to uniform distribution Simulate x1 , x2 , · · · , xn (large n) from U(0, 1) distribution By WLLN Z 1 n 1X f (x)dx f (xi ) ≈ n 1 0 This is called Monte Carlo integration Week 1: Limit Theorems Example (Monte Carlo): R Suppose we want to evaluate 01 f (x)dx which is difficult to evaluate analytically R Note: 01 f (x)dx = E(f ) with respect to uniform distribution Simulate x1 , x2 , · · · , xn (large n) from U(0, 1) distribution By WLLN Z 1 n 1X f (x)dx f (xi ) ≈ n 1 0 This is called Monte Carlo integration Homework 1-Problem 19 & 20 Find Monte Carlo approximation to R1 cos2πxdx 0 Find an estimate of the standard deviaiton of the approximation Week 1: Limit Theorems Week 1: Limit Theorems Central Limit Theorem(CLT): Week 1: Limit Theorems Central Limit Theorem(CLT): Theorem (CLT) Let X1 , X2 , · · · , be a sequence of independent random variables with E(Xi ) = 0, Var (Xi ) = σ 2 with common distribution F .( that is X1 , X2 , · · · , i.i.d ∼ F ). Assume that F has MGF M(t) defined in an interval around 0. P n Let Sn = 1 Xi . Then for all −∞ < x < ∞, P Sn √ ≤x σ n → Φ(x) as n → ∞ where Φ(x) = P(Z ≤ x) is the CDF of standard normal Week 1: Limit Theorems Central Limit Theorem(CLT): Dividing the numerator and denominator of Sn √ by n, we get σ n Theorem (CLT) Let X1 , X2 , · · · , be a sequence of independent random variables with E(Xi ) = 0, Var (Xi ) = σ 2 with common distribution F .( that is X1 , X2 , · · · , i.i.d ∼ F ). Assume that F has MGF M(t) defined in an interval around 0. P n Let Sn = 1 Xi . Then for all −∞ < x < ∞, P Sn √ ≤x σ n √ P √ → Φ(x) as n → ∞ Extensions of CLT The central limit theorem can be proved in greater generality √ Note: sd(Sn ) = nσ, so σS√n n has mean 0 and s.d 1 ! → Φ(x) as n → ∞ If E(Xi ) = µ we can apply CLT to Xi − µ(This has expected value 0) and so in this case, where Φ(x) = P(Z ≤ x) is the CDF of standard normal nX̄n ≤x σ P n(X̄n − µ) ≤x σ ! → Φ(x) as n → ∞ Typically √ we use CLT to get an approximation of n(X̄n −µ) P ≤ x σ How good is the approximation? If F is symmetric and has tails that die rapidly then the approximation is good In case when F is highly skewed or has tail that go to 0 very slowly, we need a large n to get a good approximation. Week 1: Limit Theorems Week 1: Limit Theorems Proof. Central Limit Theorem(CLT): Week 1: Limit Theorems Proof. Central Limit Theorem(CLT): Proof. M(t) = E(etX ), MSn (t) = E(etSn ) = [M(t)]n Let ! Zn = Sn √ , MZ (t) n σ n Thus, =E e t S√ n σ n √ = MSn (t/σ n) n t MZn (t) = M √ σ n We want to show that , as n → ∞, this goes to et We will make use of the following result 1+ b + an n n 2 /2 b → e as an → 0 So we also need to express MZn (t) in this form (how?). Week 1: Limit Theorems Proof. Central Limit Theorem(CLT): Proof. M(t) = E(etX ), MSn (t) = E(etSn ) = [M(t)]n Let ! Zn = Sn √ , MZ (t) n σ n Thus, =E e t S√ n σ n √ = MSn (t/σ n) n t MZn (t) = M √ σ n We want to show that , as n → ∞, this goes to et We will make use of the following result 1+ b + an n n 2 /2 b → e as an → 0 So we also need to express MZn (t) in this form (how?). Note M(0) = 1; Since E(X ) = 0, M 0 (0) = 0, E(X 2 ) = σ 2 , we have M 00 (0) = σ 2 By Taylor expansion 0 M(s) = M(0) + sM (0) + M(s) = 1 + s2 σ2 + s3 s2 00 s3 000 M (0) + M (0) 2 6 M 000 (0) Week 1: Limit Theorems Proof. Central Limit Theorem(CLT): Proof Continued: Proof. tX M(t) = E(e ), MSn (t) = Let Zn = Sn √ , MZ (t) n σ n Thus, =E M(s) = 1 + E(etSn ) n S√ n σ n √ = MSn (t/σ n) e t = [M(t)] ! with s = σ t √ We want to show that , as n → ∞, this goes to et We will make use of the following result 1+ b + an n n M 2 /2 b → e as an → 0 So we also need to express MZn (t) in this form (how?). Note M(0) = 1; Since E(X ) = 0, M 0 (0) = 0, E(X 2 ) = σ 2 , we have M 00 (0) = σ 2 By Taylor expansion 0 M(s) = 1 + s2 σ2 + s3 s2 00 s3 000 M (0) + M (0) 2 6 M 000 (0) t √ σ n where n = =1+ t2 + n 2n t 000 M (0) 6n3/2 t3 M 000 (0). 6n 3/2 t M σ√ can be n where n = Show that M(s) = M(0) + sM (0) + n n t MZn (t) = M √ σ n s2 2 s3 000 σ + M (0) 2 6 M t √ σ n written as =1+ t 2 /2 + an n with an → 0 We then have " #n n t t 2 /2 + an t 2 /2 M = 1+ →e √ n σ n as required. Week 1: Limit Theorems Week 1: Limit Theorems Problems Week 1: Limit Theorems Problems 17. Suppose that a measurement has mean µ and variance σ 2 = 25. Let X̄ be the average of n such independent measurements. How large should n be so that P{|X̄ − µ| < 1} = .95? Week 1: Limit Theorems Problems 17. Suppose that a measurement has mean µ and variance σ 2 = 25. Let X̄ be the average of n such independent measurements. How large should n be so that P{|X̄ − µ| < 1} = .95? ( √ P{|X̄ −µ| < 1} = P ( ≈P |Z | < | √ ) n(X̄ − µ) n |< 5 5 √ ) n = .95 5 But we also know that P {|Z | < 1.96 = .95} √ n 5 1.96 = 2 2 n = (1.96) × 5 Week 1: Limit Theorems Problems 17. Suppose that a measurement has mean µ and variance σ 2 = 25. Let X̄ be the average of n such independent measurements. How large should n be so that P{|X̄ − µ| < 1} = .95? ( √ P{|X̄ −µ| < 1} = P ( ≈P |Z | < | √ ) n(X̄ − µ) n |< 5 5 √ ) n = .95 5 But we also know that P {|Z | < 1.96 = .95} √ n 5 1.96 = 2 2 n = (1.96) × 5 Problems Week 1: Limit Theorems Problems 17. Suppose that a measurement has mean µ and variance σ 2 = 25. Let X̄ be the average of n such independent measurements. How large should n be so that Problems P{|X̄ − µ| < 1} = .95? Xi weight of ith package ( √ P{|X̄ −µ| < 1} = P | √ ) n(X̄ − µ) n |< 5 5 E(Xi ) = 15, σ = 10 Total weight, T = √ ) n |Z | < = .95 5 P100 1 Xi ( ≈P P(T > 1700) = P But we also know that P {|Z | < 1.96 = .95} √ n 5 1.96 = 2 2 n = (1.96) × 5 T − 1500 1700 − 1500 > 10 × 10 10 × 10 ≈ P (Z > 2) Week 1: Limit Theorems Week 1: Limit Theorems Problems Week 1: Limit Theorems Problems Week 1: Limit Theorems Problems X1 , X2 , · · · , Xn ∼ U(0, 1) M = max(X1 , X2 , · · · , Xn ) 2 P{1 − M < t} = P{M > 1 − t} = 1 − (1 − t) P{1 − M < t } = P{n(1 − M) < t} n 1 − (1 − As n → ∞, 1 − 1 − t n n t n ) n → 1 − e−t n(1 − M) → exp(1) Week 1: Limit Theorems Problems Problems X1 , X2 , · · · , Xn ∼ U(0, 1) M = max(X1 , X2 , · · · , Xn ) 2 P{1 − M < t} = P{M > 1 − t} = 1 − (1 − t) P{1 − M < t } = P{n(1 − M) < t} n 1 − (1 − As n → ∞, 1 − 1 − t n n t n ) n → 1 − e−t n(1 − M) → exp(1)