Model-independent inequalities Law of large numbers Central limit theorem Fundamental Tools - Probability Theory IV MSc Financial Mathematics The University of Warwick October 1, 2015 MSc Financial Mathematics Fundamental Tools - Probability Theory IV 1 / 14 Model-independent inequalities Law of large numbers Central limit theorem Model-independent inequalities The standard route of stochastic modelling is to first postulate probability distributions on some real world phenomena, and then calculate probabilities or expectations under the stated assumptions. In reality, we seldom have a good view on the “correct” distributions driving the random quantities of interest. If we are only provided very limited information on a random variable (eg its mean or variance only), can we still talk about probabilities of certain events? MSc Financial Mathematics Fundamental Tools - Probability Theory IV 2 / 14 Model-independent inequalities Law of large numbers Central limit theorem Markov’s inequality Even if only the mean of X is known, surprisingly something could still be said about P(X > a). Theorem (Markov’s inequality) If X is a non-negative random variable with finite expectation, then for any a > 0 we have E(X ) P(X > a) 6 . a Of course this inequality is only useful for large a, and then P(X > a) is the probability of some tail events. ) Otherwise if E(X a > 1, then this inequality will just be saying the probability is bounded above by some number larger than 1, which is always true anyway. MSc Financial Mathematics Fundamental Tools - Probability Theory IV 3 / 14 Model-independent inequalities Law of large numbers Central limit theorem Chebyshev’s inequality Theorem (Chebyshev’s inequality) If X is a random variable with mean µ and finite variance σ 2 , then for any k > 0 we have σ2 P(|X − µ| > k) 6 2 . k This inequality is the immediate consequence if we replace X by (X − µ)2 and a by k 2 in the Markov’s inequality. Chebyshev’s inequality is a statement about probability of “two-sided deviation”. What can we say about P(X − µ > k), similar to the case of Markov’s inequaility? 2 Observe that P(X − µ > k) 6 P(|X − µ| > k) 6 σk 2 which gives an upper bound of such “one-sided deviation” probability. But a better upper bound can indeed be derived. See problem sheet. MSc Financial Mathematics Fundamental Tools - Probability Theory IV 4 / 14 Model-independent inequalities Law of large numbers Central limit theorem Example Suppose it is known that the number of items produced in a factory during a week is a random variable with mean 50. 1 What can be said about the probability that this week’s production will be no less than 75? 2 If the variance of a week’s production is known to equal 25, then what can be said about the probability that this week’s production will be strictly between 40 and 60? Sketch of solution: Let X be the number of items produced a week. E(X ) 75 = 50 75 1 By Markov’s inequality, P(X > 75) 6 2 By Chebyshev’s inequality, P(|X − 50| > 10) 6 = 32 . σ2 102 = Thus P(40 < X < 60) = 1 − P(|X − 50| > 10) > 1 − MSc Financial Mathematics 25 1 100 = 4 . 1 3 4 = 4. Fundamental Tools - Probability Theory IV 5 / 14 Model-independent inequalities Law of large numbers Central limit theorem Law of large numbers If we are told that a random variable X is having an expected value of µ, how should we interpret it? The standard way is to adopt a frequency interpretation: if we draw a very large sample of random numbers (X1 , X2 , ..., Xn ) all of which have the same distribution as X , we expect the sample mean n to be very close to µ. Sn = X1 +X2 +···+X n This turns out to be a correct mathematical result: law of large numbers suggests we always have Sn converges to µ as n → ∞. There is a caveat (which you don’t need to worry about now): Sn is a random variable depending on the samples drawn. What does “Sn converges to µ” mean actually? MSc Financial Mathematics Fundamental Tools - Probability Theory IV 6 / 14 Model-independent inequalities Law of large numbers Central limit theorem Weak/strong law of large numbers Theorem (Law of large numbers) Let X1 , X2 , ... be a sequence of i.i.d random variables with finite common mean n be the sample mean. µ. Let Sn = X1 +X2 +···+X n Weak law of large numbers: For any > 0, we have lim P(|Sn − µ| > ) = 0. n→∞ Strong law of large numbers: P( lim Sn = µ) = 1. n→∞ The difference between the two versions is subtle: Weak law: the probability that Sn deviates from µ is getting smaller and smaller when n increases. Strong law: Sn always converges to µ (with probability one). MSc Financial Mathematics Fundamental Tools - Probability Theory IV 7 / 14 Model-independent inequalities Law of large numbers Central limit theorem Central limit theorem (CLT) Theorem (Central limit theorem) Let X1 , X2 , ... be a sequence of i.i.d random variables with common mean n µ and finite variance σ 2 . Let Sn = X1 +X2 +···+X be the sample mean. n Sn −µ √ Then σ/ n converges to a standard normal random variable in distribution. Equivalently, Z x u2 Sn − µ 1 √ 6x = √ e − 2 du. lim P n→∞ σ/ n 2π −∞ Law of large numbers suggests Sn is close to µ when n is large. CLT further provides a description of the random fluctuation of Sn around µ: Sn approximately has a N(µ, σ 2 /n) normal distribution no matter what distribution Xi has! MSc Financial Mathematics Fundamental Tools - Probability Theory IV 8 / 14 Model-independent inequalities Law of large numbers Central limit theorem Application of CLT: statistical inference There is a population which we only know its variance σ 2 but not the mean µ. We would like to find a possible range for µ. We draw n samples X1 , X2 , ..., Xn from the population and compute x−µ √ is approximately N(0, 1). the sample mean as x. By CLT, σ/ n We can invoke some results regarding Z ∼ N(0, 1), eg P(−1.96 < Z < 1.96) = 0.95 for Z ∼ N(0, 1). This gives P(−1.96 < x−µ √ σ/ n < 1.96) = 0.95 and in turn σ σ P x − 1.96 √ < µ < x + 1.96 √ n n = 0.95. Hence there is 95% chance that the interval x ± 1.96 √σn contains the true (but unknown) mean µ. This is called the 95% confidence interval for µ. MSc Financial Mathematics Fundamental Tools - Probability Theory IV 9 / 14 Model-independent inequalities Law of large numbers Central limit theorem Application of CLT: generation of N(0, 1) random numbers You will see in the MSc course that generation of N(0, 1) random variables is crucial in computational finance. Although N(0, 1) random variable generator comes with most numerical libraries, the theory behind the implementation is not entirely straightforward. A quick and dirty way is to simulate 12 independent U[0, 1] random P12 variables and consider Z = i=1 Ui − 6. Then µ = E(U) = 1/2 and σ 2 = var(U) = 1/12. 1 12 P12 U −µ i=1 √ i σ/ 12 P12 U −12µ i = i=1√12σ = Z approximately has a N(0, 1) distribution if we consider n = 12 being “large”. MSc Financial Mathematics Fundamental Tools - Probability Theory IV 10 / 14 Model-independent inequalities Law of large numbers Central limit theorem Application of CLT: normal approximation Suppose X ∼ Bin(100, 0.6) is a binomial random variable and we are required to calculate P(X 6 55). P55 Formally we need to find k=0 Ck100 (0.6)k (0.4)100−k . But there is no quick formula evaluating this sum. Instead we can use approximation: Recall that a binomial random variable X ∼ Bin(n, p) can be viewed as a sum Pn of i.i.d Bernoulli random variables with successful rate p. X = i=1 Hi where Hi ∼ Ber (p) for all i. For a Bernoulli random variable, µ = E(Hi ) = p and σ 2 = var(Hi ) = p(1 − p). When n is large, CLT asserts that 1 n Pn i=1√Hi −µ σ/ n = X√−nµ nσ = √X −np np(1−p) approximately has N(0, 1) distribution. Hence X approximately has a N(np, np(1 − p)) distribution. MSc Financial Mathematics Fundamental Tools - Probability Theory IV 11 / 14 Model-independent inequalities Law of large numbers Central limit theorem Normal approximation: continuity correction A binomial distribution is discrete, but a normal distribution is continuous. Apply ±0.5 adjustment when using normal approximation for better accuracy. In other words, write P(X = k) as P(k − 0.5 < X < k + 0.5) when we move from discrete exact distribution to continuous normal approximation. MSc Financial Mathematics Fundamental Tools - Probability Theory IV 12 / 14 Model-independent inequalities Law of large numbers Central limit theorem Example Given X ∼ Bin(100, 0.6), approximate the following probabilities by CLT and express the answers in terms of Φ(·) the cdf of a standard normal random variable. 1 P(X 6 55); 2 P(55 6 X < 60); 3 P(X = 70). Sketch of solution: X can be approximated as N(np, np(1 − p)), i.e N(60, 24). Then 1 P(X 6 55) ≈ P(X < 55.5) = P Z < 55.5−60 √ = Φ(−0.9186). 24 59.5−60 2 P(55 6 X < 60) ≈ P(54.5 < X < 59.5) = P 54.5−60 √ √ < Z < = 24 24 Φ(−0.1021) − Φ(−1.1227). 3 P(X = 70) ≈ P(69.5 < X < 70.5) = P 69.5−60 √ 24 <Z < 70.5−60 √ 24 = Φ(2.1433) − Φ(1.9392). MSc Financial Mathematics Fundamental Tools - Probability Theory IV 13 / 14 Model-independent inequalities Law of large numbers Central limit theorem Normal approximation to Poisson distribution A Poisson distribution can be considered as a limiting case of Bin(n, p) when we set p = λn and n → ∞. Thus normal approximation generally works on X ∼ Poi(λ) as well, which is approximated as N(λ, λ). Example: If X ∼ Poi(7), approximate P(X > 9). Sketch of solution: We adopt the normal approximation X ∼ N(7, 7). We apply continuity correction again to work out the estimate: 8.5 − 7 P(X > 9) ≈ P(X > 8.5) = P Z > √ 7 = P(Z > 0.5669) = 1 − Φ(0.5669). MSc Financial Mathematics Fundamental Tools - Probability Theory IV 14 / 14