Statistical Inference and Regression Analysis: Stat-GB.3302.30, Stat-UB.0015.01 Professor William Greene Stern School of Business IOMS Department Department of Economics Part 2 – A Expectations of Random Variables 2-A Expectations of Random Variables 2-B Covariance and Correlation 2-C Limit Results for Sums Part 2 – Expectations of Random Variables 3/124 Expected Value of a Random Variable Weighted average of the values taken by the variable Discrete E[X] all values taken by X x Pr ob(X x) Continuous E[X] xf (x)dx (Density equals zero outside the range of x.) 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 4/124 Discrete Uniform X = 1,2,…,J Prob(X = x) = 1/J E[X] = 1/J + 2/J + … + J/J = J(J+1)/2 * 1/J = (J+1)/2 Expected toss of a die = 3.5 (J=6) Expected sum of two dice = 7. Proof? 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 5/124 Poisson () e x E[X] x 0 x x! e x x 1 x (drop zero term) x! e x 1 x 1 x 1 factor out and (x 1)! x! (x 1)! e z = z 0 (let z = x-1; z goes from 0 to ) z! = (probabilities sum to 1) 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 6/124 Poisson (5) 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 7/124 The St. Petersburg Paradox Coin toss game. If first heads comes up on nth toss, you win $2n Entry fee to play a game is $C Expected value of the game = E[Win] -C + (½)21 + (½)222 + … + (½)k2k Game has infinite value. Noone would pay very much to play. Why not? 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 8/124 Continuous Random Variable E[X] xf (x)dx, 'support of x' = {x : f(x) > 0} 1 Continuous uniform: f(x) = I(x [a, b]) ba b 1 1 x2 b E[x] x dx a a ba ba 2 b 2 a 2 (b a)(b a) b a (the midpoint) 2(b a) 2(b a) 2 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 9/124 Gamma Random Variable P e x x P 1 f (x) , x 0, 0, P 0 (P) Gamma function: (P)= e t t P 1dt. 0 Results: (P) (P 1)(P 1) = (P-1)! (Show by integration by parts) (1/2) = (We will prove this later) Implication: We know Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 90 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 500000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type (P) . Expected value is P P x P P (P 1) P x P 1 xe x dx e x dx P 1 0 (P) (P) e x x P 1dx P (P) 0 Mushroom and Onion 9.2% f (x)dx 1. Frequency 0 0 Listing Percent So, 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 10/124 Gamma Function: (1/2)= (1/ 2) e t dt t 0 1 1 2 t 12 e t dt 0 Change of variable from t to z= t so t = z 2 and dt=2zdz (1/ 2) e 0 z2 z 2zdz 2 1 0 z2 e dz Change of variable from x to z 2 so z=x/ 2 and dz=1/ 2dx (1/ 2) 2 e 0 x 2 /2 1 x 2 /2 2 1/ 2dx 2 dx 0 e 2 2 1 x 2 /2 1 2 1 e dx so (1/ 2) 2 0 2 2 2 2 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 11/124 Expected Value of a Linear Translation Z = aX+b E[Z] = aE[X] + b Proof is trivial using the definition of the expected value and the fact that the density integrates to 1 to have E[b]=b. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 12/124 Normal(,) Variable From the definition of the random variable, is the mean. Proof in Rice (119) uses the linear translation. 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 Percent 900000 0 1000000 60 800000 40 Listing Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Frequency Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Listing Pie Chart of Percent vs Type Mushroom and Onion 9.2% If X ~ N[0,1], X + ~ N(,) Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 13/124 Cauchy Random Variables f(x)=(1/) 1/(1+x2) Mean does not exist. No higher moments exist. If X~N[0,1] and Y ~ N[0,1] then X/Y has the Cauchy distribution. Many applications obtain estimates of interesting quantities as ratios of estimators that are normally distributed. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 14/124 Cauchy Random Sample 1016 F req u en cy 762 508 254 0 - 562. 137 - 410. 638 - 259. 138 - 107. 639 43. 860 195. 359 346. 858 498. 357 Z 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 15/124 Expected Value of a Function of X Pepperoni 21.8% Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 90 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 500000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Meatball Garlic 5.0% 2.3% 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing Pie Chart of Percent vs Type Mushroom and Onion 9.2% Percent Frequency Listing Y=g(X) One to one case E[Y] = expected value of Y(X) – find the distribution of the new variable E[g(X)] = x g(x)f(x) will equal E[Y] Many to one case – similar argument. Proceed without the transformation of the random variable. E[g(X)] is generally not equal to g(E[X]) if g(X) is not linear Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 16/124 Linear Translation Z = aX+b E[Z] = E[aX+b] E[Z] = aE[X] + b Proof is trivial using the definition of the expected value and the fact that the density integrates to 1 to E[b]=b. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 17/124 Powers of x - Moments Moment = E[Xk] for positive integer x Raw moment: E[Xk] Central moment: E[(X – E[X])k] Standard notation E[Xk] = k E[(X – E[X])k] = k Mean = 1 = 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 18/124 Variance as a g(X) Variance = E[(X – E[X])2] Standard deviation = square root of variance is usually more interesting Discrete Var[X] = 2 (x ) Pr ob(X x) x Continuous 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 Percent 900000 0 1000000 60 800000 40 Listing Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% (x ) 2 f (x)dx Frequency Var[X] = 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 19/124 Variance of a Translation: Y = a + bX Plain 32.5% 800000 800000 500000 90 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 500000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 600000 300000 100000 Scatterplot of Listing vs IncomePC Normal - 95% CI 700000 700000 600000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 900000 Percent Scatterplot of Listing vs IncomePC 900000 400000 Mushroom 16.2% Standard deviation of Y = |b|S.D.(X) Frequency Sausage 5.8% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepper and Onion 7.3% Var[bX] = b2Var[X] Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Var[a] = 0 Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 20/124 Shortcut Var[X] = E[X2] - {E[X]}2 Uniform (0,1) 1 E[X] 2 E[X ] 2 3 x x 21dx 3 1 0 1 0 1 3 2 1 1 1 Var[X] 3 2 12 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 21/124 Bernoulli Prob(X=1)=; Prob(X=0)=1- E[X] = 0(1- ) + 1 = E[X2] = 02(1- ) + 12 = Var[X] = - 2 = (1- ) 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 22/124 Poisson: Factorial Moment e x e x x(x 1) x 2 x(x 1) x! x! e x . (x 2)! E[X X] E[X(X 1)] x 0 2 = x 2 z e z 2 e Now let z = x-2, sum is z 0 2 z 0 = 2 z! z! E[X 2 ] E[X] 2 so E[X 2 ] 2 Var[X] E[X 2 ] {E[X]}2 2 2 Variance of Poisson variable equals the mean 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 23/124 Normal Moments x = Normal[,] = N[0,1] + 2 1 1 x -μ f(x) = exp - , - < x < + σ 2π 2 σ Mean = Standard deviation 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 24/124 Gamma Random Variable P e x x P 1 f (x) , x 0, 0, P 0 (P) Expected value is P P x P P (P 1) P x P 1 xe x dx e x dx P 1 0 0 (P) (P) (P) Expected square is P 2 x P 1 P x P 1 P (P 2) x e x dx e x dx 0 0 (P) (P) (P) P 2 (P 1)(P 1) (P 1)P(P) P(P 1) . 2 2 2 (P) (P) P(P 1) P P Variance 2 2 2 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 25/124 Chi Squared [1] Plain 32.5% 800000 800000 500000 90 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 500000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 600000 300000 100000 Scatterplot of Listing vs IncomePC Normal - 95% CI 700000 700000 600000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 900000 = 1 Percent Scatterplot of Listing vs IncomePC 900000 400000 Mushroom 16.2% Variance = P/ 2 = (½)/[(½)2] = 2 = P/ = (½)/(½) Frequency Sausage 5.8% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepper and Onion 7.3% Mean Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Chi squared [1] = Gamma(½, ½) P = ½, =½ Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 26/124 Higher Moments Skewness: 3. 0 for all symmetric distributions (not just the normal) Standardized measure 3/3 Kurtosis: 4. Standardized 4/4. Compare to normal, 3 Degree of excess = 4/4 – 3. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 27/124 Symmetric and Skewed Distributions 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 28/124 Kurtosis: t[5] vs. Normal 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% = 3, Excess = 0 = 6/(k-4); for t[5] = 6/(5-4) = 6. 0 1000000 60 800000 40 Listing Kurtosis of normal(0,1) Excess Kurtosis of t[k] 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 29/124 Approximations for g(X) g(X) = continuous function g() exists Continuous first derivative not equal to zero at Taylor series approximation around mu g(X) = g() + g’()(X- ) + ½ g’’()(X- )2 (+ higher order terms) 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 30/124 Approximation to the Mean g(X) ~ g() + g’()(X- ) + ½ g’’()(X - )2 E[g(X)] ~ E[approximation] = g() + 0 + ½ g’’() E[(X - )2] = g() + ½ g’’() 2 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 31/124 N[, ]. g(X)=exp(X). = exp( + 2/2). = exp() + ½ exp() 2 True mean Approximation: Example: =0, s = 1, True mean = exp(.5) Approximation = exp(0) + .5*exp(0)*1 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% = 1.6487 = 1.5000 0 1000000 60 800000 40 Listing Example: 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 32/124 Delta method: Var[g(X)] Use linear approximation g(X) ~ g() + g’()(X - ) Var[g(X)] ~ Var[approximation] = [g’()]22 Example: Var[X2] ~ (2)22 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 33/124 Delta Method – x ~ N[, 2] Approximate Pepperoni 21.8% Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 90 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 500000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Meatball Garlic 5.0% 2.3% 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing Pie Chart of Percent vs Type Mushroom and Onion 9.2% N[0,1], exact mean and variance are exp(.5) =1.648 and exp(1)(exp(1)-1) = 4.671. Approximations are 1.5 and 1 (!) Percent = exp() + ½ exp() 2 = [exp()]2 2 E*[y] V*[y] Frequency = exp( + ½ 2) = exp(2 + 2)[exp(2) – 1] E[y] Var[y] Listing y = g(x) = exp(x) ~ lognormal Exact Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 34/124 Moment Generating Function Let g(X) = exp(tX) M(t) = E[exp(tX)] = the moment generating function for random variable X. M(t) x e p ( x) or tx etx f ( x)dx If M(t) exists in a neighborhood of zero then M(t) <== > p(x) or f(x) One to one correspondence between probability distribution and moment generating function. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 35/124 MGF Bernoulli 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Percent Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Frequency Pie Chart of Percent vs Type Mushroom and Onion 9.2% E[exp(tX)] = (1- )exp(0t) + exp(1t) = (1 - ) + exp(t). Listing P(x) = (1-) for x=0 and for x=1 Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 36/124 MGF Poisson tx e e x! M(t) x 0 e x 0 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 Percent 900000 x! 0 1000000 60 800000 40 Listing Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball x 0 exp[(e 1)] Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% x t Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% e t ax ea x! x 0 M(t) = e e e x x! -λ λe t e t Frequency Result x 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 37/124 MGF Gamma E[exp(tx)] 0 P x P 1 e x tx e dx ( P ) ( t ) x P 1 e x dx ( P ) 0 P ( P ) P ( P ) ( t ) t P 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% P 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 38/124 MGF Normal Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 90 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 500000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% This is the moment generating function for N[,2] Frequency MY(t) for Y = X + is exp(t)MX(t) = exp[t + ½ 2t2] Listing MX(t) for X ~ N[0,1] is exp(½ t2) Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 39/124 Generating the Moments rth derivative of M(t) evaluated at t = 0 gives the rth raw moment, r’ M(r)(t) = drM(t)/dtr |t=0 = equals rth raw moment. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 40/124 Poisson MGF = exp((exp(t) – 1)); M(0)=1 M’(t) = M(t) * exp(t); M’(0)= = M’(0)=1 1 = 2’ = E[X2] = M’’(0) = M’(0) exp(0) + exp(0)M(0) = 2 + Variance = 2’ - 2 = M(t) 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 41/124 Useful Properties MGF of X = MX(t) and y = a+bX then MY(t) for y is exp(at)MX(bt) For independent X and Y, MX+Y (t) = is MX(t)MY(t) The sequence of moments does not uniquely define the distribution 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 42/124 Side Results Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 90 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 500000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% Frequency Listing MGF MX(t) = E[exp(tx)] does not always exist. Characteristic function E[exp(itx)] always exists. Used to prove central limit theorems Cumulant generating function logMX(t) is sometimes useful. Cumulants are functions of moments. First cumulant is the mean, second is the variance. Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – B Covariance and Correlation Part 2 – Expectations of Random Variables 44/124 Covariance Random variables X,Y with joint discrete distribution p(X,Y) or continuous density f(x,y). Covariance = E({X – E[X]}{Y-E[Y]}) = E[XY] – E[X] E[Y]. (Note, Covariance of X,X = Var[X]. Connection to joint distribution and covariation 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 45/124 Correlation and Covariance Cov(X,Y) Var[X]Var[Y] Correlation Coefficient = = By Cauchy - Schwarz inequality, -1 1. 1 if and only if Y = a + bX with b > 0 1 if and only if Y = a + bX with b < 0 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 46/124 Correlated Populations 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 47/124 Correlated Variables X1 and X2 are independent with means 0 and standard deviations 1. Y = aX1 + bX2. Choose a and b such that X1 and Y have means 0, standard deviation 1 and correlation rho. Var[Y] = a 2 + b2 = 1 Cov[X1,Y] = a = . b = sqr(1 – 2) 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 48/124 Conditional Distributions Pepperoni 21.8% Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 90 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 500000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Meatball Garlic 5.0% 2.3% 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing Pie Chart of Percent vs Type Mushroom and Onion 9.2% Percent Frequency Listing f(y|x) = f(y,x) / f(x) Conditional distribution of y given a realization of x Conditional mean = mean of the conditional random variable = regression function Conditional variance = variance of conditional random variable = scedastic function Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 49/124 Litigation Risk Analysis Form probability tree for decisions and outcomes Determine conditional expected payoffs (gains or losses) Choose strategy to optimize expected value of payoff function (minimize loss or maximize (net) gain. Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 90 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 500000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% Frequency Listing Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 50/124 Litigation Risk Analysis: Using Probabilities to Determine a Strategy P(Upper path) = P(Causation|Liability,Document)P(Liability|Document)P(Document) = P(Causation,Liability|Document)P(Document) = P(Causation,Liability,Document) = .7(.6)(.4)=.168. (Similarly for lower path, probability = .5(.3)(.6) = .09.) Two paths to a favorable outcome. Probability = (upper) .7(.6)(.4) + (lower) .5(.3)(.6) = .168 + .09 = .258. How can I use this to decide whether to litigate or not? 800000 800000 500000 400000 Mushroom 16.2% Plain 32.5% 90 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 500000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 600000 300000 100000 Scatterplot of Listing vs IncomePC Normal - 95% CI 700000 700000 600000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 Percent 900000 0 1000000 60 800000 40 Listing Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 Frequency Boxplot of Listing Listing Pepper and Onion 7.3% Suppose the cost to litigate = $1,000,000 and a favorable outcome pays $3,000,000. What should you do? C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Percent Meatball Garlic 5.0% 2.3% Listing Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 51/124 Joint Normal Random Variables 2 2 x x 1 1 x x y y f(x,y)= exp 2 2 2 2(1 ) x 2x y 1 x y Correlation of X and Y. x y Covariance of X and Y 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% y y y 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 52/124 Conditional Normal 2 y [ ( / )(x )] 1 1 y y x x f (y | x) exp 2 2 y 2 1 2 1 y y 2 1 1 y|x exp 2 y|x y|x 2 Conditional Mean Function = E[y|x] = y ( y / x )(x x ) x Conditional Variance Function = Var[y|x] = 2y (1 2 ) (not a function of x) 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 53/124 Y and Y|X Y X 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% X 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 54/124 Application: Conditional Expected Profits and Risk You must decide how many copies of your self published novel to print . Based on market research, you believe the following distribution describes X, your likely sales (demand). x P(X=x) 25 .10 (Note: Sales are in thousands. Convert your final result to 40 .30 dollars after all computations are done by multiplying your 55 .45 final results by $1,000.) 70 .15 Printing costs are $1.25 per book. (It’s a small book.) The selling price will be $3.25. Any unsold books that you print must be discarded (at a loss of $1.25/copy). You must decide how many copies of the book to print, 25, 40, 55 or 70. (You are committed to one of these four – 0 is not an option.) A. What is the expected number of copies demanded. B. What is the standard deviation of the number of copies demanded. C. Which of the four print runs shown maximizes your expected profit? Compute all four. D. Which of the four print runs is least risky – i.e., minimizes the standard deviation of the profit (given the number printed). Compute all four. E. Based on C. and D., which of the four print runs seems best for you? 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 55/124 X = Sales (Demand) x P(X=x) 25,000 .10 40,000 .30 55,000 .45 70,000 .15 A. Expected Value = x P(X=x) all values of x = .1(25,000) + .3(40,000) + .45(55,000) + .15(70,000) = 49,750 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 56/124 B. Standard Deviation Get the Variance First 2 (x - E[x]) 2 P(X=x) all values of x = .1(25,000 - 49,750) 2 .3(40,000 - 49,750) 2 + .45(55,000 - 49,750) 2 + .15(70,000 - 49,750) 2 = 163,687,500 Standard Deviation = square root of variance. = 163,687,500 = 12,794.041 There is a shortcut 2 all values of x x 2 P(X=x) 2 2 (x - E[x]) 2 P(X=x) all values of x .1(25,0002 ) .3(40,000 2 ) + .45(55,0002 ) + .15(70,0002 ) - 49,750 2 = 163,687,500 = 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 57/124 x P(X=x) Revenue per book = $3.25 25,000 .10 Cost per book = $1.25 40,000 .30 Profit per book sold = $2.00/book 55,000 .45 70,000 .15 Expected Profit | Print Run = 25,000 is $2 25,000 = $50,000 (Demand is guaranteed to be at least 25,000) Expected Profit | Print Run = 40,000 is $2 .9 40,000 + .1 ($2 25,000 - $1.25 15,000) = $75,125 (If print 40,000, .9 chance sell all and .1 chance sell only 25,000) Expected Profit | Print Run = 55,000 is $2 .6 55,000 + .1 ($2 25,000 - $1.25 30,000) + .3 ($2 40, 000 $1.25 15, 000) = $85,625 Expected Profit | Print Run=70,000 is $2 .15 70,000 + .1 ($2 25,000 - $1.25 45,000) + .3 ($2 40, 000 $1.25 30000) + .45 ($2 55, 000 $1.25 15000) = $74,187,50 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 58/124 Expected Profit Given Print Run 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 59/124 Variances Print Run = 25,000. Variance = 0. Std. Dev. = 0 Demand will be at least 25,000. Print Run = 40,000. Variance = .1*[(2* 25000 1.25*15000) 75,125]2 (if demand is only 25,000) 75,125)]2 .9*[(2* 40000) (if demand is 40,000) Standard Deviation = square root = $14625 Print Run = 55,000. Variance = .1*[(2* 25000 1.25*30, 000) 85, 625]2 .3*[(2* 40000) 1.25*15, 000) 85, 625] + .6*[(2*55, 000 85,625] (if demand is only 25,000) (if demand is 40,000) (if demand is 55,000) Standard Deviation = square root = $32,702.49 Print Run = 70,000. Variance = .1* [(2* 25000 1.25* 45000) 74,187.5]2 (if demand is only 25,000) .3* [(2* 40000 1.25*30, 000) 74,187.5]2 + (if demand is 40,000) .45*[(2*55, 000 1.25*15, 000) 74,187.5]2 + (if demand is 55,000) 74,187.5]2 .15*[2*70, 000 (if demand is 70,000) Standard Deviation = square root = $41,580.64 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 60/124 Run=70,000 Run=55,000 Run=40,000 Run=25,000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 61/124 Run=70,000 Run=55,000 Run=40,000 Run=25,000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 62/124 Run=70,000 Run=55,000 Run=40,000 Run=25,000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 63/124 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 64/124 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 65/124 Useful Theorems - 1 E[Y] = EX[EY[Y|X]] Expectation over X of EY[Y|X] Law of Iterated Expectations 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 66/124 Example: Hierarchical Model e (x ) (x) y p( y | x) , y 0,1,..., 0 y! f ( x) e x , x 0, 0 e ( ) x (x) y f ( y, x) p( y | x) f ( x) y 0,1,..., x 0, , 0 y! e ( ) x (x) y E[ y ] y 0 yp( y ) y 0 y dx 0 y! But, Y|X is Poisson with parameter (x) so E[y|x] = x E[Y] = E[E[Y|X]] = E[x] = E[X] = 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% . 0 1000000 60 800000 40 Listing 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 67/124 Useful Theorems - 2 Decomposition of variance Var[Y] = Var[E[Y|X]] + E[Var[Y|X]] 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 68/124 Bivariate Normal E[Y|x] (Y X ) x Var[Y|x] Y2 (1 2 ) E[Var[Y|x]] E[Y2 (1 2 )] Y2 (1 2 ) Var[E[y|x]=Var[x] 2 2X XY Y2 XY = 2 2 2 (multiply and divide by Y2 .) X X Y 2 2 2 2 2 2 2 2 2 2 Y XY Y XY X XY XY Y Y X X Y 2 2 2 2 2 2 2 2 X Y X Y X Y Y X Var[Y] Y2 (1 2 ) 2 Y2 Y2 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 69/124 Useful Theorems - 3 Cov(X,Y)=Cov(X,E[Y|X]) In the hierarchical model, E[y|x]=x so Cov(X,Y)=Cov(X, X)= Var[X]= /2 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 70/124 Mean Squared Error Error of c as a predictor of Y is (Y - c) Expected squared error is EY [(Y-c)2 ] EY [{(Y ) ( c)}2 ] EY [(Y ) 2 ] EY ( c) 2 ] 2E Y [(Y )( c)] ( c) is not a random variable so the third term is zero, leaving EY [(Y-c) 2 ] Var[Y] ( c) 2 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 71/124 Minimum MSE Predictor Error of h(X) as a predictor of Y = (Y - h(X)) Expected squared error for a given X is EY [(Y-h(X)) 2 | X] Expected squared error over X is E X EY [(Y-h(X)) 2 | X] Add and subtract E[Y|X] = Y | X E X EY [{(Y-Y | X )+(Y | X -h(X))}2 | X ] E X EY [(Y Y | X )2 | X ] E X EY (Y | X h(X)) 2 | X ] 2E X EY [(Y Y | X )(Y | X h( X ))] The first term is E X [Var[Y|X]]. The second is E X [Y | X h(X)) 2 | X ] 0 if h(X) = Y | X E X EY [Y Y | X ] = 0, so the third term is zero. This implies that the minimum mean squared predictor is the conditional mean function. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 72/124 Variance of the Sum of X and Y Var[X+Y] = EYEX[ {(X+Y) - (X + Y) }2 ] = EYEX[ {(X- X) + (Y- Y)}2] = EYEX[ (X - X)2] + EYEX[(Y- Y )2] + 2 EYEX[(X- X)(Y- Y)] = EX[ (X - X)2] + EY[(Y- Y )2] + 2 EYEX[(X- X)(Y- Y)] = Var[X] + Var[Y] + 2Cov(X,Y) 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 73/124 Variance of Weighted Sum Var[aX+bY] = Var[aX] + Var[bY] +2Cov(aX,bY) = a2Var[X] + b2Var[Y] + 2ab Cov(X,Y). Also, Cov(X,Y) is the numerator in ρxy, so Cov(X,Y) = ρxy σx σy. ax by a b 2abxy x y 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 Percent 900000 2 y 0 1000000 60 800000 40 Listing Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball 2 Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 2 x Frequency 2 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 74/124 Application - Portfolio You have $1000 to allocate between assets A and B. The yearly returns on the two assets are random variables rA and rB. The means of the two returns are E[rA] = μA and E[rB] = μB The standard deviations (risks) of the returns are σA and σB. The correlation of the two returns is ρAB You will allocate a proportion w of your $1000 to A and (1-w) to B. Pie Chart of Percent vs Type Pepperoni 21.8% Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 90 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 500000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Meatball Garlic 5.0% 2.3% Mushroom and Onion 9.2% 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing Percent Frequency Listing Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 75/124 Risk and Return Your expected return on each dollar is E[wrA + (1-w)rB] = wμA + (1-w)μB The variance your return on each dollar is Var[wrA + (1-w)rB] = w2 σA2 + (1-w)2σB2 + 2w(1-w)ρABσAσB The standard deviation is the square root. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 76/124 Risk and Return: Example Suppose you know μA, μB, ρAB, σA, and σB (You have watched these stocks for many years.) The mean and standard deviation are then just functions of w. I will then compute the mean and standard deviation for different values of w. For example, μA = .04, μB, = .07 σA = .02, σB,=.06, ρAB = -.4 E[return] = w(.04) + (1-w)(.07) = .07 - .03w SD[return] = sqr[w2(.022)+ (1-w)2(.062) + 2w(1-w)(-.4)(.02)(.06)] = sqr[.0004w2 + .0036(1-w)2 - .00096w(1-w)] 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 77/124 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 78/124 Mean and Variance of a Sum Random Variables x1 , x 2 , ..., x n 1 , 1 , ..., n Means: Variances and Covariances ij , i=1,...,n and j=1,...,n Sum: x1 + x 2 + ... + x n Mean: 1 + 2 + ... + n E[{(x1 -1 ) ... (x n - n )}2 ] = = Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% j 1 n n i 1 j 1 Scatterplot of Listing vs IncomePC Normal - 95% CI 90 60 50 40 30 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 4 5 200000 2 1 100000 15000 0 200000 400000 600000 Listing 800000 1000000 Mean StDev N 369687 156865 51 80 8 300000 10 Normal 10 500000 400000 20 300000 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 100 12 700000 600000 70 ij j 1 14 800000 80 400000 100000 15000 369687 156865 51 0.994 0.012 n Histogram of Listing 900000 Mean StDev N AD P-Value 95 500000 200000 Cov(x i ,x j ) i 1 Probability Plot of Listing 600000 200000 E(x i -i )(x j - j ) n 99 700000 300000 100000 i 1 Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball n (x i -i )(x j - j ) j 1 Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% n Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% n 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing n Percent = E i 1 Frequency Variance: 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 79/124 Extension: Weighted Sum Random Variables x1 , x 2 , ..., x n 1 , 1 , ..., n Means: Variances and Covariances ij , i=1,...,n and j=1,...,n Weighted Sum: w1x1 + w 2 x 2 + ... + w n x n Mean: i1 w i i n E[{w1 (x1 -1 ) ... w n (x n - n )}2 ] = = Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% n n i 1 j 1 w i w j Cov(x i ,x j ) i 1 Scatterplot of Listing vs IncomePC Normal - 95% CI 90 30 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 4 5 200000 2 1 100000 15000 0 200000 400000 600000 Listing 800000 1000000 Mean StDev N 369687 156865 51 80 8 300000 10 Marginal Plot of Listing vs IncomePC Normal 100 10 500000 400000 20 300000 100000 15000 60 50 40 w i w j ij 12 700000 600000 70 j 1 Empirical CDF of Listing 14 800000 80 400000 200000 369687 156865 51 0.994 0.012 n Histogram of Listing 900000 Mean StDev N AD P-Value 95 500000 n Probability Plot of Listing 600000 200000 w i (x i -i ) w j (x j - j ) E w i w j (x i -i )(x j - j ) 99 700000 300000 100000 j 1 Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball i 1 j 1 Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% n Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% n n 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing n Percent = E i 1 Frequency Variance: 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 80/124 More General Portfolio Problem Assets A1 , A 2 , ..., A n . Expected Returns 1 ,..., n . Each is random with variance ii Covariance of return on asset i and asset j is ij , ii i2 Portfolio is a set of weights w1 w 2 w 3 ... w n that sum to 1. Variance of return is V = 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% 90 400000 200000 100000 15000 60 50 40 30 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 400000 600000 Listing 800000 1000000 Mean StDev N 369687 156865 51 80 8 4 200000 Normal 10 500000 300000 0 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 100 12 700000 400000 10 17500 Histogram of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 500000 wi w j ij 900000 Mean StDev N AD P-Value 95 600000 i 1 Scatterplot of Listing vs IncomePC Normal - 95% CI 99 700000 300000 100000 Probability Plot of Listing n wi i 17500 20000 22500 25000 IncomePC 27500 30000 32500 Percent 900000 i 1 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% n 0 1000000 60 800000 40 Listing Expected return is M = 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 81/124 Optimal Portfolio? Minimize the risk while obtaining a specific expected return = n Minimize V = Subject to M* i 1 wi w j ij n i 1 wi i and n i 1 wi 1 Choose the set (vector) of weights to minimize V subject to achieving a specified expected return. (Mathematical programming problem.) Alternatively, maximize expected return subject to a specified risk 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% 100000 15000 369687 156865 51 0.994 0.012 60 50 40 20 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 4 5 200000 2 1 100000 15000 0 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 300000 10 Mean StDev N 10 500000 400000 30 Normal 100 12 700000 600000 70 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 80 300000 200000 100000 90 400000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 500000 wi 1 Scatterplot of Listing vs IncomePC Normal - 95% CI 600000 200000 i 1 Probability Plot of Listing 99 700000 300000 n 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball wi w j ij and Percent Pepperoni 21.8% i 1 Listing Meatball Garlic 5.0% 2.3% n V = wi i Listing Pie Chart of Percent vs Type Mushroom and Onion 9.2% i 1 Percent Subject to n Frequency Maximize M* = 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 82/124 Sums of Independent Variables Pie Chart of Percent vs Type Pepperoni 21.8% Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 90 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 500000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Meatball Garlic 5.0% 2.3% Mushroom and Onion 9.2% 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing Percent Frequency Listing Suppose P is sales of a store. The accounting period starts with total sales = 0 On any given day, sales are random, normally distributed with mean μ and standard deviation σ. For example, mean $100,000 with standard deviation $10,000 Sales on any given day, day t, are denoted Δt Δ1 = sales on day 1, Δ2 = sales on day 2, Total sales after T days will be Δ1+ Δ2+…+ ΔT Therefore, each Δt is the change in the total that occurs on day t. Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 83/124 Behavior of the Total P1 = Δ1 P2 = Δ1 + Δ2 P3 = Δ1 + Δ2 + Δ3 And so on… PT = Δ1 + Δ2 + Δ3 + … + ΔT Pie Chart of Percent vs Type Pepperoni 21.8% Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 90 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 500000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Meatball Garlic 5.0% 2.3% Mushroom and Onion 9.2% 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing Percent Frequency Listing Let PT = Δ1+ Δ2+…+ ΔT be the total of the changes (variables) from times (observations) 1 to T. The sequence is Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 84/124 Summing If the individual Δs are each normally distributed with mean μ and standard deviation σ, then P1 = Δ1 = Normal [ μ, σ] P2 = Δ1 + Δ2 = Normal [2μ, σ√2] P3 = Δ1 + Δ2 + Δ3= Normal [3μ, σ√3] And so on… so that PT ~ N[Tμ, σ√T] 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 85/124 This Defines a Random Walk The sequence is Pepperoni 21.8% Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 90 200000 100000 15000 60 50 40 20000 22500 25000 IncomePC 27500 30000 32500 Histogram of Listing 30 6 200000 2 1 100000 15000 800000 1000000 369687 156865 51 80 8 5 400000 600000 Listing Mean StDev N 10 500000 4 200000 Normal 100 12 700000 300000 0 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 400000 10 17500 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 400000 900000 Mean StDev N AD P-Value 95 500000 Scatterplot of Listing vs IncomePC Normal - 95% CI 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball P1 = Δ1 P2 = P1 + Δ2 P3 = P2 + Δ3 And so on… PT = PT-1 + ΔT 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing Pie Chart of Percent vs Type Percent Meatball Garlic 5.0% 2.3% It follows that Frequency Listing Percent Mushroom and Onion 9.2% P1 = Δ1 P2 = Δ1 + Δ2 P3 = Δ1 + Δ2 + Δ3 And so on… PT = Δ1 + Δ2 + Δ3 + … + ΔT Listing 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 86/124 A Model for Stock Prices Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 90 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 500000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% Frequency Listing Preliminary: Consider a sequence of T random outcomes, independent from one to the next, Δ1, Δ2,…, ΔT. (Δ is a standard symbol for “change” which will be appropriate for what we are doing here. And, we’ll use “t” instead of “i” to signify something to do with “time.”) Δt comes from a normal distribution with mean μ and standard deviation σ. Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 87/124 A Model for Stock Prices Random Walk Model: Today’s price = yesterday’s price + a change that is independent of all previous information. Start at some known P0 so P1 = P0 + Δ1 and so on. Assume μ = 0 (no systematic drift in the stock price). 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 88/124 Random Walk Simulations Pt = Pt-1 + Δt, t = 1,2,…,100 Example: P0= 10, Δt Normal with μ=0, σ=0.02 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 89/124 Random Walk? Dow Jones March 27 to May 26, 2011. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 90/124 Uncertainty Expected Price = E[Pt] = P0+Tμ We have used μ = 0 (no systematic upward or downward drift). Standard deviation = σ√T reflects uncertainty or “risk.” Looking forward from “now” = time t=0, the uncertainty increases the farther out we look to the future. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 91/124 Expected Range [P0 t] 2 t 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2-C – Sums of Random Variables Part 2 – Expectations of Random Variables 93/124 Sequences of Independent Random Variables Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 90 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 500000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% Frequency Listing x1,x2,…,xn = a set of n random variables Same (marginal) probability distribution, f(x) Finite identical mean μ and variance σ2 Statistically independent IID = independent identically distributed This is a “random sample” from the population f(x). Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 94/124 The Sample Mean 1 n 1 1 1 X X X ... Xn i 1 2 i 1 n n n n 1 1 1 E[X] E X1 E X 2 ... E X n n n n 1 1 1 1 = E[X1 ] ... E[X n ] ... n n n n = Sample Mean: X= 1 1 n n n 1 Var[X] i 1 Var X i i 1 2 Var X i i 1 2 2 n n n 2 = n Distribution of X ? Remains to be seen. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 95/124 Convergence of a Random Variable to a Constant A constant, c, is a random variable, C, with variance zero. The constant is always equal to c. Prob(C = c) = 1. The sample mean, X has expected value . 2 The variance of X is . X is not a constant when n is finite. n As n X becomes a constant by this definition. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 96/124 Convergence in Mean Square E[Xn ] = μ and Var[Xn ] 0 as n Then Xn converges in mean square to μ If Slightly Broader Extension If E[Xn ] μ as n Var[Xn ] 0 as n Then Xn converges in mean square to μ. Xn + (1/n) converges in mean square to μ. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 97/124 Histogram of Xbar_4 Normal 120 Mean S tDev N 100 10.03 1.763 1000 Frequency 80 60 40 20 0 4 6 8 10 Xbar_4 12 14 Histogram of Xbar_9 Normal 80 Mean StDev N 70 10.04 1.164 1000 Frequency 60 Convergence in Mean Square: The top figure is a histogram for 1,000 means of samples of 4; the center is for samples of 9, the lowest one is for samples of 15. The vertical bars go through 7, 10 and 13 on all three figures. 50 40 30 20 10 0 6 7 Frequency 70 60 50 40 30 20 10 500000 400000 Mushroom 16.2% Plain 32.5% 90 100000 15000 60 50 40 20000 22500 25000 IncomePC 27500 30000 32500 12.0 30 Histogram of Listing 6 200000 2 1 100000 15000 800000 1000000 Mean StDev N 369687 156865 51 80 8 5 400000 600000 Listing Normal 10 500000 4 200000 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 100 12 700000 300000 0 12.8 14 800000 400000 10 17500 11.2 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 400000 9.6 10.4 Xbar_15 900000 Mean StDev N AD P-Value 95 500000 8.8 Scatterplot of Listing vs IncomePC Normal - 95% CI 600000 200000 8.0 Probability Plot of Listing 99 700000 300000 100000 7.2 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 600000 10.06 0.9054 1000 Percent 800000 800000 13 Mean StDev N Frequency Sausage 5.8% 900000 12 80 Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball 11 90 Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% 10 Xbar_9 Normal Percent Pie Chart of Percent vs Type 9 Histogram of Xbar_15 0 Mushroom and Onion 9.2% 8 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 98/124 Convergence of Means If x1 ,...,x n is a random sample from a population with finite constant, mean and finite constant variance, 2 , then X converges in mean square to Applies to functions of X. E.g., if Y = exp(X) and E[exp(X)] and Var[exp(X)] are finite constants, then Y converges in mean square to E[exp(X)]. If Xi ~ N[,2 ] and Y = exp(X) then Y exp( + 12 2 ). 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 700000 Listing Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 Frequency 98 Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 99/124 Probability Limits: Plim xn Let be a constant, be any positive value, and n index the sequence. If lim(n )Prob[|x n | > ] 0 then, x n converges in probability to . In words, the probability that the difference between x n and is larger than for any goes to zero. x n becomes arbitrarily close to . If lim(n )Prob[|x n | > ] 0 then plim x n . Mean square convergence is sufficient (not necessary) for convergence in probability. Mean square convergence is sufficient for most work in applied statistics. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 100/124 Probability Limits and Expectations What is the difference between E[xn] and plim xn? Consider: X = n with prob(X=n)=1/n X = 1 with prob(X=1)=1 – 1/n E[X]=2 – 1/n 2 Plim(X) =1 A notation P plim xn x n 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 101/124 The Slutsky Theorem Assumptions: If xn is a random variable such that plim xn = θ. For now, we assume θ is a constant. g(.) is a continuous function with continuous derivatives. g(.) is not a function of n. Conclusion: Then plim[g(xn)] = g[plim(xn)] assuming g[plim(xn)] exists. Works for probability limits. Does not work for expectations. E[xn ]=; plim(xn ) , E[1/xn ]=?; plim(1/xn )=1/ 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 102/124 Multivariate Slutsky Theorem Plim xn = a, Plim yn = b g(xn,yn) is continuous, has continuous first derivatives and exists at (a,b). Plim g(xn,yn) = g(a,b) Generalizes to K functions of M random variables 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 103/124 Monte Carlo Integration 1 n p Since i 1 g ( xi ) E[ g ( x)], a random sample can be used n to approximate the expected value. Two cases: (1) The population is known: Randomly draw R observations from 1 R p the known population, x1 ,...,x R . g ( x ) E[ g ( x)]. This is r r 1 R 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 Percent 900000 g ( x) f ( x) dx. 0 1000000 60 800000 40 Listing Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% Frequency equivalent to estimating 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 104/124 Monte Carlo Integration The population is unknown or the limits of integration are finite. Compute b a g(x)dx Strategy: Let z = (x-a)/(b-a) so x = a + (b-a)z and dx = (b-a)dz and z ranges from 0 to 1 b a 1 1 0 0 g(x)dx g[a+(b-a)z](b-a)dz (b-a) g[a+(b-a)z]dz Draw a sample from z ~ U[0,1]. Average R draws on g[a + (b-a)z] then multiply by (b-a). 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 105/124 Application Normal probability from -1 to +1.5 is 0.3413 + 0.4332 = .7745. [a = -1, b = +1.5, g(z)=(z).] Compute 10,000 random draws on x from U[0,1]. Compute z = a + (b-a)x = -1+2.5*x Average the 10,000 draws on (z) then multiply by the average by (b-a) = (1.5 – (-1)) = 2.5. Gives .773641 in my experiment. Pepperoni 21.8% Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 90 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 500000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% Frequency Listing Percent Listing 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 106/124 Application For Normal(2,1.52), E[exp(x)] = exp(2 + ½1.52) = 22.76 Draw 10,000 random U(0,1) draws. Transform to x ~ N(0,1) then z = 2 + 1.5*x Compute q=exp(z) and average 10,000 draws on q. My result was 22.87944. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 107/124 Limit Results Mean converges in probability to . Variance goes to zero. If n is finite, what can be said about its behavior? Objective: characterize the distribution of the mean when n is large but finite Strategy: find a limit result then use it to approximate for finite n. Pepperoni 21.8% Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 90 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 500000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Meatball Garlic 5.0% 2.3% 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing Pie Chart of Percent vs Type Mushroom and Onion 9.2% Percent Frequency Listing Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 108/124 A Finite Sample Distribution 180 F req u en cy 135 90 45 0 . 208 . 288 . 367 . 447 . 526 . 606 . 686 . 765 XB8 Means of 1000 samples of 8 observations from U[0,1]. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 109/124 Central Limit Theorems Set of independent random variables, X1 ,..., X n . Same distribution, f(X), not necessarily normal. 2 X n , Var[X n ]= 0. Large sample behavior is obvious. n X Stabilize X n . Let Zn = . / n Has E[Zn ]=0 and Var[Zn ] 1 for every n, even if n is huge (infinite). What can be said about the probability distribution of Zn ? 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 110/124 Limiting Distributions Xn has probability density fXn(Xn) and cdf FXn(Xn). If FXn(Xn) F(X) as n , then F(X) is the limiting distribution. (At points where F(X) is continuous in X.) FXn (X) F(X) implies that Xn converges in distribution to X. d Written Xn X 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 111/124 Lindeberg – Levy Central Limit Theorem Set of independent random variables, X1 ,..., X n . Same distribution, f(X), not necessarily normal. X n 1 n X i 1 i n 2 X n , Var[X n ]= 0. Large sample behavior is obvious. n X d Stabilize X n . Let Zn = . Z n N [0,1]. / n 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 112/124 Other Central Limit Theorems Lindeberg Levy for i.i.d. Lindeberg Feller – heteroscedastic. Variances may differ Lyupanov: distributions may differ Extensions - time series with some covariance across observations. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 113/124 Rough Approximations d Z n N [0,1]. Requires infinite n. Suppose n is fairly large. Assume this applies approximately, and manipulate it. Zn X / n N [0,1]. Assume it holds approximately for finite n. 2 / n Zn X is approximately N 0, n 2 / n Zn X is approximately N , n n n / n Zn i 1 X i is approximately N n, n 2 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 114/124 Normal Approximation to Binomial Binomial (n,p) equals the sum of n Bernoulli’s with parameter p. Each Bernoulli X has = p and 2 = p(1-p). Sum of n variables is approximately normal with mean np and variance np(1-p). 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 115/124 Approximation to binomial with n = 48, p=.25 Prob[8 x 15]= P(X=8)+P(X=9)+...+P(X=15) 48 x = X=8 .25 (1 .25)48 x x = 0.815678 15 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 116/124 Demoivre’s Normal Approximation The binomial density function has n=48, θ=.25, so μ = 12 and σ = 3. The normal density plotted has mean 12 and standard deviation 3. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 117/124 Using deMoivre’s Approximation 8 0.057905 9 0.085785 10 0.111520 11 0.128417 12 0.131984 13 0.121832 14 0.101526 15 0.076709 Total 0.815678 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% 90 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Marginal Plot of Listing vs IncomePC Normal 100 12 700000 400000 30 Empirical CDF of Listing 14 800000 600000 70 20 300000 100000 15000 369687 156865 51 0.994 0.012 80 400000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 500000 200000 100000 Scatterplot of Listing vs IncomePC Normal - 95% CI 600000 200000 0.7495859 8.1% error What happened? Probability Plot of Listing 99 700000 300000 0.8413450 – 0.0917591 = 17500 20000 22500 25000 IncomePC 27500 30000 32500 Percent 900000 P[z < 1] – P[z < -1.33]= 0 1000000 60 800000 40 Listing Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball P[-1.33 < z < 1]= Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% P[(8-12)/3<z<(15-12)/3]= Frequency The binomial has n=48, θ=.25, so μ = 12 and σ = 3. The normal distribution plotted has mean 12 and standard deviation 3. P[8 < x < 15]= 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 118/124 Continuity Correction When using a continuous distribution (normal) to approximate a discrete probability (binomial), subtract .5 from the lowest value in the range and add .5 to the highest value in the range. (The correction becomes less important as n increases.) 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 119/124 Correcting deMoivre’s Approximation 8 0.057905 9 0.085785 10 0.111520 11 0.128417 12 0.131984 13 0.121832 14 0.101526 15 0.076709 Total 0.815678 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% 90 400000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 500000 Histogram of Listing 900000 95 600000 200000 100000 Scatterplot of Listing vs IncomePC Normal - 95% CI Mean StDev N AD P-Value 0.8115198 0.5% error Probability Plot of Listing 99 700000 300000 0.878327 – 0.0668072 = 17500 20000 22500 25000 IncomePC 27500 30000 32500 Percent 900000 P[z < 1.166] – P[z < -1.5]= 0 1000000 60 800000 40 Listing Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball P[-1.5 < z < 1.166]= Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% P[(7.5-12)/3<z<(15.5-12)/3]= Frequency The binomial has n=48, θ=.25, so μ = 12 and σ = 3. The normal distribution plotted has mean 12 and standard deviation 3. P[7.5 < x < 15.5]= 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 120/124 A Useful Convergence Result If X n converges to X and if g(X n ) is a continuous function, then Yn = g(X n ) converges to Y = g(X). d Example: If X n N[0,1] then X 2n converges to the random variable that is the square of a standard normal, which is Gamma 12 , 12 . 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 121/124 Combine Slutsky with the Central Limit Theorem Zn X / n General Result: if Xn() F(X|) and if yn , then Xn(yn) F(X|) N [0,1] Suppose s n 1 n 1 i 1 Xi -X n 2 p . Then, X Zˆ n N [0,1]. sn / n Replacing with a random variable that converges to preserves the limiting result. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 122/124 Asymptotic Distributions An asymptotic distribution is an approximation to a true finite n distribution based on a result found for the limiting distribution (with infinite n) 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 123/124 Asymptotic Distribution Infinite n result Zn X N [0,1] as n / n Assume it is approximately true for finite n. Zn X ~ N [0,1] for finite n / n Manipulate as before 2 X ~ N , approximately. n This is the "asymptotic" distribution of X. This is how we use the central limit to understand the sampling distribution of X. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 2 – Expectations of Random Variables 124/124 The Chebychev Inequality For any random variable with finite mean and variance 2, Prob[|X-|/ > k] < 1/k2 Prob X is farther than k standard deviations from the mean is less than or equal to 1/k2. Useful for proofs, not for practical computations. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 90 500000 400000 200000 100000 15000 60 50 40 17500 20000 22500 25000 IncomePC 27500 30000 32500 6 5 200000 2 1 100000 15000 200000 400000 600000 Listing 800000 1000000 369687 156865 51 80 8 4 0 Mean StDev N 10 500000 300000 10 Normal 100 12 700000 400000 30 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 14 800000 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 Histogram of Listing 900000 Mean StDev N AD P-Value 95 700000 300000 100000 Probability Plot of Listing 99 17500 20000 22500 25000 IncomePC 27500 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000