Chapter 3 Selected Basic Concepts in Statistics Expected Value, Variance, Standard Deviation Numerical summaries of selected statistics Sampling distributions Expected Value Let y be a random variable with values y1 , y2 , respective probabilities p( y1 ), p( y2 ), , y N that have , p( y N ). The expected value of y, denoted E ( y) or is N E ( y ) yi p( yi ) i 1 Weighted average Not the value of y you “expect”; a long-run average E(y) Example 1 Toss a fair die once. Let y be the number of dots on upper face. y 1 2 3 4 5 6 p(y) 1/6 1/6 1/6 1/6 1/6 1/6 E ( y ) 1 16 2 16 3 16 4 16 5 16 6 16 21 3.5 6 E(y) Example 2: Green Mountain Lottery Choose 3 digits between 0 and 9. Repeats allowed, order of digits counts. If your 3-digit number is selected, you win $500. Let y be your winnings (assume ticket cost $0) y $0 $500 p(y) 0.999 0.001 E ( y ) $0 0.999 $500 0.001 $.50 US Roulette Wheel and Table The roulette wheel has alternating black and red slots numbered 1 through 36. There are also 2 green slots numbered 0 and 00. A bet on any one of the 38 numbers (1-36, 0, or 00) pays odds of 35:1; that is . . . If you bet $1 on the winning number, you receive $36, so your winnings are $35 American Roulette 0 - 00 (The European version has only one 0.) US Roulette Wheel: Expected Value of a $1 bet on a single number Let y be your winnings resulting from a $1 bet on a single number; y has 2 possible values y p(y) -1 37/38 35 1/38 E(y)= -1(37/38)+35(1/38)= -.05 So on average the house wins 5 cents on every such bet. A “fair” game would have E(y)=0. The roulette wheels are spinning 24/7, winning big $$ for the house, resulting in … Variance and Standard Deviation Let y be a random variable with values y1 , y2 , respective probabilities p( y1 ), p( y2 ), , y N that have , p( y N ). The variance of y, denoted V ( y ) or 2 is N V ( y ) E ( y ) ( yi ) 2 p( yi ) 2 2 i 1 The standard deviation of y, denoted SD( y ) or , is the square root of the variance. SD( y ) 2 Measure spread around the middle, where the middle is measured by Variance Example Toss a fair die once. Let y be the number of dots on upper face. y 1 2 3 4 5 6 p(y) 1/6 1/6 1/6 1/6 1/6 1/6 Recall = 3.5 V ( y ) (1 3.5) 2 16 (2 3.5) 2 16 (3 3.5) 2 16 (4 3.5) 2 16 (5 3.5) 2 16 (6 3.5) 2 16 (2.5) 2 16 (1.5) 2 16 ( .5) 2 16 (.5) 2 16 (1.5) 2 16 (2.5) 2 16 17.5 2.917; 6 SD( y ) 2.917 1.708 V(y) Example 2: Green Mountain Lottery y $0 $500 p(y) 0.999 0.001 Recall = .50 V ( y ) (0 .50) 2 0.999 (500 .50) 2 0.001 (.50) 2 (0.999) (499.5) 2 (0.001) 249.75 SD( y ) 249.75 15.8 Estimators for , 2, Let y1 , y2 , , yn denote sample observations. n Sample mean y y i 1 i n n Sample variance s 2 2 ( y y ) i i 1 n 1 Sample standard deviation s s 2 s2 “average” squared deviation from the middle Automate these calculations Examples Linear Transformations of Random Variables and Sample Statistics Random variable y with E(y) and V(y) Lin trans y*=a+by, what is E(y*) and V(y*) in terms of original E(y) and V(y)? Data y1, y2, …, yn with mean y and standard deviation s Lin trans y* = a + by; new data y1*, y2*, …, yn*; what is y* and s* in terms of y and s Linear Transformations Rules for E(y*), V(y*) and SD(y*) E(y*)=E(a+by) = a + bE(y) V(y*)=V(a+by) = b2V(y) SD(y*)=SD(a+by) =|b|SD(y) Rules for y*, s*2 , and s* y* = a + by s*2 = b2s2 s* = bs Expected Value and Standard Deviation of Linear Transformation a + by Let y=number of repairs a new computer needs each year. Suppose E(y)= 0.20 and SD(y)=0.55 The service contract for the computer offers unlimited repairs for $100 per year plus a $25 service charge for each repair. What are the mean and standard deviation of the yearly cost of the service contract? Cost = $100 + $25y E(cost) = E($100+$25y)=$100+$25E(y)=$100+$25*0.20= = $100+$5=$105 SD(cost)=SD($100+$25y)=SD($25y)=$25*SD(y)=$25*0.55= =$13.75 Addition and Subtraction Rules for Random Variables E(X+Y) = E(X) + E(Y); E(X-Y) = E(X) - E(Y) When X and Y are independent random variables: 1. Var(X+Y)=Var(X)+Var(Y) 2. SD(X+Y)= Var ( X ) Var (Y ) SD’s do not add: SD(X+Y)≠ SD(X)+SD(Y) 3. Var(X−Y)=Var(X)+Var(Y) 4. SD(X −Y)= Var ( X ) Var (Y ) SD’s do not subtract: SD(X−Y)≠ SD(X)−SD(Y) SD(X−Y)≠ SD(X)+SD(Y) Example: rv’s NOT independent X=number of hours a randomly selected student from our class slept between noon yesterday and noon today. Y=number of hours the same randomly selected student from our class was awake between noon yesterday and noon today. Y = 24 – X. What are the expected value and variance of the total hours that a student is asleep and awake between noon yesterday and noon today? Total hours that a student is asleep and awake between noon yesterday and noon today = X+Y E(X+Y) = E(X+24-X) = E(24) = 24 Var(X+Y) = Var(X+24-X) = Var(24) = 0. We don't add Var(X) and Var(Y) since X and Y are not independent. Pythagorean Theorem of Statistics for Independent X and Y c2=a2+b2 Var(X) a2 Var(X+Y) a c SD(X+Y) SD(X) a 2 + b2 = c 2 Var(X)+Var(Y)=Var(X+Y) b SD(Y) b2 Var(Y) a+b≠c SD(X)+SD(Y) ≠SD(X+Y) Pythagorean Theorem of Statistics for Independent X and Y 32 + 42 = 52 Var(X)+Var(Y)=Var(X+Y) 25=9+16 Var(X) 9 Var(X+Y) 3 5 SD(X+Y) SD(X) 4 SD(Y) 16 Var(Y) 3+4≠5 SD(X)+SD(Y) ≠SD(X+Y) Example: meal plans Regular plan: X = daily amount spent E(X) = $13.50, SD(X) = $7 Expected value and stan. dev. of total spent in 2 consecutive days? (assume independent) E(X1+X2)=E(X1)+E(X2)=$13.50+$13.50=$27 SD(X1 + X2) ≠ SD(X1)+SD(X2) = $7+$7=$14 SD( X 1 X 2 ) Var ( X 1 X 2 ) Var ( X 1 ) Var ( X 2 ) ($7) ($7) $ 49 $ 49 $ 98 $9.90 2 2 2 2 2 Example: meal plans (cont.) Jumbo plan for football players Y=daily amount spent E(Y) = $24.75, SD(Y) = $9.50 Amount by which football player’s spending exceeds regular student spending is Y-X E(Y-X)=E(Y)–E(X)=$24.75-$13.50=$11.25 SD(Y ̶ X) ≠ SD(Y) ̶ SD(X) = $9.50 ̶ $7=$2.50 SD(Y X ) Var (Y X ) Var (Y ) Var ( X ) ($9.50) ($7) $ 90.25 $ 49 $ 139.25 $11.80 2 2 2 2 2 For random variables, X+X≠2X 1) 2) Let X be the annual payout on a life insurance policy. From mortality tables E(X)=$200 and SD(X)=$3,867. If the payout amounts are doubled, what are the new expected value and standard deviation? The risk to the Double payout is 2X. E(2X)=2E(X)=2*$200=$400 insurance co. SD(2X)=2SD(X)=2*$3,867=$7,734 when doubling the Suppose insurance policies are sold to 2payout people.(2X) Theis not the2same as the annual payouts are X1 and X2. Assume the people risk when selling behave independently. What are the expected value policies to 2 and standard deviation of the total payout? people. E(X + X )=E(X ) + E(X ) = $200 + $200 = $400 1 2 1 2 SD(X1 + X 2 )= Var ( X1 X 2 ) Var ( X1) Var ( X 2 ) (3867)2 (3867)2 14,953,689 14,953,689 29,907,378 $5,468.76 Estimator of population mean Let y1 , y2 , , yn denote sample observations. n Sample mean y y i 1 i n y will vary from sample to sample What are the characteristics of this sample-tosample behavior? Numerical Summary of Sampling Distribution of y Consider the sample observations y1 , y2 , , yn as independent observations of the population variable y with E ( y ) and SD( y ) . n yi E ( y ) E i 1 n 1 n n E ( yi ) n n i 1 Unbiased: a statistic is unbiased if it has expected value equal to the population parameter. Numerical Summary of Sampling Distribution of y Consider the sample observations y1 , y2 , , yn as independent observations of the population variable y with E ( y ) and SD( y ) . n yi V ( y ) V i 1 n SD( y ) n 1 n 1 2 V yi 2 n i 1 n n 2 2 V ( yi ) 2 n n i 1 n Standard Error Standard error - square root of the estimated variance of a statistic important building block for statistical inference V ( y) 2 n 2 s Vˆ ( y ) n Standard error of the sample mean: s SE ( y ) n Recall SD( y ) n Shape? We have numerical summaries of the sampling distribution of y What about the shape of the sampling distribution of y ? THE CENTRAL LIMIT THEOREM The World is Normal Theorem The Central Limit Theorem (for the sample mean y) If a random sample of n observations is selected from a population (any population), then when n is sufficiently large, the sampling distribution of y will be approximately normal. (The larger the sample size, the better will be the normal approximation to the sampling distribution of y.) The Importance of the Central Limit Theorem When we select simple random samples of size n, the sample means we find will vary from sample to sample. We can model the distribution of these sample means with a probability model that is N , n Shape of population is irrelevant Estimating the population total Let y1 , y2 , , yn denote sample observations. n Sample mean y y i 1 i n ˆ Ny ( N is the population size) Estimating the population total Expected value E (ˆ) E ( Ny ) NE ( y ) N Estimating the population total Variance, standard deviation, standard error V (ˆ) V ( Ny ) 2 N V ( y) N 2 SD(ˆ) V ( Ny ) N V ( y) 2 2 N n SE (ˆ) N n s n Finite population case Example: sampling w/ replacement to estimate Population: {1, 2, 3, 4} N = 4 sample n=2 with varying probabilities: 1 .1, 2 .1, 3 .4, 4 .4 estimate with ˆ 1 n yi n i 1 i 1 2 yi 2 i 1 For the sample {1, 2} 1 1 1 ˆ 2 (10 20) 15 .1 2 2 .1 i Finite population case Example: sampling w/ replacement to estimate Sample Prob of Sample V() {1, 2} .02 15 25.0 {1, 3} .08 35/4 1.5625 {1, 4} .08 10 0 {2, 3} .08 55/4 39.0625 {2, 4} .08 15 25.0 {3, 4} .32 35/4 1.5625 {1, 1} .01 10 0 {2, 2} .01 20 0 {3, 3} .16 15/2 0 {4, 4} .16 10 0 Finite population case Example: sampling w/ replacement to estimate From the table: E (ˆ) 15(.02) 35 (.08) 4 10(.16) 10 V (ˆ) (15 10) (.02) 35 10 (.08) 4 2 (10 10) (.16) 6.250 2 2 Finite population case Example: sampling w/ replacement to estimate 2 y 1 1 i Vˆ (ˆ) ˆ ; n n 1 i 1 i From the table, E (Vˆ (ˆ)) 25(.02) 1.5625(.08) n 39.0625(.08) 25(.08) 1.5625(.32) 6.25 V (ˆ) Finite population case Example: sampling w/ replacement to estimate Example Summary n yi 1 ˆ ; n i 1 i E (ˆ) ; y 1 1 i Vˆ (ˆ) ˆ n n 1 i 1 i E (Vˆ (ˆ)) V (ˆ) n 2 Finite population case Sampling w/ replacement to estimate pop. total In general ˆ 1 n yi n i 1 is unbiased for any choice of i ; i y ˆ (ˆ )) V (ˆ ) ˆ Vˆ (ˆ ) ; E ( V n n 1 Want to choose i so that 2 1 1 n i i 1 i V (ˆ) is as small as possible. Finite population case Sampling w/ replacement to estimate pop. total A specific choice for the i ' s : Suppose we know the values yi , i 1, (so the population total is known) choose i ˆ 1 n n i 1 yi i yi 1 n n i 1 (assume all yi 0) yi yi n n every ˆ estimates exactly N Finite population case Sampling w/ replacement to estimate pop. total In reality, do not know value of yi for every item in the population. BUT can choose i proportional to a known measurement highly correlated with yi . Finite population case Sampling w/ replacement to estimate pop. total Example: want to estimate total number of job openings in a city by sampling industrial firms. • • • • Many small firms – employ few workers; A few large firms – employ many workers; Large firms influence number of job openings; Large firms should have greater chance of being in sample to improve estimate of total openings. Firms can be sampled with probabilities proportional to the firm’s total work force, which should be correlated to the firm’s job openings. Finite population case Sampling without replacement to estimate pop. total Thus far we have assumed a population that does not change when the first item is selected, that is, we sampled with replacement. When sampling without replacement this is not true Example: population {1, 2, 3, 4}; n=2, suppose equally likely. • Prob. of selecting 3 on first draw is ¼. • Prob. of selecting 3 on second draw depends on first draw (probability is 0 or 1/3) Finite population case Sampling without replacement to estimate pop. total Let i P( yi is selected in the sample) i ' s change with the draw Replace i with average probability i n that yi is selected across the n draws. ˆ 1 n yi n i 1 i where wi 1 n yi n i 1 1 i Worksheet i n n i 1 yi i n wi yi i 1 End of Chapter 3