Expectations Expectations Definition Let X and Y be jointly distributed rv’s with pmf p(x, y ) or pdf f (x, y ) according to whether the variables are discrete or continuous. Then the expected value of a function h(X , Y ), denoted by E [h(X , Y )] or µh(X ,Y ) , is given by (P P h(x, y ) · p(x, y ) E [h(X , Y )] = R ∞x R y∞ −∞ −∞ h(x, y ) · f (x, y )dxdy if X and Y are discrete if X and Y are continuo Covariance Covariance Definition The covariance between two rv’s X and Y is Cov (X , Y ) = E [(X − µX )(Y − µY )] (P P y (x − µX )(y − µY )p(x, y ) = R ∞x R ∞ −∞ −∞ (x − µX )(y − µY )f (x, y )dxdy X , Y discrete X , Y continuou Covariance Definition The covariance between two rv’s X and Y is Cov (X , Y ) = E [(X − µX )(Y − µY )] (P P y (x − µX )(y − µY )p(x, y ) = R ∞x R ∞ −∞ −∞ (x − µX )(y − µY )f (x, y )dxdy Proposition Cov (X , Y ) = E (XY ) − µX · µY X , Y discrete X , Y continuou Covariance Covariance Definition The correlation coefficient of X and Y , denoted by Corr (X , Y ), ρX ,Y or just ρ is defined by ρX ,Y = Cov (X , Y ) σX · σY Covariance Covariance Proposition 1. Corr (aX + b, cY + d) = Corr (X , Y ) if a · c > 0. 2. −1 ≤ Corr (X , Y ) ≤ 1. 3. ρ = 1 or −1 iff Y = aX + b for some a and b with a 6= 0. 4. If X and Y are independent, then ρ = 0. However, ρ = 0 does not imply that X and Y are independent Statistics and Their Distributions Statistics and Their Distributions Definition A statistic is any quantity whose value can be calculated from sample data. Statistics and Their Distributions Statistics and Their Distributions Definition The random variables X1 , X2 , . . . , Xn are said to form a (simple) random sample of size n if 1. The Xi s are independent random variables. 2. Every Xi has the same probability distribution. Statistics and Their Distributions Definition The random variables X1 , X2 , . . . , Xn are said to form a (simple) random sample of size n if 1. The Xi s are independent random variables. 2. Every Xi has the same probability distribution. In words, X1 , X2 , . . . , Xn forms a random sample if the Xi ’s are independent and identically distributed (iid). Statistics and Their Distributions Statistics and Their Distributions Deriving Sampling Distributions Example A certain system consists of two identical components. The life time of each component is supposed to have an expentional distribution with parameter λ. The system will work if at least one component works properly and the two components are assumed to work independently. Let X1 and X2 be the lifetime of the two components, respectively. What can we say about the lifetime of the system T0 = X1 + X2 ? Distribution for Sample Mean Distribution for Sample Mean Proposition Let X1 , X2 , . . . , Xn be a random sample from a distribution with mean value µ and standard deviation σ. Then 1. E (X ) = µX = µ √ 2. V (X ) = σ 2 = σ 2 /n and σX = σ/ n X Distribution for Sample Mean Proposition Let X1 , X2 , . . . , Xn be a random sample from a distribution with mean value µ and standard deviation σ. Then 1. E (X ) = µX = µ √ 2. V (X ) = σ 2 = σ 2 /n and σX = σ/ n X In words, the expected value of the sample mean equals the population mean, which is called the unbiased property. And the variance of the sample mean equals n1 of the population variance Distribution for Sample Mean Distribution for Sample Mean Example (Problem 38 revisit) There are two traffic lights on my way to work. Let X1 be the number of lights at which I must stop, and suppose that the distribution of X1 is as follows: 0 1 2 x1 µ = 1.1, σ = .49 p(x1 ) .2 .5 .3 Let X2 be the number of lights at which I must stop on the way home; X2 is independent of X1 . Assume that X2 has the same distribution as X1 , so that X1 , X2 is a random sample of size n = 2. Let X = (X1 + X2 )/2 denote the average stops. Distribution for Sample Mean Example (Problem 38 revisit) There are two traffic lights on my way to work. Let X1 be the number of lights at which I must stop, and suppose that the distribution of X1 is as follows: 0 1 2 x1 µ = 1.1, σ = .49 p(x1 ) .2 .5 .3 Let X2 be the number of lights at which I must stop on the way home; X2 is independent of X1 . Assume that X2 has the same distribution as X1 , so that X1 , X2 is a random sample of size n = 2. Let X = (X1 + X2 )/2 denote the average stops. a. Calculate µX . Distribution for Sample Mean Example (Problem 38 revisit) There are two traffic lights on my way to work. Let X1 be the number of lights at which I must stop, and suppose that the distribution of X1 is as follows: 0 1 2 x1 µ = 1.1, σ = .49 p(x1 ) .2 .5 .3 Let X2 be the number of lights at which I must stop on the way home; X2 is independent of X1 . Assume that X2 has the same distribution as X1 , so that X1 , X2 is a random sample of size n = 2. Let X = (X1 + X2 )/2 denote the average stops. a. Calculate µX . b. Calculate σ 2 . X Distribution for Sample Mean Distribution for Sample Mean Proposition Let X1 , X2 , . . . , Xn be a random sample from a distribution with mean value µ and standard deviation σ. Define T0 = X1 + X2 + · · · + Xn , then √ E (T0 ) = nµ, V (T0 ) = nσ 2 and σT0 = nσ Distribution for Sample Mean Distribution for Sample Mean Proposition Let X1 , X2 , . . . , Xn be a random sample from a normal distribution with mean value µ and standard deviation σ. Then for any n, X is normally distributed (with mean value µ and standard deviation √ σ/ n), as is T0 (with mean value nµ and standard deviation √ nσ). Distribution for Sample Mean Distribution for Sample Mean Example (Problem 54) Suppose the sediment density (g/cm) of a randomly selected specimen from a certain region is normally distributed with mean 2.65 and standard deviation .85 (suggested in “Modeling Sediment and Water Column Interactions for Hydrophobic Pollutants”, Water Research, 1984: 1169-1174). Distribution for Sample Mean Example (Problem 54) Suppose the sediment density (g/cm) of a randomly selected specimen from a certain region is normally distributed with mean 2.65 and standard deviation .85 (suggested in “Modeling Sediment and Water Column Interactions for Hydrophobic Pollutants”, Water Research, 1984: 1169-1174). a. If a random sample of 25 specimens is selected, what is the probability that the sample average sediment density is at most 3.00? Distribution for Sample Mean Example (Problem 54) Suppose the sediment density (g/cm) of a randomly selected specimen from a certain region is normally distributed with mean 2.65 and standard deviation .85 (suggested in “Modeling Sediment and Water Column Interactions for Hydrophobic Pollutants”, Water Research, 1984: 1169-1174). a. If a random sample of 25 specimens is selected, what is the probability that the sample average sediment density is at most 3.00? b. How large a sample size would be required to ensure that the above probability is at least .99? Distribution for Sample Mean Distribution for Sample Mean The Central Limit Theorem (CLT) Let X1 , X2 , . . . , Xn be a random sample from a distribution with mean value µ and standard deviation σ. Then if n is sufficiently large, X has approximately a normal distribution with mean value √ µ and standard deviation σ/ n, and T0 also has approximately a normal distribution with mean value nµ and standard deviation √ nσ. The larger the value of n, the better the approximation. Distribution for Sample Mean The Central Limit Theorem (CLT) Let X1 , X2 , . . . , Xn be a random sample from a distribution with mean value µ and standard deviation σ. Then if n is sufficiently large, X has approximately a normal distribution with mean value √ µ and standard deviation σ/ n, and T0 also has approximately a normal distribution with mean value nµ and standard deviation √ nσ. The larger the value of n, the better the approximation. Remark: 1. As long as n is sufficiently large, CLT is applicable no matter Xi ’s are discrete random variables or continuous random variables. Distribution for Sample Mean The Central Limit Theorem (CLT) Let X1 , X2 , . . . , Xn be a random sample from a distribution with mean value µ and standard deviation σ. Then if n is sufficiently large, X has approximately a normal distribution with mean value √ µ and standard deviation σ/ n, and T0 also has approximately a normal distribution with mean value nµ and standard deviation √ nσ. The larger the value of n, the better the approximation. Remark: 1. As long as n is sufficiently large, CLT is applicable no matter Xi ’s are discrete random variables or continuous random variables. 2. How large should n be such that CLT is applicable? Distribution for Sample Mean The Central Limit Theorem (CLT) Let X1 , X2 , . . . , Xn be a random sample from a distribution with mean value µ and standard deviation σ. Then if n is sufficiently large, X has approximately a normal distribution with mean value √ µ and standard deviation σ/ n, and T0 also has approximately a normal distribution with mean value nµ and standard deviation √ nσ. The larger the value of n, the better the approximation. Remark: 1. As long as n is sufficiently large, CLT is applicable no matter Xi ’s are discrete random variables or continuous random variables. 2. How large should n be such that CLT is applicable? Generally, if n > 30, CLT can be used. Distribution for Sample Mean Distribution for Sample Mean Example (Problem 49) There are 40 students in an elementary statistics class. On the basis of years of experience, the instructor knows that the time needed to grade a randomly chosen first examination paper is a random variable with an expected value of 6 min and a standard deviation of 6 min. Distribution for Sample Mean Example (Problem 49) There are 40 students in an elementary statistics class. On the basis of years of experience, the instructor knows that the time needed to grade a randomly chosen first examination paper is a random variable with an expected value of 6 min and a standard deviation of 6 min. a. If grading times are independent and the instructor begins grading at 6:50pm and grades continuously, what is the (approximate) probability that he is through grading before the 11:00pm TV news begins? Distribution for Sample Mean Example (Problem 49) There are 40 students in an elementary statistics class. On the basis of years of experience, the instructor knows that the time needed to grade a randomly chosen first examination paper is a random variable with an expected value of 6 min and a standard deviation of 6 min. a. If grading times are independent and the instructor begins grading at 6:50pm and grades continuously, what is the (approximate) probability that he is through grading before the 11:00pm TV news begins? b. If the sports report begins at 11:10pm, what is the probability that he misses part of the report if he waits unitl grading is done before turning on the TV? Distribution for Sample Mean Distribution for Sample Mean The original version of CLT The Central Limit Theorem (CLT) Let X1 , X2 , . . . be a sequence of i.i.d. random variables from a distribution with mean value µ and standard deviation σ. Define random variables Pn Xi − nµ for n = 1, 2, . . . Yn = i=1√ nσ Then as n → ∞, Yn has approximately a normal distribution. Distribution for Sample Mean Distribution for Sample Mean Corollary Let X1 , X2 , . . . , Xn be a random sample from a distribution for which only positive values are possible [P(Xi > 0) = 1]. Then if n is sufficiently large, the product Y = X1 X2 · · · Xn has approximately a lognormal distribution. Distribution for Linear Combinations Distribution for Linear Combinations Proposition Let X1 , X2 , . . . , Xn have mean values µ1 , µ2 , . . . , µn , respectively, and variances σ12 , σ22 , . . . , σn2 , respectively. 1.Whether or not the Xi s are independent, E (a1 X1 + a2 X2 + · · · + an Xn ) = a1 E (X1 ) + a2 E (X2 ) + · · · + an E (Xn ) = a1 µ1 + a2 µ2 + · · · + an µn 2. If X1 , X2 , . . . , Xn are independent, V (a1 X1 + a2 X2 + · · · + an Xn ) = a12 V (X1 ) + a22 V (X2 ) + · · · + an2 V (Xn ) = a12 σ12 + a22 σ22 + · · · + an2 σn2 Distribution for Linear Combinations Distribution for Linear Combinations Proposition (Continued) Let X1 , X2 , . . . , Xn have mean values µ1 , µ2 , . . . , µn , respectively, and variances σ12 , σ22 , . . . , σn2 , respectively. 3. More generally, for any X1 , X2 , . . . , Xn V (a1 X1 + a2 X2 + · · · + an Xn ) = n X n X i=1 j=1 ai aj Cov (Xi , Xj ) Distribution for Linear Combinations Proposition (Continued) Let X1 , X2 , . . . , Xn have mean values µ1 , µ2 , . . . , µn , respectively, and variances σ12 , σ22 , . . . , σn2 , respectively. 3. More generally, for any X1 , X2 , . . . , Xn V (a1 X1 + a2 X2 + · · · + an Xn ) = n X n X ai aj Cov (Xi , Xj ) i=1 j=1 We call a1 X1 + a2 X2 + · · · + an Xn a linear combination of the Xi ’s.