1 More Than One Random Variable – Independence In most experiments, we have more than one measured variable. Hence we need to examine probabilities associated with events that specify conditions on two or more random variables. Defn: Let X and Y be two continuous r.v.’s. If for any open rectangle in 2 bounded by a < x < b b d and c < y < d, we have P a X b, c Y d f x, y dydx , then the function f(x, y) is called a c the joint probability density function (joint p.d.f.) for X and Y. Defn: Let X and Y be two discrete r.v.’s. If for any pair of possible values x of X and y of Y, we have f x, y P( X x, Y y) , then f(x, y) is called the joint probability mass function (joint p.m.f.) for X and Y. The above definitions may be immediately extended to any finite number of (continuous or discrete) r.v.’s. Defn: We say that the r.v.’s X1, X2, …, Xn are independent if for any E1, E2, …, En , we have P X1 E1 , X 2 E2 , , X n En P X1 E1 P X 2 E2 P X n En . Example: p. 108, Exercise 3-130. Reliability of Multi-Component Systems As an example of the uses of sets of independent random variables, consider a system (mechanical or electronic) consisting of several components which operate independently of each other. There are two basic types of such systems, series and parallel. Defn: A system consisting of several components is called a series system if all components must be functioning for the system to function. Defn: A system consisting of several components is called a parallel system if the system will function so long as any single component is functioning. Defn: The reliability of a system at a time t is the probability that the system will operate according to specifications over the interval (0, t). Note: For a series system, the system reliability is the product of the component reliabilities. For a parallel system, the system reliability is 1 minus the product of the component failure probabilities. Some systems may be hybrids of series and parallel systems – a system may, for example, be a series system with components that are parallel systems of subcomponents. Example: Consider a series system of n independently operating, identical electronic components. Assume that the lifetime of each component has an exponential distribution with a mean of 500 hours. Let X1, X2, ..., Xn be the component lifetimes. For component i, the probability that the component will still operate up to time x is 2 1 u u x Ri P X i x exp du exp exp . The system reliability will be 500 500 500 x 500 x n x R Ri exp . i 1 500 n Example: Instead of the series system, assume that the same components are connected in a parallel system. The system reliability will be n x R 1 1 Ri 1 1 exp . 500 i 1 n In general, a parallel system will have reliability greater than the reliability of any of its components, while a series system will have reliability less than that of any of its components. In the example above, assume that there are 5 components, and that we want to find the reliability at 300 hours. For the series system, we have 5 300 R Ri exp 0.0498 , while for the parallel system, we have i 1 500 n n 5 x 300 R 1 1 Ri 1 1 exp 1 1 e 0.9813 . 500 500 i 1 n Linear Functions of Random Variables Let X be a r.v. with mean and standard deviation , and let c be a constant. Then we may define the r.v. Y by Y = X + c. Then the mean of the distribution of Y is + c, and the standard deviation of Y is . Let Y = cX. Then the mean of Y is c and the standard deviation of Y is c. Let X1, X2, …, Xn be (continuous or discrete) r.v.’s with means 1, 2, …, and n, respectively, and variances 12 , 22 , …, and n2 , respectively. Let c1, c2, …, cn be constants. We define a r.v. Y to be the linear combination n n cn X n ci X i . Then Y c1 X 1 c2 X 2 i 1 n n i 1 i 1 Y E Y E ci X i ci E X i ci i . i 1 This equation says that expectation is a linear operator. n 2 2 2 Furthermore, if the r.v.’s are independent, then Y ci i . i 1 3 In addition, if the X’s have normal distributions, then Y also has a normal distribution. I. e., if Xi ~ Normal i , i2 , for i = 1, 2, …, n, n n 2 2 and if the X’s are independent, then Y ci X i ~ Normal ci i , ci i . i 1 i 1 i 1 n Example: p. 116, Exercise 3-145 What if the R.V.’s are not independent? Let X1 be a (continuous or discrete) r.v. with mean 1 and variance discrete) r.v. with mean 2 and variance the distribution of Y? 22 . 12 . Let X2 be a (continuous or Let Y = X1 + X2. What are the mean and variance of Since expectation is a linear operator, we have Y E X1 X 2 E X1 E X 2 1 2 . What about the variance? Since expectation is a linear operator, we have 2 2 2 2 Y2 E Y Y E Y 2 E Y E X1 X 2 E 1 2 E X 12 2 X 1 X 2 X 22 12 2 12 22 2 2 E X 12 12 E X 22 22 2 E X 1 X 2 12 1 2 2 E X 1 X 2 1 2 Defn: The covariance of two r.v.’s X1 and X2 , with means 1 and 2, respectively, is Cov X 1 , X 2 E X 1 1 X 2 2 E X 1 X 2 1 2 . With this notation, we have Y2 12 22 2Cov X1 , X 2 . Note that from the definition: 1) if larger values of X1 tend to be associated with larger values of X2, and smaller values of X1 tend to be associated with smaller values of X2, then the covariance is positive; 2) if larger values of X1 tend to be associated with smaller values of X2, and smaller values of X1 tend to be associated with larger values of X2, then the covariance is negative; 3) If the two r.v.’s are independent, then Cov X1 , X 2 0 . Hence, the covariance describes how the two variables relate to each other. We want, however, a standardized measure of relationship, which does not depend on the scale of measurement of either variable. 4 Defn: The correlation of two r.v.’s X1 and X2 , with means 1 and 2, respectively, and standard Cov X 1 , X 2 EX 1 X 2 1 2 deviations 1 and 2, respectively, is 12 . 1 2 1 2 Properties of the correlation coefficient: 1) For any two r.v.’s X1 and X2, 1 12 1 . 2) If there is no linear relationship between the variables, then 12 = 0. 3) If there is a perfect positive linear relationship between X1 and X2, then linear function of X1. 4) If there is a perfect negative linear relationship between X1 and X2, then linear function of X1. 12 = 1, and X2 is a 12 = -1, and X2 is a Given the above definitions, we may make the following statements about linear combinations of r.v.’s: Let X1, X2, …, Xn be (continuous or discrete) r.v.’s, with means 1, 2, …, and n, respectively, and variances 1 , 2 , …, and n , respectively. Let c1, c2, …, cn be constants. We define a r.v. Y to be the linear combination 2 Y c1 X 1 c2 X 2 2 2 n cn X n ci X i . Then i 1 n n 1) Y E Y E ci X i ci E X i ci i , and i 1 i 1 i 1 n 2) Y2 ci2 i2 2 ci c j CovX i , X j ci2 i2 2 ci c j i j ij . n i 1 n i j i 1 i j Note that if the r.v.’s are all independent, then the correlations are 0, and the two equations above reduce to the two equations from page 61 above. Random Samples, Statistics, and the Central Limit Theorem Defn: A set of (continuous or discrete) random variables X1, X2, ..., Xn is called a random sample if the r.v.’s have the same distribution and are independent. We say that X1, X2, ..., Xn are independent and identically distributed (i.i.d.). Defn: A statistic is a random variable which is a function of a random sample. The probability distribution associated with a statistic is called its sampling distribution. 5 For example, let X1, X2, ..., Xn be a random sample from a distribution having mean and standard deviation . The statistic 1 n X X i is called the sample mean. Since The X’s are random variables, then X is also a n i 1 random variable, with a sampling distribution. From the equations on page 61 above, we have the n 1 n 1 E X E following: X n X i n . i 1 i 1 n Since the members of the sample are i.i.d., then X2 i 1 1 2 2 . n n2 If the random sample was selected from a normal distribution (we write X1, X2, ..., Xn ~ Normal(, ), then we can also say that X ~ Normal , . n Example: p. 121, Example 3-161 Some other examples of statistics are: 1) The sample variance, S 2 2) The sample median, ~ X, 3) The kth order statistic, 1 n X i X 2 , n 1 i 1 X k . Theorem: (Central Limit Theorem) If X1, X2, ..., Xn are a random sample from any distribution with mean and standard deviation X < +, then the limiting distribution of as n + is standard normal. n Note: Nothing was said about the distribution from which the sample was selected except that it has finite standard deviation. The sample could be selected from a normal distribution, or from an exponential distribution, or from a Weibull distribution, or from a Bernoulli distribution, or from a Poisson distribution, or from any other distribution with finite standard deviation. See, e.g., the illustration on p. 120. Note: For what n will the normal approximation be good? For most purposes, if n 30 , we will say that the approximation given by the Central Limit Theorem (CLT) works well. Example: p. 122, Exercise 3-163. 6 Example: The fracture strength of tempered glass averages 14 (measured in thousands of p.s.i.) and has a standard deviation of 2. What is the probability that the average fracture strength of 100 randomly selected pieces of tempered glass will exceed 14,500 p.s.i.? Example: Shear strength measurements for spot welds have been found to have a standard deviation of 10 p.s.i. If 100 test welds are to be measured, what is the approximate probability that the sample mean will be within 1 p.s.i. of the true population mean? Normal Approximation to Binomial Distribution Assume that Y ~ Binomial(n, p). We may also consider Y to be a sum of n i.i.d. Bernoulli(p) random Xp Y np variables, X1, X2, ..., Xn. If n is large (n 30), then the CLT implies that p(1 p) np(1 p) n has an approximate standard normal distribution. Example: Assume that the date is October 1, 2008. We want to predict the outcome of the Presidential election. We select a simple random sample of 1068 voters from the population of all U.S. voters, and ask each voter in the sample, “Do you intend to vote for Sen. Obama for President?” Let X = number of voters in the sample who plan to vote for Sen. Obama. The actual level of support for the Senator in the voting population was 0.53 (from the outcome of the election). What is the probability that a majority of the voters in the sample will say that they are supporters of Sen. Obama? We will calculate this probability exactly, using the binomial distribution, and then calculate an approximate probability, using the normal approximation to the binomial distribution. Example: It is known that under normal operating conditions, a machine tool produces 1% defective parts. We want to decide whether the rate of defects has increased. We select a simple random sample of 36 parts produced by this machine, and let X = number of defective parts in the sample. Again, we will calculate the exact probability, and the approximate probability.