Sections 4.1, 4.2, 4.3 Important Definitions in the Text: The definition of joint probability mass function (joint p.m.f.) Definition 4.1-1 The definitions of marginal probability mass function (marginal p.m.f.) and the independence of random variables Definition 4.1-2 If the joint p.m.f. of (X, Y) is f(x,y), and S is the corresponding outcome space, then the mathematical expectation, or expected value, of u(X,Y) is If the marginal p.m.f. of X is f1(x), and S1 is the corresponding outcome space, then E[v(X)] can be calculated from either An analogous statement can be made about E[v(Y)] . 1. Twelve bags each contain two pieces of candy, one red and one green. In two of the bags each piece of candy weighs 1 gram; in three of the bags the red candy weighs 2 grams and the green candy weighs 1 gram; in three of the bags the red candy weighs 1 gram and the green candy weighs 2 grams; in the remaining four bags each piece of candy weighs 2 grams. One bag is selected at random and the following random variables are defined: 1/4 1/3 2 X = weight of the red candy , y Y = weight of the green candy . 1/6 1/4 1 The space of (X, Y) is {(1,1) (1,2) (2,1) (2,2)}. The joint p.m.f. of (X, Y) is f(x, y) = 1 — 6 1 — 4 1 — 3 1 x 2 if (x, y) = (1, 1) if (x, y) = (1, 2) , (2, 1) if (x, y) = (2, 2) The marginal p.m.f. of X is f1(x) = The marginal p.m.f. of Y is f2(y) = 5 / 12 if x = 1 7 / 12 if x = 2 5 / 12 if y = 1 7 / 12 if y = 2 A formula for the joint p.m.f. of (X,Y) is f(x, y) = x+y —— if (x, y) = (1, 1) , (1, 2) , (2, 1) , (2, 2) 12 A formula for the marginal p.m.f. of X is f1(x) = x+1 x+2 2x + 3 f(x, 1) + f(x, 2) = —— + —— = ——— 12 12 12 A formula for the marginal p.m.f. of Y is f2(y) = 1+y 2+y 2y + 3 f(1, y) + f(2, y) = —— + —— = ——— 12 12 12 if x = 1, 2 if y = 1, 2 Sections 4.1, 4.2, 4.3 Important Definitions in the Text: The definition of joint probability mass function (joint p.m.f.) Definition 4.1-1 The definitions of marginal probability mass function (marginal p.m.f.) and the independence of random variables Definition 4.1-2 If the joint p.m.f. of (X, Y) is f(x,y), and S is the corresponding outcome space, then the mathematical expectation, or expected value, of u(X,Y) is E[u(X,Y)] = u(x,y)f(x,y) (x,y) S If the marginal p.m.f. of X is f1(x), and S1 is the corresponding outcome space, then E[v(X)] can be calculated from either or v(x)f1(x) v(x)f(x,y) x S1 (x,y) S An analogous statement can be made about E[v(Y)] . 1. - continued E(X) = (1)(5/12) + (2)(7/12) = 19/12 E(X2) = (1)2(5/12) + (2)2(7/12) = 11/4 Var(X) = 11/4 – (19/12)2 = 35/144 E(Y) = (1)(5/12) + (2)(7/12) = 19/12 E(Y2) = (1)2(5/12) + (2)2(7/12) = 11/4 Var(Y) = 11/4 – (19/12)2 = 35/144 f(x, y) f1(x)f2(y) Since _________________________, then the random variables X are not and Y _______________ independent. Using the joint p.m.f., E(X + Y) = (1+1)(1/6) + (1+2)(1/4) + (2+1)(1/4) + (2+2)(1/3) = 19 / 6 Alternatively, E(X + Y) = E(X) + E(Y) = 19/12 + 19/12 = 19 / 6 Using the joint p.m.f., E(X – Y) = (1–1)(1/6) + (1–2)(1/4) + (2–1)(1/4) + (2–2)(1/3) = 0 Alternatively, E(X – Y) = 19/12 – 19/12 = 0 E(X + Y) can be interpreted as the mean of the total weight of candy in the bag. E(X – Y) can be interpreted as the mean of how much more the red candy in the bag weighs than the green candy. E(XY) = (1)(1)(1/6) + (1)(2)(1/4) + (2)(1)(1/4) + (2)(2)(1/3) = 5/2 Cov(X,Y) = 1. - continued = The least squares lines for predicting Y from X is The least squares lines for predicting X from Y is The conditional p.m.f. of Y | X = 1 is Y | X = 2 is For x = 1, 2, a formula for the conditional p.m.f. of Y | X = x is 1. - continued The conditional p.m.f. of X | Y = 1 is X | Y = 2 is For y = 1, 2, a formula for the conditional p.m.f. of X | Y = y is E(Y | X = 1) = E(Y2 | X = 1) = Var(Y | X = 1) = E(Y | X = 2) = E(Y2 | X = 2) = Var(Y | X = 2) = Is the conditional mean of Y given X = x a linear function of the given value, that is, can we write E(Y | X = x) = a + bx ? 1. - continued E(X | Y = 1) = E(X2 | Y = 1) = Var(X | Y = 1) = E(X | Y = 2) = E(X2 | Y = 2) = Var(X | Y = 2) = Is the conditional mean of X given Y = y a linear function of the given value, that is, can we write E(X | Y = y) = c + dy ? 2. An urn contains six chips, one $1 chip, two $2 chips, and three $3 chips. Two chips are selected at random and without replacement. The following random variables are defined: X = dollar value of the first chip selected , Y = dollar value of the second chip selected . The space of (X, Y) is {(1,2) (1,3) (2,1) (2,2) (2,3) (3,1) (3,2) (3,3)}. 3 1/10 1/5 1/5 y 2 1/15 1/15 1/5 1/15 1/10 1 1 2 x 3 The joint p.m.f. of (X, Y) is f(x, y) = 1 — if (x, y) = (2, 3) , (3, 2) , (3, 3) 5 1 — if (x, y) = (1, 3) , (3, 1) 10 1 — if (x, y) = (1, 2) , (2, 1) , (2, 2) 15 2. - continued The marginal p.m.f. of X is f1(x) = 1/6 1/3 1/2 if x = 1 if x = 2 if x = 3 The marginal p.m.f. of Y is f2(y) = 1/6 1/3 1/2 if y = 1 if y = 2 if y = 3 A formula for the joint p.m.f. of (X,Y) is f(x, y) = (There seems to be no easy formula.) A formula for the marginal p.m.f. of X is f1(x) = x / 6 if x = 1, 2, 3 A formula for the marginal p.m.f. of Y is f2(y) = y / 6 if y = 1, 2, 3 E(X) = 7 / 3 E(X2) = 6 Var(X) = 6 – (7 / 3)2 = 5 / 9 E(Y) = 7 / 3 E(Y2) = 6 Var(Y) = 6 – (7 / 3)2 = 5 / 9 f(x, y) f1(x)f2(y) Since _________________________, then the random variables X are not and Y _______________ independent. P(X + Y < 4) = P[(X,Y) = (1,2)] + P[(X,Y) = (2,1)] = 1 / 15 + 1 / 15 = 2 / 15 Using the joint p.m.f., E(XY) = (1)(2)(2/30) + (1)(3)(3/30) + (2)(1)(2/30) + (2)(2)(2/30) + (2)(3)(6/30) + (3)(1)(3/30) + (3)(2)(6/30) + (3)(3)(6/30) = 16 / 3 2. - continued Cov(X,Y) = = The least squares lines for predicting Y from X is The least squares lines for predicting X from Y is The conditional p.m.f. of Y | X = 1 is Y | X = 2 is Y | X = 3 is For x = 1, 2, 3, a formula for the conditional p.m.f. of Y | X = x is 2. - continued The conditional p.m.f. of X | Y = 1 is X | Y = 2 is X | Y = 3 is For y = 1, 2, 3, a formula for the conditional p.m.f. of X | Y = y is E(Y | X = 1) = E(X | Y = 1) = E(Y2 | X = 1) = E(X2 | Y = 1) = Var(Y | X = 1) = Var(X | Y = 1) = E(Y | X = 2) = E(X | Y = 2) = E(Y2 | X = 2) = E(X2 | Y = 2) = Var(Y | X = 2) = Var(X | Y = 2) = E(Y | X = 3) = E(X | Y = 3) = E(Y2 | X = 3) = E(X2 | Y = 3) = Var(Y | X = 3) = Var(X | Y = 3) = 2. - continued Is the conditional mean of Y given X = x a linear function of the given value, that is, can we write E(Y | X = x) = a + bx ? Is the conditional mean of X given Y = y a linear function of the given value, that is, can we write E(X | Y = y) = c + dy ? 3. An urn contains six chips, one $1 chip, two $2 chips, and three $3 chips. Two chips are selected at random and with replacement. The following random variables are defined: X = dollar value of the first chip selected , Y = dollar value of the second chip selected . The space of (X, Y) is {(1,1) (1,2) (1,3) (2,1) (2,2) (2,3) (3,1) (3,2) (3,3)}. 3 1/12 1/6 1/4 y 2 1/18 1/9 1/6 1 1/36 1/18 1/12 1 2 x 3 xy x = 1, 2, 3 The joint p.m.f. of (X, Y) is f(x, y) = — if y = 1, 2, 3 36 3. - continued The marginal p.m.f. of X is f1(x) = 1/6 1/3 1/2 if x = 1 if x = 2 if x = 3 The marginal p.m.f. of Y is f2(y) = 1/6 1/3 1/2 if y = 1 if y = 2 if y = 3 A formula for the joint p.m.f. of (X,Y) is f(x, y) = (The formula was found previously) A formula for the marginal p.m.f. of X is f1(x) = x / 6 if x = 1, 2, 3 A formula for the marginal p.m.f. of Y is f2(y) = y / 6 if y = 1, 2, 3 E(X) = 7 / 3 E(X2) = 6 Var(X) = 6 – (7 / 3)2 = 5 / 9 E(Y) = 7 / 3 E(Y2) = 6 Var(Y) = 6 – (7 / 3)2 = 5 / 9 f(x, y) = f1(x)f2(y) Since _________________________, then the random variables X are and Y _______________ independent. P(X + Y < 4) = P[(X,Y) = (1,1)] + P[(X,Y) = (1,2)] + P[(X,Y) = (2,1)] = 1 / 36 + 1 / 18 + 1 / 18 = 5 / 36 3. - continued 3 E(XY) = 3 3 (xy) (xy / 36) = x=1 y=1 3 3 x=1 y=1 3 (xy) (x / 6) (y / 6) = x=1 y=1 (x) (x / 6) (y) (y / 6) = E(X) E(Y) = (7/3)(7/3) = 49 / 9 Cov(X,Y) = = The least squares lines for predicting Y from X is The least squares lines for predicting X from Y is For x = 1, 2, 3, the conditional p.m.f. of Y | X = x is E(Y | X = x) = Var(Y | X = x) = For y = 1, 2, 3, the conditional p.m.f. of X | Y = y is E(X | Y = y) = Var(X | Y = y) = Is the conditional mean of Y given X = x a linear function of the given value, that is, can we write E(Y | X = x) = a + bx ? Is the conditional mean of X given Y = y a linear function of the given value, that is, can we write E(X | Y = y) = c + dy ? For continuous type random variables (X, Y), the definitions of joint probability density function (joint p.d.f.), independence of X and Y, and mathematical expectation are each analogous to those for discrete type random variables, with summation signs replaced by integral signs. The covariance between random variables X and Y is The correlation between random variables X and Y is y Consider the equation of a line y = a + bx which comes “closest” to predicting the values of the random variable Y from the random variable X in the sense that E{[Y – (a + bX)]2} is minimized. x We let k(a,b) = E{[Y – (a + bX)]2} = To minimize k(a,b) , we set the partial derivatives with respect to a and b equal to zero. (Note: This is textbook exercise 4.2-5.) k — = a k — = b (Multiply the first equation by X , subtract the resulting equation from the second equation, and solve for b. Then substitute in place of b in the first equation to solve for a.) b= The least squares line for predicting Y from X is a= The least squares line for predicting Y from X can be written The least squares line for predicting X from Y can be written The conditional p.m.f./p.d.f. of Y given X = x is defined to be The conditional p.m.f./p.d.f. of X given Y = y is defined to be The conditional mean of Y given X = x is defined to be The conditional variance of Y given X = x is defined to be The conditional mean of X given Y = y and the conditional variance of X given Y = y are each defined similarly. For continuous type random variables (X, Y), the definitions of conditional mean and variance are each analogous to those for discrete type random variables, with summation signs replaced by integral signs. Suppose X and Y are two discrete type random variables, and E(Y | X = x) = a + bx. Then, for each possible value of x, Multiplying each side by f1(x), Summing each side over all x, Now, multiplying each side of Summing each side over all x, by x f1(x), The two equations and are essentially the same as those in the derivation of the least squares line for predicting Y from X. This derivation is analogous for continuous type random variables with summation signs replaced by integral signs. Consequently, if E(Y | X = x) = a + bx (i.e., if E(Y | X = x) is a linear function of x), then a and b must be respectively the intercept and slope in the least squares line for predicting Y from X. Similarly, if E(X | Y = y) = c + dy (i.e., if E(X | Y = y) is a linear function of y), then c and d must be respectively the intercept and slope in the least squares line for predicting X from Y. Suppose a set contains N = N1 + N2 + N3 items, where N1 items are of one type, N2 items are of a second type, and N3 items are of a third type; n items are selected from the N items at random and without replacement. If the random variable X1 is defined to be the number of selected n items that are of the first type, the random variable X2 is defined to be the number of selected n items that are of the second type, and the random variable X3 is defined to be the number of selected n items that are of the third type, then the joint distribution of (X1 , X2 , X3) is called a trivariate hypergeometric distribution. Since X3 = n – X1 – X2 , X3 is totally determined by X1 and X2 . The joint p.m.f. of (X1 , X2) is Each Xi has a distribution. If the number of types of items is any integer k > 1 with (X1 , X2 , … , Xk) defined in the natural way, then the joint p.d.f. is called a multivariate hypergeometric distribution. Suppose each in a sequence of independent trials must result in one of outcome 1, outcome 2, or outcome 3. The probability of outcome 1 on each trial is p1 , the probability of outcome 2 on each trial is p2 , and the probability of outcome 3 on each trial is p3 = 1 – p1 – p2 . If the random variable X1 is defined to be the number of the n trials resulting in outcome 1, the random variable X2 is defined to be the number of the n trials resulting in outcome 2, and the random variable X3 is defined to be the number of the n trials resulting in outcome 3, then the joint distribution of (X1 , X2 , X3) is called a trinomial distribution. Since X3 = n – X1 – X2 , X3 is totally determined by X1 and X2 . The joint p.m.f. of (X1 , X2) is Each Xi has a distribution. If the number of outcomes is any integer k > 1 with (X1 , X2 , … , Xk) defined in the natural way, then the joint p.d.f. is called a multinomial distribution. 4. An urn contains 15 red chips, 10 blue chips, and 5 white chips. Eight chips are selected at random and without replacement. The following random variables are defined: X1 = number of red chips selected , X2 = number of blue chips selected , X3 = number of white chips selected . (a) Find the joint p.m.f. of (X1 , X2 , X3) . (X1 , X2 , X3) have a trivariate hypergeometric distribution, and X3 = 8 – X1 – X2 is totally determined by X1 and X2 . The joint p.m.f. of (X1 , X2) is 15 10 5 x1 = 0, 1, …, 8 x1 x2 8–x –x 1 2 if f(x1, x2) = 30 8 x2 = 0, 1, …, 8 3 x1 + x2 8 (b) Find the marginal p.m.f. for each of X1 , X2 , and X3 . Each of X1 , X2 , and X3 has a hypergeometric distribution. 15 15 x1 8 – x1 f1(x1) = if x1 = 0, 1, …, 8 30 8 10 20 x2 8 – x2 f2(x2) = if x2 = 0, 1, …, 8 30 8 4. - continued 5 25 x3 8 – x3 f3(x3) = if x3 = 0, 1, …, 5 30 8 (c) Are X1 , X2 , and X3 independent? Why or why not? X1, X2, X3 cannot possibly be independent, because any one of these random variables is totally determined by the other two. (d) Find the probability that at least two of the selected chips are blue or at least two chips are white. P({X2 2} {X3 2}) = 1 – P({X2 1} {X3 1}) = 1 – [P(X2 = 0 , X3 = 0) + P(X2 = 1 , X3 = 0) + P(X2 = 0 , X3 = 1) + P(X2 = 1 , X3 = 1)] = 15 15 10 15 5 15 10 5 8 7 1 7 1 6 1 1 1– + + + 30 30 30 30 8 8 8 8 4. - continued (e) Find the conditional p.m.f. of X1 | x2 . X1 | x2 can be treated as “the number of red chips selected when For x2 = distribution with p.m.f. X1 | x2 has a (f) E(X1 | x2) can be written as a linear function of x2 , since E(X1 | x2) = Therefore, the least squares line for predicting X1 from X2 must be (g) E(X2 | x1) can be written as a linear function of x2 , since E(X2 | x1) = Therefore, the least squares line for predicting X2 from X1 must be (h) Find the covariance and correlation between X1 and X2 by making use of the following facts (instead of using direct formulas): The slope in the least squares line for predicting X1 from X2 is The slope in the least squares line for predicting X2 from X1 is The product of the slope in the least squares line for predicting X1 from X2 and the slope in the least squares line for predicting X2 from X1 is equal to . Suppose a set contains N = N1 + N2 + N3 items, where N1 items are of one type, N2 items are of a second type, and N3 items are of a third type; n items are selected from the N items at random and without replacement. If the random variable X1 is defined to be the number of selected n items that are of the first type, the random variable X2 is defined to be the number of selected n items that are of the second type, and the random variable X3 is defined to be the number of selected n items that are of the third type, then the joint distribution of (X1 , X2 , X3) is called a trivariate hypergeometric distribution. Since X3 = n – X1 – X2 , X3 is totally determined by X1 and X2 . N1 N2 N – N1 – N2 The joint p.m.f. of (X1 , X2) is x1 x 2 n – x1 – x2 if x1 and x2 are N “appropriate” n Each Xi has a hypergeometric distribution. integers If the number of types of items is any integer k > 1 with (X1 , X2 , … , Xk) defined in the natural way, then the joint p.d.f. is called a multivariate hypergeometric distribution. 5. An urn contains 15 red chips, 10 blue chips, and 5 white chips. Eight chips are selected at random and with replacement. The following random variables are defined: X1 = number of red chips selected , X2 = number of blue chips selected , X3 = number of white chips selected . (a) Find the joint p.m.f. of (X1 , X2 , X3) . (X1 , X2 , X3) have a trinomial distribution, and X3 = 8 – X1 – X2 is totally determined by X1 and X2 . The joint p.m.f. of (X1 , X2) is x1 x2 8 – x1 – x2 1 1 1 8! f(x1, x2) = — — — x1! x2! (8 – x1 – x2)! 2 3 6 if x1 = 0, 1, …, 8 x2 = 0, 1, …, 8 x 1 + x2 8 (b) Find the marginal p.m.f. for each of X1 , X2 , and X3 . Each of X1 , X2 , and X3 has a binomial distribution. 8 8! f1(x1) = x1! (8 – x1)! 8! f2(x2) = x2! (8 – x2)! 1 — 2 1 — 3 if x2 2 — 3 x1 = 0, 1, …, 8 8 – x2 if x2 = 0, 1, …, 8 5. - continued 8! f3(x3) = x3! (8 – x3)! 1 — 6 x3 5 — 6 8 – x3 if x3 = 0, 1, …, 8 (c) Are X1 , X2 , and X3 independent? Why or why not? X1, X2, X3 cannot possibly be independent, because any one of these random variables is totally determined by the other two. (d) Find the probability that at least two of the selected chips are blue or at least two chips are white. P({X2 2} {X3 2}) = 1 – P({X2 1} {X3 1}) = 1 – [P(X2 = 0 , X3 = 0) + P(X2 = 1 , X3 = 0) + P(X2 = 0 , X3 = 1) + P(X2 = 1 , X3 = 1)] = 8 1– 1 — 2 7 + 8! 7! 1! 1 — 2 1 — 3 + 7 8! 7! 1! 1 — 2 6 1 — 6 8! + 6! 1! 1! 1 — 2 1 — 3 1 — 6 5. - continued (e) Find the conditional p.m.f. of X1 | x2 . X1 | x2 can be treated as “the number of red chips selected when For x2 = distribution with p.m.f. X1 | x2 has a (f) E(X1 | x2) can be written as a linear function of x2 , since E(X1 | x2) = Therefore, the least squares line for predicting X1 from X2 must be (g) E(X2 | x1) can be written as a linear function of x2 , since E(X2 | x1) = Therefore, the least squares line for predicting X2 from X1 must be (h) Find the covariance and correlation between X1 and X2 by making use of the following facts (instead of using direct formulas): The slope in the least squares line for predicting X1 from X2 is The slope in the least squares line for predicting X2 from X1 is The product of the slope in the least squares line for predicting X1 from X2 and the slope in the least squares line for predicting X2 from X1 is equal to . Suppose each in a sequence of independent trials must result in one of outcome 1, outcome 2, or outcome 3. The probability of outcome 1 on each trial is p1 , the probability of outcome 2 on each trial is p2 , and the probability of outcome 3 on each trial is p3 = 1 – p1 – p2 . If the random variable X1 is defined to be the number of the n trials resulting in outcome 1, the random variable X2 is defined to be the number of the n trials resulting in outcome 2, and the random variable X3 is defined to be the number of the n trials resulting in outcome 3, then the joint distribution of (X1 , X2 , X3) is called a trinomial distribution. Since X3 = n – X1 – X2 , X3 is totally determined by X1 and X2 . The joint p.m.f. of (X1 , X2) is x1 x2 n – x1 – x2 n! p1 p2 (1 – p1 – p2) x1! x2! (n – x1 – x2)! if x1 and x2 are non-negative Each Xi has a b( n , pi ) distribution. integers such that x1 + x2 n If the number of outcomes is any integer k > 1 with (X1 , X2 , … , Xk) defined in the natural way, then the joint p.d.f. is called a multinomial distribution. 6. One chip is selected from each of two urns, one containing three chips labeled distinctively with the integers 1 through 3 and the other containing two chips labeled distinctively with the integers 1 and 2. The following random variables are defined: X = largest integer among the labels on the selected chips , Y = smallest integer among the labels on the selected chips . The space of (X, Y) is {(1,1) (2,1) (3,1) (2,2) (3,2)}. (Note: We immediately see that X and Y cannot be independent, since the joint space is not “rectangular”.) 2 1/6 1/6 1/3 1/6 y 1/6 1 1 2 x 3 The joint p.m.f. of (X, Y) is f(x, y) = 1 — if (x, y) = (1, 1) , (3, 1) , (2, 2) , (3, 2) 6 1 — if (x, y) = (2, 1) 3 The marginal p.m.f. of X is f1(x) = 1/6 1/x if x = 1 if x = 2, 3 The marginal p.m.f. of Y is f2(y) = (3 – y) / 3 if y = 1, 2 E(X) = 13 / 6 E(X2) = 31 / 6 Var(X) = 31 / 6 – (13 / 6)2 = 17 / 36 E(Y) = 4 / 3 E(Y2) = 2 Var(Y) = 2 – (4 / 3)2 = 2 / 9 6. - continued f(x, y) f1(x)f2(y) Since _________________________, then the random variables X are not and Y _______________ independent (as we previously noted). Using the joint p.m.f., E(XY) = (1)(1)(1/6) + (3)(1)(1/6) + (2)(2)(1/6) + (3)(2)(1/6) + (2)(1)(1/3) = 3 Cov(X,Y) = = The least squares lines for predicting Y from X is The least squares lines for predicting X from Y is The conditional p.m.f. of Y | X = 1 is Y | X = 2 is Y | X = 3 is 6. - continued The conditional p.m.f. of X | Y = 1 is X | Y = 2 is E(Y | X = 1) = E(Y2 | X = 1) = Var(Y | X = 1) = E(Y | X = 2) = E(Y2 | X = 2) = Var(Y | X = 2) = E(Y | X = 3) = E(Y2 | X = 3) = Var(Y | X = 3) = 6. - continued E(X | Y = 1) = E(X2 | Y = 1) = Var(X | Y = 1) = E(X | Y = 2) = E(X2 | Y = 2) = Var(X | Y = 2) = Is the conditional mean of Y given X = x a linear function of the given value, that is, can we write E(Y | X = x) = a + bx ? Is the conditional mean of X given Y = y a linear function of the given value, that is, can we write E(X | Y = y) = c + dy ? For continuous type random variables (X, Y), the definitions of joint probability density function (joint p.d.f.), independence of X and Y, and mathematical expectation are each analogous to those for discrete type random variables, with summation signs replaced by integral signs. The covariance between random variables X and Y is The correlation between random variables X and Y is y Consider the equation of a line y = a + bx which comes “closest” to predicting the values of the random variable Y from the random variable X in the sense that E{[Y – (a + bX)]2} is minimized. x 9. Random variables X and Y have joint p.d.f. f(x,y) = 5xy2 / 2 if 0 < x/2 < y < 1 . Skip to #9 The space of (X, Y) displayed graphically is as follows: y (Note: We immediately see that X and Y cannot be independent, since the joint (0,2) space is not “rectangular”.) (0,0) y=x/2 (2,1) x Event A = {(x,y) | 1/2 < x < 1 , 1/2 < y < 3/2} displayed graphically is as follows: y (1/2, 3/2) (0,2) (1/2, 1/2) P(A) = (1, 3/2) (2,1) (1, 1/2) x (0,0) f(x, y) dx dy = A 1 1/2 1 1 1 1 1 35 5xy2 5x2y2 15y2 5y3 —— dx dy = —— dy = —— dy = — = –— 128 2 4 16 16 x = 1/2 1/2 y = 1/2 1/2 1/2 9. - continued The marginal p.d.f. of X is f1(x) = 1 5xy2 —— dy = 2 f(x, y) dy = – x/2 1 40x – 5x4 = ———— if 0 < x < 2 48 5xy3 —— 6 y=x/2 2 2 40x – 5x4 E(X) = x ———— dx = 48 0 0 40x2 – 5x5 ———— dx = 48 2 5x3 5x6 10 — – —— = — 9 18 288 x=0 2 2 2 5x4 5x7 10 40x – 5x4 40x3 – 5x6 2 2 E(X ) = x ———— dx = ———— dx = — – —— = — 7 24 336 48 48 x=0 0 0 110 Var(X) = —– 567 9. - continued 2y The marginal p.d.f. of Y is f2(y) = f(x, y) dx = – 5xy2 —— dx = 2 0 2y 5x2y2 —— 4 x=0 5 E(Y) = — 6 = 5y4 if 0 < y < 1 E(Y 2) 5 = — 7 5 Var(Y) = —– 252 9. - continued f(x, y) f1(x)f2(y) Since _________________________, then the random variables X are not and Y _______________ independent (as we previously noted). 1 2y 5xy2 E(XY) = xy f(x, y) dx dy = xy —— dx dy = 2 – – 0 0 1 2y 1 1 1 2y 5x2y3 5x3y3 20y6 20y7 20 —— dx dy = —— dy = —— dy = —– = — 21 2 6 3 21 x=0 y=0 0 0 0 0 Cov(X,Y) = = The least squares lines for predicting Y from X is The least squares lines for predicting X from Y is For 0 < x < 2, the conditional p.d.f. of Y | X = x is 9. - continued For 0 < x < 2, E(Y | X = x) = For 0 < x < 2, E(Y2 | X = x) = For 0 < x < 2, Var(Y | X = x) = For 0 < y < 1, the conditional p.d.f. of X | Y = y is For 0 < y < 1, E(X | Y = y) = For 0 < y < 1, E(X2 | Y = y) = 9. - continued For 0 < y < 1, Var(X | Y = y) = Is the conditional mean of Y given X = x a linear function of the given value, that is, can we write E(Y | X = x) = a + bx ? Is the conditional mean of X given Y = y a linear function of the given value, that is, can we write E(X | Y = y) = c + dy ? 7. Random variables X and Y have joint p.d.f. f(x,y) = (x + y) / 8 if 0 < x < 2 , 0 < y < 2 . The space of (X, Y) displayed graphically is as follows: y (0,2) (0,0) (2,2) (2,0) x Events A = {(x,y) | 1/2 < x < 1 , 1/2 < y < 3/2} and B = {(x,y) | x > y} displayed graphically are as follows: The set A = {(x, y) | 1/2 < x < 1 , 1/2 < y < 3/2} is graphically displayed as follows: A y (0,2) (1/2, 3/2) (2,2) (1, 3/2) (1/2, 1/2) (1, 1/2) (0,0) (2,0) The set B = {(x, y) | x > y} is graphically displayed as follows: y B (0,2) x (0,0) (2,2) (2,2) (2,0) x 7. - continued P(A) = P(1/2 < X < 1 , 1/2 < Y < 3/2) = f(x, y) dx dy = A 3/2 1 3/2 x+y —— dx 8 1/2 3/2 1/2 dy = 1/2 3/2 1 + 2y 1/4 + y 3 + 4y ——— – ——— dy = ——— dy = 16 16 64 1/2 1/2 1 x2 + 2xy ——— dy 16 x = 1/2 = 3/2 3y + 2y2 7 ——— = — 64 64 y = 1/2 P(B) = P(X > Y) = f(x, y) dx dy = x > y 2 2 x+y —— dx 8 2 0 dy y 0 x2 2 x+y —— dy 8 or 0 x2 2xy + y2 ——— dx 16 y=0 2 0 2 2x3 + x4 5x4 + 2x5 ——— dx = ———— = 16 160 = 0 dx x=0 9 — 10 7. - continued The marginal p.d.f. of X is f1(x) = E(X) = 7/6 E(X2) = 5/3 Var(X) = 11/36 2 f(x, y) dy = x+y —— dy = 8 – 2 2xy + y2 ——— 16 y=0 = x+1 —— 4 0 if 0<x<2 2 f(x, y) dx = x+y —— dx = 8 The marginal p.d.f. of Y is f2(y) = – y+1 —— 4 0 if 0<y<2 E(Y) = 7/6 E(Y2) = 5/3 Var(Y) = 11/36 f(x, y) f1(x)f2(y) Since _________________________, then the random variables X are not and Y _______________ independent 7. - continued E(XY) = 2 0 = 0 2 2x3y + 3x2y2 ————— dy = 48 x=0 0 Cov(X,Y) = x2y + xy2 ——— dx 8 xy f(x, y) dx dy = – – 2 2 2 dy = 0 2 4y + 3y2 2y2 + y3 4 ———— dy = ——— = — 12 12 3 y=0 The least squares lines for predicting Y from X is The least squares lines for predicting X from Y is For 0 < x < 2, the conditional p.d.f. of Y | X = x is 7. - continued For 0 < x < 2, E(Y | X = x) = For 0 < x < 2, E(Y2 | X = x) = For 0 < x < 2, Var(Y | X = x) = For 0 < y < 2, the conditional p.d.f. of X | Y = y is For 0 < y < 2, E(X | Y = y) = For 0 < y < 2, E(X2 | Y = y) = 7. - continued For 0 < y < 2, Var(X | Y = y) = Is the conditional mean of Y given X = x a linear function of the given value, that is, can we write E(Y | X = x) = a + bx ? Is the conditional mean of X given Y = y a linear function of the given value, that is, can we write E(X | Y = y) = c + dy ? 8. Random variables X and Y have joint p.d.f. f(x,y) = (y – 1) / (2x2) if 1 < x , 1 < y < 3 . The space of (X, Y) displayed graphically is as follows: y (1,3) (1,1) (0,0) x 8. - continued Event A = {(x,y) | 1 < x < 3 , 1 < y < (x+1)/2} displayed graphically is as follows: y y = (x + 1) / 2 x =3 (1,3) (3,2) (1,1) x (0,0) 3 f(x, y) dx dy P(A) = A (x+1)/2 y–1 —— dy 2x2 = 1 1 dx = 3 (x+1)/2 y–1 —— dy 2x2 1 3 dx = 1 1 3 y2 – 2y ——— dx 4x2 y=1 = 3 (x + 1)2 – 4(x + 1) 1 ——————— + —— dx = 16x2 4x2 1 x2 – 2x + 1 ———— dx 16x2 1 3 3 1 1 1 — – — + —— dx 16 8x 16x2 1 (x+1)/2 x ln x 1 = — – —— – —— = 16 8 16x x=1 1 ln 3 3 ln 3 1 — – —— – — – 0 = — – —— 6 8 16 8 48 = 8. - continued Events B = {(x,y) | x > y} displayed graphically is as follows: y y=x (1,3) (3,3) Note that describing B as {1 < x < 3 , 1 < y < x} {3 < x < , 1 < y < 3} makes the integration more work than describing B as {1 < y < 3 , y < x < } (1,1) x (0,0) 3 P(B) = x>y f(x, y) dx dy = 1 y 3 y–1 —— dx dy = 2x2 1 1–y —— dy = 2x x=y 3 1 1–y —— dy = 2x x=y 1 3 3 y–1 y lny —— dy = — – —– 2y 2 2 ln 3 = 1 – —— 2 y=1 8. - continued 3 y–1 —— dy 2x2 The marginal p.d.f. of X is f1(x) = f(x, y) dy = – 1 3 y2 – 2y ——— 4x2 1 = — x2 y=1 E(X) = 1 1 x — dx = x2 1 = 1 — dx = ln(x) = x x=1 if 1<x E(X2) = x2 1 Var(X) = 1 — dx x2 dx = = 1 8. - continued y–1 —— dx 2x2 The marginal p.d.f. of Y is f2(y) = f(x, y) dx = – 1–y —— 2x 1 y–1 = —— 2 if 1<y<3 x=1 3 E(Y) = 1 3 y–1 y —— dy = 2 1 3 y2 – y —— dy = 2 = y3 y2 7 — – — = — 3 6 4 y=1 3 E(Y2) = 3 y–1 2 y —— dy 2 1 2 Var(Y) = — 9 = 1 3 y3 – y2 —— dy = 2 y4 y3 17 — – — = — 3 8 6 y=1 8. - continued f(x, y) = f1(x)f2(y) Since _________________________, then the random variables X are and Y _______________ independent 3 y–1 dx dy = xy —— 2 2x E(XY) = 1 1 3 y2 – y —— dy 2 1 — x 1 dx = 1 7 — 3 Cov(X,Y) = = 1 1 — dx = x The least squares lines for predicting Y from X is The least squares lines for predicting X from Y is For 1 < x , the conditional p.d.f. of Y | X = x is For 1 < y < 3, the conditional p.d.f. of X | Y = y is For 1 < x , E(Y | X = x) = For 1 < x , Var(Y | X = x) = For 0 < y < 3, E(X | Y = y) = For 0 < y < 3, Var(X | Y = y) =