Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU Chapter 4. Joint distributions (Two-way Random Variable) 4.1 Definition Let X and Y be random variables. The pair (X, Y) is called a (two-dimensional) random vector. If X and Y are discrete random variables, (X, Y) is called a discrete random vector. If X and Y are continuous random variables, (X, Y) is called a continuous random vector. 4.2 The probability distribution of a (two-dimensional) random vector 4.2.1 Probability distribution table of a discrete (two-dimensional) random vector Let (X, Y) is a discrete two-dimensional random vector. Suppose that the set of values that X can take is {x1, x2, …, xn} and the set of values that Y can take are {y1, y2, …, ym}. Denote P(xi, yj) = P(X=xi, Y=yj), i 1, n; j 1, m We have the probability distribution table of a discrete two-dimensional random vector (X, Y) as follows: X x1 x2 … xi … xn y1 P(x1, y1) P(x2, y1) … P(xi, y1) … P(xn, y1) y2 P(x1, y2) P(x2, y2) … P(xi, y2) … P(xn, y2) … … … … … … … yj P(x1, yj) P(x2, yj) … P(xi, yj) … P(xn, yj) … … … … … … … ym P(x1, ym) P(x2, ym) Y Note. 1 P(xi, ym) P(xn, ym) Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU 0 ≤ P(xi, yj) ≤ 1, i = 1, n , j = 1, m ; n m P(x , y ) 1 i i 1 j1 j Then, the marginal probability distribution table of the component X is: X x1 x2 ... xi ... xn P P(x1) P(x2) ... P(xi) ... P(xn) where m P(x , y ) , i = 1, n P(xi) = i j1 j n Note. P(x ) 1 i i 1 The marginal probability distribution table of the component Y is: Y y1 y2 ... yi ... ym P P(y1) P(y2) ... P(yi) ... P(ym) where n P(yj) = P(x , y ) , j = 1, m . i i 1 j m Note. P(y ) 1 j1 j Example. Let the probability distribution table of a two-dimensional random vector (X, Y) as follows: 100 150 200 0 0.1 0.05 0.05 1 0.05 0.2 0.15 2 0 0.1 0.3 Y X Find the marginal distribution of each component. 2 Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU Solution. We have P(X = 100) = 0.1 + 0.05 + 0 = 0.15 P(X = 150) = 0.05 + 0.2 + 0.1 = 0.35 P(X = 200 ) = 0.05 + 0.15 + 0.3 = 0.5 We have the following probability distribution table of X: X 100 150 200 P 0.15 0.35 0.5 P(Y=0) = 0.1 + 0.05 + 0.05 = 0.2 P(Y=1)= 0.05 + 0.2 + 0.15 = 0.4 P(Y=2)= 0 + 0.1 + 0.3 = 0.4 We have the following probability distribution table of Y: Y 0 1 2 P 0.2 0.4 0.4 4.2.2 The joint distribution function 1) Definition. The joint distribution function (joint cdf) of (X, Y), denoted F(x, y), is defined as F(x, y) = P(X < x, Y < y), for x, y R . 2) Properties Property 1. 0 ≤ F(x, y) ≤ 1 Property 2. F(x, y) is a function that does not decrease with each variable or with both variables, which means F(x1, y) ≤ F(x2, y) when x1 < x2, F(x, y1) ≤ F(x, y2) when y1 < y2, F (x1, y1) ≤ F(x2, y2) when x1 < x2, y1 < y2. Property 3. If at least one variable approaches , F(x, y) approaches zero, which means 3 Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU F(, y) = lim F(x, y) = 0, x F(x, ) = lim F(x, y) = 0, y F(, ) = lim F(x, y) = 0. x y Property 4. If one variable approaches , F(x, y) approaches the distribution function of the other variable, which means F(, y) lim F(x, y) FY (y) x and F(x, ) lim F(x, y) FX (x) . y Property 5. If both variables approaches , F(x, y) approaches 1, which means F(, ) lim F(x, y) 1 . x y Note. From the above properties, we can deduce: - The probability that the random vector (X, Y) takes the value in the rectangle ( x1 < X < x2, y1 < Y < y2) is defined as follows P (x1 < X < x2, y1 < Y < y2) = F(x1, y1) + F(x2, y2) – F(x1, y2) – F(x2, y1) - The probability that the random vector (X, Y) takes the value in the rectangle ( x1 < X < x2, Y < y) is defined as follows P(x1 < X < x2, Y < y) = F(x2, y) – F(x1, y) - Probability that a random vector (X, Y) takes the value in the rectangle ( X < x, y1 < Y < y2) is defined as follows P ( X < x, y1 < Y < y2) = F(x, y2) – F(x, y1) Example. Let the joint distribution function of a two-dimensional random vector (X, Y) as follows: 4 Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU s inxsiny,if (x,y) D 0, 0, F(x, y) 2 2 0, if (x, y) D Find the probability that the random variable (X, Y) takes the value in the rectangle bounded by the lines x = 0, x , y và y . 6 3 4 Solution. We have P( 0 X ; Y ) = F , F 0, F , F 0, 4 6 3 4 3 3 4 6 6 = sin = sin sin 0 sin sin sin sin 0 sin 4 3 3 4 6 6 2 3 1 = 2 2 2 6 2 . 4 4.2.3 The joint probability density function (joint pdf) of a continuous (twodimensional) random vector 1) Definition. The joint probability density function of a continuous two-dimensional random vector (X, Y), denoted f(x, y), is the second-order mixed partial derivative of the joint distribution function: f x, y Fxy (x, y) , for x, y R . Example. Find the joint probability density function of a two-dimensional random vector X, Y if its joint distribution function is: F(x, y) = sinx.siny (0 x ,0 y ) . 2 2 Solution. We have Fx cosxsiny 5 Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU So, the joint probability density function of a continuous two-dimensional random vector X, Y is: f x, y Fxy cosx.cosy , (0 x ,0 y ) 2 2 2) Properties Property 1. f(x, y) ≥ 0. Property 2. f (x, y)dxdy 1 . Example. The random vector (X, Y) has joint pdf as follows: C(2x y), if 2 < x < 6, 0 < y < 5 f (x, y) 0, with other values of x, y Find the constant C. Solution. We have 6 5 25 1 f (x, y)dxdy C(2x y)dxdy C (2x y)dy dx C (10x )dx 210C 2 2 0 20 2 6 5 6 Thus, C 1 . 210 x y Property 3. F(x, y) f (u, v)dudv , where F(x, y) is the joint distribution function of - (X, Y) . Example. Find the joint distribution function of the random vector (X, Y) according to the following joint probability density function: Solution. We have y x dudv 1 du dv F(x, y) f (u, v)dudv 2 1 u 2 1 v2 (1 u 2 )(1 v 2 ) 2 - - x y x y 6 Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU 1 1 1 1 arctanx+ . arctany+ 2 2 Property 4. Let f1, f2 be the marginal probability density function of components X and Y respectively. Then ; Example. The joint probability density function of a two-dimensional random vector (X, Y) has the form: Find the marginal probability density function of the components. Solution. We have f1 ( x) f ( x, y )dy 3 3 e 4 x 2 6 xy 9 y 2 3e 3 x dy 2 e . (3 y x ) 2 d (3 y x) (Here we apply the integral e u2 3e 3 x 2 . 3 e 3 x 2 du ) f2 ( y) f ( x, y )dx 3 3 e 4 x 2 6 xy 9 y 2 27 3 3e 4 dx 2 y 2 e 3 (2 x y )2 2 27 3 3 3e 4 d (2 x y ) 2 2 y2 . 3 3 274 y 2 e 2 Property 5. P[(X, Y) D] = f (x, y)dxdy . D Example 1. Find the probability that (X, Y) takes the value in a rectangle with vertices K(1; 1), L( 3 ; 1), N( 3 ; 0), M(1; 0) if the random vector (X, Y) has the following joint probability density function: f(x, y) = 1 . (1 x 2 )(1 y 2 ) 2 Solution. We have 7 Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU P[(X, Y) D] = 1 1 D 2 (1 x 2 )(1 y2 ) dxdy = 2 3 1 dx dy 1 1 1 x 2 0 1 y2 = 48 . Example 2. Let the joint probability density function of a two-dimensional random vector (X, Y) as follows: 1 (2x y), if 2 < x < 6, 0 < y < 5 f (x, y) 210 0, with other values of x, y Find the probability P(3 < X < 4, Y > 2). Solution. We have Example 3. Let the joint probability density function of a two-dimensional random vector (X, Y) as follows: 8xy, nÕu 0 < x < 1, 0 < y < x f (x, y) 0, with other values of x, y Find the probability P(0 < X < 1 , 0 < Y < x). 2 Solution. We have 1 P(0 < X < , 0 < Y < x) = 2 1 2 x 1 2 x 1 2 f (x, y)dxdy 8xydxdy 4 x dx 0.0625 . 3 0 0 0 0 0 4.3 Conditional distribution and Independence 4.3.1 Conditional Probability Distribution Table Consider a discrete two-dimensional random vector (X, Y), where the set of values that X can take is {x1, x2, …, xn} and the set of the values that Y can take is {y1, y2, …, ym}. 8 Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU The conditional probability distribution table of the component X given Y = yj has the form X/Y=yj x1 x2 ... xi ... xn P P(x1/yj) P(x2/yj) … P(xi/yj) … P(xn/yj) where P(xi/yj) = P(X = xi/Y = yj) (i 1, n; j 1, m) is the conditional probability of the component X having a value of xi provided that the component Y has a value of yj calculated by the formula: P(xi/yj) = n Note. P(x i 1 i P(x i , y j ) P(Y y j ) ,i 1, n / yj) 1 Similarly, the conditional probability distribution table of the component Y given X = xi has the form Y/X=xi y1 y2 … yj … ym P P(y1/xi) P (y2/xi) … P(yj/xi) … P(ym/xi) where the conditional probabilities P(yj/xi) are calculated using the formula P(yj/xi) = P(x i , y j ) P(X x i ) , j 1, m m Note. P(y / x ) 1 j1 j i Example. Let the probability distribution table of a two-way random vector (X, Y), where X = "Revenue", Y = "Advertising cost" as follows: (Unit: Million VND) Y X 100 150 200 0 0.1 0.05 0.05 1 0.05 0.2 0.15 9 Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU 2 0 0.1 0.3 Find the probability distribution of the revenue without advertising and when advertising costs is 2 million dong. Solution. +) We have P(Y = 0) = 0.1 + 0.05 + 0.05 = 0.2 0.1 P(X 100, Y 0) = = 0.5 0.2 P(Y 0) P(X=100/Y=0) = P(X=150/Y=0) = 0.05 P(X 150, Y 0) = = 0.25 P(Y 0) 0.2 P(X=200/Y=0) = 0.05 P(X 200, Y 0) = = 0.25 P(Y 0) 0.2 So, the probability distribution of revenue without advertising is X/Y =0 100 150 200 P 0.5 0.25 0.25 +) We have P(X=100/Y=2) = 0 P(X 100, Y 2) = =0 0.4 P(Y 2) P(X=150/Y=2) = 0.1 P(X 150, Y 2) = = 0.25 0.4 P(Y 2) P(X=200/Y=2) = 0.3 P(X 200, Y 2) = = 0.75 0.4 P(Y 2) So, the probability distribution of the revenue when the advertising cost is 2 million is 10 Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU X/Y =2 100 150 200 P 0 0.25 0.75 4.3.2 The conditional probability density function (The conditional pdf) Assume (X, Y) is a continuous two-dimensional random vector with joint pdf f (x, y) . The conditional probability density function of the component X given Y = y, denoted f(x/y), is defined as: f(x/y) = f (x, y) f 2 (y) f (x, y) . f (x, y)dx Similarly, the conditional probability density function of the component Y given X = x, denoted by f(y/x), is defined as: f(y/x) = f (x, y) f1 (x) f (x, y) . f (x, y)dy Note: f(x/y) 0 and f (x / y)dx 1 f(y/x) 0 and f (y / x)dy 1 . Example. The joint probability density function of a two-dimensional random variable has the form: Find the conditional probability density function of the components. Solution. The marginal pdf of X is f1 ( x) 3 The marginal pdf of Y is 11 e3 x . 2 Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU f2 ( y) 3 3 274 y 2 . e 2 We have the conditional probability density function of X given Y y is and the conditional probability density function of Y given X x is ● On the basis of conditional probability distributions, we have the following formulas: +) If (X, Y) is a discrete two-dimensional random vector, where the set of values that X can take is {x1, x2, …, xn} and the set of values that Y can take is {y1, y2, …, ym} , we have P(x i , y j ) P(X x i )P(y j / x i )= P(Y=y j )P(x i / y j ) , for i 1,n; j 1,m . Especially: X and Y are independent if and only if P(xi, yj) = P(X = xi).P(Y = yj), for i 1,n; j 1,m . +) If (X, Y) is a continuous two-dimensional random vector, we have f(x, y) = f1(x)f(y/x) = f2(y)f(x/y). Especially: X and Y are independent if and only if f x, y f1 x f 2 y , for all x, y e R. 4.4 Characteristic parameter of two-dimensional random vector 4.4.1 The expected value and the variance of the component random variable 12 Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU 1) If (X, Y) is a discrete two-dimensional random vector, where the set of values that X can take is {x1, x2, …, xn} and the set of values that Y can take is {y1, y2, …, ym}, we have Example. Let (X, Y) be a two-dimensional random vector with the following probability distribution table: 100 150 200 0 0.1 0.05 0.05 1 0.05 0.2 0.15 2 0 0.1 0.3 Y X Find E(X), V(X), E(Y), V(Y). Solution. E(X) = 100.P(X = 100) + 150.P(X = 150) + 200.P(X = 200) = 100(0.15) + 150(0.35) + 200(0.5) = 167.5 V(X) = (1002.P(X = 100) + 1502.P(X = 150) + 2002.P(X = 200)) – [E(X)]2 = (1002(0.15) + 1502(0.35) + 2002(0.5)) – (167.5)2 = 29375 – 28056.25 = 1318.75 E(Y) = 0.P(X = 0) + 1.P(X = 1) + 2.P(X = 2) = 0(0.2) + 1(0.4) + 2(0.4) = 1.2 V(Y) = (02.P(Y = 0) + 12.P(Y = 1) + 22.P(Y = 2)) – [E(Y)]2 = (02.(0.2) + 12.(0.4) + 22.(0.4)) – (1,2)2 = 2 – 1.44 = 0.56 b) If (X, Y) is a continuous two-dimensional random vector , we have 13 Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU xf (x)dx xf (x, y)dxdy , E(X) = 1 V(X) = x f (x, y)dxdy - [E(X)] , 2 x 2f1 (x)dx - [E(X)]2 = E(Y) = yf 2 (y)dy yf (x, y)dxdy , V(Y) = 2 y 2f 2 (y)dy - [E(Y)]2 = y f (x, y)dxdy -[E(Y)] . 2 2 Example. Let the probability distribution density function of a two-dimensional random vector (X, Y) as follows: 1 (2x y), if 2 < x < 6, 0 < y < 5 f (x, y) 210 with other values of x, y 0, Find E(X), V(X), E(Y), V(Y). Solution. We have 6 5 6 1 1 2144 268 x(2x y) dxdy = x(4x 5)dx E(X) = xf (x, y)dxdy = 210 84 2 504 63 2 0 1 2 268 x (2x y) dxdy - V(X) = x f (x, y)dxdy [E(X)] = 210 63 2 0 6 5 2 2 2 2 2 5036 1 268 1220 268 x 2 (4x 5)dx - = - . = = 84 2 3969 63 63 63 6 E(Y) = 1 5 170 1 y(2x y) dxdy y(2y 16)dy = = 2 0 210 105 0 63 6 5 yf (x, y)dxdy = V(Y) = 6 5 y 2 f (x, y)dxdy - [E(Y)]2 = 1 210 y (2x y)dxdy - [E(Y)] 2 0 14 2 2 Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU 2 2 1 5 2 1175 170 16225 170 y (2y 16)dy - = = = 105 0 7938 126 63 63 4.4.2 Covariance and correlation coefficient Covariance and correlation coefficient are numbers that characterize the degree of dependence between random variables. 1) Covariance a) Definition. The covariance of the random variables X and Y, denoted Cov(X, Y), is defined as: Cov(X, Y) = E{[X – E(X)][Y – E(Y)]} = E(XY) – E(X)E(Y) Especially: i) If (X, Y) is a discrete random vector then n Cov(X, Y) = m x y P(x , y ) E(X)E(Y) i 1 j1 i j i j ii) If (X, Y) is a continuous random vector then Cov(X, Y) = xyf(x,y)dxdy E(X)E(Y) b) Properties Property 1. If X, Y are independent, Cov(X, Y) = 0. The converse is not true, which means if Cov(X, Y) = 0, X and Y can be independent or dependent. Example 1. Let X and Y be two random variables with a probability distribution table as follows: Y -1 0 1 -1 4/15 1/15 4/15 0 1/15 2/15 1/15 1 0 2/15 0 X Cov(X, Y) = 0 but X and Y are not independent because 15 Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU 1 9 5 4 P(X = -1).P(Y = -1) = . = . 15 15 15 5 P(X = -1, Y = -1) = Property 2. Let X, Y be two dependent random variables and a, b be real numbers. Then V (aX bY ) a 2V ( X ) b 2V (Y ) 2abCov( X , Y ) Especially: V ( X Y ) V ( X ) V (Y ) 2Cov( X , Y ) , V ( X Y ) V ( X ) V (Y ) 2Cov( X , Y ) . Example 2. There are two types of stocks A and B sold in the stock market and their interest rates are two random variables X and Y respectively (unit: %). Suppose (X, Y) has the following probability distribution table: Y -2 0 5 10 0 0 0.05 0.05 0.1 4 0.05 0.1 0.25 0.15 6 0, 1 0.05 0.1 0 X a) If the goal is to achieve the maximum expected return, what ratio should you invest in both stocks? b) In order to minimize interest rate risk, what ratio should you invest in these two stocks? Solution. Y -2 0 5 10 P(xi) 0 0 0.05 0.05 0.1 0.2 4 0.05 0.1 0.25 0.15 0.55 6 0.1 0.05 0 ,1 0 0.25 P(yj) 0.15 0.2 0.4 0.25 X E(X) = 0(0.2) + 4(0.55) + 6(0.25) = 3.7%; V(X) = (02(0.2) + 42(0.55) + 62(0.25)) – 3.72 = 4.11 E(Y) = 4.2%; V(Y) = 17.96 16 Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU a) If we denote α as the ratio of investment in stock A, then we have a ratio of investment in stock B of (1 - α). So, we have to find α such that E(αX + (1 - α)Y) max We have E(αX + (1 - α)Y) = αE(X) + (1 - α)E(Y) = α(3.7) + (1 - α)(4.2) = 4.2 – (0.5)α ⇒ E(αX + (1 - α)Y) max when α = 0. That is, if we want to achieve the maximum expected interest rate, we must invest in buying all stock B. b) Determine α such that V(αX + (1 - α)Y) We have: P(X = 0, Y = -2) min P(X = 0) ).P(Y = -2) = (0.2)(0.15) = 0.03 ⇒ X, Y are two dependent variables. Therefore, V(αX + (1 - α)Y) = α2V(X) + (1 - α)2V(Y) + 2α(1 - α)Cov(X, Y) Cov(X, Y) = x y p(x , y ) i j i j - E(X)E(Y) = (12.4) – (3.7)(4.2) = -3.14 V(αX + (1 - α)Y) = (4.11)α2 +(17.96)(1 - α)2 + 2α(1 - α)(-3.14) = (28.35)α2 – (42.2)α + 17.96 = f(α) f (56.7) 42.2 0 ⇒ α = 0.7443. f () 56.7 0 So, V(αX + (1 - α)Y) min when α = 0.7443. In conclusion: If investing in stocks A and B at the ratio of 74.43% and 25.57% , we will have the lowest level of risk. From the above definition, we can see that the covariance has a unit of measure equal to the product of the measurement units of the random variables X and Y. Therefore, the covariance will have different values depending on the unit of measure of those random variables. Therefore, covariance is usually not significant and does not accurately reflect the degree of dependence between random variables X and Y. To overcome this limitation, a parameter called correlation coefficient is introduced. 2) Correlation coefficient 17 Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU a) Definition. The correlation coefficient of two random variables X and Y, denoted by XY , is the ratio between the covariance and the product of the standard deviations of those random variables: XY Cov(X, Y) Cov(X, Y) . X .Y V(X)V(Y) b) Properties Property 1. XY = YX . Property 2. -1 XY 1. Property 3. If X and Y are independent, we have XY = 0. The converse of Property 3 is not true; the correlation coefficient can be 0 even if the random variables are dependent. Property 4. XY 1 if and only if Y = aX + b, where a > 0. Property 5. XY 1 if and only if Y = aX + b, where a < 0. c) Meaning. The correlation coefficient XY is used to measure the degree of linear dependence between two random variables X and Y. The closer | XY | is to 1, the stronger is the linear dependence between X and Y. The closer is to 0, the weaker is the linear dependence between X and Y. In particular, if XY = 0, then the two variables X and Y have no linear relationship. Definition. Two random variables X and Y are said to be correlated if XY 0 . Otherwise , if XY 0 , we say that X and Y are uncorrelated. Note. i) If two random variables are correlated, they are also dependent. But the converse is not true, which means if two random variables are dependent, they may be correlated but also may be uncorrelated. Example 3. Two random variables X and Y in Example 1 are dependent but they are uncorrelated. 18 Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU ii) If two random variables X and Y are independent, they are uncorrelated. Example 4. Let the probability distribution table of a two-dimensional random vector (X, Y), where X = "Revenue", Y = "Advertising cost" as follows: (Unit: Million VND) ) Y X 100 150 200 0 0.1 0.05 0.05 1 0.05 0.2 0.15 2 0 0.1 0.3 Are revenue and advertising costs correlated? Solution. Cov(X, Y) = x y P(x , y ) E(X)E(Y) i j i j = ((100)(0)(0.1) + (100)(1)(0.05) + (100)(2)(0) + (150)(0)(0.05) + (150)(1)(0.2) + (150)(2)(0.1) + (200)(0)(0.05) + (200)(1)(0.15) + (200)(2)(0.3)) – (167.5)(1.2) = 215 – 201 = 14 XY Cov(X, Y) 14 0.5152 0. V(X).V(Y) (1318.75)(0.56) So, revenue and advertising costs are correlated. Example 5. Let the joint probability density function of a two-dimensional random vector (X, Y) as follows: 1 (2x y),if 2 < x < 6, 0 < y < 5 f (x, y) 210 0, with other values of x, y Are X and Y correlated? Solution. We have Cov(X, Y) = xyf(x,y)dxdy - E(X)E(Y) = 19 6 5 1 xy 210 (2x y)dxdy 2 0 - E(X)E(Y) Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU = XY 80 268 170 200 . 7 63 63 3969 Cov(X, Y) 200 / 3969 0.03129 V(X).V(Y) 5036 / 3969 16225 / 7938 So, X and Y are correlated. 4.5 Conditional Expectation 1) If (X, Y) is a discrete random vector, where the set of possible values X can take is x1 , x 2 , , x n , and the set of values that Y can take is y1 , y 2 , , y m , the conditional expectation of random variable Y given X = xi is defined as follows: . Similarly, the conditional expectation of X given Y = yj is defined as follows: . 2) If (X, Y) is a continuous random vector, the conditional expectation of the random variable Y given X = x is defined as follows: E Y / X x yf(y/x)dy , where f(y/x) is the conditional probability density function of Y given X = x. Similarly, the conditional expectation of X given Y = y is defined as follows: E X / Y y xf(x/y)dx , where f(x/y) is the conditional probability density function of X given Y = y. Example 1. Let the probability distribution table of two-dimensional random vector (X, Y), where X = "Revenue", Y = "Advertising cost" as follows: (Unit: Million VND) ) Y X 0 100 150 200 0.1 0.05 0.05 20 Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU 1 0.05 0.2 0.15 2 0 0.1 0.3 Determine average revenue by advertising cost. Solution. Average revenue by advertising cost is the conditional expectation of X over Y. We have the probability distribution of revenue without advertising is X/Y =0 100 150 200 P 0.5 0, 25 0.25 ⇒ E(X/Y=0) = (100)(0.5) + (150)(0.25) + (200)(0.25) = 137.5 The probability distribution table of revenue when the advertising cost is 1 million is X/Y =1 100 150 200 P 0.05 0.125 0.4 0.2 0.5 0.4 0.15 0.375 0.4 ⇒ E(X/Y=1) = (100)(0.125) + (150)(0.5) + (200)(0.375) = 162.5 The probability distribution table of revenue when the advertising cost is 2 million dong is X/Y =2 100 150 200 P 0 0.25 0.75 ⇒ E(X/Y=2) = 100.0 + 150.0.25 + 200.0.75 = 187.5 Example 2. The random vector (X, Y) has the joint probability density function as follows: 8xy,if 0<x<1;0<y<x f (x, y) 0, with other values of x, y Find the conditional expectation E(X/Y=y), E(Y/X=x). Solution. The probability density function of X is x 3 8xydy,if 0 < x < 1 4x , f1 (x) f (x, y)dy 0 0, 0, if x (0;1) The probability density function of Y is 21 if 0 < x < 1 if x (0;1) Lecturer: Nguyen Duong Nguyen, Mathematics Department, Faculty of Basic Science, FTU 1 2 8xydx,if 0 < y < 1 4y(1 y ), if 0 < y < 1 f2 (y) f (x, y)dx y if y (0;1) 0, 0, if y (0;1) The conditional probability density function of X given Y y is 2x ,if y<x<1 f(x,y) f(x/y)= 1 y 2 f2 ( y) with other values of x 0, The conditional probability density function of Y given X x is 2y f(x,y) 2 ,if 0<y<x f(y/x)= x f1 (x ) 0, with other values of y So, the conditional expectation of X given Y = y is 2x 2(1 y3 ) 2(1 y y 2 ) E(X / Y y) xf (x / y)dx x dx 2 3(1 y 2 ) 3(1 y) y 1 y 1 and the conditional expectation of Y given X = x is E(Y / X x) x 0 yf (y / x)dy y 2y 2x dy . 2 x 3 22