6 RANDOM VECTORS Basic Concepts Often, a single random variable cannot adequately provide all of the information needed about the outcome of an experiment. For example, tomorrow’s weather is really best described by an array of random variables that includes wind speed, wind direction, atmospheric pressure, relative humidity and temperature. It would not be either easy or desirable to attempt to combine all of this information into a single measurement. We would like to extend the notion of a random variable to deal with an experiment that results in several observations each time the experiment is run. For example, let T be a random variable representing tomorrow’s maximum temperature and let R be a random variable representing tomorrow’s total rainfall. It would be reasonable to ask for the probability that tomorrow’s temperature is greater than 70◦ and tomorrow’s total rainfall is less than 0.1 inch. In other words, we wish to determine the probability of the event A = {T > 70, R < 0.1}. Another question that we might like to have answered is, “What is the probability that the temperature will be greater than 70◦ regardless of the rainfall?” To answer this question, we would need to compute the probability of the event B = {T > 70}. In this chapter, we will build on our probability model and extend our definition of a random variable to permit such calculations. Definition The first thing we must do is to precisely define what we mean by “an array of random variables.” 153 154 Definition y ω 3 2 (4,1) 1 Ω 1 2 3 4 5 x Figure 6.1: A mapping using (X1 , X2 ) Definition 6.1. ¡Let Ω be a sample space.¢An n-dimensional random variable or random vector, X1 (·), X (·), . . . , Xn (·) , is a vector of functions that assigns to ¡2 ¢ each point ω ∈ Ω a point X1 (ω), X2 (ω), . . . , Xn (ω) in n-dimensional Euclidean space. Example: Consider an experiment where a die is rolled twice. Let X1 denote the number of the first roll, and X2 the number of the second roll. Then (X1 , X2 ) is a two-dimensional random vector. A possible sample point in Ω is that is mapped into the point (4, 1) as shown in Figure 6.1. Joint Distributions 155 (x,y) Figure 6.2: The region representing the event P (X ≤ x, Y ≤ y). Joint Distributions Now that we know the definition of a random vector, we can begin to use it to assign probabilities to events. For any random vector, we can define a joint cumulative distribution function for all of the components as follows: Definition 6.2. Let (X1 , X2 , . . . , Xn ) be a random vector. The joint cumulative distribution function for this random vector is given by FX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) ≡ P (X1 ≤ x1 , X2 ≤ x2 , . . . , Xn ≤ xn ). In the two-dimensional case, the joint cumulative distribution function for the random vector (X, Y ) evaluated at the point (x, y), namely FX,Y (x, y), is the probability that the experiment results in a two-dimensional value within the shaded region shown in Figure 6.2. Every joint cumulative distribution function must posses the following properties: 1. lim all xi →−∞ FX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) = 0 156 Discrete Distributions x2 6 5 4 3 2 1 1 2 3 4 5 6 x1 Figure 6.3: The possible outcomes from rolling a die twice. 2. lim all xi →+∞ FX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) = 1 3. As xi varies, with all other xj ’s (j 6= i) fixed, FX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) is a nondecreasing function of xi . As in the case of one-dimensional random variables, we shall identify two major classifications of vector-valued random variables: discrete and continuous. Although there are many common properties between these two types, we shall discuss each separately. Discrete Distributions A random vector that can only assume at most a countable collection of discrete values is said to be discrete. As an example, consider once again the example on page 154 where a die is rolled twice. The possible values for either X1 or X2 are in the set {1, 2, 3, 4, 5, 6}. Hence, the random vector (X1 , X2 ) can only take on one of the 36 values shown in Figure 6.3. If the die is fair, then each of the points can be considered to have a probability Continuous Distributions 157 1 mass of 36 . This prompts us to define a joint probability mass function for this type of random vector, as follows: Definition 6.3. Let (X1 , X2 , . . . Xn ) be a discrete random vector. Then pX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) ≡ P (X1 = x1 , X2 = x2 , . . . , Xn = xn ). is the joint probability mass function for the random vector (X1 , X2 , . . . , Xn ). Referring again to the example on page 154, we find that the joint probability mass function for (X1 , X2 ) is given by 1 36 pX1 ,X2 (x1 , x2 ) = for x1 = 1, 2, . . . , 6 and x2 = 1, 2, . . . , 6 Note that for any probability mass function, FX1 ,X2 ,...,Xn (b1 , b2 , . . . , bn ) = X X x1 ≤b1 x2 ≤b2 ··· X pX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ). xn ≤bn Therefore, if we wished to evaluate FX1 ,X2 (3.0, 4.5) we would sum all of the probability mass in the shaded region shown in Figure 6.3, and obtain 1 FX1 ,X2 (3.0, 4.5) = 12 36 = 13 . This is the probability that the first roll is less than or equal to 3 and the second roll is less than or equal to 4.5. Every joint probability mass function must have the following properties: 1. pX1 ,X2 ,...,Xn (x1 , x2 , . . . xn ) ≥ 0 for any (x1 , x2 , . . . , xn ). X 2. X ... all x1 pX1 ,X2 ,...,Xn (x1 , x2 , . . . xn ) = 1 all xn 3. P (E) = X pX1 ,X2 ,...,Xn (x1 , x2 , . . . xn ) for any event E . (x1 ,...,xn )∈E You should compare these properties with those of probability mass functions for single-valued discrete random variables given in Chapter 5. Continuous Distributions Extending our notion of probability density functions to continuous random vectors is a bit tricky and the mathematical details of the problem is beyond the scope of 158 Continuous Distributions an introductory course. In essence, it is not possible to define the joint density as a derivative of the cumulative distribution function as we did in the one-dimensional case. Let Rn denote n-dimensional Euclidean space. We sidestep the problem by defining the density of an n-dimensional random vector to be a function that when integrated over the set {(x1 , . . . , xn ) ∈ Rn : x1 ≤ b1 , x2 ≤ b2 , . . . , xn ≤ bn } will yield the value for the cumulative distribution function evaluated at (b1 , b2 , . . . , bn ). More formally, we have the following: Definition 6.4. Let (X1 , . . . , Xn ) be a continuous random vector with joint cumulative distribution function FX1 ,...,Xn (x1 , . . . , xn ). The function fX1 ,...,Xn (x1 , . . . , xn ) that satisfies the equation FX1 ,...,Xn (b1 , . . . , bn ) = Z b1 −∞ ··· Z bn −∞ fX1 ,...,Xn (t1 , . . . , tn ) dt1 · · · dtn for all (b1 , . . . , bn ) is called the joint probability density function for the random vector (X1 , . . . , Xn ). Now, instead of obtaining a derived relationship between the density and the cumulative distribution function by using integrals as anti-derivatives, we have enforced such a relationship by the above definition. Every joint probability density function must have the following properties: 1. fX1 ,X2 ,...,Xn (x1 , x2 , . . . xn ) ≥ 0 for any (x1 , x2 , . . . , xn ). Z +∞ 2. −∞ ··· 3. P (E) = Z +∞ −∞ Z ··· fX1 ,...,Xn (x1 , . . . , xn ) dx1 · · · dxn = 1 Z E fX1 ,...,Xn (x1 , . . . , xn ) dx1 · · · dxn for any event E . You should compare these properties with those of probability density functions for single-valued continuous random variables given in Chapter 5. Continuous Distributions 159 x2 b2 a2 a1 b1 x1 Figure 6.4: Computing P (a1 < X1 ≤ b1 , a2 < X2 ≤ b2 ). In the one-dimensional case, we had the handy formula P (a < X ≤ b) = FX (b) − FX (a). This worked for any type of probability distribution. The situation in the multidimensional case is a little more complicated, with a comparable formula given by P (a1 < X1 ≤ b1 , a2 < X2 ≤ b2 ) = FX1 ,X2 (b1 , b2 ) − FX1 ,X2 (a1 , b2 ) − FX1 ,X2 (b1 , a2 ) + FX1 ,X2 (a1 , a2 ). You should be able to verify this formula for yourself by accounting for all of the probability masses in the regions shown in Figure 6.4. Example: Let (X, Y ) be a two-dimensional random variable with the following joint probability density function (see Figure 6.5): ( fX,Y (x, y) = Note that 2−y 0 Z 2Z 2 1 0 if 0 ≤ x ≤ 2 and 1 ≤ y ≤ 2 otherwise (2 − y) dx dy = 1. 160 Marginal Distributions fXY(x,y) 1 1 2 y 1 2 x Figure 6.5: A two-dimensional probability density function Suppose we would like to compute P (X ≤ 1.0, Y ≤ 1.5). To do this, we calculate the volume under the surface fX,Y (x, y) over the region {(x, y) : x ≤ 1, y ≤ 1.5}. This region of integration is shown shaded (in green) in Figure 6.5. Performing the integration, we get, P (X ≤ 1.0, Y ≤ 1.5) = = Z 1.5 Z 1.0 −∞ −∞ Z 1.5 Z 1.0 1.0 0 fX,Y (x, y) dx dy (2 − y) dx dy = 83 . Marginal Distributions Given the probability distribution for a vector-valued random variable (X1 , . . . Xn ), we might ask the question, “Can we determine the distribution of X1 , disregarding the other components?” The answer is yes, and the solution requires the careful use of English rather than mathematics. For example, in the two-dimensional case, we may be given a random vector (X, Y ) with joint cumulative distribution function FX,Y (x, y). Suppose we would like to find the cumulative distribution function for X alone, i.e., FX (x)? We know that FX,Y (x, y) = P (X ≤ x, Y ≤ y) Marginal Distributions 161 and we are asking for (1) FX (x) = P (X ≤ x). But in terms of both X and Y , expression 1 can be read: “the probability that X takes on a value less than or equal to x and Y takes on any value.” Therefore, it would make sense to say FX (x) = P (X ≤ x) = P (X ≤ x, Y ≤ ∞) = lim FX,Y (x, y). y→∞ Using this idea, we shall define what we will call the marginal cumulative distribution function: Definition 6.5. Let (X1 , . . . , Xn ) be a random vector with joint cumulative distribution function FX1 ,...,Xn (x1 , . . . , xn ). The marginal cumulative distribution function for X1 is given by FX1 (x1 ) = lim lim · · · lim FX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ). x2 →∞ x3 →∞ xn →∞ Notice that we can renumber the components of the random vector and call any one of them X1 . So we can use the above definition to find the marginal cumulative distribution function for any of the Xi ’s. Although Definition 6.5 is a nice definition, it is more useful to examine marginal probability mass functions and marginal probability density functions. For example, suppose we have a discrete random vector (X, Y ) with joint probability mass function pX,Y (x, y). To find pX (x), we ask “What is the probability that X = x regardless of the value that Y takes on? This can be written as pX (x) = P (X = x) = P (X = x, Y = any value) = X pX,Y (x, y). all y Example: In the die example (on page 154) pX1 ,X2 (x1 , x2 ) = 1 36 for x1 = 1, 2, . . . , 6 and x2 = 1, 2, . . . , 6 To find pX1 (2), for example, we compute pX1 (2) = P (X1 = 2) = 6 X k=1 pX1 ,X2 (2, k) = 61 . 162 Marginal Distributions Table 6.1: Joint pmf for daily production pX,Y (x, y) 1 2 X 3 4 5 1 .05 .15 .05 .05 .10 Y 2 3 0 0 .10 0 .05 .10 .025 .025 .10 .10 4 0 0 0 0 .10 5 0 0 0 0 0 This is hardly a surprising result, but it brings some comfort to know we can get it from all of the mathematical machinery we’ve developed thus far. Example: Let X be the total number of items produced in a day’s work at a factory, and let Y be the number of defective items produced. Suppose that the probability mass function for (X, Y ) is given by Table 6.1. Using this joint distribution, we can see that the probability of producing 2 items with exactly 1 of those items being defective is pX,Y (2, 1) = 0.15. To find the marginal probability mass function for the total daily production, X , we sum the probabilities over all possible values of Y for each fixed x: pX (1) = pX,Y (1, 1) = 0.05 pX (2) = pX,Y (2, 1) + pX,Y (2, 2) = 0.15 + 0.10 = 0.25 pX (3) = pX,Y (3, 1) + pX,Y (3, 2) + pX,Y (3, 3) = 0.05 + 0.05 + 0.10 = 0.20 etc. But notice that in these computations, we are simply adding the entries all columns for each row of Table 6.1. Doing this for Y as well as X we can obtain Table 6.2 So, for example, P (Y = 2) = pY (2) = 0.275. We simply look for the result in the margin1 for the entry Y = 2. The procedure is similar for obtaining marginal probability density functions. Recall that a density, fX (x), itself is not a probability measure, but fX (x)dx, is. So 1 Would you believe that this is why they are called marginal distributions? Marginal Distributions 163 Table 6.2: Marginal pmf’s for daily production pX,Y (x, y) 1 2 X 3 4 5 pY (y) Y 2 3 0 0 .10 0 .05 .10 .025 .025 .10 .10 .275 .225 1 .05 .15 .05 .05 .10 0.4 4 0 0 0 0 .10 .10 5 0 0 0 0 0 0 pX (x) .05 .25 .20 .10 .40 with a little loose-speaking integration notation we should be able to compute fX (x) dx = P (x ≤ X < x + dx) = P (x ≤ X < x + dx, Y = any value) = Z +∞ −∞ fX,Y (x, y) dy dx where y is the variable of integration in the above integral. Looking at this relationship as fX (x) dx = Z +∞ −∞ fX,Y (x, y) dy dx it would seem reasonable to define fX (x) ≡ Z +∞ −∞ fX,Y (x, y) dy and we therefore offer the following: Definition 6.6. Let (X1 , . . . , Xn ) be a continuous random variable with joint probability density function fX1 ,...,Xn . The marginal probability density function for the random variable X1 is given by fX1 (x1 ) = Z +∞ −∞ ··· Z +∞ −∞ fX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) dx2 · · · dxn . Notice that in both the discrete and continuous cases, we sum (or integrate) over all possible values of the unwanted random variable components. 164 Marginal Distributions y 1 a 0 1 x Figure 6.6: The support for the random vector The trick in such problems is to insure that your limits of integration are correct. Drawing a picture of the region where there is positive probability mass (the support of the distribution) often helps. For the above example, the picture of the support would be as shown in Figure 6.6. If the dotted line in Figure 6.6 indicates a particular value for x (call it a), by integrating over all values of y , we are actually determining how much probability mass has been placed along the line x = a. The integration process assigns all of that mass to x = a in one-dimension. Repeating this process for each x yields the desired probability density function. Example: Let (X, Y ) be a two-dimensional continuous random variable with joint probability density function ( fX,Y (x, y) = x+y 0 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 otherwise Find the marginal probability density function for X . Solution: Let’s consider the case where we fix x so that 0 ≤ x ≤ 1. We compute fX (x) = Z +∞ −∞ fX,Y (x, y) dy Marginal Distributions 165 = Z 1 (x + y) dy 0 Ã y2 xy + 2 = = x+ !¯y=1 ¯ ¯ ¯ ¯ y=0 1 2 If x is outside the interval [0, 1], we have fX (x) = 0. So, summarizing these computations we find that ( fX (x) = 1 2 x+ 0 0≤x≤1 otherwise We will leave it to you to check that fX (x) is, in fact, a probability density R +∞ functionby making sure that fX (x) ≥ 0 for all x and that −∞ fX (x) dx = 1. Example: Let (X, Y ) be the two-dimensional random variable with the following joint probability density function (see Figure 6.5 on page 160) ( fX,Y (x, y) = 2−y 0 if 0 ≤ x ≤ 2 and 1 ≤ y ≤ 2 otherwise Find the marginal probability density function for X and the marginal probability density function for Y . Solution: Let’s first find the marginal probability density function for X . Consider the case where we fix x so that 0 ≤ x ≤ 2. We compute fX (x) = = Z +∞ −∞ Z 2 fX,Y (x, y) dy (2 − y) dy 1 Ã y2 2y − 2 = = !¯y=2 ¯ ¯ ¯ ¯ y=1 1 2 If x is outside the interval [0, 2], we have fX (x) = 0. Therefore, ( fX (x) = 1 2 0 0≤x≤2 otherwise 166 SEE ERRATA SHEET Functions of Random Vectors To find the marginal probability density function for Y , consider the case where we fix y so that 1 ≤ y ≤ 2. We compute fY (y) = = Z +∞ −∞ Z 1 fX,Y (x, y) dx (2 − y) dx 0 = x=1 (2 − y)x|x=0 = 2−y If y is outside the interval [1, 2], we have fY (y) = 0. Therefore, ( fY (x) = 2−y 0 0≤x≤2 otherwise You should also double check that fX and fY are both probability density functions. Functions of Random Vectors The technique for computing functions of one-dimensional random variables carries over to the multi-dimensional case. Most of these problems just require a little careful reasoning. Example: Let (X1 , X2 ) be the die-tossing random vector of the example on page 154. Find the probability mass function for the random variable Z = X1 +X2 , the sum of the two rolls. 1 Solution: We already know that pX1 ,X2 (x1 , x2 ) = 36 for x1 = 1, . . . , 6 and x2 = 1, . . . , 6. We ask the question, “What are the possible values that Z can take on?” The answer: “The integers 2, 3, 4, . . . , 12.” For example, Z equals 4 precisely when any one of the mutually exclusive events {X1 = 1, X2 = 3}, {X1 = 2, X2 = 2}, or {X1 = 3, X2 = 1} occurs. So, pZ (4) = pX1 ,X2 (1, 3) + pX1 ,X2 (2, 2) + pX1 ,X2 (3, 1) = 3 36 . Functions of Random Vectors 167 Continuing in this manner, you should be able to verify that 1 pZ (2) = 36 ; 4 pZ (5) = 36 ; 5 pZ (8) = 36 ; 2 pZ (11) = 36 ; 2 pZ (3) = 36 ; 5 pZ (6) = 36 ; 4 pZ (9) = 36 ; 1 pZ (12) = 36 . 3 pZ (4) = 36 ; 6 pZ (7) = 36 ; 3 pZ (10) = 36 ; Example: Let (X, Y ) be the two-dimensional random variable given in the example on page 164. Find the cumulative distribution function for the random variable Z =X +Y. Solution: The support for the random variable Z is the interval [0, 2], so 0 FZ (z) = if z < 0 if 0 ≤ z ≤ 2 if z > 2 ? 1 For the case 0 ≤ z ≤ 2, we wish to evaluate FZ (z) = P (Z ≤ z) = P (X + Y ≤ z). In other words, we are computing the probability mass assigned to the shaded set in Figure 6.7 as z varies from 0 to 2. In establishing limits of integration, we notice that there are two cases to worry about as shown in Figure 6.8: Case I (z ≤ 1): FZ (z) = Z z Z z−y 0 = (x + y) dx dy 0 1 3 3z . Case II (z > 1): FZ (z) = Z z−1 Z 1 0 (x + y) dx dy + 0 = z 2 − 31 z 3 − 31 . Z 1 Z z−y z−1 0 (x + y) dx dy 168 Functions of Random Vectors y 1 z x+ y= z z 0 x 1 Figure 6.7: The event {Z ≤ z}. y y z 1 z 0 1 z 1 Case I x 0 1 z x Case II Figure 6.8: Two cases to worry about for the example Independence of Random Variables 169 These two cases can be summarized as FZ (z) = 0 1 z3 3 z 2 − 13 z 3 − 31 1 if z < 0 if 0 ≤ z ≤ 1 if 1 < z ≤ 2 if z > 2 Notice what we thought at first to be one case (0 ≤ z ≤ 2) had to be divided into two cases (0 ≤ z ≤ 1 and 1 < z ≤ 2). Independence of Random Variables Definition 6.7. A sequence of n random variables X1 , X2 , . . . , Xn is independent if and only if, and FX1 ,X2 ,...,Xn (b1 , b2 , . . . , bn ) = FX1 (b1 )FX2 (b2 ) · · · FXn (bn ) for all values b1 , b2 , . . . , bn . Definition 6.8. A sequence of n random variables X1 , X2 , . . . , Xn is a random sample if and only if 1. X1 , X2 , . . . , Xn are independent, and 2. FXi (x) = F (x) for all x and for all i (i.e., each Xi has the same marginal distribution, F (x)). We say that a random sample is a vector of independent and identically distributed (i.i.d.) random variables. Recall: An event A is independent of an event B if and only if P (A ∩ B) = P (A)P (B). Theorem 6.1. If X and Y are independent random variables then any event A involving X alone is independent of any event B involving Y alone. Testing for independence Case I: Discrete A discrete random variable X is independent of a discrete random variable Y if and only if pX,Y (x, y) = [pX (x)][pY (y)] for all x and y . 170 Independence of Random Variables Case II: Continuous A continuous random variable X is independent of a continuous random variable Y if and only if fX,Y (x, y) = [fX (x)][fY (y)] for all x and y . Example: A company produces two types of compressors, grade A and grade B. Let X denote the number of grade A compressors produced on a given day. Let Y denote the number of grade B compressors produced on the same day. Suppose that the joint probability mass function pX,Y (x, y) = P (X = x, Y = y) is given by the following table: y pX,Y (x, y) 0 x 1 2 0 0.1 0.2 0.2 1 0.3 0.1 0.1 The random variables X and Y are not independent. Note that pX,Y (0, 0) = 0.1 6= pX (0)pY (0) = (0.4)(0.5) = 0.2 Example: Suppose an electronic circuit contains two transistors. Let X be the time to failure of transistor 1 and let Y be the time to failure of transistor 2. ( fX,Y (x, y) = 4e−2(x+y) x ≥ 0, y ≥ 0 0 otherwise The marginal densities are ( fX (x) = ( fY (y) = 2e−2x x ≥ 0 0 otherwise 2e−2y y ≥ 0 0 otherwise Expectation and random vectors 171 We must check the probability density functions for (X, Y ), X and Y for all values of (x, y). For x ≥ 0 and y ≥ 0: fX,Y (x, y) = 4e−2(x+y) = fX (x)fY (y) = 2e−2x 2e−2y For x ≥ 0 and y < 0: fX,Y (x, y) = 0 = fX (x)fY (y) = 2e−2x (0) For x < 0 and y ≥ 0: fX,Y (x, y) = 0 = fX (x)fY (y) = (0)2e−2y For x < 0 and y < 0: fX,Y (x, y) = 0 = fX (x)fY (y) = (0)(0) So the random variables X and Y are independent. Expectation and random vectors Suppose we are given a random vector (X, Y ) and a function g(x, y). Can we find E(g(X, Y ))? Theorem 6.2. E(g(X, Y )) = XX g(x, y)pX,Y (x, y) if (X, Y ) is discrete all x all y E(g(X, Y )) = Z ∞ Z ∞ −∞ −∞ g(x, y)fX,Y (x, y) dy dx if (X, Y ) is continuous Example: Suppose X and Y have joint probability density function ( fX,Y (x, y) = x + y 0 ≤ x ≤ 1, 0 ≤ y ≤ 1 0 otherwise 172 Expectation and random vectors Let Z = XY . To find E(Z) = E(XY ) use Theorem 6.2 to get Z ∞ Z ∞ E(XY ) = −∞ −∞ Z 1Z 1 = xyfX,Y (x, y) dx dy xy(x + y) dx dy 0 0 0 0 Z 1Z 1 = Z 1 = 0 Z 1 = 0 1 2 6y = x2 y + xy 2 dx dy 1 3 3x y 1 3y ¯1 ¯ + 21 x2 y 2 ¯ dy 0 + 12 y 2 dy ¯1 ¯ + 61 y 3 ¯ = 0 1 3 We will prove the following results for the case when (X, Y ) is a continuous random vector. The proofs for the discrete case are similar using summations rather than integrals, probability mass functions rather than probability density functions. Theorem 6.3. E(X + Y ) = E(X) + E(Y ) Proof. Using Theorem 6.2: E(X + Y ) = = = = = Z ∞ Z ∞ −∞ −∞ Z ∞ Z ∞ −∞ −∞ Z ∞ Z ∞ −∞ −∞ Z ∞ −∞ Z ∞ −∞ x (x + y)fX,Y (x, y) dx dy xfX,Y (x, y) dx dy + xfX,Y (x, y) dy dx + ·Z ∞ −∞ £ Z ∞ Z ∞ −∞ −∞ ¸ yfX,Y (x, y) dx dy −∞ −∞ Z ∞ ·Z ∞ fX,Y (x, y) dy dx + ¤ x fX (x) dx + Z ∞ −∞ yfX,Y (x, y) dx dy Z ∞ Z ∞ £ −∞ y ¤ y fY (y) dy = E(X) + E(Y ) Theorem 6.4. If X and Y are independent, then E[h(X)g(Y )] = E[h(X)]E[g(Y )] −∞ ¸ fX,Y (x, y) dx dy Expectation and random vectors 173 Proof. Using Theorem 6.2: E[h(X)g(Y )] = Z ∞ Z ∞ −∞ −∞ h(x)g(y)fX,Y (x, y) dx dy since X and Y are independent. . . = = = Z ∞ Z ∞ −∞ −∞ Z ∞ −∞ Z ∞ −∞ h(x)g(y)fX (x)fY (y) dx dy g(y)fY (y) ·Z ∞ −∞ ¸ h(x)fX (x) dx dy £ ¤ g(y)fY (y) E(h(X)) dy since E(h(X)) is a constant. . . = E(h(X)) Z ∞ −∞ g(y)fY (y) dy = E(h(X))E(g(Y )) Corollary 6.5. If X and Y are independent, then E(XY ) = E(X)E(Y ) Proof. Using Theorem 6.4, set h(x) = x and g(y) = y to get E[h(X)g(Y )] = E(h(X))E(g(Y )) E[XY ] = E(X)E(Y ) Definition 6.9. The covariance of the random variables X and Y is Cov(X, Y ) ≡ E[(X − E(X))(Y − E(Y ))]. Note that Cov(X, X) = Var(X). Theorem 6.6. For any random variables X and Y Var(X + Y ) = Var(X) + Var(Y ) + 2Cov(X, Y ) 174 Expectation and random vectors Proof. Remember that the variance of a random variable W is defined as Var(W ) = E[(W − E(W ))2 ] Now let W = X + Y . . . Var(X + Y ) = E[(X + Y − E(X + Y ))2 ] = E[(X + Y − E(X) − E(Y ))2 ] = E[({X − E(X)} + {Y − E(Y )})2 ] Now let a = {X − E(X)}, let b = {Y − E(Y )}, and expand (a + b)2 to get. . . = E[{X − E(X)}2 + {Y − E(Y )}2 + 2{X − E(X)}{Y − E(Y )}] = E[{X − E(X)}2 ] + E[{Y − E(Y )}2 ] +2E[{X − E(X)}{Y − E(Y )}] = Var(X) + Var(Y ) + 2Cov(X, Y ) Theorem 6.7. Cov(X, Y ) = E(XY ) − E(X)E(Y ) Proof. Using Definition 6.9, we get Cov(X, Y ) = E[(X − E(X))(Y − E(Y ))] = E[XY − XE(Y ) − E(X)Y + E(X)E(Y )] = E[XY ] + E[−XE(Y )] + E[−E(X)Y ] + E[E(X)E(Y )]] Since E[X], E[Y ], and E[XY ] are all constants. . . = E[XY ] − E(Y )E[X] − E(X)E[Y ] + E(X)E(Y ) = E[XY ] − E(X)E(Y ) Corollary 6.8. If X and Y are independent then Cov(X, Y ) = 0. Proof. If X and Y are independent then, from Corollary 6.5, E(XY ) = E(X)E(Y ). We can then use Theorem 6.7: Cov(X, Y ) = E(XY ) − E(X)E(Y ) = 0 Expectation and random vectors 175 Corollary 6.9. If X and Y are independent then Var(X+Y ) = Var(X)+Var(Y ). Proof. If X and Y are independent then, from Theorem 6.6 and Corollary 6.8: Var(X + Y ) = Var(X) + Var(Y ) + 2Cov(X, Y ) Var(X + Y ) = Var(X) + Var(Y ) + 2(0) Var(X + Y ) = Var(X) + Var(Y ) Definition 6.10. The covariance of two random variables X and Y is given by £ ¤ Cov(X, Y ) ≡ E (X − E(X))(Y − E(Y )) . Definition 6.11. The correlation coefficient for two random variables X and Y is given by Cov(X, Y ) ρ(X, Y ) ≡ p . Var(X)Var(Y ) Theorems 6.10, 6.11 and 6.12 are stated without proof. Proofs of these results may be found in the book by Meyer.2 Theorem 6.10. For any random variables X and Y , |ρ(X, Y )| ≤ 1. Theorem 6.11. Suppose that |ρ(X, Y )| = 1. Then (with probability one), Y = aX + b for some constants a and b. In other words: If the correlation coefficient ρ is ±1, the Y is a linear function of X (with probability one). The converse of this theorem is also true: Theorem 6.12. Suppose that X and Y are two random variables, such that Y = aX + b where a and b are constants. Then |ρ(X, Y )| = 1. If a > 0, then ρ(X, Y ) = +1. If a < 0, then ρ(X, Y ) = −1. 2 Meyer, P., Introductory probability theory and statistical applications, Addison-Wesley, Reading MA, 1965. 176 Expectation and random vectors Random vectors and conditional probability Example: Consider the compressor problem again. y pX,Y (x, y) 0 x 1 2 0 0.1 0.2 0.2 1 0.3 0.1 0.1 Given that no grade B compressors were produced on a given day, what is the probability that 2 grade A compressors were produced? Solution: P (X = 2 | Y = 0) = = P (X = 2, Y = 0) P (Y = 0) 2 0.2 = 0.5 5 Given that 2 compressors were produced on a given day, what is the probability that one of them is a grade B compressor? Solution: P (Y = 1 | X + Y = 2) = = = P (Y = 1, X + Y = 2) P (X + Y = 2) P (X = 1, Y = 1) P (X + Y = 2) 1 0.1 = 0.3 3 Example: And, again, consider the two transistors. . . ( fX,Y (x, y) = 4e−2(x+y) x ≥ 0, y ≥ 0 0 otherwise Given that the total life time for the two transistors is less than two hours, what is the probability that the first transistor lasted more than one hour? Expectation and random vectors 177 Solution: P (X > 1 | X + Y ≤ 2) = P (X > 1, X + Y ≤ 2) P (X + Y ≤ 2) We then compute P (X > 1, X + Y ≤ 2) = Z 2 Z 2−x = e 4e−2(x+y) dy dx 1 0 −2 − 3e−4 and P (X + Y ≤ 2) = Z 2 Z 2−x 0 4e−2(x+y) dy dx 0 = 1 − 5e−4 to get P (X > 1 | X + Y ≤ 2) = e−2 − 3e−4 = 0.0885 1 − 5e−4 Conditional distributions Case I: Discrete Let X and Y be random variables with joint probability mass function pX,Y (x, y) and let pY (y) be the marginal probability mass function for Y . We define the conditional probability mass function of X given Y as pX|Y (x | y) = pX,Y (x, y) pY (y) whenever pY (y) > 0. Case II: Continuous Let X and Y be random variables with joint probability density function fX,Y (x, y) and let fY (y) be the marginal probability density function for Y . We define the conditional probability density function of X given Y as fX|Y (x | y) = whenever fY (y) > 0. fX,Y (x, y) fY (y) 178 Self-Test Exercises for Chapter 6 Law of total probability Case I: Discrete pX (x) = X pX|Y (x | y)pY (y) y Case II: Continuous fX (x) = Z +∞ −∞ fX|Y (x | y)fY (y) dy Self-Test Exercises for Chapter 6 For each of the following multiple-choice questions, choose the best response among those provided. Answers can be found in Appendix B. S6.1 Let X1 , X2 , X3 , X4 be independent and identically distributed random variables, each with P (Xi = 1) = 12 and P (Xi = 0) = 12 . Let P (X1 + X2 + X3 + X4 = 3) = r, then the value of r is (A) 1 16 (B) 1 4 (C) 1 2 (D) 1 (E) none of the above. S6.2 Let (X, Y ) be a discrete random vector with joint probability mass function given by pX,Y (0, 0) = 1/4 pX,Y (0, 1) = 1/4 pX,Y (1, 1) = 1/2 Then P (Y = 1) equals (A) 1/3 (B) 1/4 (C) 1/2 Self-Test Exercises for Chapter 6 179 (D) 3/4 (E) none of the above. S6.3 Let (X, Y ) be a random vector with joint probability mass function given by pX,Y (1, −1) = 1 4 pX,Y (1, 1) = 1 2 pX,Y (0, 0) = 1 4 Let pX (x) denote the marginal probability mass function for X . The value of pX (1) is (A) 0 (B) 1 4 (C) 1 2 (D) 1 (E) none of the above. S6.4 Let (X, Y ) be a random vector with joint probability mass function given by pX,Y (1, −1) = 1 4 pX,Y (1, 1) = 1 2 pX,Y (0, 0) = 1 4 The value of P (XY > 0) is (A) 0 (B) 1 4 (C) 1 2 (D) 1 (E) none of the above. S6.5 Let (X, Y ) be a continuous random vector with joint probability density function ( fX,Y (x, y) = Then P (X > 0) equals (A) 0 (B) 1/4 1/2 if −1 ≤ x ≤ 1 and 0 ≤ y ≤ 1 0 otherwise 180 Self-Test Exercises for Chapter 6 (C) 1/2 (D) 1 (E) none of the above. S6.6 Suppose (X, Y ) is a continuous random vector with joint probability density function ( e−y if 0 ≤ x ≤ 1 and y ≥ 0 fX,Y (x, y) = 0 otherwise Then P (X > 12 ) equals (A) 0 (B) 1/2 (C) e−1/2 (D) 1 −y 2e (E) none of the above. S6.7 Let (X, Y ) be a discrete random vector with joint probability mass function given by pX,Y (0, 0) = 1/3 pX,Y (0, 1) = 1/3 pX,Y (1, 0) = 1/3 Then P (X + Y = 1) equals (A) 0 (B) 1/3 (C) 2/3 (D) 1 (E) none of the above. S6.8 Let (X, Y ) be a discrete random vector with joint probability mass function given by pX,Y (0, 0) = 1/3 pX,Y (0, 1) = 1/3 pX,Y (1, 0) = 1/3 Self-Test Exercises for Chapter 6 181 Let W = max{X, Y }. Then P (W = 1) equals (A) 0 (B) 1/3 (C) 2/3 (D) 1 (E) none of the above. S6.9 Suppose (X, Y ) is a continuous random vector with joint probability density function if 0 ≤ x ≤ 2 and y 0≤y≤1 fX,Y (x, y) = 0 otherwise Then E(X 2 ) equals (A) 0 (B) 1 (C) 4/3 (D) 8/3 (E) none of the above. S6.10 Suppose (X, Y ) is a continuous random vector with joint probability density function if 0 ≤ x ≤ 2 and y 0≤y≤1 fX,Y (x, y) = 0 otherwise Then P (X < 1, Y < 0.5) equals (A) 0 (B) 1/8 (C) 1/4 (D) 1/2 (E) none of the above. 182 Self-Test Exercises for Chapter 6 S6.11 The weights of individual oranges are independent random variables, each having an expected value of 6 ounces and standard deviation of 2 ounces. Let Y denote the total net weight (in ounces) of a basket of n oranges. The variance of Y is equal to √ (A) 4 n (B) 4n (C) 4n2 (D) 4 (E) none of the above. S6.12 Let X1 , X2 , X3 be independent and identically distributed random variables, each with P (Xi = 1) = p and P (Xi = 0) = 1 − p. If P (X1 + X2 + X3 = 3) = r, then the value of p is (A) 1 3 (B) 1 2 (C) √ 3 r (D) r3 (E) none of the above. S6.13 If X and Y are random variables with Var(X) = Var(Y ), then Var(X + Y ) must equal (A) 2Var(X) √ (B) 2(Var(X)) (C) Var(2X) (D) 4Var(X) (E) none of the above. S6.14 Let (X, Y ) be a continuous random vector with joint probability density function ( 1/π if x2 + y 2 ≤ 1 fX,Y (x, y) = 0 otherwise Then P (X > 0) equals Self-Test Exercises for Chapter 6 183 (A) 1/(4π) (B) 1/(2π) (C) 1/2 (D) 1 (E) none of the above. S6.15 Let X be a random variable. Then E[(X + 1)2 ] − (E[X + 1])2 equals (A) Var(X) (B) 2E[(X + 1)2 ] (C) 0 (D) 1 (E) none of the above. S6.16 Let X , Y and Z be independent random variables with E(X) = 1 E(Y ) = 0 E(Z) = 1 Let W = X(Y + Z). Then E(W ) equals (A) 0 (B) E(X 2 ) (C) 1 (D) 2 (E) none of the above S6.17 Toss a fair die twice. Let the random variable X represent the outcome of the first toss and the random variable Y represent the outcome of the second toss. What is the probability that X is odd and Y is even. (A) 1/4 (B) 1/3 (C) 1/2 184 Self-Test Exercises for Chapter 6 (D) 1 (E) none of the above S6.18 Assume that (X, Y ) is random vector with joint probability density function ( fX,Y (x, y) = k for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 and y ≤ x 0 otherwise where k is a some positive constant. Let W = X −Y . To find the cumulative distribution function for W we can compute FW (w) = Z BZ D A C k dx dy for 0 ≤ w ≤ 1. The limit of integration that should appear in position D is (A) min{1, x − w} (B) min{1, y − w} (C) min{1, x − y} (D) min{1, y + w} (E) none of the above. S6.19 Assume that (X, Y ) is a random vector with joint probability mass function given by pX,Y (−1, 0) = 1/4 pX,Y (0, 0) = 1/2 pX,Y (0, 1) = 1/4 Define the random variable W = XY . The value of P (W = 0) is (A) 0 (B) 1/4 (C) 1/2 (D) 3/4 (E) none of the above. Self-Test Exercises for Chapter 6 185 S6.20 Let X be a random variable with E(X) = 0 and Var(X) = 1. Let Y be a random variable with E(Y ) = 0 and Var(Y ) = 4. Then E(X 2 + Y 2 ) equals (A) 0 (B) 1 (C) 3 (D) 5 (E) none of the above. S6.21 Suppose (X, Y ) is a continuous random vector with joint probability density function ( fX,Y (x, y) = · 1 Then E XY 4xy 0 if 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 otherwise ¸ equals (A) +∞ (B) 0 (C) 1/2 (D) 4 (E) none of the above. S6.22 The life lengths of two transistors in an electronic circuit is a random vector (X, Y ) where X is the life length of transistor 1 and Y is the life length of transistor 2. The joint probability density function of (X, Y ) is given by ( fX,Y (x, y) = 2e−(x+2y) if x ≥ 0 and y ≥ 0 0 otherwise Then P (X + Y ≤ 1) equals (A) (B) (C) R1R1 0 0 2e−(x+2y) dx dy R 1 R 1−y 0 0 0 y R1R1 2e−(x+2y) dx dy 2e−(x+2y) dx dy 186 Questions for Chapter 6 (D) R1R1 0 1−y 2e−(x+2y) dx dy (E) none of the above. Questions for Chapter 6 6.1 A factory can produce two types of gizmos, Type 1 and Type 2. Let X be a random variable denoting the number of Type 1 gizmos produced on a given day, and let Y be the number of Type 2 gizmos produced on the same day. The joint probability mass function for X and Y is given by pX,Y (1, 0) = 0.10; pX,Y (1, 2) = 0.10; pX,Y (2, 1) = 0.20; pX,Y (3, 1) = 0.05; pX,Y (1, 1) = 0.20; pX,Y (1, 3) = 0.10; pX,Y (2, 2) = 0.20; pX,Y (3, 2) = 0.05; (a) Compute P (X ≤ 2, Y = 2), P (X ≤ 2, Y 6= 2) and P (Y > 0). (b) Find the marginal probability mass functions for X and for Y. (c) Find the distribution for the random variable Z = X + Y which is the total daily production of gizmos. 6.2 A sample space Ω is a set consisting for four points, {ω1 , ω2 , ω3 , ω4 }. A probability measure, P (·) assigns probabilities, as follows: P ({ω1 }) = P ({ω3 }) = 1 2 1 4 P ({ω2 }) = P ({ω4 }) = 1 8 1 8 Random variables X , Y and Z are defined as X(ω1 ) = 0, Y (ω1 ) = 0, Z(ω1 ) = 1, X(ω2 ) = 0, Y (ω2 ) = 1, Z(ω2 ) = 2, X(ω3 ) = 1, Y (ω3 ) = 1, Z(ω3 ) = 3, X(ω4 ) = 1 Y (ω4 ) = 0 Z(ω4 ) = 4 (a) Find the joint probability mass function for (X, Y, Z). (b) Find the joint probability mass function for (X, Y ). (c) Find the probability mass function for X . Questions for Chapter 6 187 (d) Find the probability mass function for the random variable W = X + Y + Z. (e) Find the probability mass function for the random variable T = XY Z . (f) Find the joint probability mass function for the random variable (W, T ). (g) Find the probability mass function for the random variable V = max{X, Y }. 6.3 Let (X, Y ) have a uniform distribution over the rectangle ¡ ¢ (0, 0), (0, 1), (1, 1), (1, 0) . In other words, the probability density function for (X, Y ) is given by ( 1 0 fX,Y (x, y) = if 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 otherwise Find the cumulative distribution function for the random variable Z = X+Y . 6.4 Two light bulbs are burned starting at time 0. The first one to fail burns out at time X and the second one at time Y . Obviously X ≤ Y . The joint probability density function for (X, Y ) is given by ( fX,Y (x, y) = 2e−(x+y) 0 if 0 < x < y < ∞ otherwise (a) Sketch the region in (x, y)-space for which the above probability density function assigns positive mass. (b) Find the marginal probability density functions for X and for Y . (c) Let Z denote the excess life of the second bulb, i.e., let Z = Y − X . Find the probability density function for Z . (d) Compute P (Z > 1) and P (Y > 1). 6.5 If two random variables X and Y have a joint probability density function given by ( fX,Y (x, y) = 2 0 if x > 0, y > 0 and x + y < 1 otherwise 188 Questions for Chapter 6 (a) Find the probability that both random variables will take on a value less than 12 . (b) Find the probability that X will take on a value less than and Y will take on a value greater than 12 . 1 4 (c) Find the probability that the sum of the values taken on by the two random variables will exceed 23 . (d) Find the marginal probability density functions for X and for Y . (e) Find the marginal probability density function for Z = X + Y. 6.6 Suppose (X, Y, Z) is a random vector with joint probability density function ( fX,Y,Z (x, y, z) = α(xyz 2 ) 0 if 0 < x < 1, 0 < y < 2 and 0 < z < 2 otherwise (a) Find the value of the constant α. (b) Find the probability that X will take on a value less than and Y and Z will both take on values less than 1. 1 2 (c) Find the probability density function for the random vector (X, Y ). 6.7 Suppose that the random vector (X, Y ) has joint probability density function ( fX,Y (x, y) = kx(x − y) 0 if 0 ≤ x ≤ 2 and |y| ≤ x otherwise (a) Sketch the support of the distribution for (X, Y ) in the xy plane. (b) Evaluate the constant k . (c) Find the marginal probability density function for Y . (d) Find the marginal probability density function for X . (e) Find the probability density function for Z = X + Y . Questions for Chapter 6 189 6.8 The length, X , and the width, Y , of salt crystals form a random variable (X, Y ) with joint probability density function ( fX,Y (x, y) = x 0 if 0 ≤ x ≤ 1 and 0 ≤ y ≤ 2 otherwise (a) Find the marginal probability density functions for X and for Y . 6.9 If the joint probability density function of the price, P , of a commodity (in dollars) and total sales, S , (in 10, 000 units) is given by ( fS,P (s, p) = 5pe−ps 0 0.20 < p < 0.40 and s > 0 otherwise (a) Find the marginal probability density functions for P and for S . (b) Find the conditional probability density function for S given that P takes on the value p. (c) Find the probability that sales will exceed 20, 000 units given P = 0.25. 6.10 If X is the proportion of persons who will respond to one kind of mail-order solicitation, Y is proportion of persons who will respond to a second type of mail-order solicitation, and the joint probability density function of X and Y is given by ( fX,Y (x, y) = 2 5 (x 0 + 4y) 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 otherwise (a) Find the marginal probability density functions for X and for Y . (b) Find the conditional probability density function for X given that Y takes on the value y . (c) Find the conditional probability density function for Y given that X takes on the value x. (d) Find the probability that there will be at least a 20% response to the second type of mail-order solicitation. 190 Questions for Chapter 6 (e) Find the probability that there will be at least a 20% response to the second type of mail-order solicitation given that there has only been a 10% response to the first kind of mail-order solicitation. 6.11 An electronic component has two fuses. If an overload occurs, the time when fuse 1 blows is a random variable, X , and the time when fuse 2 blows is a random variable, Y . The joint probability density function of the random vector (X, Y ) is given by ( fX,Y (x, y) = e−y 0 0 ≤ x ≤ 1 and y ≥ 0 otherwise (a) Compute P (X + Y ≥ 1). (b) Are X and Y independent? Justify your answer. 6.12 The coordinates of a laser on a circular target are given by the random vector (X, Y ) with the following probability density function: 1 fX,Y (x, y) = π 0 x2 + y 2 ≤ 1 otherwise Hence, (X, Y ) has a uniform distribution on a disk of radius one centered at (0, 0). (a) Compute P (X 2 + Y 2 ≤ 0.25). (b) Find the marginal distributions for X and Y . (c) Compute P (X ≥ 0, Y > 0). (d) Compute P (X ≥ 0 | Y > 0). (e) Find the conditional distribution for X given Y = y . (Note: Be careful of the case |y| = 1.) 6.13 Let X and Y be discrete random variables with the support of X denoted by Θ and the support of Y denoted by Φ. Let pX be the marginal probability mass function for X , let pX|Y be the conditional probability mass function of X given Y , and let pY |X be the conditional probability mass function of Questions for Chapter 6 191 Y given X . Show that pY |X (y|x)pX (x) pX|Y (x|y) = X pY |X (y|x)pX (x) x∈Θ for any x ∈ Θ and y ∈ Φ. 6.14 The volumetric fractions of each of two compounds in a mixture are random variables X and Y , respectively. The random vector (X, Y ) has joint probability density function ( 2 0 fX,Y (x, y) = 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 − x otherwise (a) Determine the marginal probability density functions for X and for Y . (b) Are X and Y independent random variables? Justify your answer. (c) Compute P (X < 0.5 | Y = 0.5). (d) Compute P (X < 0.5 | Y < 0.5). 6.15 Suppose (X, Y ) is a continuous random vector with joint probability density function ( 2 0 fX,Y (x, y) = if x > 0, y > 0 and x + y < 1 otherwise Find E(X + Y ). 6.16 Suppose (X, Y, Z) is a random vector with joint probability density function ( fX,Y,Z (x, y, z) = 3 2 8 (xyz ) 0 if 0 < x < 1, 0 < y < 2, 0 < z < 2 otherwise Find E(X + Y + Z). 6.17 Suppose the joint probability density function of the price, P , of a commodity (in dollars) and total sales, S , (in 10, 000 units) is given by ( fS,P (s, p) = Find E(S | P = 0.25). 5pe−ps 0 0.20 < p < 0.40 and s > 0 otherwise 192 Questions for Chapter 6 6.18 Let X be the proportion of persons who will respond to one kind of mailorder solicitation, and let Y be the proportion of persons who will respond to a second type of mail-order solicitation. Suppose the joint probability density function of X and Y is given by ( fX,Y (x, y) = 2 5 (x + 4y) 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 otherwise 0 Find E(X | Y = y) for each 0 < y < 1. 6.19 Suppose that (X, Y ) is a random vector with the following joint probability density function: ( fX,Y (x, y) = 2x 0 if 0 < x < 1 and 0 < y < 1 otherwise Find the marginal probability density function for Y . 6.20 Suppose that (X, Y ) is a random vector with the following joint probability density function: ( fX,Y (x, y) = 2xe−y 0 if 0 ≤ x ≤ 1 and y ≥ 0 otherwise (a) Find P (X > 0.5). (b) Find the marginal probability density function for Y . (c) Find E(X). (d) Are X and Y independent random variables? Justify your answer. 6.21 The Department of Metaphysics of Podunk University has 5 students, with 2 juniors and 3 seniors. Plans are being made for a party. Let X denote the number of juniors who will attend the party, and let Y denote the number of seniors who will attend. After an extensive survey was conducted, it has been determined that the random vector (X, Y ) has the following joint probability mass function: pX,Y (x, y) X (juniors) 0 1 2 Y (seniors) 0 1 2 .01 .25 .03 .04 .05 .15 .04 .12 .20 3 .01 .05 .05 Questions for Chapter 6 193 (a) What is the probability that 3 or more students (juniors and/or seniors) will show up at the party? (b) What is the probability that more seniors than juniors will show up at the party? (c) If juniors are charged $5 for attending the party and seniors are charged $10, what is the expected amount of money that will be collected from all students? (d) The two juniors arrive early at the party. What is the probability that no seniors will show up? 6.22 Let (X, Y ) be a continuous random vector with joint probability density function given by ( fX,Y (x, y) = 2 0 if x ≥ 0, y ≥ 0 and x + y ≤ 1 . otherwise (a) Compute P (X < Y ). (b) Find the marginal probability density function for X and the marginal probability density function for Y . (c) Find the conditional probability density function for Y given that X = 12 . (d) Are X and Y independent random variables? Justify your answer. 6.23 Let (X, Y ) be a continuous random vector with joint probability density function given by ( fX,Y (x, y) = 4xy 0 for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 . otherwise (a) Find the marginal probability density function for X and the marginal probability density function for Y . (b) Are X and Y independent random variables? Justify your answer. (c) Find P (X + Y ≤ 1). (d) Find P (X > 0.5 | Y ≤ 0.5). 194 Questions for Chapter 6 6.24 Let (X, Y ) be a continuous random vector with joint probability density function given by ( fX,Y (x, y) = 6x2 y 0 for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 otherwise (a) Find the marginal probability density function for X and the marginal probability density function for Y . (b) Are X and Y independent random variables? Justify your answer. (c) Find P (X + Y ≤ 1). (d) Find E(X + Y ). 6.25 A box contains four balls numbered 1, 2, 3 and 4. A game consists of drawing one of the balls at random from the box. It is not replaced. A second ball is then drawn at random from the box. Let X be the number on the first ball drawn, and let Y be the number on the second ball. (a) Find the joint probability mass function for the random vector (X, Y ). (b) Compute P (Y ≥ 3 | X = 1). 6.26 Suppose that (X, Y ) is a random vector with the following joint probability density function: 2xe−y fX,Y (x, y) = if 0 < x < 1 and y > 0 0 otherwise (a) Compute P (X > 0.5, Y < 1.0). (b) Compute P (X > 0.5 | Y < 1.0). (c) Find the marginal probability density function for X . (d) Find E(X + Y ). Questions for Chapter 6 195 6.27 Past studies have shown that juniors taking a particular probability course receive grades according to a normal distribution with mean 65 and variance 36. Seniors taking the same course receive grades normally distributed with a mean of 70 and a variance of 36. A probability class is composed of 75 juniors and 25 seniors. (a) What is the probability that a student chosen at random from the class will receive a grade in the 70’s (i.e., between 70 and 80)? (b) If a student is chosen at random from the class, what is the student’s expected grade? (c) A student is chosen at random from the class and you are told that the student has a grade in the 70’s. What is the probability that the student is a junior? 6.28 (†) Suppose X and Y are positive, independent continuous random variables with probability density functions fX (x) and fY (y), respectively. Let Z = X/Y . (a) Express the probability density function for Z in terms of an integral that involves only fX and fY . (b) Now suppose that X and Y can be positive and/or negative. Express the probability density function for Z in terms of an integral that involves only fX and fY . Compare your answer to the case where X and Y are both positive. ANSWERS TO SELFTEST EXERCISES Chapter 6 S6.1 B S6.2 D S6.3 E S6.4 C S6.5 C S6.6 B S6.7 C S6.8 C S6.9 C S6.10 B S6.11 B S6.12 C S6.13 E S6.14 C S6.15 A S6.16 C S6.17 A S6.18 D S6.19 E S6.20 D S6.21 D S6.22 B Remarks S6.19 The correct answer is 1