Jointly Distributed Random Variables Consider tossing a fair (1,1) (1,2) (1,3) (2,1) (2,2) (2,3) ··· ··· ··· (6,1) (6,2) (6,3) die twice. Then the outcomes would be (1,4) (1,5) (1,6) (2,4) (2,5) (2,6) ··· ··· ··· (6,4) (6,5) (6,6) 1 . and the probability for each outcome is 36 If we define two random variables by X = the outcome of the first toss and Y = the outcome of the second toss, then the outcome for this experiment (two tosses) can be describe by the random pair (X , Y ), and the probability for any possible value of that random pair 1 (x, y ) is 36 . Liang Zhang (UofU) Applied Statistics I July 3, 2008 1 / 21 Jointly Distributed Random Variables Definition Let X and Y be two discrete random variables defined on the sample space S of an experiment. The joint probability mass function p(x, y ) is defined for each pair of numbers (x, y ) by p(x, y ) = P(X = x and Y = y ) P P (It must be the case that p(x, y ) ≥ 0 and x y p(x, y ) = 1.) For any event A consisting of pairs of (x, y ), the probability P[(X , Y ) ∈ A] is obtained by summing the joint pmf over pairs in A: XX P[(X , Y ) ∈ A] = p(x, y ) (x,y )∈A Liang Zhang (UofU) Applied Statistics I July 3, 2008 2 / 21 Jointly Distributed Random Variables Example (Problem 75) A restaurant serves three fixed-price dinners costing $12, $15, and $20. For a randomly selected couple dinning at this restaurant, let X = the cost of the man’s dinner and Y = the cost of the woman’s dinner. If the joint pmf of X and Y is assumed to be y p(x, y ) 12 15 20 12 .05 .05 .10 x 15 .05 .10 .35 20 0 .20 .10 a. What is the probability for them to both have the $12 dinner? b. What is the probability that they have the same price dinner? c. What is the probability that the man’s dinner cost $12? Liang Zhang (UofU) Applied Statistics I July 3, 2008 3 / 21 Jointly Distributed Random Variables Definition Let X and Y be two discrete random variables defined on the sample space S of an experiment with joint probability mass function p(x, y ). Then the pmf’s of each one of the variables alone are called the marginal probability mass functions, denoted by pX (x) and pY (y ), respectively. Furthermore, X X pX (x) = p(x, y ) and pY (y ) = p(x, y ) y Liang Zhang (UofU) x Applied Statistics I July 3, 2008 4 / 21 Jointly Distributed Random Variables Example (Problem 75) continued: The marginal probability mass functions for the previous example is calculated as following: y p(x, y ) 12 15 20 p(x, ·) 12 .05 .05 .10 .20 x 15 .05 .10 .35 .50 20 0 .20 .10 .30 p(·, y ) .10 .35 .55 Liang Zhang (UofU) Applied Statistics I July 3, 2008 5 / 21 Jointly Distributed Random Variables Definition Let X and Y be continuous random variables. A joint probability density functionR f (x, y ) for these two variables is a function satisfying ∞ R∞ f (x, y ) ≥ 0 and −∞ −∞ f (x, y )dxdy = 1. For any two-dimensional set A ZZ P[(X , Y ) ∈ A] = f (x, y )dxdy A In particular, if A is the two-dimensilnal rectangle {(x, y ) : a ≤ x ≤ b, c ≤ y ≤ d}, then Z bZ P[(X , Y ) ∈ A] = P(a ≤ X ≤ b, c ≤ Y ≤ d) = f (x, y )dydx a Liang Zhang (UofU) Applied Statistics I d c July 3, 2008 6 / 21 Jointly Distributed Random Variables Definition Let X and Y be continuous random variables with joint pdf f (x, y ). Then the marginal probability density functions of X and Y , denoted by fX (x) and fY (y ), respectively, are given by Z ∞ fX (x) = f (x, y )dy for − ∞ < x < ∞ Z−∞ ∞ fY (y ) = f (x, y )dx for − ∞ < y < ∞ −∞ Liang Zhang (UofU) Applied Statistics I July 3, 2008 7 / 21 Jointly Distributed Random Variables Example (variant of Problem 12) Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y : ( xe −(x+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise a. What is the probability that the lifetimes of both components excceed 3? b. What are the marginal pdf’s of X and Y ? c. What is the probability that the lifetime X of the first component excceeds 3? d. What is the probability that the lifetime of at least one component excceeds 3? Liang Zhang (UofU) Applied Statistics I July 3, 2008 8 / 21 Jointly Distributed Random Variables Example (Problem 17) An ecologist wishes to select a point inside a circular sampling region according to a uniform distribution (in practice this could be done by first selecting a direction and then a distance from the center in that direction). let X = the x coordinate of the point selected and Y = the y coordinate of the point selected. If the circle is centered at (0,0) and has radius R, then the joint pdf of X and Y is ( 1 x2 + y 2 ≤ R2 2 f (x, y ) = πR 0 otherwise What is the probability that the x coordinate of the selected point is within R/2 of the center of the circular region? Liang Zhang (UofU) Applied Statistics I July 3, 2008 9 / 21 Jointly Distributed Random Variables Recall the following example (variant of Problem 12): Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y : ( xe −(x+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise The marginal pdf’s of X and Y are fX (x) = xe −x and fY (y ) = e −y , respectively. We see that fX (x) · fY (y ) = f (x, y ). Liang Zhang (UofU) Applied Statistics I July 3, 2008 10 / 21 Jointly Distributed Random Variables Definition Two random variables X and Y are said to be independent if for every pair of x and y values, p(x, y ) = pX (x) · pY (y ) when X and Y are discrete or f (x, y ) = fX (x) · fY (y ) when X and Y are continuous If the above relation is not satisfied for all (x, y ), then X and Y are said to be dependent. Liang Zhang (UofU) Applied Statistics I July 3, 2008 11 / 21 Jointly Distributed Random Variables Examples: Problem 12. The joint pdf for X and Y is ( xe −x(1+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise The marginal pdf’s of X and Y are fX (x) = −e −x and fY (y ) = 1 , (1 + y )2 respectively. We see that fX (x) · fY (y ) 6= f (x, y ). Liang Zhang (UofU) Applied Statistics I July 3, 2008 12 / 21 Jointly Distributed Random Variables Examples: Our first example: tossing a fair die. If we let X = the outcome of the first toss and Y = the outcome of the second toss, then we will have p(x, y ) = pX (x) · pY (y ) Obviously, we know the two toss should be independent. Our second example: dinning choices. y p(x, y ) 12 15 20 p(x, ·) 12 .05 .05 .10 .20 pX (12) · pY (12) 6= p(12, 12). x 15 .05 .10 .35 .50 20 0 .20 .10 .30 p(·, y ) .10 .35 .55 Liang Zhang (UofU) Applied Statistics I July 3, 2008 13 / 21 Jointly Distributed Random Variables Definition If X1 , X2 , . . . , Xn are all discrete random variables, the joint pmf of the variables is the function p(x1 , x2 , . . . , xn ) = P(X1 = x1 , X2 = x2 , . . . , Xn = xn ) If the random variables are continuous, the joint pdf of X1 , X2 , . . . , Xn is the function f (x1 , x2 , . . . , xn ) such that for any n intervals [a1 , b1 ], . . . , [an , bn ], P(a1 ≤ X1 ≤ b1 , . . . an ≤ Xn ≤ bn ) = Z b1 Z bn ··· f (x1 , . . . , xn )dxn . . . dx1 a1 Liang Zhang (UofU) an Applied Statistics I July 3, 2008 14 / 21 Jointly Distributed Random Variables Example: Consider tossing a particular die six times. The probabilities for outcomes of each toss are given as following: 1 2 3 4 5 6 x p(x) .15 .20 .25 .20 .15 .05 If we are interested in obtaining exactly three “1’s”, then this experiment can be modeled by the binomial distribution. However, if the question is “what is the probability for obtaining exactly three 1’s, two 5’s and one 6”, the binomial distribution can not do the job. Let Xi = number of i’s from the experiment (six tosses). Then P(X1 = 3, X2 = 0, X3 = 0, X4 = 0, X5 = 2, X6 = 1) = 6! (.15)3 (.15)2 (.05)1 3!2!1! Liang Zhang (UofU) Applied Statistics I July 3, 2008 15 / 21 Jointly Distributed Random Variables Multinomial Distribution: 1. The experiment consists of a sequence of n trials, where n is fixed in advance of the experiment; 2. Each trial can result in one of the r possible outcomes; 3. The trials are independent; 3. The trials are identical, which means the probabilities for outcomes of eachP trial are the same. We use p1 , p2 , . . . , pr to denote them. (pi > 0 and ri=1 pi = 1) Definition An experiment for which Conditions 1 — 4 are satisfied is called a multinomial experiment. Let Xi = the number of trials resulting in outcome i, then the joint pmf of X1 , X2 , . . . , Xr is called a multinomial distribution. Liang Zhang (UofU) Applied Statistics I July 3, 2008 16 / 21 Jointly Distributed Random Variables Remark: The joint pmf is p(x1 , x2 , . . . , xr ) = ( x1 x2 n! xr (x1 !)(x2 !)···(xr !) p1 p2 · · · pr 0 Liang Zhang (UofU) 0 ≤ xi ≤ n with Pr i=1 xi =n otherwise Applied Statistics I July 3, 2008 17 / 21 Jointly Distributed Random Variables Definition The random variables X1 , X2 , . . . , Xn are said to be independent if for every subset Xi1 , Xi2 , . . . , Xik of the variables (each pair, each triple, and so on), the joint pmf or pdf of the subset is equal to the product of the marginal pmf’s or pdf’s. e.g. one way to construct a multinormal distribution is to take the product of pdf’s of n independent standard normal rv’s: 1 −x22 /2 1 −xn2 /2 1 −x12 /2 √ e f (x1 , x2 , . . . , xn ) = √ e ··· √ e 2π 2π 2π 1 2 2 2 = √ e −(x1 +x2 +···+xn )/2 ( 2π)n Liang Zhang (UofU) Applied Statistics I July 3, 2008 18 / 21 Jointly Distributed Random Variables Recall the following example (Problem 12): Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y : ( xe −x(1+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise If we find out that the lifetime for the second component is 8 (Y = 8), what is the probability for the first component to have a lifetime more than 8, i.e. what is P(X ≥ 8 | Y = 8)? We can answer this question by studying conditional probability distributions. Liang Zhang (UofU) Applied Statistics I July 3, 2008 19 / 21 Jointly Distributed Random Variables Definition Let X and Y be two continuous rv’s with joint pdf f (x, y ) and marginal Y pdf fY (y ). Then for any Y value y for which fY (y ) > 0, the conditional probability density function of X given that Y = y is fX |Y (x | y ) = f (x, y ) fY (y ) −∞<x <∞ If X and Y are discrete, then conditional probability mass function of X given that Y = y is pX |Y (x | y ) = Liang Zhang (UofU) p(x, y ) pY (y ) Applied Statistics I −∞<x <∞ July 3, 2008 20 / 21 Jointly Distributed Random Variables Example (Problem 12 revisit): Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y : ( xe −x(1+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise What is P(X ≥ 8 | Y = 8)? f (x, y ) fX |Y (x | y ) = = fY (y ) ( x(1 + y )2 e −x(1+y ) 0 x ≥ 0 and y ≥ 0 otherwise Then Z 8 P(X ≥ 8 | Y = 8) = 1 − −∞ Z 8 fX |Y (x | 8)dx 81xe −9x dx = 73e −72 =1− 0 Liang Zhang (UofU) Applied Statistics I July 3, 2008 21 / 21