Jointly Distributed Random Variables Jointly Distributed Random Variables Definition Let X and Y be two discrete random variables defined on the sample space S of an experiment. The joint probability mass function p(x, y ) is defined for each pair of numbers (x, y ) by p(x, y ) = P(X = x and Y = y ) P P (It must be the case that p(x, y ) ≥ 0 and x y p(x, y ) = 1.) For any event A consisting of pairs of (x, y ), the probability P[(X , Y ) ∈ A] is obtained by summing the joint pmf over pairs in A: XX P[(X , Y ) ∈ A] = p(x, y ) (x,y )∈A Jointly Distributed Random Variables Jointly Distributed Random Variables Definition Let X and Y be two discrete random variables defined on the sample space S of an experiment with joint probability mass function p(x, y ). Then the pmf’s of each one of the variables alone are called the marginal probability mass functions, denoted by pX (x) and pY (y ), respectively. Furthermore, X X pX (x) = p(x, y ) and pY (y ) = p(x, y ) y x Jointly Distributed Random Variables Jointly Distributed Random Variables Definition Let X and Y be continuous random variables. A joint probability density function f (x, y ) for R ∞these R ∞ two variables is a function satisfying f (x, y ) ≥ 0 and −∞ −∞ f (x, y )dxdy = 1. For any two-dimensional set A ZZ P[(X , Y ) ∈ A] = f (x, y )dxdy A In particular, if A is the two-dimensilnal rectangle {(x, y ) : a ≤ x ≤ b, c ≤ y ≤ d}, then Z bZ P[(X , Y ) ∈ A] = P(a ≤ X ≤ b, c ≤ Y ≤ d) = d f (x, y )dydx a c Jointly Distributed Random Variables Jointly Distributed Random Variables Definition Let X and Y be continuous random variables with joint pdf f (x, y ). Then the marginal probability density functions of X and Y , denoted by fX (x) and fY (y ), respectively, are given by Z ∞ fX (x) = f (x, y )dy for − ∞ < x < ∞ Z−∞ ∞ fY (y ) = f (x, y )dx for − ∞ < y < ∞ −∞ Jointly Distributed Random Variables Jointly Distributed Random Variables Example (variant of Problem 12) Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y : ( xe −(x+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise Jointly Distributed Random Variables Example (variant of Problem 12) Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y : ( xe −(x+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise a. What is the probability that the lifetimes of both components excceed 3? Jointly Distributed Random Variables Example (variant of Problem 12) Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y : ( xe −(x+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise a. What is the probability that the lifetimes of both components excceed 3? b. What are the marginal pdf’s of X and Y ? Jointly Distributed Random Variables Example (variant of Problem 12) Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y : ( xe −(x+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise a. What is the probability that the lifetimes of both components excceed 3? b. What are the marginal pdf’s of X and Y ? c. What is the probability that the lifetime X of the first component excceeds 3? Jointly Distributed Random Variables Example (variant of Problem 12) Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y : ( xe −(x+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise a. What is the probability that the lifetimes of both components excceed 3? b. What are the marginal pdf’s of X and Y ? c. What is the probability that the lifetime X of the first component excceeds 3? d. What is the probability that the lifetime of at least one component excceeds 3? Jointly Distributed Random Variables Jointly Distributed Random Variables Example (Problem 17) An ecologist wishes to select a point inside a circular sampling region according to a uniform distribution (in practice this could be done by first selecting a direction and then a distance from the center in that direction). let X = the x coordinate of the point selected and Y = the y coordinate of the point selected. If the circle is centered at (0,0) and has radius R, then the joint pdf of X and Y is ( 1 x2 + y 2 ≤ R2 2 f (x, y ) = πR 0 otherwise Jointly Distributed Random Variables Example (Problem 17) An ecologist wishes to select a point inside a circular sampling region according to a uniform distribution (in practice this could be done by first selecting a direction and then a distance from the center in that direction). let X = the x coordinate of the point selected and Y = the y coordinate of the point selected. If the circle is centered at (0,0) and has radius R, then the joint pdf of X and Y is ( 1 x2 + y 2 ≤ R2 2 f (x, y ) = πR 0 otherwise What is the probability that the x coordinate of the selected point is within R/2 of the center of the circular region? Jointly Distributed Random Variables Jointly Distributed Random Variables Recall the following example (variant of Problem 12): Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y : ( xe −(x+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise Jointly Distributed Random Variables Recall the following example (variant of Problem 12): Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y : ( xe −(x+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise The marginal pdf’s of X and Y are fX (x) = xe −x and fY (y ) = e −y , respectively. Jointly Distributed Random Variables Recall the following example (variant of Problem 12): Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y : ( xe −(x+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise The marginal pdf’s of X and Y are fX (x) = xe −x and fY (y ) = e −y , respectively. We see that fX (x) · fY (y ) = f (x, y ). Jointly Distributed Random Variables Jointly Distributed Random Variables Definition Two random variables X and Y are said to be independent if for every pair of x and y values, p(x, y ) = pX (x) · pY (y ) when X and Y are discrete or f (x, y ) = fX (x) · fY (y ) when X and Y are continuous If the above relation is not satisfied for all (x, y ), then X and Y are said to be dependent. Jointly Distributed Random Variables Jointly Distributed Random Variables Examples: Problem 12. The joint pdf for X and Y is ( xe −x(1+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise Jointly Distributed Random Variables Examples: Problem 12. The joint pdf for X and Y is ( xe −x(1+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise The marginal pdf’s of X and Y are fX (x) = e −x and fY (y ) = respectively. 1 , (1 + y )2 Jointly Distributed Random Variables Examples: Problem 12. The joint pdf for X and Y is ( xe −x(1+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise The marginal pdf’s of X and Y are fX (x) = e −x and fY (y ) = 1 , (1 + y )2 respectively. We see that fX (x) · fY (y ) 6= f (x, y ). Jointly Distributed Random Variables Jointly Distributed Random Variables Examples: Our first example: tossing a fair die. If we let X = the outcome of the first toss and Y = the outcome of the second toss, then we will have p(x, y ) = pX (x) · pY (y ) Obviously, we know the two toss should be independent. Jointly Distributed Random Variables Examples: Our first example: tossing a fair die. If we let X = the outcome of the first toss and Y = the outcome of the second toss, then we will have p(x, y ) = pX (x) · pY (y ) Obviously, we know the two toss should be independent. Our second example: dinning choices. y p(x, y ) 12 15 20 p(x, ·) 12 .05 .05 .10 .20 x 15 .05 .10 .35 .50 20 0 .20 .10 .30 p(·, y ) .10 .35 .55 pX (12) · pY (12) 6= p(12, 12). Jointly Distributed Random Variables Jointly Distributed Random Variables Definition If X1 , X2 , . . . , Xn are all discrete random variables, the joint pmf of the variables is the function p(x1 , x2 , . . . , xn ) = P(X1 = x1 , X2 = x2 , . . . , Xn = xn ) If the random variables are continuous, the joint pdf of X1 , X2 , . . . , Xn is the function f (x1 , x2 , . . . , xn ) such that for any n intervals [a1 , b1 ], . . . , [an , bn ], P(a1 ≤ X1 ≤ b1 , . . . an ≤ Xn ≤ bn ) = Z b1 Z bn ··· f (x1 , . . . , xn )dxn . . . dx1 a1 an Jointly Distributed Random Variables Jointly Distributed Random Variables Example: Consider tossing a particular die six times. The probabilities for outcomes of each toss are given as following: 1 2 3 4 5 6 x p(x) .15 .20 .25 .20 .15 .05 If we are interested in obtaining exactly three “1’s”, then this experiment can be modeled by the binomial distribution. Jointly Distributed Random Variables Example: Consider tossing a particular die six times. The probabilities for outcomes of each toss are given as following: 1 2 3 4 5 6 x p(x) .15 .20 .25 .20 .15 .05 If we are interested in obtaining exactly three “1’s”, then this experiment can be modeled by the binomial distribution. However, if the question is “what is the probability for obtaining exactly three 1’s, two 5’s and one 6”, the binomial distribution can not do the job. Jointly Distributed Random Variables Example: Consider tossing a particular die six times. The probabilities for outcomes of each toss are given as following: 1 2 3 4 5 6 x p(x) .15 .20 .25 .20 .15 .05 If we are interested in obtaining exactly three “1’s”, then this experiment can be modeled by the binomial distribution. However, if the question is “what is the probability for obtaining exactly three 1’s, two 5’s and one 6”, the binomial distribution can not do the job. Let Xi = number of i’s from the experiment (six tosses). Then P(X1 = 3, X2 = 0, X3 = 0, X4 = 0, X5 = 2, X6 = 1) = 6! (.15)3 (.15)2 (.05)1 3!2!1! Jointly Distributed Random Variables Jointly Distributed Random Variables Multinomial Distribution: 1. The experiment consists of a sequence of n trials, where n is fixed in advance of the experiment; Jointly Distributed Random Variables Multinomial Distribution: 1. The experiment consists of a sequence of n trials, where n is fixed in advance of the experiment; 2. Each trial can result in one of the r possible outcomes; Jointly Distributed Random Variables Multinomial Distribution: 1. The experiment consists of a sequence of n trials, where n is fixed in advance of the experiment; 2. Each trial can result in one of the r possible outcomes; 3. The trials are independent; Jointly Distributed Random Variables Multinomial Distribution: 1. The experiment consists of a sequence of n trials, where n is fixed in advance of the experiment; 2. Each trial can result in one of the r possible outcomes; 3. The trials are independent; 3. The trials are identical, which means the probabilities for outcomes of each trial are P the same. We use p1 , p2 , . . . , pr to denote them. (pi > 0 and ri=1 pi = 1) Jointly Distributed Random Variables Multinomial Distribution: 1. The experiment consists of a sequence of n trials, where n is fixed in advance of the experiment; 2. Each trial can result in one of the r possible outcomes; 3. The trials are independent; 3. The trials are identical, which means the probabilities for outcomes of each trial are P the same. We use p1 , p2 , . . . , pr to denote them. (pi > 0 and ri=1 pi = 1) Definition An experiment for which Conditions 1 — 4 are satisfied is called a multinomial experiment. Let Xi = the number of trials resulting in outcome i, then the joint pmf of X1 , X2 , . . . , Xr is called a multinomial distribution. Jointly Distributed Random Variables Jointly Distributed Random Variables Remark: The joint pmf is p(x1 , x2 , . . . , xr ) = ( x1 x2 n! xr (x1 !)(x2 !)···(xr !) p1 p2 · · · pr 0 0 ≤ xi ≤ n with otherwise Pr i=1 xi =n Jointly Distributed Random Variables Jointly Distributed Random Variables Definition The random variables X1 , X2 , . . . , Xn are said to be independent if for every subset Xi1 , Xi2 , . . . , Xik of the variables (each pair, each triple, and so on), the joint pmf or pdf of the subset is equal to the product of the marginal pmf’s or pdf’s. Jointly Distributed Random Variables Definition The random variables X1 , X2 , . . . , Xn are said to be independent if for every subset Xi1 , Xi2 , . . . , Xik of the variables (each pair, each triple, and so on), the joint pmf or pdf of the subset is equal to the product of the marginal pmf’s or pdf’s. e.g. one way to construct a multinormal distribution is to take the product of pdf’s of n independent standard normal rv’s: 1 −x12 /2 1 −x22 /2 1 −xn2 /2 √ e f (x1 , x2 , . . . , xn ) = √ e ··· √ e 2π 2π 2π 1 2 2 2 = √ e −(x1 +x2 +···+xn )/2 ( 2π)n Jointly Distributed Random Variables Jointly Distributed Random Variables Recall the following example (Problem 12): Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y : ( xe −x(1+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise Jointly Distributed Random Variables Recall the following example (Problem 12): Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y : ( xe −x(1+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise If we find out that the lifetime for the second component is 8 (Y = 8), what is the probability for the first component to have a lifetime more than 8, i.e. what is P(X ≥ 8 | Y = 8)? Jointly Distributed Random Variables Recall the following example (Problem 12): Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y : ( xe −x(1+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise If we find out that the lifetime for the second component is 8 (Y = 8), what is the probability for the first component to have a lifetime more than 8, i.e. what is P(X ≥ 8 | Y = 8)? We can answer this question by studying conditional probability distributions. Jointly Distributed Random Variables Jointly Distributed Random Variables Definition Let X and Y be two continuous rv’s with joint pdf f (x, y ) and marginal Y pdf fY (y ). Then for any Y value y for which fY (y ) > 0, the conditional probability density function of X given that Y = y is fX |Y (x | y ) = f (x, y ) fY (y ) −∞<x <∞ If X and Y are discrete, then conditional probability mass function of X given that Y = y is pX |Y (x | y ) = p(x, y ) pY (y ) −∞<x <∞ Jointly Distributed Random Variables Jointly Distributed Random Variables Example (Problem 12 revisit): Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y : ( xe −x(1+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise Jointly Distributed Random Variables Example (Problem 12 revisit): Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y : ( xe −x(1+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise What is P(X ≥ 8 | Y = 8)? Jointly Distributed Random Variables Example (Problem 12 revisit): Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y : ( xe −x(1+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise What is P(X ≥ 8 | Y = 8)? ( x(1 + y )2 e −x(1+y ) f (x, y ) fX |Y (x | y ) = = fY (y ) 0 x ≥ 0 and y ≥ 0 otherwise Jointly Distributed Random Variables Example (Problem 12 revisit): Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y : ( xe −x(1+y ) x ≥ 0 and y ≥ 0 f (x, y ) = 0 otherwise What is P(X ≥ 8 | Y = 8)? ( x(1 + y )2 e −x(1+y ) f (x, y ) fX |Y (x | y ) = = fY (y ) 0 x ≥ 0 and y ≥ 0 otherwise Then Z 8 P(X ≥ 8 | Y = 8) = 1 − −∞ Z 8 fX |Y (x | 8)dx 81xe −9x dx = 73e −72 =1− 0