6 RANDOM VECTORS Basic Concepts

advertisement
6
RANDOM VECTORS
Basic Concepts
Often, a single random variable cannot adequately provide all of the information
needed about the outcome of an experiment. For example, tomorrow’s weather
is really best described by an array of random variables that includes wind speed,
wind direction, atmospheric pressure, relative humidity and temperature. It would
not be either easy or desirable to attempt to combine all of this information into a
single measurement.
We would like to extend the notion of a random variable to deal with an experiment
that results in several observations each time the experiment is run. For example,
let T be a random variable representing tomorrow’s maximum temperature and
let R be a random variable representing tomorrow’s total rainfall. It would be
reasonable to ask for the probability that tomorrow’s temperature is greater than
70◦ and tomorrow’s total rainfall is less than 0.1 inch. In other words, we wish to
determine the probability of the event
A = {T > 70, R < 0.1}.
Another question that we might like to have answered is, “What is the probability
that the temperature will be greater than 70◦ regardless of the rainfall?” To answer
this question, we would need to compute the probability of the event
B = {T > 70}.
In this chapter, we will build on our probability model and extend our definition of
a random variable to permit such calculations.
Definition
The first thing we must do is to precisely define what we mean by “an array of
random variables.”
153
154
Definition
y
ω
3
2
(4,1)
1
Ω
1
2
3
4
5
x
Figure 6.1: A mapping using (X1 , X2 )
Definition 6.1. ¡Let Ω be a sample space.¢An n-dimensional random variable or
random vector, X1 (·), X
(·), . . . , Xn (·) , is a vector
of functions that assigns to
¡2
¢
each point ω ∈ Ω a point X1 (ω), X2 (ω), . . . , Xn (ω) in n-dimensional Euclidean
space.
Example: Consider an experiment where a die is rolled twice. Let X1 denote the
number of the first roll, and X2 the number of the second roll. Then (X1 , X2 ) is a
two-dimensional random vector. A possible sample point in Ω is
that is mapped into the point (4, 1) as shown in Figure 6.1.
Joint Distributions
155
(x,y)
Figure 6.2: The region representing the event P (X ≤ x, Y ≤ y).
Joint Distributions
Now that we know the definition of a random vector, we can begin to use it to assign
probabilities to events. For any random vector, we can define a joint cumulative
distribution function for all of the components as follows:
Definition 6.2. Let (X1 , X2 , . . . , Xn ) be a random vector. The joint cumulative
distribution function for this random vector is given by
FX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) ≡ P (X1 ≤ x1 , X2 ≤ x2 , . . . , Xn ≤ xn ).
In the two-dimensional case, the joint cumulative distribution function for the
random vector (X, Y ) evaluated at the point (x, y), namely
FX,Y (x, y),
is the probability that the experiment results in a two-dimensional value within the
shaded region shown in Figure 6.2.
Every joint cumulative distribution function must posses the following properties:
1.
lim
all xi →−∞
FX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) = 0
156
Discrete Distributions
x2
6
5
4
3
2
1
1
2
3
4
5
6
x1
Figure 6.3: The possible outcomes from rolling a die twice.
2.
lim
all xi →+∞
FX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) = 1
3. As xi varies, with all other xj ’s (j 6= i) fixed,
FX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn )
is a nondecreasing function of xi .
As in the case of one-dimensional random variables, we shall identify two major classifications of vector-valued random variables: discrete and continuous.
Although there are many common properties between these two types, we shall
discuss each separately.
Discrete Distributions
A random vector that can only assume at most a countable collection of discrete
values is said to be discrete. As an example, consider once again the example on
page 154 where a die is rolled twice. The possible values for either X1 or X2 are in
the set {1, 2, 3, 4, 5, 6}. Hence, the random vector (X1 , X2 ) can only take on one
of the 36 values shown in Figure 6.3.
If the die is fair, then each of the points can be considered to have a probability
Continuous Distributions
157
1
mass of 36
. This prompts us to define a joint probability mass function for this type
of random vector, as follows:
Definition 6.3. Let (X1 , X2 , . . . Xn ) be a discrete random vector. Then
pX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) ≡ P (X1 = x1 , X2 = x2 , . . . , Xn = xn ).
is the joint probability mass function for the random vector (X1 , X2 , . . . , Xn ).
Referring again to the example on page 154, we find that the joint probability mass
function for (X1 , X2 ) is given by
1
36
pX1 ,X2 (x1 , x2 ) =
for x1 = 1, 2, . . . , 6 and x2 = 1, 2, . . . , 6
Note that for any probability mass function,
FX1 ,X2 ,...,Xn (b1 , b2 , . . . , bn ) =
X X
x1 ≤b1 x2 ≤b2
···
X
pX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ).
xn ≤bn
Therefore, if we wished to evaluate FX1 ,X2 (3.0, 4.5) we would sum all of the
probability mass in the shaded region shown in Figure 6.3, and obtain
1
FX1 ,X2 (3.0, 4.5) = 12 36
= 13 .
This is the probability that the first roll is less than or equal to 3 and the second roll
is less than or equal to 4.5.
Every joint probability mass function must have the following properties:
1. pX1 ,X2 ,...,Xn (x1 , x2 , . . . xn ) ≥ 0 for any (x1 , x2 , . . . , xn ).
X
2.
X
...
all x1
pX1 ,X2 ,...,Xn (x1 , x2 , . . . xn ) = 1
all xn
3. P (E) =
X
pX1 ,X2 ,...,Xn (x1 , x2 , . . . xn ) for any event E .
(x1 ,...,xn )∈E
You should compare these properties with those of probability mass functions for
single-valued discrete random variables given in Chapter 5.
Continuous Distributions
Extending our notion of probability density functions to continuous random vectors
is a bit tricky and the mathematical details of the problem is beyond the scope of
158
Continuous Distributions
an introductory course. In essence, it is not possible to define the joint density as a
derivative of the cumulative distribution function as we did in the one-dimensional
case.
Let Rn denote n-dimensional Euclidean space. We sidestep the problem by defining
the density of an n-dimensional random vector to be a function that when integrated
over the set
{(x1 , . . . , xn ) ∈ Rn : x1 ≤ b1 , x2 ≤ b2 , . . . , xn ≤ bn }
will yield the value for the cumulative distribution function evaluated at
(b1 , b2 , . . . , bn ).
More formally, we have the following:
Definition 6.4. Let (X1 , . . . , Xn ) be a continuous random vector with joint
cumulative distribution function
FX1 ,...,Xn (x1 , . . . , xn ).
The function fX1 ,...,Xn (x1 , . . . , xn ) that satisfies the equation
FX1 ,...,Xn (b1 , . . . , bn ) =
Z b1
−∞
···
Z bn
−∞
fX1 ,...,Xn (t1 , . . . , tn ) dt1 · · · dtn
for all (b1 , . . . , bn ) is called the joint probability density function for the random
vector (X1 , . . . , Xn ).
Now, instead of obtaining a derived relationship between the density and the
cumulative distribution function by using integrals as anti-derivatives, we have
enforced such a relationship by the above definition.
Every joint probability density function must have the following properties:
1. fX1 ,X2 ,...,Xn (x1 , x2 , . . . xn ) ≥ 0 for any (x1 , x2 , . . . , xn ).
Z +∞
2.
−∞
···
3. P (E) =
Z +∞
−∞
Z
···
fX1 ,...,Xn (x1 , . . . , xn ) dx1 · · · dxn = 1
Z
E
fX1 ,...,Xn (x1 , . . . , xn ) dx1 · · · dxn for any event E .
You should compare these properties with those of probability density functions
for single-valued continuous random variables given in Chapter 5.
Continuous Distributions
159
x2
b2
a2
a1
b1
x1
Figure 6.4: Computing P (a1 < X1 ≤ b1 , a2 < X2 ≤ b2 ).
In the one-dimensional case, we had the handy formula
P (a < X ≤ b) = FX (b) − FX (a).
This worked for any type of probability distribution. The situation in the multidimensional case is a little more complicated, with a comparable formula given
by
P (a1 < X1 ≤ b1 , a2 < X2 ≤ b2 ) =
FX1 ,X2 (b1 , b2 ) − FX1 ,X2 (a1 , b2 ) − FX1 ,X2 (b1 , a2 ) + FX1 ,X2 (a1 , a2 ).
You should be able to verify this formula for yourself by accounting for all of the
probability masses in the regions shown in Figure 6.4.
Example: Let (X, Y ) be a two-dimensional random variable with the following
joint probability density function (see Figure 6.5):
(
fX,Y (x, y) =
Note that
2−y
0
Z 2Z 2
1
0
if 0 ≤ x ≤ 2 and 1 ≤ y ≤ 2
otherwise
(2 − y) dx dy = 1.
160
Marginal Distributions
fXY(x,y)
1
1
2
y
1
2
x
Figure 6.5: A two-dimensional probability density function
Suppose we would like to compute P (X ≤ 1.0, Y ≤ 1.5). To do this, we calculate
the volume under the surface fX,Y (x, y) over the region {(x, y) : x ≤ 1, y ≤ 1.5}.
This region of integration is shown shaded (in green) in Figure 6.5. Performing the
integration, we get,
P (X ≤ 1.0, Y ≤ 1.5) =
=
Z 1.5 Z 1.0
−∞ −∞
Z 1.5 Z 1.0
1.0
0
fX,Y (x, y) dx dy
(2 − y) dx dy = 83 .
Marginal Distributions
Given the probability distribution for a vector-valued random variable (X1 , . . . Xn ),
we might ask the question, “Can we determine the distribution of X1 , disregarding
the other components?” The answer is yes, and the solution requires the careful
use of English rather than mathematics.
For example, in the two-dimensional case, we may be given a random vector (X, Y )
with joint cumulative distribution function FX,Y (x, y). Suppose we would like to
find the cumulative distribution function for X alone, i.e., FX (x)? We know that
FX,Y (x, y) = P (X ≤ x, Y ≤ y)
Marginal Distributions
161
and we are asking for
(1)
FX (x) = P (X ≤ x).
But in terms of both X and Y , expression 1 can be read: “the probability that X
takes on a value less than or equal to x and Y takes on any value.” Therefore, it
would make sense to say
FX (x) = P (X ≤ x)
= P (X ≤ x, Y ≤ ∞)
=
lim FX,Y (x, y).
y→∞
Using this idea, we shall define what we will call the marginal cumulative distribution function:
Definition 6.5. Let (X1 , . . . , Xn ) be a random vector with joint cumulative distribution function FX1 ,...,Xn (x1 , . . . , xn ). The marginal cumulative distribution
function for X1 is given by
FX1 (x1 ) = lim
lim · · · lim FX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ).
x2 →∞ x3 →∞
xn →∞
Notice that we can renumber the components of the random vector and call any one
of them X1 . So we can use the above definition to find the marginal cumulative
distribution function for any of the Xi ’s.
Although Definition 6.5 is a nice definition, it is more useful to examine marginal
probability mass functions and marginal probability density functions. For example, suppose we have a discrete random vector (X, Y ) with joint probability mass
function pX,Y (x, y). To find pX (x), we ask “What is the probability that X = x
regardless of the value that Y takes on? This can be written as
pX (x) = P (X = x) = P (X = x, Y = any value)
=
X
pX,Y (x, y).
all y
Example: In the die example (on page 154)
pX1 ,X2 (x1 , x2 ) =
1
36
for x1 = 1, 2, . . . , 6 and x2 = 1, 2, . . . , 6
To find pX1 (2), for example, we compute
pX1 (2) = P (X1 = 2) =
6
X
k=1
pX1 ,X2 (2, k) = 61 .
162
Marginal Distributions
Table 6.1: Joint pmf for daily production
pX,Y (x, y)
1
2
X
3
4
5
1
.05
.15
.05
.05
.10
Y
2
3
0
0
.10
0
.05
.10
.025 .025
.10
.10
4
0
0
0
0
.10
5
0
0
0
0
0
This is hardly a surprising result, but it brings some comfort to know we can get it
from all of the mathematical machinery we’ve developed thus far.
Example: Let X be the total number of items produced in a day’s work at a factory,
and let Y be the number of defective items produced. Suppose that the probability
mass function for (X, Y ) is given by Table 6.1. Using this joint distribution, we
can see that the probability of producing 2 items with exactly 1 of those items being
defective is
pX,Y (2, 1) = 0.15.
To find the marginal probability mass function for the total daily production, X ,
we sum the probabilities over all possible values of Y for each fixed x:
pX (1) = pX,Y (1, 1) = 0.05
pX (2) = pX,Y (2, 1) + pX,Y (2, 2) = 0.15 + 0.10 = 0.25
pX (3) = pX,Y (3, 1) + pX,Y (3, 2) + pX,Y (3, 3) = 0.05 + 0.05 + 0.10 = 0.20
etc.
But notice that in these computations, we are simply adding the entries all columns
for each row of Table 6.1. Doing this for Y as well as X we can obtain Table 6.2
So, for example, P (Y = 2) = pY (2) = 0.275. We simply look for the result in
the margin1 for the entry Y = 2.
The procedure is similar for obtaining marginal probability density functions. Recall that a density, fX (x), itself is not a probability measure, but fX (x)dx, is. So
1
Would you believe that this is why they are called marginal distributions?
Marginal Distributions
163
Table 6.2: Marginal pmf’s for daily production
pX,Y (x, y)
1
2
X
3
4
5
pY (y)
Y
2
3
0
0
.10
0
.05
.10
.025 .025
.10
.10
.275 .225
1
.05
.15
.05
.05
.10
0.4
4
0
0
0
0
.10
.10
5
0
0
0
0
0
0
pX (x)
.05
.25
.20
.10
.40
with a little loose-speaking integration notation we should be able to compute
fX (x) dx = P (x ≤ X < x + dx)
= P (x ≤ X < x + dx, Y = any value)
=
Z +∞
−∞
fX,Y (x, y) dy dx
where y is the variable of integration in the above integral. Looking at this
relationship as
fX (x) dx =
Z +∞
−∞
fX,Y (x, y) dy dx
it would seem reasonable to define
fX (x) ≡
Z +∞
−∞
fX,Y (x, y) dy
and we therefore offer the following:
Definition 6.6. Let (X1 , . . . , Xn ) be a continuous random variable with joint
probability density function fX1 ,...,Xn . The marginal probability density function
for the random variable X1 is given by
fX1 (x1 ) =
Z +∞
−∞
···
Z +∞
−∞
fX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) dx2 · · · dxn .
Notice that in both the discrete and continuous cases, we sum (or integrate) over
all possible values of the unwanted random variable components.
164
Marginal Distributions
y
1
a
0
1
x
Figure 6.6: The support for the random vector
The trick in such problems is to insure that your limits of integration are correct.
Drawing a picture of the region where there is positive probability mass (the support
of the distribution) often helps.
For the above example, the picture of the support would be as shown in Figure 6.6.
If the dotted line in Figure 6.6 indicates a particular value for x (call it a), by
integrating over all values of y , we are actually determining how much probability
mass has been placed along the line x = a. The integration process assigns all of
that mass to x = a in one-dimension. Repeating this process for each x yields the
desired probability density function.
Example: Let (X, Y ) be a two-dimensional continuous random variable with joint
probability density function
(
fX,Y (x, y) =
x+y
0
0 ≤ x ≤ 1 and 0 ≤ y ≤ 1
otherwise
Find the marginal probability density function for X .
Solution: Let’s consider the case where we fix x so that 0 ≤ x ≤ 1. We compute
fX (x) =
Z +∞
−∞
fX,Y (x, y) dy
Marginal Distributions
165
=
Z 1
(x + y) dy
0
Ã
y2
xy +
2
=
= x+
!¯y=1
¯
¯
¯
¯
y=0
1
2
If x is outside the interval [0, 1], we have fX (x) = 0.
So, summarizing these computations we find that
(
fX (x) =
1
2
x+
0
0≤x≤1
otherwise
We will leave it to you to check that fX (x) is, in fact,
a probability density
R +∞
functionby making sure that fX (x) ≥ 0 for all x and that −∞ fX (x) dx = 1.
Example: Let (X, Y ) be the two-dimensional random variable with the following
joint probability density function (see Figure 6.5 on page 160)
(
fX,Y (x, y) =
2−y
0
if 0 ≤ x ≤ 2 and 1 ≤ y ≤ 2
otherwise
Find the marginal probability density function for X and the marginal probability
density function for Y .
Solution: Let’s first find the marginal probability density function for X . Consider
the case where we fix x so that 0 ≤ x ≤ 2. We compute
fX (x) =
=
Z +∞
−∞
Z 2
fX,Y (x, y) dy
(2 − y) dy
1
Ã
y2
2y −
2
=
=
!¯y=2
¯
¯
¯
¯
y=1
1
2
If x is outside the interval [0, 2], we have fX (x) = 0. Therefore,
(
fX (x) =
1
2
0
0≤x≤2
otherwise
166
SEE ERRATA SHEET
Functions of Random Vectors
To find the marginal probability density function for Y , consider the case where
we fix y so that 1 ≤ y ≤ 2. We compute
fY (y) =
=
Z +∞
−∞
Z 1
fX,Y (x, y) dx
(2 − y) dx
0
=
x=1
(2 − y)x|x=0
= 2−y
If y is outside the interval [1, 2], we have fY (y) = 0. Therefore,
(
fY (x) =
2−y
0
0≤x≤2
otherwise
You should also double check that fX and fY are both probability density functions.
Functions of Random Vectors
The technique for computing functions of one-dimensional random variables carries
over to the multi-dimensional case. Most of these problems just require a little
careful reasoning.
Example: Let (X1 , X2 ) be the die-tossing random vector of the example on
page 154. Find the probability mass function for the random variable Z = X1 +X2 ,
the sum of the two rolls.
1
Solution: We already know that pX1 ,X2 (x1 , x2 ) = 36
for x1 = 1, . . . , 6 and x2 =
1, . . . , 6. We ask the question, “What are the possible values that Z can take on?”
The answer: “The integers 2, 3, 4, . . . , 12.” For example, Z equals 4 precisely
when any one of the mutually exclusive events
{X1 = 1, X2 = 3},
{X1 = 2, X2 = 2},
or
{X1 = 3, X2 = 1}
occurs. So,
pZ (4) = pX1 ,X2 (1, 3) + pX1 ,X2 (2, 2) + pX1 ,X2 (3, 1) =
3
36 .
Functions of Random Vectors
167
Continuing in this manner, you should be able to verify that
1
pZ (2) = 36
;
4
pZ (5) = 36 ;
5
pZ (8) = 36
;
2
pZ (11) = 36 ;
2
pZ (3) = 36
;
5
pZ (6) = 36 ;
4
pZ (9) = 36
;
1
pZ (12) = 36 .
3
pZ (4) = 36
;
6
pZ (7) = 36 ;
3
pZ (10) = 36
;
Example: Let (X, Y ) be the two-dimensional random variable given in the example on page 164. Find the cumulative distribution function for the random variable
Z =X +Y.
Solution: The support for the random variable Z is the interval [0, 2], so


 0
FZ (z) =
if z < 0
if 0 ≤ z ≤ 2
if z > 2
?

 1
For the case 0 ≤ z ≤ 2, we wish to evaluate
FZ (z) = P (Z ≤ z) = P (X + Y ≤ z).
In other words, we are computing the probability mass assigned to the shaded set
in Figure 6.7 as z varies from 0 to 2.
In establishing limits of integration, we notice that there are two cases to worry
about as shown in Figure 6.8:
Case I (z ≤ 1):
FZ (z) =
Z z Z z−y
0
=
(x + y) dx dy
0
1 3
3z .
Case II (z > 1):
FZ (z) =
Z z−1 Z 1
0
(x + y) dx dy +
0
= z 2 − 31 z 3 − 31 .
Z 1 Z z−y
z−1 0
(x + y) dx dy
168
Functions of Random Vectors
y
1
z
x+
y=
z
z
0
x
1
Figure 6.7: The event {Z ≤ z}.
y
y
z
1
z
0
1
z 1
Case I
x
0
1 z
x
Case II
Figure 6.8: Two cases to worry about for the example
Independence of Random Variables
169
These two cases can be summarized as
FZ (z) =


0


 1 z3
3

z 2 − 13 z 3 − 31



1
if z < 0
if 0 ≤ z ≤ 1
if 1 < z ≤ 2
if z > 2
Notice what we thought at first to be one case (0 ≤ z ≤ 2) had to be divided into
two cases (0 ≤ z ≤ 1 and 1 < z ≤ 2).
Independence of Random Variables
Definition 6.7. A sequence of n random variables X1 , X2 , . . . , Xn is independent
if and only if, and
FX1 ,X2 ,...,Xn (b1 , b2 , . . . , bn ) = FX1 (b1 )FX2 (b2 ) · · · FXn (bn )
for all values b1 , b2 , . . . , bn .
Definition 6.8. A sequence of n random variables X1 , X2 , . . . , Xn is a random
sample if and only if
1. X1 , X2 , . . . , Xn are independent, and
2. FXi (x) = F (x) for all x and for all i (i.e., each Xi has the same marginal
distribution, F (x)).
We say that a random sample is a vector of independent and identically distributed (i.i.d.) random variables.
Recall: An event A is independent of an event B if and only if
P (A ∩ B) = P (A)P (B).
Theorem 6.1. If X and Y are independent random variables then any event A
involving X alone is independent of any event B involving Y alone.
Testing for independence
Case I: Discrete
A discrete random variable X is independent of a discrete random variable Y if
and only if
pX,Y (x, y) = [pX (x)][pY (y)]
for all x and y .
170
Independence of Random Variables
Case II: Continuous
A continuous random variable X is independent of a continuous random variable
Y if and only if
fX,Y (x, y) = [fX (x)][fY (y)]
for all x and y .
Example: A company produces two types of compressors, grade A and grade B.
Let X denote the number of grade A compressors produced on a given day. Let
Y denote the number of grade B compressors produced on the same day. Suppose
that the joint probability mass function pX,Y (x, y) = P (X = x, Y = y) is given
by the following table:
y
pX,Y (x, y)
0
x
1
2
0
0.1
0.2
0.2
1
0.3
0.1
0.1
The random variables X and Y are not independent. Note that
pX,Y (0, 0) = 0.1 6= pX (0)pY (0) = (0.4)(0.5) = 0.2
Example: Suppose an electronic circuit contains two transistors. Let X be the
time to failure of transistor 1 and let Y be the time to failure of transistor 2.
(
fX,Y (x, y) =
4e−2(x+y) x ≥ 0, y ≥ 0
0
otherwise
The marginal densities are
(
fX (x) =
(
fY (y) =
2e−2x x ≥ 0
0
otherwise
2e−2y y ≥ 0
0
otherwise
Expectation and random vectors
171
We must check the probability density functions for (X, Y ), X and Y for all values
of (x, y).
For x ≥ 0 and y ≥ 0:
fX,Y (x, y) = 4e−2(x+y) = fX (x)fY (y) = 2e−2x 2e−2y
For x ≥ 0 and y < 0:
fX,Y (x, y) = 0 = fX (x)fY (y) = 2e−2x (0)
For x < 0 and y ≥ 0:
fX,Y (x, y) = 0 = fX (x)fY (y) = (0)2e−2y
For x < 0 and y < 0:
fX,Y (x, y) = 0 = fX (x)fY (y) = (0)(0)
So the random variables X and Y are independent.
Expectation and random vectors
Suppose we are given a random vector (X, Y ) and a function g(x, y). Can we find
E(g(X, Y ))?
Theorem 6.2.
E(g(X, Y )) =
XX
g(x, y)pX,Y (x, y)
if (X, Y ) is discrete
all x all y
E(g(X, Y )) =
Z ∞ Z ∞
−∞ −∞
g(x, y)fX,Y (x, y) dy dx
if (X, Y ) is continuous
Example: Suppose X and Y have joint probability density function
(
fX,Y (x, y) =
x + y 0 ≤ x ≤ 1, 0 ≤ y ≤ 1
0
otherwise
172
Expectation and random vectors
Let Z = XY . To find E(Z) = E(XY ) use Theorem 6.2 to get
Z ∞ Z ∞
E(XY ) =
−∞ −∞
Z 1Z 1
=
xyfX,Y (x, y) dx dy
xy(x + y) dx dy
0
0
0
0
Z 1Z 1
=
Z 1
=
0
Z 1
=
0
1 2
6y
=
x2 y + xy 2 dx dy
1 3
3x y
1
3y
¯1
¯
+ 21 x2 y 2 ¯ dy
0
+ 12 y 2 dy
¯1
¯
+ 61 y 3 ¯ =
0
1
3
We will prove the following results for the case when (X, Y ) is a continuous random
vector. The proofs for the discrete case are similar using summations rather than
integrals, probability mass functions rather than probability density functions.
Theorem 6.3. E(X + Y ) = E(X) + E(Y )
Proof. Using Theorem 6.2:
E(X + Y ) =
=
=
=
=
Z ∞ Z ∞
−∞ −∞
Z ∞ Z ∞
−∞ −∞
Z ∞ Z ∞
−∞ −∞
Z ∞
−∞
Z ∞
−∞
x
(x + y)fX,Y (x, y) dx dy
xfX,Y (x, y) dx dy +
xfX,Y (x, y) dy dx +
·Z ∞
−∞
£
Z ∞ Z ∞
−∞ −∞
¸
yfX,Y (x, y) dx dy
−∞ −∞
Z ∞ ·Z ∞
fX,Y (x, y) dy dx +
¤
x fX (x) dx +
Z ∞
−∞
yfX,Y (x, y) dx dy
Z ∞ Z ∞
£
−∞
y
¤
y fY (y) dy
= E(X) + E(Y )
Theorem 6.4. If X and Y are independent, then
E[h(X)g(Y )] = E[h(X)]E[g(Y )]
−∞
¸
fX,Y (x, y) dx dy
Expectation and random vectors
173
Proof. Using Theorem 6.2:
E[h(X)g(Y )] =
Z ∞ Z ∞
−∞ −∞
h(x)g(y)fX,Y (x, y) dx dy
since X and Y are independent. . .
=
=
=
Z ∞ Z ∞
−∞ −∞
Z ∞
−∞
Z ∞
−∞
h(x)g(y)fX (x)fY (y) dx dy
g(y)fY (y)
·Z ∞
−∞
¸
h(x)fX (x) dx dy
£
¤
g(y)fY (y) E(h(X)) dy
since E(h(X)) is a constant. . .
= E(h(X))
Z ∞
−∞
g(y)fY (y) dy
= E(h(X))E(g(Y ))
Corollary 6.5. If X and Y are independent, then E(XY ) = E(X)E(Y )
Proof. Using Theorem 6.4, set h(x) = x and g(y) = y to get
E[h(X)g(Y )] = E(h(X))E(g(Y ))
E[XY ] = E(X)E(Y )
Definition 6.9. The covariance of the random variables X and Y is
Cov(X, Y ) ≡ E[(X − E(X))(Y − E(Y ))].
Note that Cov(X, X) = Var(X).
Theorem 6.6. For any random variables X and Y
Var(X + Y ) = Var(X) + Var(Y ) + 2Cov(X, Y )
174
Expectation and random vectors
Proof. Remember that the variance of a random variable W is defined as
Var(W ) = E[(W − E(W ))2 ]
Now let W = X + Y . . .
Var(X + Y ) = E[(X + Y − E(X + Y ))2 ]
= E[(X + Y − E(X) − E(Y ))2 ]
= E[({X − E(X)} + {Y − E(Y )})2 ]
Now let a = {X − E(X)}, let b = {Y − E(Y )}, and
expand (a + b)2 to get. . .
= E[{X − E(X)}2 + {Y − E(Y )}2 + 2{X − E(X)}{Y − E(Y )}]
= E[{X − E(X)}2 ] + E[{Y − E(Y )}2 ]
+2E[{X − E(X)}{Y − E(Y )}]
= Var(X) + Var(Y ) + 2Cov(X, Y )
Theorem 6.7. Cov(X, Y ) = E(XY ) − E(X)E(Y )
Proof. Using Definition 6.9, we get
Cov(X, Y ) = E[(X − E(X))(Y − E(Y ))]
= E[XY − XE(Y ) − E(X)Y + E(X)E(Y )]
= E[XY ] + E[−XE(Y )] + E[−E(X)Y ] + E[E(X)E(Y )]]
Since E[X], E[Y ], and E[XY ] are all constants. . .
= E[XY ] − E(Y )E[X] − E(X)E[Y ] + E(X)E(Y )
= E[XY ] − E(X)E(Y )
Corollary 6.8. If X and Y are independent then Cov(X, Y ) = 0.
Proof.
If X and Y are independent then, from Corollary 6.5, E(XY ) =
E(X)E(Y ). We can then use Theorem 6.7:
Cov(X, Y ) = E(XY ) − E(X)E(Y ) = 0
Expectation and random vectors
175
Corollary 6.9. If X and Y are independent then Var(X+Y ) = Var(X)+Var(Y ).
Proof. If X and Y are independent then, from Theorem 6.6 and Corollary 6.8:
Var(X + Y ) = Var(X) + Var(Y ) + 2Cov(X, Y )
Var(X + Y ) = Var(X) + Var(Y ) + 2(0)
Var(X + Y ) = Var(X) + Var(Y )
Definition 6.10. The covariance of two random variables X and Y is given by
£
¤
Cov(X, Y ) ≡ E (X − E(X))(Y − E(Y )) .
Definition 6.11. The correlation coefficient for two random variables X and Y
is given by
Cov(X, Y )
ρ(X, Y ) ≡ p
.
Var(X)Var(Y )
Theorems 6.10, 6.11 and 6.12 are stated without proof. Proofs of these results may
be found in the book by Meyer.2
Theorem 6.10. For any random variables X and Y ,
|ρ(X, Y )| ≤ 1.
Theorem 6.11. Suppose that |ρ(X, Y )| = 1. Then (with probability one),
Y = aX + b for some constants a and b. In other words: If the correlation
coefficient ρ is ±1, the Y is a linear function of X (with probability one).
The converse of this theorem is also true:
Theorem 6.12. Suppose that X and Y are two random variables, such that
Y = aX + b where a and b are constants. Then |ρ(X, Y )| = 1. If a > 0, then
ρ(X, Y ) = +1. If a < 0, then ρ(X, Y ) = −1.
2
Meyer, P., Introductory probability theory and statistical applications, Addison-Wesley, Reading
MA, 1965.
176
Expectation and random vectors
Random vectors and conditional probability
Example: Consider the compressor problem again.
y
pX,Y (x, y)
0
x
1
2
0
0.1
0.2
0.2
1
0.3
0.1
0.1
Given that no grade B compressors were produced on a given day, what is the
probability that 2 grade A compressors were produced?
Solution:
P (X = 2 | Y = 0) =
=
P (X = 2, Y = 0)
P (Y = 0)
2
0.2
=
0.5
5
Given that 2 compressors were produced on a given day, what is the probability
that one of them is a grade B compressor?
Solution:
P (Y = 1 | X + Y = 2) =
=
=
P (Y = 1, X + Y = 2)
P (X + Y = 2)
P (X = 1, Y = 1)
P (X + Y = 2)
1
0.1
=
0.3
3
Example: And, again, consider the two transistors. . .
(
fX,Y (x, y) =
4e−2(x+y) x ≥ 0, y ≥ 0
0
otherwise
Given that the total life time for the two transistors is less than two hours, what is
the probability that the first transistor lasted more than one hour?
Expectation and random vectors
177
Solution:
P (X > 1 | X + Y ≤ 2) =
P (X > 1, X + Y ≤ 2)
P (X + Y ≤ 2)
We then compute
P (X > 1, X + Y ≤ 2) =
Z 2 Z 2−x
= e
4e−2(x+y) dy dx
1
0
−2
− 3e−4
and
P (X + Y ≤ 2) =
Z 2 Z 2−x
0
4e−2(x+y) dy dx
0
= 1 − 5e−4
to get
P (X > 1 | X + Y ≤ 2) =
e−2 − 3e−4
= 0.0885
1 − 5e−4
Conditional distributions
Case I: Discrete
Let X and Y be random variables with joint probability mass function pX,Y (x, y)
and let pY (y) be the marginal probability mass function for Y .
We define the conditional probability mass function of X given Y as
pX|Y (x | y) =
pX,Y (x, y)
pY (y)
whenever pY (y) > 0.
Case II: Continuous
Let X and Y be random variables with joint probability density function fX,Y (x, y)
and let fY (y) be the marginal probability density function for Y .
We define the conditional probability density function of X given Y as
fX|Y (x | y) =
whenever fY (y) > 0.
fX,Y (x, y)
fY (y)
178
Self-Test Exercises for Chapter 6
Law of total probability
Case I: Discrete
pX (x) =
X
pX|Y (x | y)pY (y)
y
Case II: Continuous
fX (x) =
Z +∞
−∞
fX|Y (x | y)fY (y) dy
Self-Test Exercises for Chapter 6
For each of the following multiple-choice questions, choose the best response
among those provided. Answers can be found in Appendix B.
S6.1 Let X1 , X2 , X3 , X4 be independent and identically distributed random variables, each with P (Xi = 1) = 12 and P (Xi = 0) = 12 . Let P (X1 + X2 +
X3 + X4 = 3) = r, then the value of r is
(A)
1
16
(B)
1
4
(C)
1
2
(D) 1
(E) none of the above.
S6.2 Let (X, Y ) be a discrete random vector with joint probability mass function
given by
pX,Y (0, 0) = 1/4
pX,Y (0, 1) = 1/4
pX,Y (1, 1) = 1/2
Then P (Y = 1) equals
(A) 1/3
(B) 1/4
(C) 1/2
Self-Test Exercises for Chapter 6
179
(D) 3/4
(E) none of the above.
S6.3 Let (X, Y ) be a random vector with joint probability mass function given by
pX,Y (1, −1) =
1
4
pX,Y (1, 1) =
1
2
pX,Y (0, 0) =
1
4
Let pX (x) denote the marginal probability mass function for X . The value
of pX (1) is
(A) 0
(B)
1
4
(C)
1
2
(D) 1
(E) none of the above.
S6.4 Let (X, Y ) be a random vector with joint probability mass function given by
pX,Y (1, −1) =
1
4
pX,Y (1, 1) =
1
2
pX,Y (0, 0) =
1
4
The value of P (XY > 0) is
(A) 0
(B)
1
4
(C)
1
2
(D) 1
(E) none of the above.
S6.5 Let (X, Y ) be a continuous random vector with joint probability density
function
(
fX,Y (x, y) =
Then P (X > 0) equals
(A) 0
(B) 1/4
1/2 if −1 ≤ x ≤ 1 and 0 ≤ y ≤ 1
0
otherwise
180
Self-Test Exercises for Chapter 6
(C) 1/2
(D) 1
(E) none of the above.
S6.6 Suppose (X, Y ) is a continuous random vector with joint probability density
function
(
e−y if 0 ≤ x ≤ 1 and y ≥ 0
fX,Y (x, y) =
0
otherwise
Then P (X > 12 ) equals
(A) 0
(B) 1/2
(C) e−1/2
(D)
1 −y
2e
(E) none of the above.
S6.7 Let (X, Y ) be a discrete random vector with joint probability mass function
given by
pX,Y (0, 0) = 1/3
pX,Y (0, 1) = 1/3
pX,Y (1, 0) = 1/3
Then P (X + Y = 1) equals
(A) 0
(B) 1/3
(C) 2/3
(D) 1
(E) none of the above.
S6.8 Let (X, Y ) be a discrete random vector with joint probability mass function
given by
pX,Y (0, 0) = 1/3
pX,Y (0, 1) = 1/3
pX,Y (1, 0) = 1/3
Self-Test Exercises for Chapter 6
181
Let W = max{X, Y }. Then P (W = 1) equals
(A) 0
(B) 1/3
(C) 2/3
(D) 1
(E) none of the above.
S6.9 Suppose (X, Y ) is a continuous random vector with joint probability density
function


if 0 ≤ x ≤ 2 and
 y
0≤y≤1
fX,Y (x, y) =

 0
otherwise
Then E(X 2 ) equals
(A) 0
(B) 1
(C) 4/3
(D) 8/3
(E) none of the above.
S6.10 Suppose (X, Y ) is a continuous random vector with joint probability density
function


if 0 ≤ x ≤ 2 and
 y
0≤y≤1
fX,Y (x, y) =

 0
otherwise
Then P (X < 1, Y < 0.5) equals
(A) 0
(B) 1/8
(C) 1/4
(D) 1/2
(E) none of the above.
182
Self-Test Exercises for Chapter 6
S6.11 The weights of individual oranges are independent random variables, each
having an expected value of 6 ounces and standard deviation of 2 ounces.
Let Y denote the total net weight (in ounces) of a basket of n oranges. The
variance of Y is equal to
√
(A) 4 n
(B) 4n
(C) 4n2
(D) 4
(E) none of the above.
S6.12 Let X1 , X2 , X3 be independent and identically distributed random variables,
each with P (Xi = 1) = p and P (Xi = 0) = 1 − p. If P (X1 + X2 + X3 =
3) = r, then the value of p is
(A)
1
3
(B)
1
2
(C)
√
3
r
(D) r3
(E) none of the above.
S6.13 If X and Y are random variables with Var(X) = Var(Y ), then Var(X + Y )
must equal
(A) 2Var(X)
√
(B) 2(Var(X))
(C) Var(2X)
(D) 4Var(X)
(E) none of the above.
S6.14 Let (X, Y ) be a continuous random vector with joint probability density
function
(
1/π if x2 + y 2 ≤ 1
fX,Y (x, y) =
0
otherwise
Then P (X > 0) equals
Self-Test Exercises for Chapter 6
183
(A) 1/(4π)
(B) 1/(2π)
(C) 1/2
(D) 1
(E) none of the above.
S6.15 Let X be a random variable. Then E[(X + 1)2 ] − (E[X + 1])2 equals
(A) Var(X)
(B) 2E[(X + 1)2 ]
(C) 0
(D) 1
(E) none of the above.
S6.16 Let X , Y and Z be independent random variables with
E(X) = 1
E(Y ) = 0
E(Z) = 1
Let
W = X(Y + Z).
Then E(W ) equals
(A) 0
(B) E(X 2 )
(C) 1
(D) 2
(E) none of the above
S6.17 Toss a fair die twice. Let the random variable X represent the outcome of
the first toss and the random variable Y represent the outcome of the second
toss. What is the probability that X is odd and Y is even.
(A) 1/4
(B) 1/3
(C) 1/2
184
Self-Test Exercises for Chapter 6
(D) 1
(E) none of the above
S6.18 Assume that (X, Y ) is random vector with joint probability density function
(
fX,Y (x, y) =
k for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 and y ≤ x
0 otherwise
where k is a some positive constant. Let W = X −Y . To find the cumulative
distribution function for W we can compute
FW (w) =
Z BZ D
A
C
k dx dy
for 0 ≤ w ≤ 1. The limit of integration that should appear in position D is
(A) min{1, x − w}
(B) min{1, y − w}
(C) min{1, x − y}
(D) min{1, y + w}
(E) none of the above.
S6.19 Assume that (X, Y ) is a random vector with joint probability mass function
given by
pX,Y (−1, 0) = 1/4
pX,Y (0, 0) = 1/2
pX,Y (0, 1) = 1/4
Define the random variable W = XY . The value of P (W = 0) is
(A) 0
(B) 1/4
(C) 1/2
(D) 3/4
(E) none of the above.
Self-Test Exercises for Chapter 6
185
S6.20 Let X be a random variable with E(X) = 0 and Var(X) = 1. Let Y be a
random variable with E(Y ) = 0 and Var(Y ) = 4. Then E(X 2 + Y 2 ) equals
(A) 0
(B) 1
(C) 3
(D) 5
(E) none of the above.
S6.21 Suppose (X, Y ) is a continuous random vector with joint probability density
function
(
fX,Y (x, y) =
·
1
Then E
XY
4xy
0
if 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1
otherwise
¸
equals
(A) +∞
(B) 0
(C) 1/2
(D) 4
(E) none of the above.
S6.22 The life lengths of two transistors in an electronic circuit is a random vector
(X, Y ) where X is the life length of transistor 1 and Y is the life length of
transistor 2. The joint probability density function of (X, Y ) is given by
(
fX,Y (x, y) =
2e−(x+2y) if x ≥ 0 and y ≥ 0
0
otherwise
Then P (X + Y ≤ 1) equals
(A)
(B)
(C)
R1R1
0
0
2e−(x+2y) dx dy
R 1 R 1−y
0
0
0
y
R1R1
2e−(x+2y) dx dy
2e−(x+2y) dx dy
186
Questions for Chapter 6
(D)
R1R1
0
1−y
2e−(x+2y) dx dy
(E) none of the above.
Questions for Chapter 6
6.1 A factory can produce two types of gizmos, Type 1 and Type 2. Let X be a
random variable denoting the number of Type 1 gizmos produced on a given
day, and let Y be the number of Type 2 gizmos produced on the same day.
The joint probability mass function for X and Y is given by
pX,Y (1, 0) = 0.10;
pX,Y (1, 2) = 0.10;
pX,Y (2, 1) = 0.20;
pX,Y (3, 1) = 0.05;
pX,Y (1, 1) = 0.20;
pX,Y (1, 3) = 0.10;
pX,Y (2, 2) = 0.20;
pX,Y (3, 2) = 0.05;
(a) Compute P (X ≤ 2, Y = 2), P (X ≤ 2, Y 6= 2) and
P (Y > 0).
(b) Find the marginal probability mass functions for X and for
Y.
(c) Find the distribution for the random variable Z = X + Y
which is the total daily production of gizmos.
6.2 A sample space Ω is a set consisting for four points, {ω1 , ω2 , ω3 , ω4 }. A
probability measure, P (·) assigns probabilities, as follows:
P ({ω1 }) =
P ({ω3 }) =
1
2
1
4
P ({ω2 }) =
P ({ω4 }) =
1
8
1
8
Random variables X , Y and Z are defined as
X(ω1 ) = 0,
Y (ω1 ) = 0,
Z(ω1 ) = 1,
X(ω2 ) = 0,
Y (ω2 ) = 1,
Z(ω2 ) = 2,
X(ω3 ) = 1,
Y (ω3 ) = 1,
Z(ω3 ) = 3,
X(ω4 ) = 1
Y (ω4 ) = 0
Z(ω4 ) = 4
(a) Find the joint probability mass function for (X, Y, Z).
(b) Find the joint probability mass function for (X, Y ).
(c) Find the probability mass function for X .
Questions for Chapter 6
187
(d) Find the probability mass function for the random variable
W = X + Y + Z.
(e) Find the probability mass function for the random variable
T = XY Z .
(f) Find the joint probability mass function for the random variable (W, T ).
(g) Find the probability mass function for the random variable
V = max{X, Y }.
6.3 Let (X, Y ) have a uniform distribution over the rectangle
¡
¢
(0, 0), (0, 1), (1, 1), (1, 0) .
In other words, the probability density function for (X, Y ) is given by
(
1
0
fX,Y (x, y) =
if 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1
otherwise
Find the cumulative distribution function for the random variable Z = X+Y .
6.4 Two light bulbs are burned starting at time 0. The first one to fail burns out
at time X and the second one at time Y . Obviously X ≤ Y . The joint
probability density function for (X, Y ) is given by
(
fX,Y (x, y) =
2e−(x+y)
0
if 0 < x < y < ∞
otherwise
(a) Sketch the region in (x, y)-space for which the above probability density function assigns positive mass.
(b) Find the marginal probability density functions for X and
for Y .
(c) Let Z denote the excess life of the second bulb, i.e., let
Z = Y − X . Find the probability density function for Z .
(d) Compute P (Z > 1) and P (Y > 1).
6.5 If two random variables X and Y have a joint probability density function
given by
(
fX,Y (x, y) =
2
0
if x > 0, y > 0 and x + y < 1
otherwise
188
Questions for Chapter 6
(a) Find the probability that both random variables will take on
a value less than 12 .
(b) Find the probability that X will take on a value less than
and Y will take on a value greater than 12 .
1
4
(c) Find the probability that the sum of the values taken on by
the two random variables will exceed 23 .
(d) Find the marginal probability density functions for X and
for Y .
(e) Find the marginal probability density function for Z = X +
Y.
6.6 Suppose (X, Y, Z) is a random vector with joint probability density function
(
fX,Y,Z (x, y, z) =
α(xyz 2 )
0
if 0 < x < 1, 0 < y < 2 and 0 < z < 2
otherwise
(a) Find the value of the constant α.
(b) Find the probability that X will take on a value less than
and Y and Z will both take on values less than 1.
1
2
(c) Find the probability density function for the random vector
(X, Y ).
6.7 Suppose that the random vector (X, Y ) has joint probability density function
(
fX,Y (x, y) =
kx(x − y)
0
if 0 ≤ x ≤ 2 and |y| ≤ x
otherwise
(a) Sketch the support of the distribution for (X, Y ) in the xy plane.
(b) Evaluate the constant k .
(c) Find the marginal probability density function for Y .
(d) Find the marginal probability density function for X .
(e) Find the probability density function for Z = X + Y .
Questions for Chapter 6
189
6.8 The length, X , and the width, Y , of salt crystals form a random variable
(X, Y ) with joint probability density function
(
fX,Y (x, y) =
x
0
if 0 ≤ x ≤ 1 and 0 ≤ y ≤ 2
otherwise
(a) Find the marginal probability density functions for X and
for Y .
6.9 If the joint probability density function of the price, P , of a commodity (in
dollars) and total sales, S , (in 10, 000 units) is given by
(
fS,P (s, p) =
5pe−ps
0
0.20 < p < 0.40 and s > 0
otherwise
(a) Find the marginal probability density functions for P and
for S .
(b) Find the conditional probability density function for S given
that P takes on the value p.
(c) Find the probability that sales will exceed 20, 000 units given
P = 0.25.
6.10 If X is the proportion of persons who will respond to one kind of mail-order
solicitation, Y is proportion of persons who will respond to a second type of
mail-order solicitation, and the joint probability density function of X and
Y is given by
(
fX,Y (x, y) =
2
5 (x
0
+ 4y)
0 ≤ x ≤ 1 and 0 ≤ y ≤ 1
otherwise
(a) Find the marginal probability density functions for X and
for Y .
(b) Find the conditional probability density function for X given
that Y takes on the value y .
(c) Find the conditional probability density function for Y given
that X takes on the value x.
(d) Find the probability that there will be at least a 20% response
to the second type of mail-order solicitation.
190
Questions for Chapter 6
(e) Find the probability that there will be at least a 20% response
to the second type of mail-order solicitation given that there
has only been a 10% response to the first kind of mail-order
solicitation.
6.11 An electronic component has two fuses. If an overload occurs, the time when
fuse 1 blows is a random variable, X , and the time when fuse 2 blows is a
random variable, Y . The joint probability density function of the random
vector (X, Y ) is given by
(
fX,Y (x, y) =
e−y
0
0 ≤ x ≤ 1 and y ≥ 0
otherwise
(a) Compute P (X + Y ≥ 1).
(b) Are X and Y independent? Justify your answer.
6.12 The coordinates of a laser on a circular target are given by the random vector
(X, Y ) with the following probability density function:

 1
fX,Y (x, y) =
 π
0
x2 + y 2 ≤ 1
otherwise
Hence, (X, Y ) has a uniform distribution on a disk of radius one centered at
(0, 0).
(a) Compute P (X 2 + Y 2 ≤ 0.25).
(b) Find the marginal distributions for X and Y .
(c) Compute P (X ≥ 0, Y > 0).
(d) Compute P (X ≥ 0 | Y > 0).
(e) Find the conditional distribution for X given Y = y . (Note:
Be careful of the case |y| = 1.)
6.13 Let X and Y be discrete random variables with the support of X denoted by
Θ and the support of Y denoted by Φ. Let pX be the marginal probability
mass function for X , let pX|Y be the conditional probability mass function
of X given Y , and let pY |X be the conditional probability mass function of
Questions for Chapter 6
191
Y given X . Show that
pY |X (y|x)pX (x)
pX|Y (x|y) = X
pY |X (y|x)pX (x)
x∈Θ
for any x ∈ Θ and y ∈ Φ.
6.14 The volumetric fractions of each of two compounds in a mixture are random variables X and Y , respectively. The random vector (X, Y ) has joint
probability density function
(
2
0
fX,Y (x, y) =
0 ≤ x ≤ 1 and 0 ≤ y ≤ 1 − x
otherwise
(a) Determine the marginal probability density functions for X
and for Y .
(b) Are X and Y independent random variables? Justify your
answer.
(c) Compute P (X < 0.5 | Y = 0.5).
(d) Compute P (X < 0.5 | Y < 0.5).
6.15 Suppose (X, Y ) is a continuous random vector with joint probability density
function
(
2
0
fX,Y (x, y) =
if x > 0, y > 0 and x + y < 1
otherwise
Find E(X + Y ).
6.16 Suppose (X, Y, Z) is a random vector with joint probability density function
(
fX,Y,Z (x, y, z) =
3
2
8 (xyz )
0
if 0 < x < 1, 0 < y < 2, 0 < z < 2
otherwise
Find E(X + Y + Z).
6.17 Suppose the joint probability density function of the price, P , of a commodity
(in dollars) and total sales, S , (in 10, 000 units) is given by
(
fS,P (s, p) =
Find E(S | P = 0.25).
5pe−ps
0
0.20 < p < 0.40 and s > 0
otherwise
192
Questions for Chapter 6
6.18 Let X be the proportion of persons who will respond to one kind of mailorder solicitation, and let Y be the proportion of persons who will respond
to a second type of mail-order solicitation. Suppose the joint probability
density function of X and Y is given by
(
fX,Y (x, y) =
2
5 (x
+ 4y)
0 ≤ x ≤ 1 and 0 ≤ y ≤ 1
otherwise
0
Find E(X | Y = y) for each 0 < y < 1.
6.19 Suppose that (X, Y ) is a random vector with the following joint probability
density function:
(
fX,Y (x, y) =
2x
0
if 0 < x < 1 and 0 < y < 1
otherwise
Find the marginal probability density function for Y .
6.20 Suppose that (X, Y ) is a random vector with the following joint probability
density function:
(
fX,Y (x, y) =
2xe−y
0
if 0 ≤ x ≤ 1 and y ≥ 0
otherwise
(a) Find P (X > 0.5).
(b) Find the marginal probability density function for Y .
(c) Find E(X).
(d) Are X and Y independent random variables? Justify your
answer.
6.21 The Department of Metaphysics of Podunk University has 5 students, with
2 juniors and 3 seniors. Plans are being made for a party. Let X denote the
number of juniors who will attend the party, and let Y denote the number of
seniors who will attend.
After an extensive survey was conducted, it has been determined that the
random vector (X, Y ) has the following joint probability mass function:
pX,Y (x, y)
X (juniors)
0
1
2
Y (seniors)
0
1
2
.01 .25 .03
.04 .05 .15
.04 .12 .20
3
.01
.05
.05
Questions for Chapter 6
193
(a) What is the probability that 3 or more students (juniors
and/or seniors) will show up at the party?
(b) What is the probability that more seniors than juniors will
show up at the party?
(c) If juniors are charged $5 for attending the party and seniors
are charged $10, what is the expected amount of money that
will be collected from all students?
(d) The two juniors arrive early at the party. What is the probability that no seniors will show up?
6.22 Let (X, Y ) be a continuous random vector with joint probability density
function given by
(
fX,Y (x, y) =
2
0
if x ≥ 0, y ≥ 0 and x + y ≤ 1
.
otherwise
(a) Compute P (X < Y ).
(b) Find the marginal probability density function for X and the
marginal probability density function for Y .
(c) Find the conditional probability density function for Y given
that X = 12 .
(d) Are X and Y independent random variables? Justify your
answer.
6.23 Let (X, Y ) be a continuous random vector with joint probability density
function given by
(
fX,Y (x, y) =
4xy
0
for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1
.
otherwise
(a) Find the marginal probability density function for X and the
marginal probability density function for Y .
(b) Are X and Y independent random variables? Justify your
answer.
(c) Find P (X + Y ≤ 1).
(d) Find P (X > 0.5 | Y ≤ 0.5).
194
Questions for Chapter 6
6.24 Let (X, Y ) be a continuous random vector with joint probability density
function given by
(
fX,Y (x, y) =
6x2 y
0
for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1
otherwise
(a) Find the marginal probability density function for X and the
marginal probability density function for Y .
(b) Are X and Y independent random variables? Justify your
answer.
(c) Find P (X + Y ≤ 1).
(d) Find E(X + Y ).
6.25 A box contains four balls numbered 1, 2, 3 and 4. A game consists of drawing
one of the balls at random from the box. It is not replaced. A second ball is
then drawn at random from the box.
Let X be the number on the first ball drawn, and let Y be the number on the
second ball.
(a) Find the joint probability mass function for the random vector (X, Y ).
(b) Compute P (Y ≥ 3 | X = 1).
6.26 Suppose that (X, Y ) is a random vector with the following joint probability
density function:


2xe−y



fX,Y (x, y) =
if 0 < x < 1 and y > 0

0



otherwise
(a) Compute P (X > 0.5, Y < 1.0).
(b) Compute P (X > 0.5 | Y < 1.0).
(c) Find the marginal probability density function for X .
(d) Find E(X + Y ).
Questions for Chapter 6
195
6.27 Past studies have shown that juniors taking a particular probability course
receive grades according to a normal distribution with mean 65 and variance
36. Seniors taking the same course receive grades normally distributed with
a mean of 70 and a variance of 36. A probability class is composed of 75
juniors and 25 seniors.
(a) What is the probability that a student chosen at random from
the class will receive a grade in the 70’s (i.e., between 70
and 80)?
(b) If a student is chosen at random from the class, what is the
student’s expected grade?
(c) A student is chosen at random from the class and you are
told that the student has a grade in the 70’s. What is the
probability that the student is a junior?
6.28 (†) Suppose X and Y are positive, independent continuous random variables
with probability density functions fX (x) and fY (y), respectively. Let Z =
X/Y .
(a) Express the probability density function for Z in terms of an
integral that involves only fX and fY .
(b) Now suppose that X and Y can be positive and/or negative.
Express the probability density function for Z in terms of
an integral that involves only fX and fY . Compare your
answer to the case where X and Y are both positive.
ANSWERS TO SELFTEST
EXERCISES
Chapter 6
S6.1 B S6.2 D S6.3 E S6.4 C S6.5 C
S6.6 B S6.7 C S6.8 C S6.9 C S6.10 B
S6.11 B S6.12 C S6.13 E S6.14 C S6.15 A
S6.16 C S6.17 A S6.18 D S6.19 E S6.20 D
S6.21 D S6.22 B
Remarks
S6.19 The correct answer is 1
Download