STAT 211
Joint Probability Distributions and Random Samples
By using the following examples, the joint probability mass function for two discrete random variables and their properties, their marginal probability mass functions, the case for independent and dependent variables, their conditional distributions, expected value, variance, covariance, and correlation will be demonstrated. Properties will be explained.
Example 1: Quality audit records are kept on numbers of major and minor failures of circuit packs during-in of large electronic switching devices. They indicate that for a device of this type, the random variables X (the number of major failures) and Y (the number of minor failures) can be described at least approximately by the accompanying joint distribution. x y
0
0 1 2 Total
1
2
0.15 0.05 0.01 0.21
0.10 0.10 0.04 0.24
0.10 0.14 0.04 0.28
3 0.10 0.13 0.04 0.27
Total 0.45 0.42 0.13 1 a.
Find the marginal probability mass functions for X and Y. x 0 1 2 otherwise p(x) 0.45 0.42 0.13 0 y 0 1 2 3 otherwise p(y) 0.21 0.24 0.28 0.27 0 b.
Are X and Y independent? No c.
Find the expected value and the variance of X.
E(X)=0.68 E(X 2 )= 0.94 Var(X)=0.4776 d.
Find the expected value and the variance of Y.
E(Y)=1.61 E(Y
2
)= 3.79 Var(Y)=1.1979 e.
Find the Cov(X,Y) and Corr(X,Y).
E(XY)=1.25 Cov(X,Y)=0.1552 Corr(X,Y)=0.2052 f.
Find the conditional probability function for Y given that X=0 that is there are no circuit pack failures. y 0
P(Y=y | X=0) 0.3333
1
0.2222
2
0.2222
3
0.2222 otherwise
0 g.
What is the expected number of minor failures given that there were no major failures?
E(Y | X=0) =1.3333
h.
Suppose that demerits are assigned to devices of this type according to the formula
D=2X+Y. Find the marginal probability mass function for D. d
P(D=d)
0 1 2 3 4 5 6 7 otherwise
0.15 0.10 0.15 0.20 0.15 0.17 0.04 0.04 0 d=0 only when X=0,Y=0 then P(d=0)=P(X=0,Y=0) d=3 only when X=0,Y=3 or X=1,Y=1 then P(d=3)=P(X=0,Y=3)+P(X=1,Y=1) i.
Suppose that demerits are assigned to devices of this type according to the formula
D=2X+Y. Find the expected value and the variance of D.
E(D)=2.97 E(D
2
)=12.55 Var(D)=3.7291 j.
Suppose that demerits are assigned to devices of this type according to the formula
U=Min(X,Y). Find the mean value and the variance of U. u 0
P(U=u) 0.51
1
0.41
2
0.08 otherwise
0
u=0 only when X=0,Y=0 or X=0,Y=1 or X=0,Y=2 or X=0,Y=3 or X=1,Y=0 or X=2,Y=0 then
P(U=0)=P(X=0,Y=0)+P(X=0,Y=1)+P(X=0,Y=2)+P(X=0,Y=3)+P(X=1,Y=0)+P(X=2,Y=0)
E(U)=0.57 E(U
2
)=0.73 Var(U)=0.4051
By using the following example, the joint probability density function for two continuous random variables and their properties, their marginal probability density functions, the case for independent and dependent variables, their conditional distributions, expected value, variance, covariance, and correlation will be demonstrated. Properties will be explained.
Example 2: Suppose that a pair of random variables, X and Y have the same joint probability density f ( x , y )
x ( 1
y ) if 0
0 otherwise
x
2 and 0
y
1 a.
Find the marginal probability density functions for X and Y. g ( x )
x / 2
0
0
x
2 otherwise
and h ( y )
2 ( 1
0 y ) 0
y
1 otherwise b.
Are X and Y independent? Yes c.
Find the expected value and variance of X.
E(X)=4/3 E(X
2
)= 2 Var(X)=0.2222 d.
Find the expected value and variance of Y.
E(Y)=1/3 E(Y
2
)= 1/6 Var(Y)=0.0556 e.
Find the conditional probability density function of x given y=0.6. f ( x | y )
x / 2 if 0
x
0 otherwise
2 and 0
y
1
when y=0.6, f(x|y)=x/2 for 0
x
2 f.
Find the conditional probability density function of y given x=0.4. f ( y | x )
2 ( 1
y ) if 0
0 otherwise
x
2 and 0
y
1 when x=0.4, f(y|x)=2(1-y) for 0
y
1 g.
What is E(X|Y=0.6)? 4/3 h.
Find the Cov(X,Y) and Corr(X,Y).
E(XY)=4/9 Cov(X,Y)=0 Corr(X,Y)=0
Random Sample:
The random variables X
1
, X
2
, ….,X n
are said to form a random sample of size n if
(i) The X i
's are independent random variables.
(ii) Every X i
's has the same probability distribution.
_
The sampling distribution of
Let X
1
,X
2
,….,X n x and the distribution of a linear combination of variables:
be a random sample of size n with the mean E(X)=
and variance Var(X)=
2
.
The mean of
_
X is
_ x
E
_
X
and the variance of
_
X is
2
_ x
Var
_
X
2
/ n .
The mean of
2
X i
i n
1
X i
is
Var
X i
n
2
.
X i
E
X i
n
and the variance of
If h(x) is a linear combination of X i
’s then the mean of h(x) is h ( x )
E
h ( x )
variance of h(x) is
2 h ( x )
Var
h ( x )
. i n
1
X i is
and the
Example 3 (Exercise 5.42): A company maintains 3 offices in a certain region, each staffed by two employees. Information concerning yearly salaries (1000’s of dollars) is as follows:
Office 1
Employee 1
1
2
2
3
2
4
3
5
3
6
Salary 19.7 23.6 20.2 23.6 15.8 19.7
(a) Suppose two of these employees are randomly selected from among the six (without replacement). Determine the sampling distribution of the sample mean salary
_
X .
S={(1,2),(1,3),(1,4),(1,5),(1,6),(2,1),(2,3),(2,4),(2,5),(2,6),(3,1),(3,2),(3,4),(3,5),(3,6),
(4,1),(4,2),(4,3),(4,5),(4,6),(5,1),(5,2),(5,3),(5,4),(5,6),(6,1),(6,2),(6,3),(6,4),(6,5)}. There are 30 outcomes.
_ x
21.65
when (1,2),(2,1),(1,4),(4,1),(2,6),(6,2),(4,6),(6,4) with 8 outcomes
19.95
when (1,3),(3,1),(3,6),(6,3) with 4 outcomes
17.75
when (1,5),(5,1),(5,6),(6,5) with 4 outcomes
19.70 when (1,6),(6,1),(4,5),(5,4),(2,5),(5,2) with 6 outcomes
21.90
when (2,3),(3,2),(3,4),(4,3) with 4 outcomes
23.60
when (2,4),(4,2) with 2 outcomes
18.0
when (3,5),(5,3) with 2 outcomes
_ x : 17.75 18 19.7 19.95 21.65 21.90 23.60 otherwise
_
P ( x ) : 4/30 2/30 6/30 4/30 8/30 4/30 2/30 0
E(
_ x )=
_ x
p (
_ x ) =613/30=20.43
(b) Suppose one of the three offices is randomly selected. Determine the sampling distribution of the sample mean salary
_
X .
S={(1,2),(2,1),(3,4),(4,3),(5,6),(6,5)}. There are 6 outcomes
_ x
21.65 when (1,2),(2,1) with probability 1/3
21.90 when (3,4),(4,3) with probability 1/3
17.75 when (5,6),(6,5) with probability 1/3
E(
_ x )=61.3/3=20.43
Population mean = (19.7+23.6+20.2+23.6+15.8+19.7)/6=20.43
Additional:The sampling distribution of the range of salaries for part (b) is range: 3.4 3.9 p(range) : 2/6 4/6
E(range)=
range
p ( range ) =3.7333 where Population range is 23.6-15.8=7.8
If X i
's are normally distributed random sample of size n with the mean
and variance
2 then
_
X is also normally distributed with the mean
and the variance
2
/ n .
Z
x
/
n
has a standard normal distribution with the mean 0 and the variance 1.
If X i
's are normally distributed random sample of size n with the mean
and variance
2 then
n i
1
X i
Z
is also normally distributed with the mean n
and the variance n
2
.
X i
n
n
has a standard normal distribution with the mean 0 and the variance 1
If X i
's are normally distributed random sample of size n with the mean
and variance
2 then
any linear combination of X i
's is also normally distributed with the mean E(h(x)) and the
variance Var(h(x)).
Z
h ( x )
Var
E ( h (
( h ( x ) x )) has a standard normal distribution with the mean 0 and the variance 1
Example 4 (Exercise 5.60): Five automobiles of the same type are to be driven on a 300-mile trip. Let X i
be the observed fuel efficiency (mpg) for the i th
car.
First two cars are economy brand and X i
’s are distributed N(20,4), i=1,2
Last three cars are name brand and X i
’s are distributed N(21,3.5), i=3,4,5
All five are independent.
Y is a measure of the difference in efficiency between economy gas and name brand gas.
E(Y)= E
X
1
Var(Y)= Var
2
X
2
X
1
2
X
2
X
3
X
X
4
3
3
X
X
5
4
=
X
5
20
3
=
20
2
4
4
4
21
21
3
3 .
5
3 .
5
3 .
5
9
21
=20-21=-1
=2+1.1667=3.1667
P(Y
0)=P(Z
0.56)=0.2877
P(-1
Y
1)=P(Y
1)-P(Y<-1)=P(Z
1.12)-P(Z<0)=0.8665-0.5=0.3665
Example 5 (Exercise 5.66): If two loads are applied to a cantilever beam, the bending moment at
0 due to loads is a
1
X
1
+a
2
X
2
where X
1
<X
2 and a
1
<a
2
for independent X
1
and X
2
.
(a) E(5X
1
+10X
2
)= 5E(X
1
)+10E(X
2
)=5(2)+10(4)=50
Var(5X
1
+10X
2
)= 25Var(X
1
)+100Var(X
2
)=25(0.5
2
)+100(1
2
)=106.25
Then the standard deviation is 10.308
(b) Y=5X
1
+10X
2
~ N(50 , 106.25)
P(Y>75)= P Z
75
E ( Y )
Var ( Y )
P Z
75
50
10 .
3078
=P(Z>2.43)=0.0075
(c) Let independent A
1
and A
2
be random variables which are independent from X i
‘s.
E(A
1
X
1
+A
2
X
2
)= E(A
1
)E(X
1
)+ E(A
2
)E(X
2
)=5(2)+10(4)=50
(d) Var(A
1
X
1
+A
2
X
2
) = E[{(A
1
X
1
+A
2
X
2
)-E(A
1
X
1
+A
2
X
2
)}
2
] =E[{(A
1
X
1
+A
2
X
2
)-50}
2
]=
E ( A
1
2
) E ( X
1
2
)
E ( A
2
2
) E ( X
2
2
)
2500
2 ( 50 ) E ( A
1
) E ( X
1
)
2 ( 50 ) E ( A
2
) E ( X
2
)
2 E ( A
1
) E ( X
1
) E ( A
2
) E ( X
2
)
=25.25(4.25)+100.25(17)+2500-100(5)(2)-100(10)(4)+2(5)(2)(10)(4)=111.5625
(e) If Corr(X
1
,X
2
)=0.5 then Cov(X
1
,X
2
)=[Corr(X
1
,X
2
)] Var ( X
1
) Var ( X
2
) =(0.5)(0.5)(1)=0.25 it means X
1 and X
2 are not independent then
Var(5X
1
+10X
2
)= 25Var(X
1
)+100Var(X
2
)+2(5)(10) Cov(X
1
,X
2
)=106.25+100(0.25)= 131.25
Example 6 (Exercise 5.69): Three different roads feed into a particular freeway entrance.
Number of cars coming from each road onto the freeway is a random variable.
Road 1 Road 2 Road 3
Expected value
Standard deviation
800
16
1000
25
600
18
(a) What is the expected total number of cars entering the freeway at this point during the period? E(R
1
+R
2
+R
3
)= E(R
1
)+E(R
2
)+E(R
3
)=2400
(b) What is the variance of the total number of cars entering the freeway at this point during the period? with independence, Var(R
1
+R
2
+R
3
)= Var(R
1
)+Var(R
2
)+Var(R
3
)=1205
(c) If Cov(R
1
,R
2
)=80, Cov(R
1
,R
3
)=90, Cov(R
2
,R
3
)=100 then E(R
1
+R
2
+R
3
)=2400 and
Var(R
1
+R
2
+R
3
)= Var(R
1
)+Var(R
2
)+Var(R
3
)+2 Cov(R
1
,R
2
)+2 Cov(R
1
,R
3
)+2 Cov(R
2
,R
3
)
=1205+2(80)+2(90)+2(100)=1745
which gives the standard deviation as 41.77
Central Limit Theorem: Let X
1
, X
2
,….,X n
be a random sample from a distribution with mean
and variance
2 . Then if n is sufficiently large (n>30),
_
X has approximately a normal distribution with mean
_
X distribution with mean
x i
and variance
2
_
X
n
2 and variance
2
x i
/ n and i n
1
X i
n
has approximately a normal
2
. The larger the value of n, the better the approximation.
Example 7: Let X
1
,X
2
,…,X
100
denote the actual net weights of randomly selected 50-lb bags of fertilizer. If the expected weight of each bag is 50 and the variance is 1,
(a) What is the probability that the average weight of 100 bags will be between 49.75 and 50.25?
The average weight of 100 bags is
P 49 .
75
_
X
50 .
25
byCLT
P
49 .
75
_
X .
50
0 .
01
_
X
z
50 and
50 .
25
0 .
50
01
2
_
X
2 / n
1 / 100
P(-2.5≤z≤2.5)
0 .
01
=0.9938-0.0062=0.9876
(b) What is the probability that the total weight of 100 bags will be between 4950 and 5000?
The total weight of 100 bags is i n
1
X i
.
x i
n
=5000 and
2
x i
n
2
=100
P
4950
X i
5000
byCLT
P
4950
5000
100
z
5000
5000
100
P(-5≤z≤0)=0.5-0=0.5