Probability & Statistics
Professor Wei Zhu
July 23rd
(1)
Let’s have some fun: A miner is trapped!
A miner is trapped in a mine containing 3 doors.
• The 1st door leads to a tunnel that will take him to safety after 3 hours.
• The 2nd door leads to a tunnel that returns him to the mine after 5 hours.
• The 3rd door leads to a tunnel that returns him to the mine after 7 hours.
At all times, he is equally likely to choose any one of the doors.
E(time to reach safety) ?
Theorem (Law of Total Expectation):
E(X) = EY (EX|Y [X|Y])
Exercise: Prove this theorem for the situation when X and Y are both discrete
random variables.
Special Case: If A1 , A2 , ⋯ , Ak is a partition of the whole outcome space (*that is,
these events are mutually exclusive and exhaustive), then:
k
E(X) = ∑
E(X|Ai ) P(Ai )
i=1
(2) *Review: MGF, its second function: The m.g.f. will generate
the moments
Moment:
1st (population) moment: () = ∫  ∙ ()
2nd (population) moment: ( 2 ) = ∫  2 ∙ ()
…
Kth (population) moment: (  ) = ∫   ∙ ()
Take the Kth derivative of the  () with respect to t, and the set t = 0, we obtain the
Kth moment of X as follows:

()|
= ()

=0
2
()|
= ( 2 )
2
=0
…

()|
= (  )

=0
Note: The above general rules can be easily proven using calculus.
Exercise: Prove the above general relationships.
1
Example (proof of a special case): When ~(,  2 ) , we want to verify the above
equations for k=1 & k=2.

()|

=0
1 2 2

= ( +2
) ∙ ( +  2 ) (using the chain rule)
So when t=0

()|
=  = ()

=0
2

()

=
[  ()]

2

1 2 2

= [( +2  ) ∙ ( +  2 )] (using the result of the Product Rule)

1 2 2
1 2 2
= ( +2  ) ∙ ( +  2 )2 + ( +2  ) ∙  2
2
()|
=  2 +  2 = ( 2 )
2
=0
Considering  2 = Var(X) = E( 2 ) −  2
*(3). Review: Joint distribution, and independence
Definition. The joint cdf of two random variables X and Y are defined as:
, (, ) = (, ) = ( ≤ ,  ≤ )
Definition. The joint pdf of two discrete random variables X and Y are defined as:
, (, ) = (, ) = ( = ,  = )
Definition. The joint pdf of two continuous random variables X and Y are defined
as:
2
, (, ) = (, ) =  (, )
Definition. The marginal pdf of the discrete random variable X or Y can be obtained
by summation of their joint pdf as the following:  () = ∑ (, ) ;  () =
∑ (, ) ;
Definition. The marginal pdf of the continuous random variable X or Y can be
∞
obtained by integration of the joint pdf as the following:  () = ∫−∞ (, ) ;
∞
() = ∫−∞ (, ) ;
Definition. The conditional pdf of a random variable X or Y is defined as:
2
(, )
(, )
; (|) =
()
()
Definition. The joint moment generating function of two random variables X and Y
is defined as
, (1 , 2 ) = ( 1 +2  )
Note that we can obtain the marginal mgf for X or Y as follows:
(1 ) = , (1 , 0) = ( 1 +0∗ ) = ( 1  );  (2 ) = , (0, 2 ) = ( 0∗+2 ∗ )
= ( 2 ∗ )
(|) =
Theorem. Two random variables X and Y are independent ⇔ (if and only if)
, (, ) =  () () ⇔ , (, ) =  () () ⇔ , (1 , 2 ) =  (1 )  (2 )
Definition. The covariance of two random variables X and Y is defined as
(, ) = [( −  )( −  )].
Theorem. If two random variables X and Y are independent, then we have
(, ) = 0. (*Note: However, (, ) = 0 does not necessarily mean that X
and Y are independent.)
(4) *Definitions: population correlation & sample correlation
Definition: The population (Pearson Product Moment) correlation coefficient ρ is
defined as:
(, )
=
√() ∗ ()
Definition: Let (X1 , Y1), …, (Xn , Yn) be a random sample from a given bivariate
population, then the sample (Pearson Product Moment) correlation coefficient r is
defined as:
=
∑( − ̅)( − ̅)
√[∑( − ̅)2 ][∑( − ̅)2 ]
3
Left: Karl Pearson FRS[1] (/ˈpɪərsən/; originally named Carl; 27 March 1857 – 27 April
1936[2]) was an influential English mathematician and biometrician.
Right: Sir Francis Galton, FRS (/ˈfrɑːnsɪs ˈɡɔːltən/; 16 February 1822 – 17 January 1911)
was a British Scientist and Statistician.
(5) *Definition: Bivariate Normal Random Variable
(, )~( , 2 ;  , 2 ; ) where  is the correlation between  &
The joint p.d.f. of (, ) is
1
1
−  2
−   −
, (, ) =
exp {−
[(
) − 2 (
)(
)
2
2(1 −  )

2  √1 − 2
−  2
+(
) ]}

Exercise: Please derive the mgf of the bivariate normal distribution.
Q5. Let X and Y be random variables with joint pdf
1
1
−  2
−   −
, (, ) =
exp {−
[(
) − 2 (
)(
)
2
2(1 −  )

2  √1 − 2
−  2
+(
) ]}

Where −∞ <  < ∞, −∞ <  < ∞. Then X and Y are said to have the bivariate
normal distribution. The joint moment generating function for X and Y is
1
(1 , 2 ) = exp [1  + 2  + (12 2 + 21 2   + 22 2 )]
2
.
(a) Find the marginal pdf’s of X and Y;
(b) Prove that X and Y are independent if and only if ρ = 0.
4
(Here ρ is indeed, the <population> correlation coefficient between X and Y.)
(c) Find the distribution of( + ).
(d) Find the conditional pdf of f(x|y), and f(y|x)
Solution:
(a)
The moment generating function of X can be given by
1
() = (, 0) =  [  + 2  2 ].
2
Similarly, the moment generating function of Y can be given by
1
() = (, 0) =  [  + 2  2 ].
2
Thus, X and Y are both marginally normal distributed, i.e.,
~( , 2 ), and ~( , 2 ).
The pdf of X is
() =
The pdf of Y is
() =
1
√2
1
√2
[−
[−
( −  )2
22
( −  )2
22
].
].
(b)
If = 0, then
1
(1 , 2 ) = exp [ 1 +  2 + (2 12 + 2 22 )] = (1 , 0) ∙ (0, 2 )
2
Therefore, X and Y are independent.
If X and Y are independent, then
1
(1 , 2 ) = (1 , 0) ∙ (0, 2 ) = exp [ 1 +  2 + (2 12 + 2 22 )]
2
1 2 2
= exp [ 1 +  2 + ( 1 + 2  1 2 + 2 22 )]
2
Therefore,  = 0
(c)
+ () = [ (+) ] = [ + ]
Recall that (1 , 2 ) = [ 1 +2  ], therefore we can obtain + ()by 1 = 2 =
in (1 , 2 )
That is,
5
1
+ () = (, ) = exp [  +   + (2  2 + 2   2 + 2  2 )]
2
1 2
= exp [( +  ) + ( + 2  + 2 ) 2 ]
2
∴  +  ~( =  +  ,  2 = 2 + 2  + 2 )
(d)
The conditional distribution of X given Y=y is given by
(|) =
(, )
1
=
−
()
√2 √1 − 2
2

( −  −  ( −  ))

{
Similarly, we have the conditional distribution of Y given X=x is
(|) =
(, )
1
=
−
()
√2 √1 − 2
}
2

( −  −  ( −  ))

2(1 − 2 )2
{
Therefore:
.
2(1 − 2 )2
.
}

( −  ), (1 − 2 )2 )

| =  ~  ( +  ( −  ), (1 − 2 )2 )

| =  ~  ( +
Exercise:
1. Linear transformation : Let X ~ N (  ,  2 ) and Y  a  X  b , where a&b are
constants, what is the distribution of Y?
2. The Z-Score Distribution: Let X ~ N (  ,  2 ) , and Z 
X 

a
1

,b  


What is the distribution of Z?
3. Distribution of the Sample Mean:
If X 1 , X 2 ,
i .i . d .
, X n ~ N (  ,  2 ) , prove that X ~ N (  ,
2
n
).
6
4. Some musing over the weekend: We know from the bivariate normal
distribution that if X and Y follow a joint bivariate nornmal distribution,
then each variable (X or Y) is univariate normal, and furthermore, their
sum, (X+Y) also follows a univariate normal distribution. Now my question
is, do you think the sum of any two (univariate) normal random variables
(*even for those who do not have a joint bivriate normal distribution),
would always follow a univariate normal distribution? Prove you claim if
is no.
Exercise -- Solutions:
1. Linear transformation : Let X ~ N (  ,  2 ) and Y  a  X  b , where a&b are
constants, what is the distribution of Y?
Solution:
M Y (t )  E (e tY )  E[e t ( aX b ) ]  E (e atX bt )  E (e atX  e bt )
 e  E (e
bt
atX
)  e e
bt
at 
a 2 2t 2
2
 exp[( a  b)t 
a 2 2 t 2
]
2
Thus,
Y ~ N (a  b, a 2  2 )
2. Distribution of the Z-score:
Let X ~ N (  ,  2 ) , and Z 
X 

a
1

,b  


What is the distribution of Z?
Solution (1), the mgf approach:
M Z (t )  e


 t

1
1
1
1
 t
 ( t )  2 ( t )2
t2
1
 M X ( t)  e   e  2   e 2

→ m.g.f. for N (0,1)
Thus, Z 
X 

~ N (0,1)
7
Now with one standard normal table, we will be able to calculate the probability of
any normal random variable:
P ( a  X  b)  P (
 P(
a

a


X 

Z
b


b

)
)
Solution (2), the c.d.f. approach:
FZ ( z )  P( Z  z )  P(
X 
 z)

 P( X    z   )  FX (  z   )
f Z ( z) 
d
d
FZ ( z ) 
FX (  z   )  f X (  z   )  
dz
dz

1

e
2 
(  z     ) 2
2 2
z2
1 2
 
e → the p.d.f. for N(0,1)
2
Solution (3), the p.d.f. approach:

1
f X ( x) 
2 
e
( x )2
2 2
x  z  
Let the Jacobian be J:
J
dx d
 (  z   )  
dz dz

1
f z ( z ) | J |  f x ( x)   
e
2
Z 
X 

( x   )2
2 2
1 

e
2
X ~ N ( ,
n
2
1  z2

e
2
~ N (0,1)
3. Distribution of the Sample Mean: If X 1 , X 2 ,
2
(  z     )2
2 2
i .i . d .
, X n ~ N (  ,  2 ) , then
).
Solution:
M X (t )  E (e tX )  E (e
t
X 1  X 2  X n
n
)
8
= E (et
*
( X1   X n )
), where t *  t / n,
 M X1 X n (t * )  M X1 (t * )M X n (t * )
 (e
e
 X ~ N ( ,
1
2
t *   2t *2
)n
t 1
t
n  n 2 ( ) 2
n 2
n

e
t 
1 2 2
t
2 n
2
n
)
9