Multivariate Probability Distributions (PPT)

advertisement
Multivariate Probability Distributions
Multivariate Random Variables
• In many settings, we are interested in 2 or more
characteristics observed in experiments
• Often used to study the relationship among
characteristics and the prediction of one based on
the other(s)
• Three types of distributions:
– Joint: Distribution of outcomes across all combinations of
variables levels
– Marginal: Distribution of outcomes for a single variable
– Conditional: Distribution of outcomes for a single variable,
given the level(s) of the other variable(s)
Joint Distribution
Discrete Case (Probabili ty Mass Function) :
p( y1 , y2 )  PY1  y1 , Y2  y2   0
  p( y , y )  1
1
2
all y1 all y 2
F ( y1 , y2 )  PY1  y1 , Y2  y2  
y1
y2
  p( y , y )
t1   t 2  
1
2
Continuous Case (Probabili ty Density Function) :
f ( y1 , y2 )  0

 

 
f ( y1 , y2 )dy2 dy1  1
F ( y1 , y2 )  PY1  y1 , Y2  y2   
y1

y2
 
f (t1 , t 2 )dt 2 dt1
Generalize s to any number of Random Variables
Marginal Distributions
Discrete Case :
p1 ( y1 ) 
 p( y , y )
1
2
all y 2
p2 ( y 2 ) 
 p( y , y )
1
2
all y1
Continuous Case :

f1 ( y1 )   f ( y1 , y2 )dy2


f 2 ( y2 )   f ( y1 , y2 )dy1

Generalize s to any number of Random Variables
(sum or integrate over all other vari ables)
Conditional Distributions
• Describes the behavior of one variable, given level(s) of
other variable(s)
Discrete Case:
p ( y1 | y2 )  P Y1  y1 | Y2  y2  
p ( y2 | y1 )  P Y2  y2 | Y1  y1  
Continuous Case:
f ( y1 , y2 )
f ( y1 | y2 ) 
f 2 ( y2 )
f ( y1 , y2 )
f ( y2 | y1 ) 
f1 ( y1 )






p ( y1 , y2 )
p2 ( y2 )
p ( y1 , y2 )
p1 ( y1 )
 p( y
1
| y2 )  1 y2
s.t. p2 ( y2 )  0
all y1
 p( y
2
| y1 )  1 y1 s.t. p1 ( y1 )  0
all y2
f ( y1 | y2 )dy1  1 y2
s.t. f 2 ( y2 )  0
f ( y2 | y1 )dy2  1 y1 s.t. f1 ( y1 )  0
Expectations
Discrete Case :
E g (Y1 , Y2 ) 
  g ( y , y ) p( y , y )
1
2
1
2
all y1 all y 2
E Y1   1 
  y p( y , y )   y  p( y , y )   y p ( y )
1
1
2
all y1 all y 2
V (Y1 )   12 
1
all y1
 (y
1
1
1 1
1
all y1
(y
 1 ) 2 p ( y1 , y2 ) 
all y1 all y 2
2
all y 2
1
 1 ) 2 p1 ( y1 )
all y1
Continuous Case :
E g (Y1 , Y2 )  



 
E Y1   1  



 
 
V (Y1 )  
2
1

g ( y1 , y2 ) f ( y1 , y2 )dy2 dy1






y1 f ( y1 , y2 )dy2 dy1   y1  f ( y1 , y2 )dy2 dy1   y1 f1 ( y1 )dy1

 

( y1  1 ) f ( y1 , y2 )dy2 dy1   ( y1  1 ) 2 f1 ( y1 )dy1
2

Covariance of Y1 , Y2 :
COV (Y1 , Y2 )  E Y1  1 Y2   2   E Y1Y2  Y1 2  1Y2  1 2  
 E Y1Y2    2 E Y1   1 E Y2   1 2  E Y1Y2   1 2
Expectations of Linear Functions
Y1 ,..., Yn  Random Variables with E (Yi )  i
X 1 ,..., X m  Random Variables with E ( X j )   j
n
m
i 1
j 1
U1   aiYi U 2   b j X j {ai },{b j }  constants


a1 y1  ...  an yn  f ( y1 ,..., yn )dyn ...dy1 

E (U1 )   ...










 a1  ... y1 f ( y1 ,..., yn )dyn ...dy1  ... 
 an  ... yn f ( y1 ,..., yn )dyn ...dy1 
 a1 E (Y1 )  ...  an E (Yn ) 
n
  ai i
i 1
Variances of Linear Functions
Y1 ,..., Yn  Random Variables with E (Yi )  i
X 1 ,..., X m  Random Variables with E ( X j )   j
n
m
i 1
j 1
U1   aiYi U 2   b j X j {ai },{b j }  constants
2
n
n


 
2
V (U1 )  E (U1  E (U1 ))  E   aiYi   ai i   
i 1
 
 i 1
2
 n
 
 E   ai (Yi  i )   
 
 i 1
n 1 n
n 2

2
 E  ai (Yi  i )  2  ai (Yi  i )ai ' (Yi '  i ' ) 
i 1 i 'i 1
 i 1



n


n 1
  a E (Yi  i )  2
i 1
2
i
2
  a V (Yi )  2
i 1
2
i
 a a E(Y   )(Y
i 1 i 'i 1
n 1
n
n
i i'
i
n
 a a COV (Y , Y )
i 1 i 'i 1
i i'
i
i'
i
i'
 i ' ) 
Covariance of Two Linear Functions
Y1 ,..., Yn  Random Variables with E (Yi )  i
X 1 ,..., X m  Random Variables with E ( X j )   j
n
m
i 1
j 1
U1   aiYi U 2   b j X j {ai },{b j }  constants
m
 n

COV (U1 , U 2 )  COV   aiYi ,  b j X j  
j 1
 i 1

n
m
 n

 m
 E   aiYi   ai i   b j X j   b j j  
i 1
j 1
 j 1
 i 1

m
 n

 E  ai (Yi  i ) b j ( X j  j ) 
j 1
 i 1

n
m


n
m
  ai b j E (Yi  i )( X j  j )   ai b j COV (Yi , X j )
i 1 j 1
i 1 j 1
Multinomial Distribution
• Extension of Binomial Distribution to
experiments where each trial can end in exactly
one of k categories
• n independent trials
• Probability a trial results in category i is pi
• Yi is the number of trials resulting in category I
• p1+…+pk = 1
• Y1+…+Yk = n
Multinomial Distribution
p y1 ,..., yk   PY1  y1 ,..., Yk  yk  
n!
yk
y1

p1 ... pk
y1!... yk !
k
k
i 1
i 1
  yi  n,  pi  1, yi  0, pi  0
n!
pi ( yi ) 
piyi (1  pi ) n  yi
yi !(n  yi )!
yi  0,1,.., n
(Yi has a marginal binomial distributi on)
 E (Yi )  npi
V (Yi )  npi (1  pi )
Multinomial Distribution
Covariance of Y j , Y j ' :
1 if trial i results in category j
Ui  
0 otherwise
E (U i )  1( p j )  0(1  p j )  p j
1 if trial i results in category j '
Vi  
0 otherwise
E (Vi )  p j '
E (U iVi )  1(0)  0(1)  0 (Each tria l can result in only one category)
 COV (U i , Vi )  E (U iVi )  E (U i ) E (Vi )  0  p j p j '   p j p j '
COV (U i , Vi ' )  0 i  i ' by independen ce
n
Y j  U i
i 1
n
Y j '   Vi
i 1
n
 n
 n n
COV Y j , Y j '   COV   U i ,  Vi    COV U i ,Vi '  
i 1
 i 1
 i 1 i '1
n
n
  COV U i , Vi    COV U i , Vi '    np j p j '
i 1
i 1 i ' i
Conditional Expectations
Discrete Case :
E Y1 | y2   E Y1 | Y2  y2  
V Y1 | y2   V Y1 | Y2  y2  
 y p( y
1
all y1
 y
1
1
| y2 )
 E Y1 | y2  p ( y1 | y2 )
2
all y1
Continuous Case :
E Y1 | y2   E Y1 | Y2  y2    y1 f ( y1 | y2 )dy1


V Y1 | y2   V Y1 | Y2  y2   


 y1  EY1 | y2 
2
f ( y1 | y2 )dy1
When E[Y1|y2] is a function of y2, function is called the regression of Y1 on Y2
Unconditional and Conditional Mean

E Y1    y1 f1 ( y1 )dy1 



  y1  f ( y1 , y2 )dy2  dy1 
 





  y1  f ( y1 | y2 ) f 2 ( y2 )dy2  dy1 
 





   y1 f ( y1 | y2 )dy1  f 2 ( y2 )dy2 

 
 

  E Y1 | y2 f 2 ( y2 )dy2  EY2 E Y1 | Y2 


Unconditional and Conditional Variance


V Y1 | Y2   E Y | Y2  E Y1 | Y2 
2
1
2



 EY2 V Y1 | Y2   EY2 E Y | Y2  E Y1 | Y2  
2
1
2
   

 E Y  E E Y | Y   
 E Y  E Y   E E Y | Y   E Y  
 V Y   E E Y | Y   E E Y | Y  
 EY2 E Y | Y2  EY2 E Y1 | Y2  
2
2
1
2
1
2
Y2
1
2
2
2
1
2
1
Y2
1
2
2
1
2
1
Y2
1
2
 V Y1   VY2 E Y1 | Y2 
2
Y2
 V (Y1 )  E[V (Y1 | Y2 )]  V [ E (Y1 | Y2 )]
1
2
Compounding
• Some situations in theory and in practice have a
model where a parameter is a random variable
• Defect Rate (P) varies from day to day, and we count
the number of sampled defectives each day (Y)
– Pi ~Beta(a,b)
Yi |Pi ~Bin(n,Pi)
• Numbers of customers arriving at store (A) varies
from day to day, and we may measure the total sales
(Y) each day
– Ai ~ Poisson(l)
Yi|Ai ~ Bin(Ai,p)
Download