1.017/1.010 Class 11 Multivariate Probability Multiple Random Variables

advertisement
1.017/1.010 Class 11
Multivariate Probability
Multiple Random Variables
Recall the dart tossing experiment from Class 4. Treat the 2 dart
coordinates as two different scalar random variables x and y.
In this experiment the experimental outcome is the location where the dart
lands. The random variables x and y both depend on this outcome (they
are defined over the same sample space). In this case we can define the
following events:
A = [ x (ξ ) ≤ x] B = [ y (ξ ) ≤ y ] C = [ x (ξ ) ≤ x, y (ξ ) ≤ y ] = A I B = AB
x and y are independent if A and B are independent events for all x and y:
P(C) = P(AB) = P(A)P(B)
Another example …
Consider a time series constructing from a sequence of random variables
defined at different times (a series of n seismic observations or stream
flows x1, x2, x3, …, xn.). Each possible time series can be viewed as an
outcome ξ of an underlying experiment. Events can be defined as above:
Ai = [ x i (ξ ) ≤ xi ]
Aij = [ x i (ξ ) ≤ xi , x j (ξ ) ≤ x j ] = Ai I A j = Ai A j
xi and xj are independent if:
P(Aij) = P(Ai Aj) = P(Ai)P(Aj)
Multivariate Probability Distributions
Multivariate cumulative distribution function (CDF), for x, y continuous
or discrete:
Fxy ( x, y ) = P[( x (ξ ) ≤ x)( y (ξ ) ≤ y )]
Multivariate probability mass function (PMF), for x, y discrete:
p xy ( xi , y j ) = P[( x (ξ ) = xi )( y (ξ ) = y j )]
Multivariate probability density function (PDF), for x, y continuous:
1
f xy ( x, y ) =
∂ 2 Fxy ( x, y )
∂x∂y
If x and y are independent:
Fxy ( x, y ) = P[ x ≤ x]P[ y ≤ y ] = Fx ( x) F y ( y )
p xy ( xi , y j ) = p x ( x i ) p y ( y j )
f xy ( x, y ) = f x ( x) f y ( y )
Computing Probabilities from Multivariate Density Functions
Probability that (x, y)∈ the region D:
P [( x, y ) ∈ D] =
f ( x, y ) dxdy
∫
( x, y ) ∈ D
xy
Covariance and Correlation
Dependence between random variables x and y is frequently described with the
covariance and correlation:
+∞
Cov ( x , y ) = E[( x - x )( y - y )] =
∫ ( x - x )( y - y) f
xy
( x,y ) dx dy
-∞
Correl ( x , y ) =
Cov ( x, y )
Cov ( x, y )
=
1/ 2
Std ( x ) Std ( y )
[Var ( x )Var ( y )]
Uncorrelated x and y: Cov(x, y) = Correl (x, y) = 0
Independence implies uncorrelated (but not necessarily vice versa)
Examples
Two independent exponential random variables (parameters ax and ay):
f xy ( x, y ) = f x ( x) f y ( y ) =
 x
 y
 x 1
1
1
y
exp −  exp −  =
exp −
− 
ax
 a x a y 
 a y  a x a y
 ax  a y
ax = E(x), ay = E(y), Correl(x,y) = 0
2
Two dependent normally distributed random variables (parameters µx, µy, σx ,
σy, and ρ):
f xy ( x, y ) =
1
2π C
0.5
  ( Z − µ )' C −1 ( Z − µ )  
exp− 

2

 
Z = vector of random variables = [x y]′
µ = vector of means = [ E(x) E(y) ] ′
 σ x2
C = covariance matrix = C = 
 ρσ xσ y
ρσ xσ y 

σ y2 
σx = Std(x), σy = Std(y), ρ = Correl(x,y)
|C| = determinant of C = σ x2σ y2 (1 − ρ 2 )
1
C = inverse of C =
C
-1
 σ y2

− ρσ xσ y
− ρσ xσ y 

σ x2 
Multivariate probability distributions are rarely used except when:
1. The random variables are independent
2. The random variables are dependent but normally distributed
Exercise:
Use the MATLAB function mvnrnd to generate scatterplots of correlated
bivariate normal samples. This function takes as arguments the means of x and
y and the covariance matrix defined above (called SIGMA in the MATLAB
documentation).
Assume E[x] = 0, E[y] = 0, σx = 1, σy = 0. Use mvnrnd to generate 100 (x, y)
realizations . Use plot to plot each of these as a point on the (x,y) plane (do not
connect the points). Vary the correlation coefficient ρ to examine its effect on
the scatter. Consider ρ = 0., 0.5, 0.9. Use subplot to put plots for all 3 ρ values
on one page.
Copyright 2003 Massachusetts Institute of Technology
Last modified Oct. 8, 2003
3
Download