Probability of Random Vectors Multiple Random Variables Each outcome of a random experiment may need to be described by a set of N > 1 random variables fx ; ; xN g, or in vector form: X = [x ; ; xN ]T which is called a random vector. In signal processing X is often used to represent a set of N samples of a random signal x(t) (a random process). 1 1 Joint Distribution Function and Density Function The joint distribution function of a random vector X is dened as FX (u ; ; uN ) = PZ (x < Zu ; ; xN < uN ) u1 uN = p( ; ; N )d dN 1 1 1 ,1 1 ,1 1 where p( ; ; N ) is the joint density function of the random vector X. 1 Independent Variables For convenience, let us rst consider two of the N variables and rename them as x and y. These two variables are independent i P (A \ B ) = P (x < u; y < v) = P (x < u)P (y < v) = P (A)P (B ) where events A and B are dened as \x < u" and \y < v", respectively. This denition is equivalent to p(x; y) = p(x)p(y) as this will lead to Z u Z v Z u Z v P (x < u; y < v) = p(; )dd = p( )p()dd = Z ,1 ,1 u ,1 p( )d Z v ,1 ,1 ,1 p()d = P (x < u)P (y < v) Similarly, a set of N variables are independent i p(x ; ; xN ) = p(x ) p(x ) p(xN ) 1 1 1 2 Mean Vector The expectation or mean of random variable xi is dened as i = E (xi) =4 Z 1 1 i p(1; ; N ) d1 dN ,1 ,1 Z The mean vector of random vector X is dened as M = E (X ) =4 [E (x ); ; E (xN )]T = [ ; ; N ]T 1 1 which can be interpreted as the center of gravity of an N-dimensional object with p(x ; ; xN ) being the density function. 1 Covariance Matrix The variance of random variable xi is dened as i =4 E [(xi , i) ] = E (xi ) , i 1 1 = (i , i) p( ; ; N ) d dN ,1 ,1 2 2 Z 2 2 Z 2 1 1 The covariance of xi and xj is dened as ij = Cov(xi; xj ) =4 E [(xi , i)(xj , j )] = E (xixj ) , ij 1 1 = ij p( ; ; N ) d dN , ij ,1 ,1 2 Z Z 1 1 The covariance matrix of a random vector X is dened as = E [(X , M )(3X , M )T ] = E (XX T ) , MM T 2 = where 6 4 :: :: :: :: ij :: :: :: :: 2 7 5 N N ij = E (xixj ) , ij is the covariance of xi and xj . When i = j , i = E (xi ) , i is the variance of xi , which can be interpreted as the amount of information, 2 2 2 2 2 or energy, contained in the ith component of the signal X . And the total information or energy contained in X is represented by N X tr = i=1 i 2 is symmetric as ij = ji. Moreover, it can be shown that is also positive denite, i.e., all its eigenvalues f ; ; N g are greater than zero and we have N X tr = i > 0 2 2 1 i=1 and det = N Y i=1 i > 0 Two variables xi and xj are uncorrelated i ij = 0, i.e., 2 E (xi xj ) = E (xi)E (xj ) = ij If this is true for all i 6= j , then X is called uncorrelated or decorrelated and its covariance matrix becomes a diagonal matrix with only nonzero i (i = 1; ; N ) on its diagonal. If xi (i = 1; ; N ) are independent, p(x ; ; xN ) = p(x ) p(xN ), then it is easy to show that they are also uncorrelated. However, uncorrelated variables are not necessarily independent. (But uncorrelated variables with normal distribution are also independent.) 2 1 1 Autocorrelation Matrix The autocorrelation matrix of X is dened as :: :: :: R =4 E (XX T ) = :: rij :: :: :: :: where 2 3 6 4 7 5 rij =4 E (xi xj ) = ij + ij 2 3 N N Obviously R is symmetric and we have = R , MM T When M = 0, we have = R. Two variable xi and xj are orthogonal i rij = 0. Zero mean random variables which are uncorrelated are also orthogonal. 4 Mean and Covariance under Unitary Transforms A unitary (orthogonal) transform of X is dened as ( Y = AT X X = AY where A is a unitary (orthogonal) matrix AT = A, 1 and Y is another random vector. The mean vector MY and the covariance matrix Y of Y are related to the MX and X of X as shown below: MY = E (Y ) = E (AT X ) = AT E (X ) = AT MX Y = E (Y Y T ) , MY MYT = E (AT XX T A) , AT MX MXT A = AT E (XX T )A , AT MX MXT A = AT [E (XX T ) , MX MXT ]A = AT X A Unitary transform does not change the trace of : tr Y = tr [E (Y Y T ) , MY MYT ] = E [tr (Y Y T )] , tr (MY MYT ) = E (Y T Y ) , MYT MY = E (X T AAT X ) , MXT AAT MX = E (X T X ) , MXT MX = tr X which means the total amount of energy or information contained in X is not changed after a unitary transform Y = AT X (although its distribution among the N components is changed). 5 Normal Distribution The density function of a normally distributed random vector X is: 1 1 (X ,M )T , (X ,M )] exp [ , p(x ; ; xN ) = N (X; M; ) = = 2 (2)N= jj where M and are the mean vector and covariance matrix of X , respectively. When N = 1, and M become and , respectively, and the density function becomes single variable normal distribution. To nd the shape of a normal distribution, consider the iso-value hyper surface in the N-dimensional space determined by equation 1 1 1 2 2 N (X; M; ) = c 0 where c is a constant. This equation can be written as (X , M )T , (X , M ) = c 0 1 1 where c is another constant related to c , M and . For N = 2 variables x and y, we have " #" # a b= 2 x , x (X , M )T , (X , M ) = [x , x; y , y ] b=2 c y , y = a(x , x) + b(x , x)(y , y ) + c(y , y ) = c Here we have assumed " # a b=2 = , b=2 c 1 0 1 2 2 1 1 The above quadratic equation represents an ellipse (instead of any other quadratic curve) centered at M = [ ; ]T , because , , as well as , is positive denite: , = ac , b =4 > 0 1 1 1 2 2 When N > 2, the equation N (X; M; ) = c represents a hyper ellipsoid in the N-dimensional space. The center and spatial distribution of this ellipsoid are determined by M and , respectively. 0 6 In particular, when X = [x ; ; xN ]T is decorrelated, i.e., ij = 0 for all i 6= j , becomes a diagonal matrix 2 0 0 3 6 7 = diag[ ; ; N ] = 664 0 0 775 0 0 N and equation N (X; M; ) = c can be written as N X T , (X , M ) (X , M ) = (xi , i) = c 1 2 1 2 1 2 2 2 2 0 2 1 i=1 i 2 1 which represents a standard ellipsoid with all its axes parallel to those of the coordinate system. Estimation of M and When p(x ; ; xN ) is not known, M and cannot be found by their denitions. However, they can be estimated if a large number of outcomes (Xj ; j = 1; ; K ) of the random experiment in question can be observed. The mean vector M can be estimated as K X M^ = 1 Xj 1 Kj =1 i.e., the ith element of M is estimated as K X ^ = 1 x j ( ) Kj i where xij is the ith element of Xj . The autocorrelation R can be estimated as K R^ = K1 Xj XjT i =1 ( ) X j =1 And the covariance matrix can be estimated as K X ^ = 1 Xj XjT , M^ M^ T = R^ , M^ M^ T Kj =1 7